Ill
Klaus J. K o h l e r
Terminal Intonation Patterns in Single-Accent Utterances of German:
Phonetics, Phonology and Semantics
1.
Introduction
1.1 Hypotheses
T h i s c o n t r i b u t i o n d e a l s w i t h Hypotheses ( 2 ) and ( 3 ) o u t l i n e d i n 2.1.2 and
2.1.3
o f Contribution
I (Kohler,
peak r e l a t i v e t o s t r e s s e d
1991b), i . e . w i t h t h e a l i g n m e n t o f an FO
vowel onset i n t e r m i n a l
utterances
c o n t a i n i n g one
accent. S e c t i o n 2. i s concerned w i t h t h e FO peak p o s i t i o n s i n sentences t h a t
have a unique accent placement because t h e y are made up o f j u s t one c o n t e n t
word beside s e v e r a l
reduced f u n c t i o n words. S e c t i o n 3. l o o k s
sentences w i t h a l t e r n a t e accent p l a c e s due t o l e x i c a l
t o d i f f e r e n t sentence f o c u s .
of
data presented i n C o n t r i b u t i o n
perceptual
also
t o be
(Hertrich,
1991b).
considered
1.2 Types o f p h o n o l o g i c a l
briefly
(long
stop
voiceless
reference
consonant)
reference
t o the
will
involve
vowel,
voice)
VI
testing
categorization
a number o f d i f f e r e n t
syllable-initial
vs. voiced f r i c a t i v e ,
as w e l l
will
t o Contribution
aiming a t t h e ( p h o n o l o g i c a l )
c o n t i n u a across
vs. short
(creaky
with
s t r u c t u r e s f o r perceptual
p h o n e t i c FO peak s h i f t
glottal
with
IV ( H e r t r i c h , 1991a). As t h i s
The i n v e s t i g a t i o n i s p e r c e p t u a l ,
structures
i n t e r a c t i o n s , also
functions
a m b i g u i t y between one and two a c c e n t s , peak sequences
have
of
stress oppositions or
I t d e l i m i t s t h e s t r e s s and intonation
FO peaks and d i s c u s s e s t h e i r
the
a t FO peaks i n
lateral
v s . g l i d e vs.
post-nuclear
as two p o t e n t i a l accent
syllable
voiced vs.
positions
i n words
( p r e f i x o r stem s t r e s s ) and sentences ( s u b j e c t o r v e r b f o c u s ) ,
1.3 S t i m u l u s
In
generation^
a l l cases, s e v e r a l
type
n a t u r a l l y produced tokens o f t h e p a r t i c u l a r sentence
under s c r u t i n y were r e c o r d e d on analogue tape
t h e same male speaker (KK, t h e a u t h o r )
(Revox A77, 19cm/s) by
under s t u d i o c o n d i t i o n s
i n the Kiel
Phonetics I n s t i t u t e . A l t h o u g h a medial peak p o s i t i o n was t o be t h e b a s i s f o r
stimulus manipulation
early
peak
linguistic
would
have
as w e l l ) ,
i n most experiments ( b u t see 3.1 f o r t h e c h o i c e o f an
early
and l a t e
peaks
were
also
collected
o f each
i t e m t o s p e c i f y t h e ranges o f FO peaks from e a r l y t o l a t e
t o be covered
by t h e t e s t
series,
and i n o r d e r
to
that
provide
i n f o r m a t i o n about t h e shapes o f t h e d i f f e r e n t peaks t o be t a k e n i n t o account
^ The s t i m u l i
f o r 2.1.1.2-5 and 3.1-2 were g e n e r a t e d by Michael Weinhold.
117
in
t h e s y n t h e s i s . The recorded d a t a were checked a u d i t o r i l y f o r s u c c e s s f u l
rendering
(10
o f the intended phonetic
kHz, 5 kHz low-pass f i l t e r ) ,
Data General
programme
structures,
A/D
conversion
t h e a c c e p t a b l e tokens were processed
E c l i p s e S230 computer w i t h
package
and, a f t e r
(as r e g a r d s
the Kiel
the pitch
Phonetics
algorithm,
see
on a
Institute
SSP
Schafer-Vincent,
1982, 1983). Obvious FO a n a l y s i s e r r o r s ( o c t a v e jumps, m i s s i n g FO v a l u e s i n
s p i t e o f c l e a r p e r i o d i c i t y i n t h e s i g n a l ) were c o r r e c t e d manually.
Then one token
containing
an a u d i t o r i l y
c l a s s i f i e d medial
was s e l e c t e d and i t s peak c o n t o u r s h i f t e d
and
t o the
separately
right
in a
as a p a r a l l e l
contour, or the f a l l i n g
original
less
steep
steps
of
transposition
branch
was
fixed
FO
The
and
two
t o avoid
types
c h a r a c t e r i s t i c s o f medial
of
duration
left
shifts
natural
a low FO
shift
do
was
stretch
not
as f a r as
p r o d u c t i o n s by a
i n the
alter
the
t o e a r l y peak changes; t h e p a r a l l e l
t h e whole peak c o n f i g u r a t i o n
shift
branches o f t h e peak
time-expanded i n l e f t
too long
determined
v e r s i o n s . The
o f both
r i g h t - h a n d base p o i n t , t o approximate
descent
synthesis.
of
of
peak
along t h e time axis t o the l e f t
f o r each u t t e r a n c e , t o c r e a t e new
effected either
the
number
(or early)
s i m p l y sounds more f i n a l
LPC
basic
transposition
and
categorical
than t h e one w i t h t h e f l a t t e n e d f a l l . A f t e r t h e s h i f t , t h e t a i l
c o n t o u r was
j o i n e d t o t h e new peak p o s i t i o n by expansion
immediate
Fig.
precursor,
finally
FO
was
masked
s i m i l a r l y the
i n voiceless
stretches.
1 i l l u s t r a t e s t h e p r i n c i p l e s o f g e n e r a t i n g FO peak s h i f t v e r s i o n s .
The o r i g i n a l
and
and
o r compression,
the
u t t e r a n c e s were then s y n t h e s i z e d w i t h t h e LPC
new
FO
versions
obtained
through
the
peak
a n a l y s i s values
shift
parameter
manipulation.
1.4. P e r c e p t i o n experiments
Two t y p e s o f d i s c r i m i n a t i o n and o f i d e n t i f i c a t i o n t e s t s were performed:
(1)
A quick
with
serial
discrimination
the ordered
right
to left
series
test,
o f peak
i n which
shift
stimuli
listeners
from
and asked f o u r q u e s t i o n s on prepared
were
left
presented
to right
or
answer s h e e t s ; f o r
each q u e s t i o n t h e y heard t h e s e r i e s a t l e a s t once.
(a)
Do you p e r c e i v e any changes i n t h e melody o f t h e sentence
s t i m u l u s t o t h e next?
No - one change - s e v e r a l changes.
118
from one
riME<REL>
1
I
[SEC]
m—^
'
S ie ha t j a
g e
PITCH
CHZ]
I
!
0J_
TIME(REL)
CSEC]
PITCH
[HZ]
Fig. 1
( a ) Speech wave and fundamental frequency ( l i n e a r s c a l e ) o f a medial peak i n
t h e n a t u r a l l y produced u t t e r a n c e "Sie h a t j a g e l o g e n . "
("She's been
l y i n g . " ) . The end c o n t o u r (on t h e s y l l a b l e geQ) was added by FO parameter
m a n i p u l a t i o n because t h e a n a l y s i s d i d n o t p r o v i d e i t . The t i m e marks A i , kz
d e l i m i t t h e FO peak c o n t o u r ( c o i n c i d i n g a p p r o x i m a t e l y w i t h / o : / ) , which was
s h i f t e d l e f t and r i g h t .
( b ) The l e f t - and r i g h t - m o s t p o s i t i o n s o f t h e s h i f t e d FO peak c o n t o u r on t h e
same t i m e s c a l e as i n ( a ) , a p p r o x i m a t i n g t h e n a t u r a l p r o d u c t i o n s o f e a r l y
and l a t e peaks, r e s p e c t i v e l y .
119
(b)
A t which s t i m u l u s i n t h e s e r i e s has t h e f i r s t change occurred?
E n c i r c l e t h e r e l e v a n t number.
(c)
A t which s t i m u l i
i n t h e s e r i e s have f u r t h e r changes o c c u r r e d ?
E n c i r c l e t h e r e l e v a n t numbers.
(d)
What a r e t h e meanings o f t h e o r i g i n a l
utterance
and o f u t t e r a n c e s
r e p r e s e n t i n g t h e f i r s t and f u r t h e r changes i n t h e s e r i e s ?
The t e s t tape c o n s t r u c t i o n had t h e f o l l o w i n g f o r m a t :
200-ms bleep
800-ms pause
s t i m u l u s 1 ( o r n)
3-s pause
s t i m u l u s 2 ( o r n - 1)
3- s pause
•
stimulus n ( o r 1 ) .
(2)
A formal
randomized
of
or
one
AX or XA discrimination
two-step
differences,
as
(restricted
t o uneven rank t o l i m i t
peak
series
shift
prepared
ascending
answer
sheets.
presented
Two
test
well
as
the test
tapes
were
order
size),
identical
stimuli
from t h e ordered
compiled,
judgements
one
for
on
the
o f arrangement o f s t i m u l i
and each c o n t a i n i n g a r a n d o m i z a t i o n
a l l t h e d i f f e r e n t as w e l l as t h e i d e n t i c a l
f o l l o w i n g general
of
f o r 'same/different'
and one f o r t h e descending
within the pairs,
of
were
t e s t , i n which a l l t h e p a i r s
of 2 repetitions
stimulus pairs, with the
format:
200-ms bleep
800-ms pause
s t i m u l u s A ( o r X)
2-s pause
s t i m u l u s X ( o r A)
4- s pause
and so on f o r a l l t h e s t i m u l u s p a i r s . A f t e r each b l o c k o f 10 t e s t
items
a f u r t h e r 500-ms bleep was added f o r o r i e n t a t i o n .
(3)
A
natural
stimuli
identification
120
test,
i n which
three
different
n a t u r a l l y spoken c o n t e x t s were p a i r e d w i t h sentences c o n t a i n i n g each o f
three
naturally
produced
peak p o s i t i o n s
s u b j e c t s t o j u d g e , on prepared
(the
melody
compiled
one
o f ) the test
- early,
medial,
late
- for
answer s h e e t s , whether t h e c o n t e x t and
i t e m matched
o r n o t . The
test
tapes
were
i n a s h o r t v e r s i o n o f 9 items (3 c o n t e x t s x 3 peaks) and a long
o f 90 i t e m s , w i t h
10 r e p e t i t i o n s
o f each o f t h e 9 i t e m s .
I n each
tape t h e s t i m u l i were randomized and f o l l o w e d t h e same f o r m a t as i n t h e
randomized d i s c r i m i n a t i o n t e s t , w i t h t h e o n l y d i f f e r e n c e t h a t t h e pause
between c o n t e x t and t e s t s t i m u l u s was 0.5 s.
(4) A
synthesized
stimuli
identification
test,
i n which
one
synthesized
c o n t e x t sentence was p a i r e d w i t h each s t i m u l u s f r o m an e a r l y t o medial
FO peak s h i f t s e r i e s , f o r s u b j e c t s t o j u d g e , on prepared
answer sheets,
whether c o n t e x t and t e s t i t e m matched o r n o t . The t e s t tape c o n t a i n e d a
randomization
o f 10
combination,
following
identification
The
test files
The
listening
descriptions
Kiel
repetitions
t h e same
tests
Phonetics
format
(except
took
those
place
Institute.
including
as
and
test
stimulus
i n the natural
stimuli
on t h e computer and o u t p u t on analogue t a p e .
i n 2.1.2 and 2.1.3;
i n the acoustically
The s t i m u l i
were
v a r i a b l y s i z e d groups o f up t o 8 persons,
subjects
context
test.
were compiled
there)
o f each
treated
presented
staff,
studio
o f the
v i a loudspeaker
to
who were s t u d e n t s o f a v a r i e t y o f
phonetics/linguistics/languages,
academic and t e c h n i c a l
see t h e separate
as w e l l
and " n a i v e " o u t s i d e r s ,
as members o f
a l l with
German o f a
n o r t h e r n v a r i e t y as t h e i r n a t i v e language ( e x c e p t f o r 2.1.2 and 2.1.3; see
the separate d e s c r i p t i o n s
there),^
1.5 I n t e r a c t i v e p e r c e p t u a l t e s t i n g a t t h e computer
The
development
implementation
possible
o f an
intonation
model
f o r German
and
i t s RULSYS
TTS
(see C o n t r i b u t i o n s I and V I I ; K o h l e r , 1991b, d) have made i t
t o check
t h e perceptual
relevance
of
certain
changes
i n FO
c o n f i g u r a t i o n s v e r y q u i c k l y by g e n e r a t i n g p a r a m e t r i c d i s p l a y s and a c o u s t i c
o u t p u t from o r t h o g r a p h i c i n p u t (supplemented by a d d i t i o n a l
symbolic markers,
Michael Weinhold p u t t o g e t h e r t h e t e s t t a p e s , c a r r i e d o u t t h e t e s t s , and
compiled t h e d a t a , f o r 2.1.1.2-5 and 3,1-2.
121
such as @ZZ
output
f o r e a r l y o r @ZZZ f o r l a t e peaks) and by m o d i f y i n g t h e a c o u s t i c
interactively
through
systematic
r e p r e s e n t a t i o n . T h i s can be achieved
(a)
inserted,
regenerated
also
(b)
illustrated
d e l e t e d , o r changed
w i t h t h e new
f o r auditory
i n value,
parameter s p e c i f i c a t i o n
comparison
i n the graphic
parameter
i n two ways:
I n a graphic d i s p l a y o f the type
moved,
changes
with
the
i n F i g . 2, FO p o i n t s are
and
t h e speech s i g n a l
is
f o r auditory evaluation,
stored
original.
A p i t c h c o n f i g u r a t i o n i s d e f i n e d by t h e use o f t h e f r e e v a r i a b l e s X and
Y ( f o r t i m e and f r e q u e n c y ) as, f o r example, i n t h e r u l e
00.01: <VOK,FSTRESS,TERMIN> ^
<TF0=TF0+(X-100)/2.5,T2F0=T2F0+(X-100)/2.5,
T3F0=T3F0+(X-100)/2.5,2F0=Y>,
which
means
that
accented vowel
a
(medial)
peak
pattern
<TERMIN>
associated
with
an
(VOK,FSTRESS> and d e f i n e d by t h r e e FO p o i n t s w i t h t h e time
values TF0,T2F0 and T3F0 i s t o be d i s p l a c e d i n t i m e by adding o r s u b t r a c t i n g
the
same v a r i a b l e t i m e v a l u e X, and/or v e r t i c a l l y expanded o r compressed by
v a r y i n g t h e frequency
v a l u e o f t h e c e n t r e FO p o i n t ( 2 F 0 ) . An
orthographic
i n p u t i s then processed by t h e system up t o t h i s r u l e , when an X-Y plane as
shown i n F i g . 3 appears on t h e screen, r e p r e s e n t i n g 250 t i m e frames o f 10 ms
along t h e h o r i z o n t a l
and 250 u n i t s o f 1 Hz along t h e v e r t i c a l . A c u r s o r can
now be moved, e.g. i n 5 - u n i t s t e p s , t o feed t h e v a r i a b l e s X and Y i n r u l e
00.01
time
w i t h new
constant
values
of
-100
rescales
t h e temporal
allowing
parallel
along
f o r further
resets
step
shifts
the horizontal
processing.
t h e zero
size
from
o f a l l FO
t o the right
In rule
point,
5 x 10 ms
and
00.01, t h e a d d i t i v e
the factor
p o i n t s by 20 ms
and t o t h e l e f t
w i t h one
from
cursor
step
the (medial)
zero
shifted
scale
quick
t h e a u d i t o r y consequences
tested
1/2.5
t o 5 x 10/2.5 ms = 20 ms,
p o s i t i o n . The peak p a t t e r n can t h u s be c o n t i n u a l l y
and
of
in a
along
the time
succession
from
s t i m u l u s t o s t i m u l u s o f t h e same sentence t y p e . S i m i l a r changes can be made
i n t h e frequency
axis.
Both procedures ( a ) and ( b ) a r e v e r y e f f e c t i v e f o r q u i c k h y p o t h e s i s
and q u i c k
checking
experiments,
to
for
of points l e f t
open
by t h e more e l a b o r a t e
and have been used a good deal
i n the Kiel
c o n f i r m and expand f o r m a l t e s t r e s u l t s as w e l l
122
perception
Intonation Project
as t o p r e p a r e
new hypotheses and t h e i r e v a l u a t i o n i n group l i s t e n i n g
testing
tests.
t h e ground
175
FO
130
150
125
100
75
50
25
10 16 22 28 35 42 +B 55 61 67 73
5 6 6 6
7 7 6
7 6 6 6
20
93 10104 111
B3 7
Fig. 2
RULSYS development system o u t p u t o f t h e symbolic i n p u t "Sie hat j a gelogen
@ZZ."
w i t h an e a r l y FO peak. FO ( i n Hz; square parameter and c o s i n e
i n t e r p o l a t i o n between d e f i n e d FO p o i n t s ) and p h o n e t i c t r a n s c r i p t i o n a l i g n e d
t o t h e t i m e s c a l e (segment and c u m u l a t i v e
durations i n c s ) ; cursor
p o s i t i o n e d on t h e peak v a l u e ; EO = a.
Fig. 3
X-Y plane f o r p r o v i d i n g v a r i a b l e s , d e f i n e d i n a TTS r u l e ( e . g . t i m e and
f r e q u e n c y ) , w i t h new v a l u e s by moving a c u r s o r along t h e h o r i z o n t a l and/or
the v e r t i c a l a x i s .
123
2.
FO peak a l i g n m e n t
2.1.
Phonetics and phonology
2.1.1
K i e l experiments on German
The f i r s t q u e s t i o n t o be asked w i t h r e g a r d t o FO peak a l i g n m e n t i s as t o how
the
a c o u s t i c continuum
the
onset
(around
o f FO maximum v a l u e p o s i t i o n from e a r l y
o f t h e s t r e s s e d vowel
t h e s t r e s s e d vowel
with
centre)
(well
before
which
i t i s a s s o c i a t e d ) t o medial
to late
( a t t h e end o f t h e s t r e s s e d
vowel) i s p a r t i t i o n e d p e r c e p t u a l l y . I s t h e c o n t i n u a l
change o f t h e temporal
r e l a t i o n o f t h e FO maximum t o s t r e s s e d vowel onset c o r r e l a t e d w i t h a gradual
perceptual
change,
phonological
or
are
there
categorical
s w i t c h e s , and how many o f these
second q u e s t i o n , which
i s closely
linked
breaks
have t o be recognized?
with
the f i r s t
whether t h e p e r c e p t u a l o r g a n i z a t i o n o f t h e p h y s i c a l
on
t h e segmental
duration
structure
o f the stressed
stressed s y l l a b l e
initial
o f t h e stressed
vowel,
corresponding
the clear
consonants ( l a t e r a l s
to
The
one, r e l a t e s t o
continuum
i s dependent
syllable,
i n particular the
acoustic
segmentability o f
or f r i c a t i v e s
vs. g l i d e s or
creaky o n s e t ) and t h e presence o f p o s t - v o c a l i c v o i c i n g . To f i n d answers t o
these
questions
peak
shift
series
were
created
f o r the following
five
utterances:
[ z i fiat 5a §3'lo:§i)] ("She's been l y i n g . " )
(1)
"Sie h a t j a gelogen."
(2)
"Es i s t j a gelungen."
(3)
" S i e h a t j a g e j o d e l t . " [ z i fiat sa ga ' j o i d a i t ]
(4)
"Sie mu6 wohl
[ e s i s t 5a ga 'luqan] ( " I t has worked.")
arbeiten."
[ z i mus
("She's been y o d e l l i n g . " )
v o l 'Tasbaitn]
("She w i l l
have t o
work.")
(5)
"Er i s t j a g e r i t t e n . "
[ E B i s t 9a ga ' B i t n ]
("He's been r i d i n g . " )
2.1.1.1 "Sie h a t j a gelogen."
Taking t h e medial
FO peak p o s i t i o n o f t h e o r i g i n a l
u t t e r a n c e i n F i g . 1 as a
point
o f d e p a r t u r e , t h e c o n t o u r A1A2 was moved a l o n g
equal
steps o f 30 ms each t o t h e l e f t
right.
In the transposition
parallel,
and 4 c o r r e s p o n d i n g
t o the right,
i n t h e one t o t h e l e f t ,
the time axis
both
only the r i s i n g
branches
branch
steps t o t h e
were
moved i n
was, t h e f a l l i n g
one b e i n g expanded between t h e new maximum p o s i t i o n and t h e o r i g i n a l
base p o i n t .
generated
A series
as w e l l ,
long low-level
was
with
parallel
shift
also
rather
conveying
"metallic",
t h e meaning
124
was
due t o t h e
although t h e p i t c h
o f greater
right
t o the l e f t
b u t t h e LPC s y n t h e s i s q u a l i t y was i n f e r i o r
FO, sounding
not unnatural,
complete
in 6
finality
pattern
i n the
statement
of
and o f l e s s room f o r argument. Moreover, t h e n a t u r a l
early
informal
peaks i n t h i s
listening
sentence showed t h e same f l a t t e n e d FO
d i d n o t suggest a d i f f e r e n t
productions
descent.
As
behaviour w i t h regard t o the
perceptual
assessment o f s h i f t s i n t h e peak p o s i t i o n i n t h e two s e r i e s , t h e
one
the
with
adjusted
falling
branch
was
chosen
f o r the
listening
experiments.
2.1.1.1.1 D i s c r i m i n a t i o n t e s t s
The 11 s t i m u l i
t h e ascending
e n t e r e d i n t o both d i s c r i m i n a t i o n t e s t s ( 1 ) and ( 2 ) o f 1.4 i n
as w e l l as t h e descending
order.
Results
Table
I presents
t h e responses
sequence o f t h e s e r i a l
by
60
discrimination
listeners i n the r i g h t - l e f t
listeners
test,
i n the l e f t - r i g h t
Table
peak
I I t h e responses by 33
sequence.
Table I
Frequency d i s t r i b u t i o n o f 'change has o c c u r r e d ' responses by 60 l i s t e n e r s i n
t h e l e f t - r i g h t sequence o f t h e s e r i a l discrimination
t e s t across t h e 11
s t i m u l i w i t h FO peak s h i f t s i n "Sie h a t j a gelogen."
(1 = left-most,
11 = r i g h t - m o s t p o s i t i o n )
Stimulus
3
4
5
6
7
8
9
10
11
22
11
F i r s t change
perceived
1
4
39
16
1
5
F u r t h e r changes
perceived
11
15
21
Total
1
4
40
21
11
15
21
22
11
Table I I
Frequency d i s t r i b u t i o n o f 'change has o c c u r r e d ' responses by 33 l i s t e n e r s i n
t h e r i g h t - l e f t sequence o f t h e s e r i a l discrimination
t e s t across t h e 11
s t i m u l i w i t h FO peak s h i f t s i n "Sie h a t j a gelogen."
(1 = l e f t - m o s t ,
11 = r i g h t - m o s t p o s i t i o n )
Stimulus
10
9
8
7
6
5
4
3
2
1
F i r s t change
perceived
5
7
13
4
3
1
1
4
2
6
16
6
10
5
1
1
11
15
10
19
7
10
5
1
F u r t h e r changes
perceived
Total
5
125
The
randomized
paired
discrimination
test
c a r r i e d o u t w i t h a group o f 39 s u b j e c t s ,
i n t h e ascending
ordering
was
i n t h e descending o r d e r i n g w i t h a
d i f f e r e n t group o f 34 s u b j e c t s ; each o f t h e two t e s t s c o n t a i n e d t h e p a i r i n g s
of the i d e n t i c a l
s t i m u l i a t t h e uneven r a n k numbers i n t h e s e r i e s o f 11 t e s t
items d e s c r i b e d i n 2.1.1.1 and o r d e r e d from l e f t - m o s t t o r i g h t - m o s t
FO
peak p o s i t i o n . F i g . 4 shows t h e r e s u l t s .
Fig. 4
D i s c r i m i n a t i o n f u n c t i o n s i n t h e randomized
paired
discrimination
test,
showing percentage o f ' d i f f e r e n t ' judgements f o r u t t e r a n c e p a i r s o f "Sie h a t
j a gelogen." w i t h 0-step ( a ) , 1-step ( b ) , o r 2-step ( c ) d i s t a n c e s o f FO peak
p o s i t i o n s , i n the ordering l e f t - r i g h t (continuous
l i n e ) or r i g h t - l e f t
(broken l i n e ) . The s t i m u l u s
number r e f e r s t o t h e second s t i m u l u s i n t h e
ascending and t o t h e f i r s t s t i m u l u s i n t h e descending o r d e r . 73 sbs.,
n = 146 a t each d a t a p o i n t ( a ) ; 39 sbs., n = 78 i n t h e l e f t - t o - r i g h t ,
34 sbs., n = 68 i n t h e r i g h t - t o - l e f t o r d e r i n g o f ( b ) and ( c ) .
126
%
2
%
'diffGrent'
3
4
5
6
7
8
9
10 11
'diffGrGnt'
-100
80
h60
Ho
h20
-L.
3
4
J
5
I
L
6
7
127
stimnr
8
9
10
11
Discussion
Both t y p e s o f t e s t converge i n d e m o n s t r a t i n g
discrimination
f u n c t i o n - around s t i m u l i
a major and a minor peak i n t h e
5/6
and 9/10,
r e s p e c t i v e l y , but
a l s o a s t r o n g o r d e r e f f e c t . On t h e one hand, d i s c r i m i n a t i o n i s s h a r p e s t , and
e q u a l l y so i n both o r d e r i n g s o f d i f f e r e n t s t i m u l i , i f t h e 1-step d i s t a n c e i s
l o c a t e d between s t i m u l i
between s t i m u l i
5 and 6, o r , c o r r e s p o n d i n g l y ,
t h e 2-step
distance
5 and 7 ( i . e . f o r t h e p a i r s 5 - 6, 6 - 5, 5 - 7 and 7 - 5 ) ;
on t h e o t h e r hand, t h e d i f f e r e n t i a t i o n weakens i f t h e d i s t a n c e i s l o c a t e d a t
a lower p o s i t i o n i n t h e s e r i e s f o r t h e descending sequence (5 - 4, 5 - 3 ) o r
at
a higher
Stimulus
pair,
position
f o r t h e ascending one
6-8,
7-9).
5 i s h i g h l y d i s c r i m i n a t e d i f i t comes second o r i s spanned i n t h e
i . e . i n 4 - 5, 3 - 5 ,
alarms'
(6-7, 7-8,
in identical
So t h e q u e s t i o n
4-6,
and t h i s
even occurs
by way
of 'false
p a i r i n g s o f s t i m u l u s 5.
a r i s e s as t o what t h e r e i s i n t h e s i g n a l
t h a t might
mark
s t i m u l u s 5 as d i f f e r e n t
from a l l t h e o t h e r s . F i g . 5 shows t h e p o s i t i o n s o f
t h e FO peaks i n s t i m u l i
4 and 5 i n r e l a t i o n t o t h e speech wave. S t i m u l u s
is
the f i r s t
contour
one
enters
preceding
i n t h e s e r i e s o f 11 from
t h e accented
stimuli
vowel
/o:/
on
left
a
to right,
rising
where
slope;
5
t h e FO
i n a l l the
i n t h e s e r i e s , FO f a l l s t h r o u g h o u t
t h e vowel. I n stimulus
5 t h e i n c r e a s e o f a c o u s t i c energy i n t h e t r a n s i t i o n
from t h e consonant / I /
t o t h e vowel /o:/ i s thus coupled w i t h a r i s i n g FO, t h e r i s i n g s l o p e o f t h e
peak
contour
across
/galo:/
being
intensified
s t i m u l u s 4 t h i s does n o t happen, b u t a f a l l
peak i s moved f u r t h e r
over
i t s final
is intensified
30 ms.
In
i n s t e a d . As t h e
t o t h e r i g h t , t h e FO r i s e becomes p r o g r e s s i v e l y more
e x t e n s i v e over a p r o g r e s s i v e l y l o n g e r i n c r e a s e i n a c o u s t i c energy up t o t h e
middle
o f t h e vowel,
coincides
with
i . e . t o t h e FO
the original
between successive
stimuli
peak p o s i t i o n
production.
will
drop,
In this
i n stimulus
continuum,
i f the increase
7,
which
distinctivity
i n t h e FO
rise
has
reached p e r c e p t u a l s a t u r a t i o n . T h i s seems t o happen a f t e r s t i m u l u s 6.
A further
s h i f t o f t h e FO peak t o t h e r i g h t beyond s t i m u l u s 7 r e s u l t s i n an
i n c r e a s i n g low FO s t r e t c h
(see F i g . 1 ) , which r e c e i v e s t h e i n t e n s i f i c a t i o n ,
whereas, a t t h e same t i m e , t h e end o f t h e r i s e i s l i n k e d w i t h a decrease o f
a c o u s t i c energy. When both
stimuli
parameter changes a r e l a r g e enough,
have t h e i r d i s t i n c t i v i t y
successive
r a i s e d a g a i n . T h i s seems t o happen around
128
stimuli
effect
9 and 10 i n t h e ascending
than
t h e change
from
order,
falling
but i s obviously
to rising
FO
a much weaker
i n the stressed
vowel,
producing much l o w e r peaks i n t h e response f u n c t i o n s .
0
Fig.
5
FO peaks i n s t i m u l i 4 and 5 o f t h e s e r i e s o f 11 " S i e h a t j a g e l o g e n . " from
l e f t - m o s t t o r i g h t - m o s t p o s i t i o n , i n r e l a t i o n t o t h e speech wave. The
v e r t i c a l l i n e s mark t h e FO maximum.
These r e s u l t s
shift
suggest t h a t
continuum
progressing
there
i n t h e area
towards
this
i s a maximum o f s e n s i t i v i t y
of stimuli
area
5/6. So
any p a i r i n g s within
are d i s c r i m i n a t e d best,
viz. 4 - 5 ,
6-5,
7-6
5-4,
6 - 7 , 7 - 8 , 5 - 3 , 6 - 8 , where t h e p r o g r e s s i o n
9/10, b u t does
descending
or
5-6,
(and even 8 - 7 ) ; 3 - 5 , 4 - 6 , 5 - 7 , 7 - 5 , 8 - 6 , b u t n o t
area o f h i g h s e n s i t i v i t y .
stimuli
i n t h e peak
order,
A second, weaker s e n s i t i v i t y
not surface
because
of
i n t h e response
t h e displacement
d i s c r i m i n a t i o n curve a s s o c i a t e d w i t h s t i m u l i
thus p e r c e p t u a l l y p a r t i t i o n e d
into
boundary o c c u r r i n g between s t i m u l i
to
i s away from t h e
peak, i s l o c a t e d a t
functions
the right
for
the
of the
5/6. The a c o u s t i c continuum i s
two c l e a r l y d e l i m i t e d s e c t i o n s w i t h t h e
4 and 6, and t h i s
c o i n c i d e s w i t h an a c o u s t i c change from f a l l i n g
to rising
perceptual
division
FO across
stressed
vowel onset. Around t h e boundary between these two s e c t i o n s , d i s c r i m i n a t i o n
129
i s s h a r p e s t , and,
as w i l l
be seen i n 2.1.1.1.2 and 2.2,
t h e two p e r c e p t u a l l y
determined s e c t i o n s o f t h e a c o u s t i c continuum c o r r e s p o n d t o two
c a t e g o r i e s r e l a t e d t o a semantic d i f f e r e n t i a t i o n
between
intonational
'established'
and
'new'.
So
i t appears
that
we
are
dealing
Repp, 1984),
with
time
i n the
perception'
(see
1987a). The
d a t a p o i n t t o an a b r u p t p e r c e p t u a l
continuum t h e FO
this
here
an
peak i s moved i n t o t h e vowel
example
of 'categorical
domain o f
pitch
(Kohler,
change when i n t h e
acoustic
of the stressed
syllable.
A
f u r t h e r FO peak s h i f t along t h e a c o u s t i c continuum r e s u l t s i n a more gradual
auditory
initial
change, w i t h
a minor
s t r e t c h o f l o w - l e v e l FO
sensitivity
and
maximum a t
the f i n a l
a
p o i n t where
the
weakening o f t h e r i s e - f a l l
in
t h e s t r e s s e d vowel become l a r g e enough. The d a t a t h u s s u p p o r t Hypothesis ( 2 )
(see C o n t r i b u t i o n I ; K o h l e r , 1991b) as f a r as t h e a b r u p t v s . gradual
in
perception
are
concerned.
This
means
that
an
early
FO
changes
peak
must
c o n s t i t u t e a p h o n o l o g i c a l c a t e g o r y o f German i n t o n a t i o n , c o n t r a s t i n g w i t h a
medial
peak, whereas a l a t e
perceptual
results
may
peak i s l e s s c l e a r l y
turn
out
to
be
separated,
different
i f in
although
accordance
the
with
n a t u r a l p r o d u c t i o n t h e FO peak s h i f t t o l a t e p o s i t i o n s were accompanied by a
similar shift
stimulus
o f t h e a c o u s t i c energy maximum t o t h e r i g h t
manipulation
utterance,
the
synchronized
energy
w i t h FO
profile
on
of
t h e vowel
the
original
c e n t r e , was
s e n s i t i v i t y maximum i n t h e response f u n c t i o n c o u l d
(see 2.1.1.5).
I n 2.1.1.1.2 and
o r g a n i z a t i o n o f t h e semantic
phonological
2.2,
further
then
support
functions in parallel
(whereas i n t h e
medial
used).
easily
will
be
The
be
peak
minor
boosted
given t o the
w i t h t h e perceptual
and
s t r u c t u r i n g o f FO peak a l i g n m e n t .
2.1.1.1.2 I d e n t i f i c a t i o n
tests
On t h e b a s i s o f t h e d i s c r i m i n a t i o n t e s t r e s u l t s and o f hypotheses
the
FO
semantics
of
early,
medial
and
late
peaks,
three
concerning
contexts
were
constructed:
(1)
"Wer
einmal
s p r i c h t . Das
l i i g t , dem
g l a u b t man
n i c h t , auch wenn e r g l e i c h d i e Wahrheit
g i l t auch f u r Anna."
("Once a l i a r ,
always a l i a r . T h i s a l s o a p p l i e s t o Anne.")
T h i s c o n t e x t s e t s t h e frame f o r an e s t a b l i s h e d f a c t and
of
(2)
an argument, which i s brought t o a c l o s e .
" J e t z t v e r s t e h ' i c h das
erst."
130
t h e summing
up
("Now I u n d e r s t a n d . " )
T h i s c o n t e x t p r e s e n t s a new f a c t and opens up a new argument.
(3) "Oh!"
T h i s c o n t e x t i n t r o d u c e s emphatic
Each o f these
surprise.
c o n t e x t s was spoken n a t u r a l l y
t h r e e n a t u r a l l y produced peaks i n t h e sentence
a natural
synthesized
stimuli
and p a i r e d w i t h
each o f t h e
"Sie h a t j a g e l o g e n . " t o form
i d e n t i f i c a t i o n t e s t a c c o r d i n g t o 1.4 ( 3 ) . Furthermore, a
stimuli
identification test
(see 1.4 ( 4 ) ) was performed
with
p a i r i n g s o f c o n t e x t ( 2 ) ( " J e t z t " ) and each one o f t h e f i r s t 8 s t i m u l i i n t h e
continuum
( f r o m l e f t t o r i g h t ) o f 2.1.1.1.
Results
Table I I I and F i g . 6 p r e s e n t t h e r e s u l t s o f t h e two t e s t s .
Table I I I
Percentages o f 'matching' responses f o r c o m b i n a t i o n s o f 3 c o n t e x t s and
e a r l y , medial o r l a t e FO peaks i n t h e sentence " S i e h a t j a gelogen." i n a
natural
stimuli
i d e n t i f i c a t i o n t e s t . 88 s u b j e c t s
Context
( 1 ) Wer
(2) Jetzt
( 3 ) Oh
Peak p o s i t i o n
early
87.5
27.3
8.0
medial
26.1
70.5
72.7
late
13.6
67.0
76.1
Discussion
The
r e s u l t s o f combining
t h e 3 c o n t e x t s and 3 FO peak p o s i t i o n s
show t h a t
s u b j e c t s a r e a b l e t o make s y s t e m a t i c judgements because t h e responses a r e
s i g n i f i c a n t l y d i f f e r e n t from chance, b e i n g e i t h e r more t h a n 66% o r l e s s
30% i n f a v o u r o f 'matching'. T h i s means t h a t t h e d i f f e r e n t FO peak
must be p e r c e p t u a l l y
than
positions
i d e n t i f i a b l e , and s i n c e i n a l l cases t h e i d e n t i f i c a t i o n
o f an e a r l y versus a n o n - e a r l y peak i s f a r more c l e a r l y d i f f e r e n t i a t e d than
t h a t o f a medial versus a l a t e one, t h i s i d e n t i f i c a t i o n t e s t reproduces t h e
categorization
that
o f the discrimination
t h e medial
vs. l a t e
t e s t s . I t i s o n l y i n t h e "Wer" c o n t e x t
FO peaks y i e l d
131
a significant difference
i n the
%
1
'matching'
2
3
4
5
6
7
8
Fig. 6
I d e n t i f i c a t i o n f u n c t i o n i n t h e synthesized
stimuli
identification
test,
showing percentage 'matching' judgements f o r 8 s t i m u l i "Sie h a t j a gelogen."
w i t h FO peak s h i f t from l e f t t o r i g h t i n t h e c o n t e x t " J e t z t v e r s t e h i c h das
e r s t . " 19 s u b j e c t s ; f o r each s t i m u l u s n = 190.
response p a t t e r n s
least
into
{ % = 4.31, p = . 0 5 ) . C o n t r a r i w i s e ,
t h e "Oh"
context
the early pattern
( d i f f e r e n c e between " J e t z t " and "Oh"
fits
contexts
= 31.07, p = .001).
The
c o n t e x t u a l i z a t i on o f t h e e a r l y
"Jetzt"
introduction
( F i g . 6)
shows
t o medial
an
FO
abrupt
'non-matching' judgements i n s p i t e o f t h e gradual
dimension,
to
the
change a l o n g t h e p h y s i c a l
Stimuli 1 -
perceptual
support
'matching' t o
p e r c e p t i o n advanced i n c o n n e c t i o n w i t h t h e d i s c r i m i n a t i o n t e s t s .
one
adds
from
categorical
represent
thus
change
a
4
and
peak continuum w i t h t h e
identification
132
assumption
category,
of
stimuli
6
-
8
a
different
'early'
one. They may
and
'medial'
between these
be regarded
FO
as two p h o n o l o g i c a l
peaks. The discrimination
categories, v i z .
of stimuli
is
sharpest
i d e n t i f i c a t i o n c a t e g o r i e s , which i s p r e c i s e l y what t h e t h e o r y
of categorical perception postulates.
2.1.1.1.3 "Sie h a t g e l o g e n . "
As
t h e o b j e c t i o n was
identification
test
raised
of
2.1.1.1.2
particle " j a " ("after a l l " ;
of
9
c o n t e x t - peak
that
might
i n t h e natural
stimuli
have been
i n f l u e n c e d by t h e modal
" I see") p r e d e t e r m i n i n g
t h e judgement, a new s e t
combinations
p o r t i o n s corresponding
t h e responses
was
generated
by
t o " j a " from t h e e x i s t i n g
excising
the signal
ones used i n t h e t e s t o f
2.1.1.1.2. T h i s s p l i c i n g was easy t o p e r f o r m because t h e word was bounded by
silence
(= v o i c e l e s s o c c l u s i o n s
t h e natural
stimuli
i n [ t ] and [§]). Then two l o n g v e r s i o n s o f
i d e n t i f i c a t i o n t e s t according
one w i t h t h e s t i m u l i
t o 1.4(3) were g e n e r a t e d :
"Sie h a t j a g e l o g e n . " and one w i t h "Sie h a t gelogen."
These two t e s t s were r u n a t one week's i n t e r v a l w i t h two groups o f s u b j e c t s
i n t h e f o l l o w i n g sequence:
Group I (17 s u b j e c t s ) d i d t h e t e s t w i t h t h e " j a " s t i m u l i
test
second, f o r Group
I I (7 s u b j e c t s ) t h e o r d e r
was
first,
reversed.
the other
Table IV
presents t h e r e s u l t s .
Table IV
Percentages o f 'matching' responses f o r c o m b i n a t i o n s o f 3 c o n t e x t s and
e a r l y , medial o r l a t e FO peaks i n t h e sentences "Sie h a t ( j a ) g e l o g e n . " i n a
natural
stimuli
i d e n t i f i c a t i o n t e s t w i t h 10 r e p e t i t i o n s and two groups o f
s u b j e c t s ( I : 17 sbs; I I : 7 s b s ) ; A w i t h , B w i t h o u t " j a "
(1)
early
medial
late
Wer
(3) Oh
(2) J e t z t
I
II
I
II
I
II
A
82.9
81.4
31.8
11.4
19.4
15.7
B
86.4
85.7
39.4
38.6
20.6
18.6
A
41.2
28.6
84.1
94.3
65.9
92.9
B
48.2
57.1
80.0
90.0
64.7
84.3
A
25.3
22.9
81.8
98.6
78.2
97.1
B
31.2
37.1
85.3
87.1
87.6
90.0
As i n 2.1.1.1.2
(see Table I I I ) ,
t h e responses t o t h e " j a " s t i m u l i
in
either
positive
a l l cases
clearly
133
or
negative,
and
(= A) are
significantly
different
from
equal
distribution.
Again
t h e 'medial'
and
'late'
peaks
produce more s i m i l a r judgement p a t t e r n s than t h e ' m e d i a l ' and ' e a r l y ' ones,
and t h e y a r e o n l y s i g n i f i c a n t l y d i f f e r e n t f o r Group I i n t h e "Wer" c o n t e x t
(J^
= 9.66, p = .01) and i n t h e "Oh"
strong
distinction
differentiation
between
'early'
f o r 'medial'
finding
once
more
supports
'early'
t o 'medial'
and
context
and
'late'
'medial'
and
has t h u s
been
t h e hypothesis
and a g r a d u a l
( i ^ = 6.44, p = . 0 5 ) . The
t h e much
confirmed.
o f a categorical
change
from
weaker
'medial'
This
switch
from
to 'late'
peak
p o s i t i o n s i n t h e u t t e r a n c e "Sie h a t j a gelogen."
For t h e s t i m u l i
the
w i t h o u t " j a " , i n p r i n c i p l e t h e same d a t a were o b t a i n e d . Of
18 comparisons o f t h e r e s u l t s f o r u t t e r a n c e s w i t h / w i t h o u t " j a " o n l y f o u r
are s t a t i s t i c a l l y s i g n i f i c a n t a c c o r d i n g t o x
I,
tests, the f i r s t
one i n Group
t h e o t h e r s i n Group I I :
(a)
t h e ' l a t e ' peak i n t h e "Oh" c o n t e x t ;
(b)
t h e ' m e d i a l ' peak i n t h e "Wer" c o n t e x t ; '% = 11.67, p = .001,
(c)
t h e ' e a r l y ' peak i n t h e " J e t z t " c o n t e x t ;
(d)
t h e ' l a t e ' peak i n t h e " J e t z t " c o n t e x t ;
In
( a ) , ( b ) , (c) the difference
for
stimuli
expected
without
" j a " , which
i n 'matching'
= 13.75, p = .001,
= 6.89, p = . 0 1 .
i m p l i e s an increase
i n 'matching'
i s c o n t r a r y t o what
i f t h e o b j e c t i o n were v a l i d .
t h e r e i s a decrease
= 5.32, p = .05,
In t h e remaining
responses f o r s t i m u l i
would
have
t o be
case ( d ) , however,
w i t h o u t " j a " , which
may be taken as an i n d i c a t i o n o f a s t r e n g t h e n i n g t h r o u g h t h e modal
"ja"
answers
particle
o f t h e meaning conveyed by i n t o n a t i o n . But t h e r e s u l t s cannot be s o l e l y
determined
increases
by t h e p a r t i c l e ,
the a
error
a l l t h e l e s s so s i n c e
and may
thus
reject
this
the null
pairwise
hypothesis
testing
o f no
d i s t i n c t i o n between t h e two u t t e r a n c e t y p e s , a l t h o u g h i t i s c o r r e c t .
A
further
objection
might
i n f l u e n c e on t h e r e s u l t s :
be t h a t
i f the " j a " stimuli
would a l s o be s e t f o r t h e s t i m u l i
o r d e r was r e v e r s e d , should
t h e order
o f t h e two t e s t s
are t e s t e d f i r s t
w i t h o u t " j a " . Group
t h u s produce a s i g n i f i c a n t l y
had an
the pattern
I I , f o r which t h e
s m a l l e r number o f
'matching' responses f o r t h e s t i m u l i w i t h o u t " j a " more f r e q u e n t l y than Group
I,
b u t t h e above d a t a do n o t s u p p o r t t h i s assumption.
w i t h o u t " j a " do n o t show s i g n i f i c a n t d i s t i n c t i o n s
Moreover, t h e s t i m u l i
between Groups I and I I ,
w i t h t h e one e x c e p t i o n o f t h e 'medial' peak i n t h e "Oh" c o n t e x t {%
134
= 9.64,
p = .01). I n view o f t h e p o s s i b l e
i n c r e a s e o f t h e a e r r o r , we can thus say
t h a t t h e t e s t o r d e r d i d n o t have a s i g n i f i c a n t
influence
on t h e response
p a t t e r n s , which a r e b a s i c a l l y d e t e r m i n e d by an i n t o n a t i o n a l phonology, i . e .
by
' e a r l y ' v s . ' n o n - e a r l y ' FO peak p o s i t i o n s - l e s s s t r o n g l y by ' m e d i a l ' v s .
'late'
formal
ones
-, and which
may
means, such as modal
be
heightened,
but not replaced
by,
other
particles.
2.1.1.2 "Es i s t .ia gelungen."
The
question
differences
now
arises
between
otherwise
a
positions
medial-peak
token
timing
t o stressed
vowel
and i n what ways they
was t h e one
vowel,
one,
chain:
"Es
F i g . 7 shows t h e speech wave as w e l l
natural
relative
relevant
s y l l a b l e s t r u c t u r e selected
short
segment
the perceptually
syllable structures
The f i r s t
phonologically
comparable
peak
t o other
have t o be a d j u s t e d .
containing
t o whether
different
onset a r e t r a n s f e r a b l e
may
as
selected
instead
of a
long
i s t j a gelungen."
i n an
(see 2.1.1).
as t h e energy and FO c o n t o u r s i n t h e
f o r FO
g e n e r a t i o n f o l l o w e d t h e procedure o f p a r a l l e l
peak
shift.
The
test
stimulus
s h i f t s o f b o t h branches o f t h e
peak c o n t o u r (see 1 . 3 ) . The s t e p s i z e was 30 ms, and one peak was l o c a t e d a t
the
boundary
between t h e s t r e s s e d - s y l l a b l e
initial
consonant / I / and t h e
s t r e s s e d vowel / u / . F i g . 8 shows t h e 9 d i f f e r e n t peak p o s i t i o n s used f o r t h e
stimulus generation.
Only t h e quick
was performed i n t h e l e f t - r i g h t
s e r i a l discrimination
sequence w i t h 29
test
(see 1.4 ( 1 ) )
subjects.
Results
Table V p r e s e n t s t h e r e s u l t s .
Table V
D i s t r i b u t i o n o f 'change has o c c u r r e d ' responses by 29 l i s t e n e r s i n t h e
l e f t - r i g h t sequence o f t h e s e r i a l discrimination
t e s t across t h e 9 s t i m u l i
w i t h FO peak s h i f t s i n "Es i s t j a gelungen." ( 1 = l e f t - m o s t , 9 = r i g h t - m o s t
position)
Stimulus
F i r s t change
perceived a t
4
5
6
5
19
5
21
10
F u r t h e r changes
perceived at
Total
135
.0000
Speech wave, energy and FO c o n t o u r s
(linear
scale) i n the natural
medial-peak token o f "Es i s t j a gelungen." s e l e c t e d f o r FO peak s h i f t . The
t i m e marks i n d i c a t e on- and o f f s e t s o f / g / , /a/* / V and / u / . The broken
l i n e s mark t h e l e f t and r i g h t base p o i n t s as w e l l as t h e maximum o f t h e peak
c o n f i g u r a t i o n t o be s h i f t e d .
Discussion
There i s again
moved i n t o
an a b r u p t change i n t h e response p a t t e r n as t h e FO peak i s
the stressed
a f t e r vowel o n s e t ,
as i n t h e s t i m u l i
an
absolute
time
vowel.
The a b s o l u t e
timing
of positions
i . e . 30 ms and 60 ms, r e s p e c t i v e l y ,
" S i e h a t j a gelogen."
span
o f up t o 60 ms
i s e x a c t l y t h e same
(see 2.1.1.1.1). These d a t a p o i n t t o
into
r e s p o n s i b l e f o r a p h o n o l o g i c a l change from
pendent o f t h e p h o n o l o g i c a l vowel
5 and 6
the stressed
vowel
that i s
' e a r l y ' t o ' m e d i a l ' peak, inde-
q u a n t i t y and c o n s e q u e n t l y
o f vowel
dura-
t i o n f o l l o w i n g t h e FO peak, a t l e a s t i n d i s y l l a b l e s . T h i s f i n d i n g means t h a t
t h e 'medial' FO peak has a l a t e r r e l a t i v e p o s i t i o n i n a s h o r t vowel than i n
a l o n g one,
tion
v i z . c l o s e r t o i t s o f f s e t , and
t h i s t i e s i n w i t h t h e produc-
and p e r c e p t i o n d a t a i n C o n t r i b u t i o n I I (Gartenberg & P a n z l a f f - R e u t e r ,
136
0
Fig.
8
Speech wave and FO c o n t o u r ( l i n e a r s c a l e ) i n "Es i s t j a gelungen." w i t h t i m e
marks i n d i c a t i n g t h e 9 FO p o s i t i o n s f o r c o m p l e t e - c o n t o u r s h i f t from l e f t t o
right.
1991,
5 . 2 ) . T h i s means, f u r t h e r m o r e , t h a t t h e s e r i e s o f 9 s t i m u l i
include a proper
d i d not
' l a t e ' peak: i t would have had t o be l o c a t e d w e l l
into the
u n s t r e s s e d vowel / a / .
2.1.1.3 " S i e h a t j a g e j o d e l t . "
The n e x t s y l l a b l e s t r u c t u r e t o be c o n s i d e r e d c o n t a i n s a l o n g s t r e s s e d vowel
/o:/,
as i n " S i e h a t j a gelogen.",
articulatory/acoustic
syllable,
lateral
transition
FO
relative
i n the i n i t i a l
i n s t e a d o f t h e more a b r u p t
/I/:
transition
t o vowel
speech wave as w e l l
into
onset
spectral
(see 2.1.1).
transition
the stressed
position
change a s s o c i a t e d
"Sie h a t j a g e j o d e l t . "
whether t h e more g r a d u a l
the
b u t a g l i d e / j / w i t h a much more gradual
vowel,
can be l e s s
clearly
o f the stressed
with the
The q u e s t i o n
initial
i s as t o
influences the perception o f
because
t h e FO
assessed.
peak
position
F i g . 9 shows t h e
as t h e energy, FO and spectrum d i s p l a y s i n t h e n a t u r a l
137
200n
PITCH
tHZ]
Ift
f Hi,
FREQUENCY
CKHZl
Fig. 9
Speech wave, energy, FO ( l i n e a r s c a l e ) and s p e c t r a l d i s p l a y s i n t h e n a t u r a l
medial-peak token o f "Sie h a t j a g e j o d e l t . " s e l e c t e d f o r FO peak s h i f t . The
t i m e marks i n d i c a t e t h e l e f t base p o i n t (appr. i n t h e temporal c e n t r e o f t h e
F? t r a n s i t i o n f o r / a j o : / ) , t h e maximum FO v a l u e and t h e r i g h t base p o i n t .
medial-peak token
s e l e c t e d f o r FO peak s h i f t .
The t e s t
f o l l o w e d t h e same procedure as i n 2.1.1.2, w i t h a s t e p
one peak
( n r 5) being
located
a t t h e temporal
centre
stimulus
generation
s i z e o f 35 ms, and
o f t h e F2
formant
t r a n s i t i o n i n / a j o : / . F i g . 10 shows t h e 11 d i f f e r e n t peak p o s i t i o n s used f o r
t h e s t i m u l u s g e n e r a t i o n . Only t h e quick
( 1 ) ) was performed i n t h e l e f t - r i g h t
s e r i a l discrimination
sequence w i t h 24 s u b j e c t s .
138
test
(see 1.4
F i g . 10
Speech wave and FO c o n t o u r { l i n e a r s c a l e ) i n "Sie h a t j a g e j o d e l t . " w i t h
t i m e marks i n d i c a t i n g t h e 11 FO peak p o s i t i o n s f o r c o m p l e t e - c o n t o u r
shift
from l e f t t o r i g h t .
Results
Table V I p r e s e n t s t h e r e s u l t s .
Table V I
D i s t r i b u t i o n o f 'change has o c c u r r e d ' responses by 24 l i s t e n e r s i n t h e
l e f t - r i g h t sequence o f t h e s e r i a l discrimination
t e s t across t h e I I s t i m u l i
with
FO peak s h i f t s i n "Sie h a t j a g e j o d e l t . "
{1 = leftmost,
11 =
right-most position)
Stimulus
4
5
6
7
8
9
10
11
F i r s t change
perceived
5
9
8
2
F u r t h e r changes
perceived
Total
1
5
10
8
16
2
3
6
4
3
6
139
1
1
6
6
Discussion
In
this
case t h e f i r s t
change occurs
less
a b r u p t l y although
c l e a r l y marked and c o i n c i d e s w i t h t h e temporal
peak i n t h e Fa t r a n s i t i o n .
half-way
i t is still
p o s i t i o n o f t h e FO
F u r t h e r changes i n t h e p e r c e p t u a l
profile
also
occur e a r l i e r than i n t h e o t h e r s t i m u l u s types t e s t e d so f a r . A l l t h i s goes
to
show t h a t a g l i d e t r a n s i t i o n does i n t e r f e r e w i t h t h e c a t e g o r i z a t i o n o f FO
peaks, b u t t h e g e n e r a l
p a t t e r n o f a phonological
'medial' peaks and a g r a d u a l
separation o f 'early'
and
s w i t c h from 'medial' t o ' l a t e ' s t a y s .
2.1.1.4 "Sie muB wohl a r b e i t e n . "
The
next
syllable
structure
realisation of a s y l l a b l e - i n i t i a l
chosen
has
creaky
voice
vowel p r e f i x e d by a g l o t t a l
(the
phonetic
stop) before a
.0000
TinE(REL>
CSEC3
I
50-1
ENERGY
CdBl
SPEECH
200-1
PITCH
CHZI
Fig.
11
Speech wave, energy and FO c o n t o u r s
(linear
scale) i n the natural
medial-peak token o f "Sie muB wohl a r b e i t e n . " ( w i t h creaky v o i c e t r a n s i t i o n
i n s t e a d o f a g l o t t a l s t o p i n t e r r u p t i o n o f v o i c i n g ) s e l e c t e d f o r FO peak
s h i f t . The t i m e marks d e l i m i t t h e FO peak c o n f i g u r a t i o n t h a t was s h i f t e d
( l e f t and r i g h t base p o i n t s , and maximum).
140
stressed long
as
vowel: " S i e muB wohl
t o whether
categorization
a
creaky
as a g l i d e .
energy and FO c o n t o u r s
stimulus
voice
arbeiten."
onset
F i g . 11
has
t h e same
the
quick
peak
as t h e
The t e s t
a t t h e onset o f t h e more
v i b r a t i o n s a t t h e t r a n s i t i o n from / I /
discrimination
FO
t h e same procedure as i n 2.1.1.2, w i t h a step
t o /an/. F i g . 12 shows
11 d i f f e r e n t peak p o s i t i o n s used f o r t h e s t i m u l u s
serial
on
f o r FO peak s h i f t .
s i z e o f 35 ms and one peak ( n r 5) being l o c a t e d
regular g l o t t a l
effect
shows t h e speech wave as w e l l
i n t h e token s e l e c t e d
generation followed
(see 2.1.1). The q u e s t i o n i s
test
(see 1.3
( 1 ) ) was
generation.
performed
Only t h e
i n the
l e f t - r i g h t sequence w i t h 24 s u b j e c t s .
0
Speech wave and FO c o n t o u r ( l i n e a r s c a l e ) i n " S i e muB wohl a r b e i t e n " w i t h
t i m e marks i n d i c a t i n g t h e 11 FO peak p o s i t i o n s f o r c o m p l e t e - c o n t o u r s h i f t
from l e f t t o r i g h t .
141
Results
Table V I I p r e s e n t s t h e r e s u l t s .
Table V I I
D i s t r i b u t i o n o f 'change has o c c u r r e d ' responses by 24 l i s t e n e r s i n t h e
l e f t - r i g h t sequence o f t h e s e r i a l discrimination
t e s t across t h e 11 s t i m u l i
with
FO
peak
shifts
i n "Sie muB
wohl
arbeiten."
(1 = left-most,
11 = r i g h t - m o s t p o s i t i o n )
Stimulus
4
5
6
3
20
1
7
8
9
10
11
5
5
8
2
2
5
Discussion
5
8
2
F i r s t change
perceived
F u r t h e r changes
perceived
1
Total
3
20
2
2
The f i r s t change occurs v e r y a b r u p t l y i n s t i m u l u s 5, i . e . about 35 ms
after
t h e creaky v o i c e t r a n s i t i o n . The p e r c e p t i o n o f l a t e r changes i s spread over
t h e remainder
o f t h e continuum w i t h o u t c l e a r peaks i n t h e response f u n c t i o n .
There i s a minor maximum a t s t i m u l u s 10, i . e . a t a s i m i l a r p o s i t i o n
t h e continuum across
arbeiten."
thus p a t t e r n s w i t h
glide transition
FO
peak
" S i e h a t j a gelogen.".
I n every r e s p e c t " S i e muB wohl
r a t h e r than w i t h
t h e case o f a
i n " S i e h a t j a g e j o d e l t . " . What seems t o be i m p o r t a n t f o r
perception
f u n c t i o n from / I /
arbeiten"),
the l a t t e r ,
as i n
i s t h e abrupt
articulatory
t o t h e s t r e s s e d vowel
n o t t h e gradual
change
change
( i n "gelungen"
i n phonation
from
i n the transfer
as w e l l
as i n "wohl
voice
t o creak t o
v o i c e , superimposed on t h e a r t i c u l a t o r y s w i t c h .
2.1.1.5 "Er i s t j a g e r i t t e n . "
In
the stimuli
manifested
examined so f a r t h e course
i n t h e observable
o f t h e peak
contour
has been
FO v a l u e s . T h i s changes when t h e p o s t - and/or
p r e - v o c a l i c consonant i n t h e s t r e s s e d s y l l a b l e a s s o c i a t e d w i t h t h e peak i s
voiceless.
Now p a r t o f t h e c o n t o u r
has t o be r e c o n s t r u c t e d b e f o r e a peak
s h i f t becomes p o s s i b l e . F i g . 13 shows t h e speech wave as w e l l
and FO c o n t o u r s
selected
f o r FO
i n a natural
peak
shift
as t h e energy
medial-peak token
o f "Er i s t j a g e r i t t e n . "
(see 2.1.1).
test
142
The
stimulus
generation
f o l l o w e d t h e same procedure
peak p o s i t i o n s
quick
from
left
as i n 2.1.1.2 w i t h a s t e p s i z e o f 30 ms and 15
to right
s e r i a l discrimination
the
left-right
sequence
test
starting
a t t h e beginning
(see 1.4 ( 1 ) ) was performed
by t h e e x p e r i m e n t e r .
Again
o f / a / . The
informally i n
t h e r e was
an
abrupt
change i n p e r c e p t i o n as t h e peak e n t e r e d t h e s t r e s s e d vowel. But as t h e peak
was moved i n t o
the voiceless section o f / t / i t l o s t
becoming lower and l o w e r i n p i t c h . T h i s proves
peak c o n t o u r must be p r e s e n t
i n the signal
reconstructed
from
falling
by a l i s t e n e r
t h a t t h e maximum v a l u e o f a
f o r identification:
surrounding
branches, whereas a l o w r i g h t
i t s characteristics,
values
i t i s not
of the rising
and
base p o i n t may be m i s s i n g due t o FO
contour t r u n c a t i o n before voicelessness i n f i n a l
s y l l a b l e s (see Gartenberg &
P a n z l a f f - R e u t e r , 1991, 3.) w i t h o u t d e t r i m e n t t o t h e peak c h a r a c t e r i s t i c s (on
the
c o n t r a r y , t h e r e must
pattern
be t r u n c a t i o n
i n certain
contexts
t o guarantee
identity).
.0000
TinE(REL)
CSEC]
1.1450
X
J
I
ENERSY
[dB]
SPEECH
200n
PITCH
CHZ]
Fig.
13
~
Speech wave, energy and FO c o n t o u r s
(linear
scale) i n the natural
medial-peak token o f "Er i s t j a g e r i t t e n . " s e l e c t e d f o r FO peak s h i f t . The
t i m e marks i n d i c a t e on- and o f f s e t s o f / g / , / a / , / b / , / i / , / t / , / n / . The
d o t t e d l i n e r e p r e s e n t s t h e r e c o n s t r u c t e d FO i n t e r p o l a t i o n o f t h e r i g h t
branch o f t h e peak c o n t o u r . The broken l i n e s mark t h e l e f t and r i g h t base
p o i n t s as w e l l as t h e maximum o f t h e peak c o n f i g u r a t i o n t o be s h i f t e d .
143
A f u r t h e r s h i f t o f t h e peak c o n t o u r maximum t o t h e onset o f v o i c i n g
f i n a l / n / approximates
t h e FO c o n f i g u r a t i o n found i n n a t u r a l
l a t e peaks (see F i g . 1 4 ) , b u t t h e a u d i t o r y
medial
the
n a s a l s i n medial
fold
vibration:
ones), t h e low FO f a l l
in
productions o f
is still
that
source
and l a t e peaks d i f f e r
I n medial
peaks
i n a m p l i t u d e and mode o f
(and t h e same would a p p l y
t o early
a t t h e end o f an u t t e r a n c e i s accompanied by a drop
amplitude,
which
weakens
unstressed
vowels
and
sonorants
c o n s i d e r a b l y , o f t e n r e d u c i n g them t o c r e a k y v o i c e and t o i r r e g u l a r
glottal
the
p u l s e s . I n l a t e peaks, t h i s d e c l i n e i s moved t o t h e r i g h t
later
unstressed
FO f a l l ,
vowels
thus
keeping
and s y l l a b i c
s t r e t c h i n t h e s t r e s s e d vowel
there i s a natural
and
of a
peak, n o t o f a l a t e one. A comparison o f F i g s . 13 and 14 shows t h a t
final
vocal
impression
i n the
sound
a high
source
sonorants;
amplitude
on t h e o t h e r
breathy
following
a t t h e onset o f
hand
t h e l o w FO
b e f o r e t h e peak g e t s i t s i n t e n s i t y reduced. So
p a r a l l e l i s m i n t h e t i m e courses
i n t e n s i t y f o r t h e t h r e e peak
o f FO, source
amplitude
contours. I f i t i s destroyed, the
1.1450
.0000
TinE(REL)
CSEC]
50-1
ENERSY
CdB]
SPEECH
Fig.
14
Speech wave, energy and FO c o n t o u r s ( l i n e a r s c a l e ) i n a n a t u r a l l a t e - p e a k
token o f "Er i s t j a g e r i t t e n . " The t i m e marks i n d i c a t e on- and o f f s e t s o f
/g/, / V , A/, / i / , / t / , /n/.
144
p e r c e p t u a l p a t t e r n i d e n t i t y may be l o s t .
Thus a l a t e peak, p o s i t i o n e d a t t h e sonorant v o i c i n g onset a f t e r a v o i c e l e s s
o b s t r u e n t , can o n l y be s u c c e s s f u l l y r e c o n s t r u c t e d by a l i s t e n e r
descent t o t h e t e r m i n a l low l e v e l
guarantee s u f f i c i e n t i n t e n s i t y
i n the f i n a l
c o n t o u r t o be a u d i t o r i l y m o n i t o r e d .
its
low f i n a l
i n t e n s i t y and g l o t t a l
cannot be t u r n e d
into
appropriate location.
to
be r a i s e d
changed. T h i s
a late
enough
sonorant
source
peak u t t e r a n c e w i t h
i r r e g u l a r i t y l a c k s these
simply
amplitude t o
f o r t h e h i g h f a l l i n g FO
But a n a t u r a l medial
peak p e r c e p t
The a m p l i t u d e
considerably
has a h i g h
a t t r i b u t e s and
by an FO s h i f t
and d u r a t i o n o f t h e f i n a l
a t t h e same
time
can be achieved by t r a n s f e r r i n g
i f t h e FO
sonorant
and t h e mode
the f i n a l
into the
have
of
vibration
/ n / from
the late
peak s t i m u l u s . C o n t r a r i w i s e , w i t h a l a t e peak s t i m u l u s as p o i n t o f d e p a r t u r e
a p e r c e p t u a l l y c o n v i n c i n g medial
addition
t o t h e peak
amplitude
shift
and s h o r t e n e d .
peak p a t t e r n can o n l y be generated i f i n
the final
This
sonorant
has a l s o
been
is drastically
reproduced
lowered i n
i n a RULSYS TTS
formant s y n t h e s i s - b y - r u l e ( K o h l e r , 1 9 9 1 f ) .
2.1.1.6. " S i e h a t .ia g e s t r i t t e n . "
I f a s h o r t s t r e s s e d vowel
less obstruents
i s n o t o n l y f o l l o w e d b u t a l s o preceded by v o i c e -
t h e masking o f peak h e i g h t as t h e maximum v a l u e
i n t o t h e v o i c e l e s s s e c t i o n a r i s e s s y l l a b l e - i n i t i a l l y as w e l l .
t h e speech wave as w e l l
peak token
as t h e energy and FO c o n t o u r s
o f "Sie hat j a g e s t r i t t e n . "
quarrelling."),
[zi
by t h e p r e c e d i n g
voiceless
Reuter, 1991, 3 . ) . " g e r i t t e n "
their
FO maximum c l o s e
fricative
and " g e s t r i t t e n "
t o vowel
offset,
been
tool
(see G a r t e n b e r g
shift.
increase
& Panzlaff-
converge, however, i n having
syllable
f o r medial
(see l o c . c i t . ,
peaks i n
5.2.).
was t e s t e d i n t e r a c t i v e l y
(see 1.5 ( b ) ) w i t h t h e r u l e - g e n e r a t e d
p o s i t i o n as a p o i n t o f d e p a r t u r e
parallel
base p o i n t i n /a/* has thus
as i s usual
i n "Sie hat j a g e s t r i t t e n . "
t h e TTS r e s e a r c h
complete
medial-
("She's
§9'jKitn]
I n a d d i t i o n , t h e r e i s a CFO
s h o r t s t r e s s e d vowels b e f o r e an u n s t r e s s e d
The peak s h i f t
i n a natural
c l u s t e r / J t e / i s much l o n g e r than
/ » / , and FO, r i s i n g from t h e l e f t
reached a h i g h e r v a l u e a t vowel o n s e t .
caused
9a
F i g . 15 shows
where FO s e t s i n h i g h e r i n t h e s t r e s s e d vowel compared w i t h
" g e r i t t e n " o f 2.1.1.5, because t h e i n i t i a l
the i n i t i a l
hat
i s moved
medial
using
peak
(see F i g . 16a) and a s t e p s i z e o f 20 ms i n
Significant
145
FO
values
i n the
rule-generated
.0000
TltlE(REL) I
CSEC]
55-1
L
ENERSY
CdB]
SPEECH
200-1
PITCH
CHZ]
F i g . 15
Speech wave, energy and FO c o n t o u r s ( l i n e a r s c a l e ) i n a n a t u r a l
medial-peak
token o f "Sie h a t j a g e s t r i t t e n . " The t i m e marks i n d i c a t e on- and o f f s e t s o f
/ g / , hi,
/J/.
A/, / b / , A/, A/, A/.
u t t e r a n c e a t / a / and / i / on140 Hz, r e s p e c t i v e l y .
and
offsets
are
84 Hz,
88 Hz,
144 Hz
and
When t h e peak i s l o c a t e d 40 ms b e f o r e t h e b e g i n n i n g o f
/ i / , t h e peak i s c l e a r l y ' e a r l y ' ; a t / i / onset (see F i g . 16b) i t has changed
to
' m e d i a l ' . The c o r r e s p o n d i n g s i g n i f i c a n t FO v a l u e s i n these two
positions
are 88 Hz, 104 Hz, 138 Hz, 86 Hz, and 84 Hz, 94 Hz, 148 Hz, 108 Hz.
change from ' e a r l y ' t o ' m e d i a l '
occurs
quite
abruptly
in
this
So
the
syllable
s t r u c t u r e as w e l l when t h e FO r i s e across t h e v o i c e l e s s c l u s t e r becomes more
extensive than the f a l l ,
and t h e FO o f f s e t i n t h e s t r e s s e d vowel
is
m i d d l e o f t h e FO range between maximum and minimum values i n t h e
Thus i n t h i s sentence, t h e s w i t c h from ' e a r l y '
t h e r e i s an i n i t i a l
all
to
'medial'
i n the
utterance.
occurs
before
FO r i s e i n t h e s t r e s s e d vowel, which i s d i f f e r e n t
the other s y l l a b l e structures, with i n i t i a l
so f a r . The reason f o r t h i s d i f f e r e n c e
i s o b v i o u s l y accounted
v o i c e d consonants,
l i e s i n t h e CFO
f o r i n t h e p e r c e p t i o n process.
146
from
analysed
interference,
which
175
iZ ; I S ; H ;A
:T
i J ; AS ;G ; EQ SH;T
\j ; EOi
IR i I
N
IBQ
125
i
:
4-2 +B 55 61 67
75 Bl
6 7 6 6
B 6 6
B7
j:
lOQ
75
50
-5
0
10
10 16 22 2B 35
6 6 6 7 7
;Z ;ig;H
; A i T ; J i A s i G ; EQ SHiT
100
;R ; I
\
I
\/\
'''rT'\
125
'
i
:
96 103 110 117
9 7 7 7
It
M
:EO;N
M
; i....
75
M
M
M
i
M
M
M
i
50
25
5
Fig.
10 IB 22 2B 35
6 S 6 7 7
+2 4B 55 61 67
75 Bl
6 7 6 6
B 6 B
B7
95 103 110 117
9 7 7 7
16
(a) RULSYS o u t p u t o f "Sie h a t j a g e s t r i t t e n . " (= ( d e f a u l t ) medial p e a k ) , ( b )
60 ms peak s h i f t t o t h e l e f t
(= f i r s t
c l e a r medial
peak
position i n
l e f t - r i g h t move). FO ( i n Hz; square parameter and c o s i n e
interpolation
between s e t FO p o i n t s ) and p h o n e t i c t r a n s c r i p t i o n a l i g n e d t o t h e t i m e
scale
(segment and c u m u l a t i v e d u r a t i o n s i n c s ) ; EO = 9 , SH = J".
147
2.1.1.7
Conclusion
The d i s c r i m i n a t i o n
same
direction,
and i d e n t i f i c a t i o n
tests
v i z . the perceptual
o f 2.1.1.1-5 a l l p o i n t i n t h e
exploitation
of different
s y n c h r o n i z a t i o n s w i t h s t r e s s e d vowel onsets and o f t h e ensuing
vs.
high
(rising)
FO
as
a
psychophonetic
basis
FO
peak
low ( f a l l i n g )
f o r phonological
c a t e g o r i z a t i o n a t t h e l e v e l o f i n t o n a t i o n . For an ' e a r l y ' peak, FO i s low i n
the
s t r e s s e d vowel because i t i s on i t s descent a t t h e vowel onset
complete p a r a l l e l
shift,
a l s o reaches i t s low end p o i n t e a r l y .
v o i c i n g b e f o r e t h e s t r e s s e d vowel,
I f there i s
t h e FO p o i n t a t vowel onset
by a h i g h e r FO v a l u e so t h a t FO f a l l s
i s preceded
i n t o t h e accented s y l l a b l e .
i s no p r e v i o u s v o i c i n g i n u t t e r a n c e - i n i t i a l
and, i n
I f there
p o s i t i o n o f a stressed
syllable
b e g i n n i n g w i t h v o i c e l e s s consonants, FO a t vowel onset has as low a v a l u e as
would r e s u l t from
an FO descent across
s t r e n g t h e n t h e low FO l e v e l
characterized
t h e stressed s y l l a b l e
periphery t o
i n t h e accented vowel. The ' e a r l y ' peak i s thus
by a h i g h p r e n u c l e a r
FO - e i t h e r d i r e c t l y
observable
e x t r a p o l a t i o n from t h e FO s t a r t i n t h e s t r e s s e d s y l l a b l e nucleus
o r by
- and by a
low FO i n t h e l a t t e r .
C o n t r a r i w i s e , t h e 'medial' peak has a low p r e n u c l e a r FO, an FO r i s e
least
2 semitones
from
interactive testing),
i n t i m e than
structures,
resulting
nuclear
vowel
onset
t o peak
value,
i n an ' e a r l y ' peak. The amount o f descent depends on s y l l a b l e
and
the rise
may
be
absent
because
i n a h i g h e r FO s t a r t i n g p o i n t a t nucleus
' m e d i a l ' peak accentuates
the
' e a r l y ' peak. I n a ' l a t e '
a h i g h e r FO l e v e l
peak t h e r i s e
o f CFO
onset.
I n t e r a c t i v e perceptual t e s t i n g
'medial'
peak
patterns
difference
i f their
time
nucleus
right
onset
So i n a l l cases,
FO.
(see 1,5) has f u r t h e r shown t h a t t h e ' e a r l y '
do
not lose
their
characteristic
base p o i n t s have t h e same FO v a l u e
(due t o a f l a t t e n i n g
i n the i n i t i a l
nuclear
that
feature.
generated
'early'
as
the d i s t i n c t i v e
peak o f "Sie h a t j a gelogen.",
maximum v a l u e up t o a p o i n t i n c l u d i n g t h e f i r s t
the
stressed
vowel,
instead
o f having
148
an
FO
auditory
a t t h e same
o f t h e FO descent
I t i s t h u s t h e FO d i f f e r e n t i a t i o n
counts
than
i s extended because i t occurs
' e a r l y ' peak).
vowel
interference,
i n t h e s t r e s s e d vowel
l a t e r , b u t i t i s a l s o p r e f i x e d by a s t r e t c h o f low l e v e l
after
according t o
and a subsequent descent t o a low FO a t a l a t e r p o i n t
the
and
(of at
i n the
part of the
I f i n a RULSYS
i s kept
a t t h e peak
3 FO frames o f 10 ms each i n
immediate
fall,
the auditory
characteristics
of the 'early'
peak a r e n o t l o s t .
s t a r t i n g from a ' m e d i a l ' peak c o n f i g u r a t i o n
i n t h e above u t t e r a n c e ( w i t h
onset = 88 Hz, /!/ o f f s e t = 116 Hz, s t r e s s e d vowel
FO sequence 122 - 128 - 128 - 130 H z ) , t h e / I /
onset
frames
direction
patterns
are raised
o f an ' e a r l y '
now
continuing
lies
into
nucleus-initial
'medial'
one a l t h o u g h t h e o n l y
t h e accented
patterns
onset c o n s i s t i n g
being
vowel,
i s changed
difference
completed
o f the
Admittedly, the difference
i n the
between t h e two
i n t h e /!/ r a t h e r
i . e . i n t h e presence
i s c l e a r l y weakened by t h i s
/I/
o f f s e t and a l l t h e vowel
t o 130 Hz, t h e ' m e d i a l ' peak
i n the rise
rise.
On t h e o t h e r hand, i f ,
between
than
o r absence o f a
the 'early'
modification,
and
b u t i t shows
t h a t a ' m e d i a l ' peak needs an FO r i s e i n t h e nucleus a f t e r a s o n o r a n t .
2.1.2
The
Munich e x p e r i m e n t s on German
results of the Kiel
how widespread
the
the phonological categorization
intonation
discrimination
ordering
experiments n a t u r a l l y prompted t h e q u e s t i o n as t o
system
of
and randomized
(2.1.1.1.1) as w e l l
German
paired
in
o f FO peak p o s i t i o n s
general.
Therefore
discrimination
as t h e synthesized
tests
stimuli
(2.1.1.1.2) were r e p e a t e d i n t h e Phonetics I n s t i t u t e
w i t h groups
performed
Institute
language
laboratory
serial
i n t h e ascending
identification
o f Munich
o f l i s t e n e r s o f a Bavarian d i a l e c t background.
i n the
the
i sin
and
test
University^
The t e s t s were
the stimuli
were
presented over headphones. Tapes and i n s t r u c t i o n s were i d e n t i c a l t o t h e ones
in
the
Kiel
discrimination
paired
experiments.
test
discrimination
11
listeners
participated
and i n t h e i d e n t i f i c a t i o n
test,
in
the
14 i n t h e
test.
Results
Table V I I I p r e s e n t s t h e r e s u l t s o f t h e s e r i a l discrimination
test.
3
I wish t o thank Dr Anton B a t l i n e r f o r o r g a n i z i n g
149
the t e s t runs.
serial
randomized
Table V I I I
D i s t r i b u t i o n o f 'change has o c c u r r e d ' responses o f 11 Munich l i s t e n e r s i n
t h e l e f t - r i g h t sequence o f t h e s e r i a l
discrimination
test
across t h e 11
s t i m u l i w i t h FO peak s h i f t s
i n "Sie h a t j a g e l o g e n . " ( 1 = l e f t - m o s t ,
11 = r i g h t - m o s t )
Stimulus
4
5
6
7
8
4
1
9
10
11
2
1
F i r s t change
perceived
1
9
1
F u r t h e r changes
perceived
3
Total
1 9
1
4
1 3
F i g . 17 shows t h e r e s u l t s o f t h e randomized paired
T100
%
2
1
discrimination
test.
'different'
-80
-60
-40
-20
11
1
Fig.
17
D i s c r i m i n a t i o n f u n c t i o n s i n t h e randomized
paired
discrimination
test,
showing percentage o f ' d i f f e r e n t ' judgements f o r u t t e r a n c e p a i r s o f "Sie h a t
j a gelogen." w i t h 0-step ( a ) , 1-step ( b ) , o r 2-step ( c ) d i s t a n c e s o f FO peak
p o s i t i o n s , i n t h e o r d e r i n g l e f t - r i g h t . The s t i m u l u s
numbers r e f e r t o t h e
second s t i m u l u s . 14 sbs., n = 28 a t each d a t a p o i n t i n ( a ) , ( b ) , ( c ) .
150
% 'different'
-r100
F i g . 18 shows t h e r e s u l t s o f t h e synthesized
%
stimuli
identification
test.
'matching'
stim nr
1
1
1
I
I
I
I
I
2
3
4
5
6
7
8
I
I
F i g . 18
I d e n t i f i c a t i o n f u n c t i o n i n t h e synthesized
stimuli
identification
test,
showing percentage 'matching' judgements f o r 8 s t i m u l i " S i e h a t j a gelogen."
w i t h FO peak s h i f t from l e f t t o r i g h t i n t h e c o n t e x t " J e t z t v e r s t e h ' i c h das
e r s t . " 11 s u b j e c t s ; f o r each s t i m u l u s n = 110.
Discussion
The comparison o f Tables I and V I I I shows t h a t t h e Munich group has t h e same
t y p e o f response p a t t e r n w i t h a maximum f o r s t i m u l u s
f o r f u r t h e r changes i n t h e s e r i e s f r o m s t i m u l u s
much s m a l l e r number o f s u b j e c t s
response c u r v e does n o t show up
discrimination
discrimination
test
5 and a
large
7 t o 1 1 . However, due t o t h e
i n t h e Munich group, t h e minor peak
so
are supported
clearly.
by
those
scatter
The
results
of
of the
t h e randomized
i n the
serial
paired
t e s t o f F i g . 17 ( i n comparison w i t h F i g . 4 ) . There i s again a
maximum o f s e n s i t i v i t y i n t h e FO peak s h i f t continuum i n t h e area o f s t i m u l i
5/6 and a second, weaker s e n s i t i v i t y peak a t s t i m u l u s
area
i s narrower, w i t h t h e p a i r i n g s 5 - 6
152
and 3 - 5
9, b u t t h e s e n s i t i v i t y
not being
included
in
t h e maxima, and t h e r e i s no peak o f ' f a l s e alarms' f o r t h e 5 - 5
pair.
The
i d e n t i f i c a t i o n f u n c t i o n o f F i g . 18 ( i n comparison w i t h F i g . 5) p o i n t s t o t h e
same two p e r c e p t u a l
i d e n t i f i c a t i o n categories comprising
stimuli
1-4,
on
t h e one hand, and s t i m u l i 6 - 8, on t h e o t h e r , b u t w i t h a l o t more n o i s e (an
o f f s e t o f about 20% - 30%) i n t h e f i r s t c a t e g o r y
and
at
stimulus
5, t h e
boundary between t h e t w o . We may again a s s o c i a t e t h i s p a r t i t i o n i n g w i t h
two p h o n o l o g i c a l
results
c a t e g o r i e s o f ' e a r l y ' and 'medial'
are thus
i n agreement
with
the Kiel
g e n e r a l i z a t i o n o f a p e r c e p t u a l and p h o n o l o g i c a l
p o s i t i o n s r e l a t i v e t o s t r e s s e d vowel onset
FO
peaks.
data
The
and
the
Munich
allow
categorization
of
f o r the intonation
the
FO
of
peak
German
across r e g i o n a l v a r i e t i e s .
2.1.3
Experiments on o t h e r languages
What remained an open i s s u e a f t e r t h e v e r y c l e a r r e s u l t s o f t h e experiments
on d i f f e r e n t v a r i e t i e s o f German was whether we
a r e here
dealing
with
p h o n o l o g i c a l c a t e g o r i z a t i o n o f German, a l b e i t on a p s y c h o p h o n e t i c b a s i s ,
whether t h e phenomenon i s more widespread
based on a s p e c i f i c f e a t u r e o f
hypothesis
t h a t such a general
human
or
speech
even
a
language
perception
i n general.
p e r c e p t i o n o f FO p a t t e r n s i n human speech. I r r e s p e c t i v e o f t h e
language, leads t o t h e assumption t h a t n a t i v e speakers
i n the
particular
other
languages
than German l i s t e n i n g t o German u t t e r a n c e s should be a b l e t o d e t e c t
in
FO
peak
positions
in relation
t o general
sequences, even w i t h o u t knowing any German a t
assessing
t h e s t i m u l i semantically, but
human
a l l , and
simply
on
changes
consonant - vowel
therefore
t h e basis
without
of
general
p h o n e t i c p r o p e r t i e s o f human speech. I f t h e r e s u l t s o f such l i s t e n i n g
were t o c o i n c i d e w i t h t h e r e s u l t s
f o r German,
this
would
be
i n d i c a t i o n o f a language-independent p s y c h o p h o n e t i c mechanism.
step i n t h i s d i r e c t i o n , t h e s e r i a l
discrimination
test
The
phonological
i n any
of
or
universal,
psychophonetic p r i n c i p l e does o p e r a t e
c a t e g o r i z a t i o n and t h e l i n g u i s t i c f u n c t i o n s i t may serve
a
As
i n the
a
tests
strong
a
first
ascending
o r d e r i n g o f 2.1.1.1.1 was r u n w i t h two groups o f non-German speakers:
(a)
25 Russian speakers i n Leningrad",
who had no knowledge
of
who e i t h e r worked on Russian, E n g l i s h o r French p h o n e t i c s
(11) or
s t u d e n t s i n t h e i r f i r s t o r second y e a r i n t h e P h i l o l o g i c a l
" I wish t o t h a n k P r o f . N a t a l i a Svetozarova o f L e n i n g r a d
a d m i n i s t e r i n g t h e t e s t i n her Phonetics L a b o r a t o r y .
153
German
and
were
Faculty (14).
University f o r
A copy, on s t a n d a r d c a s s e t t e , o f t h e o r i g i n a l
series o f
"Sie
from
h a t j a gelogen." w i t h
FO
peak
shifts
p r o v i d e d . The s u b j e c t s l i s t e n e d t o t h e s e r i e s
s e r i e s t h a t t h e y p e r c e i v e d as b e i n g
most
left
twice
c r o s s , on a prepared answer sheet, t h e number o f
11
stimuli
to right
was
then
had t o
the stimulus
i n the
clearly
and
of
different
from t h e
rest.
(b) 40 n a t i v e speakers o f 13 d i f f e r e n t languages a t t e n d i n g
courses a t b e g i n n e r s o r advanced l e v e l
t e s t t a p e was p r e s e n t e d t o
(twice
14
and
twice
6
them
over
listeners)
at Kiel
German
language
U n i v e r s i t y . The
original
loudspeaker
in their
i n four
relatively
a c o u s t i c a l l y n o n - t r e a t e d classroom. The answer-sheets
were t h e same as f o r t h e c o r r e s p o n d i n g t e s t
2.1.1.1.1. A g r e a t deal o f t i m e and c a r e was
subgroups
with
spent
quiet
and t h e
German
on
procedure
listeners
explaining
Table IX
Background
discrimination
information
test
Native
language
Native
country
Farsi
Iran
Polish
Portuguese
Brazil
Korean
about
t h e 40
Beginners
foreign
listeners
Advanced
in
the
Total
9
1
10
4
2
6
3
1
4
3
1
4
1
3
Spanish
Chile
2
Spanish
Argentina
1
English
USA
3
3
English
England
2
2
Arabic
Israel
1
1
1
Japanese
1
1
Thai
1
1
Nepali
1
1
Chinese
1
1
Singhalese ( S r i Lanka)
1
1
Swedish
28
154
but
1
1
12
40
in
the
test
instructions
i n German.^
Table
IX
provides
the
background
o f t h e Russian group. A l t h o u g h t h e
instruction
i n f o r m a t i o n about t h e 40 l i s t e n e r s .
Results
Table X p r e s e n t s t h e r e s u l t s
demanded a s i n g l e response, some s u b j e c t s i n d i c a t e d more t h a n
as being c l e a r l y
one
stimulus
different.
Table X
Frequency d i s t r i b u t i o n o f ' c l e a r l y
d i f f e r e n t ' responses by 25 Russian
l i s t e n e r s w i t h o u t any knowledge o f German i n t h e l e f t - r i g h t
sequence o f
the s e r i a l discrimination
t e s t across t h e 11 s t i m u l i w i t h FO peak s h i f t s i n
"Sie h a t j a gelogen." ( 1 = l e f t - m o s t , 11 = r i g h t - m o s t p o s i t i o n )
Stimulus
2
Phoneticians
4
5
6
7
8
9
10
1
1
7
4
1
1
1
Non-phoneticians
1
1
2
11
3
1
1
Total
1
2
3
18
7
2
2
Table X I p r e s e n t s t h e r e s u l t s
1
o f t h e m u l t i l a n g u a g e group, r e s t r i c t e d t o t h e
p e r c e p t i o n o f t h e f i r s t change i n t h e s e r i e s . The one
and one Korean speaker d i d n o t p e r c e i v e any
change
Chinese,
at
one
a l l , although
Farsi
the
o t h e r t h r e e Korean speakers d i d .
Table X I
Frequency d i s t r i b u t i o n o f ' f i r s t
change has o c c u r r e d ' responses by 40
l i s t e n e r s o f 13 d i f f e r e n t languages, i n t h e l e f t - r i g h t
sequence o f t h e
s e r i a l discrimination
t e s t across t h e 11 s t i m u l i w i t h FO peak s h i f t s i n " S i e
hat j a gelogen." ( 1 = l e f t - m o s t , 11 = r i g h t - m o s t p o s i t i o n )
Stimulus
first
4
5
6
7
8
9
11
2
9
19
2
2
2
1
change
perceived
(3 l i s t e n e r s
p e r c e i v e d no change a t a l l . )
^ Robert Gartenberg
carried out the tests
155
and compiled t h e d a t a .
Discussion
Both groups,
i n spite
of their
c l e a r maximum o f t h e response
position
than
language
function
converge
who f a v o u r e d
portion of their
a
'medial'
peak
language-independent
position
is
a
may then
l a n g u a g e - s p e c i f i c phonology a t d i f f e r e n t
i s a higher
answers f o r s t i m u l u s 6. These
indeed
phenomenon, which
a
s t i m u l u s 5, b u t who
r e s u l t s a r e a v e r y s t r o n g i n d i c a t i o n t h a t t h e dichotomy
and
i n having
f o r s t i m u l u s 6. T h i s
f o r t h e German l i s t e n e r s ,
also provided a substantial
diversity,
between an ' e a r l y '
general
psychophonetic,
be i n c o r p o r a t e d i n t o t h e
levels.
Thus i n Mandarin Chinese (see Carding, K r a t o c h v i l , Svantesson & Zhang, 1985)
it
i s put t o
use
i n t h e tone
system,
differentiating
between t h e
c o n t i n u o u s l y ( l o w ) f a l l i n g FO o f tone 3 ( e . g . i n ma3 'horse') and t h e ( h i g h )
rising-falling
FO o f tone 4 ( e . g . i n ma4 ' t o c u r s e ' ) .
t h i s c o n n e c t i o n how a Chinese speaker
B e l l Labs, Murray H i l l
series
tone 4; l a t e r
S h i h , r e s e a r c h worker a t
i n 1986) c l a s s i f i e d t h e 11 s t i m u l i
" S i e h a t j a gelogen."
s l i g h t e s t doubt
(Dr C h i l i n
I t i s worth n o t i n g i n
o f the l e f t - r i g h t
w i t h o u t any knowledge o f German. W i t h o u t t h e
she a s s o c i a t e d s t i m u l i
1 - 4 with
tone 3, s t i m u l u s 5 w i t h
i n t h e s e r i e s tone 4 changed t o t h e combined t o n e 2 - 4; b u t
whereas t h e s w i t c h from tone 3 t o tone 4 o c c u r r e d a b r u p t l y i n t h e succession
of
stimuli
4 and 5, t h e change from tone 4 t o tone 2 - 4 was gradual and
c o u l d be l e s s e a s i l y l o c a t e d ( a t s t i m u l u s 9 t h e change had d e f i n i t e l y
place). This informal
are
differentiated
taken
t e s t shows ( a ) t h a t tones 3 and 4 i n Mandarin Chinese
by
t h e FO
maximum
relative
t o t h e vowel
onset,
a
p r e n u c l e a r FO peak s i g n a l l i n g t h e f o r m e r , a n u c l e a r FO peak t h e l a t t e r , and
(b)
that
these
categorizations
a r e p o s s i b l e on t h e
language-independent
b a s i s o f human speech p e r c e p t i o n i n g e n e r a l .
A d i f f e r e n t case o f e x p l o i t i n g t h e p e r c e p t u a l r e l e v a n c e o f e a r l i e r v s . l a t e r
peaks a r e t h e acute and grave
tonal
(Garding,
finally
English
1979,
and
intonation
French
differentiate
been
And
make
phonologies
'closed/open
"She's
1982).
intonation
them t o semantic
dimension
and " E l l e
a menti."
156
languages
peak
like
patterns
distinctions
(see 2 . 2 ) . E n g l i s h
i n t h e same way as German i n t h e i r
lying."
i n Norwegian and Swedish
use o f t h e d i f f e r e n t
and r e l a t e
t o argument'
word accents
and
in their
along t h e
French
corresponding
I t i s an
German,
interesting
can
sentences
research
objective f o r the future t o investigate the d i f f e r e n t
linguistic
functions
t h e dichotomy can be p u t t o i n t h e w o r l d ' s languages.
2.2 Semantics
The q u e s t i o n t o be pursued now i s what l i n g u i s t i c f u n c t i o n s
the phonological c o n t r a s t s o f ' e a r l y '
German. I n p a r t i c u l a r , i t i s t o
be
vs.
'medial'
ascertained
vs.
are c a r r i e d
'late'
whether
the
peaks
by
in
categorical
change from ' e a r l y ' t o ' m e d i a l ' and o f t h e more g r a d u a l change from 'medial'
to
' l a t e ' peak p o s i t i o n s a r e mapped onto a semantic
space
f a s h i o n . Some i n s i g h t was g a i n e d from t h e d a t a o b t a i n e d
in a
through
congruent
controlled
d i a l o g u e s (Gartenberg & H e r t r i c h , 1988). Furthermore, i n t h e K i e l and Munich
s e r i a l discrimination
t e s t s (2.1.1.1 and 2.1.2) w i t h " S i e h a t j a
s u b j e c t s were a l s o asked
to
paraphrase
c o r r e s p o n d i n g t o t h e t h r e e peak p o s i t i o n s
t h e meanings
of
(see 1.4 ( 1 ) ) . Here
gelogen.",
the utterances
a r e some
t h e answers.
Kiel
(a)
original
utterance
(b) f i r s t change
(c) further
Statement o f a f a c t
o r end o f an
argumentation.
I n t r o d u c t o r y statement,
beginning o f
argumentation.
As ( b ) , b u t g r e a t e r
insistence.
J u s t i f y i n g statement,
establishing
causality relation
t o what precedes.
Slight surprise
and reproach
over b e h a v i o u r .
Strong
Report t o a t h i r d
p a r t y t h a t she has
been l y i n g ; t h e
speaker s t r e s s e s a
f a c t r e s u l t i n g from
t h e environment.
I n d i g n a n t statement.
S u r p r i s e statement.
Statement,
explanation.
Surprise, astonishment.
Statement
surprise.
Tendency towards
indignation.
without
Statement o f a f a c t ,
e.g. i n t h e c o n t e x t
"the punishment i s
justified."
I n d i g n a t i o n , e.g. i n t h e
c o n t e x t " I would n o t have
expected t h i s o f h e r . "
157
change
surprise.
of
J u s t i f y i n g statement
a t t h e end o f a c h a i n
o f arguments.
Beginning o f an
argumentation, s l i g h t
indignation.
Greater i n d i g n a t i o n .
Statement, r e p o r t .
Sudden r e a l i s a t i o n o f
lying.
Question.
Matter-of-fact
statement.
Statement w i t h
expression of
indignation.
Statement w i t h
expression of
astonishment.
Declarative,
expected, m a t t e r - o f fact.
Unexpected, i n d i g n a n t .
Confirmation of a
f a c t ; i t i s obvious
t h a t she's been
lying.
Surprising f a c t f o r the
speaker, t h e l i e i s
unexpected.
Statement o f f a c t
t h a t t h e speaker
d i s c o v e r e d l o n g ago.
Explanation of a f a c t .
the
the
" I can't believe
it."
Indignation.
Munich
(a)
original
utterance
(b)
first
change
(c)
further
change
C o n f i r m a t i o n o f what
i s a l r e a d y known.
Surprise statement,
r e p r o a c h f u l undertone
( " . . I would not have
expected t h a t . " )
Pure a s t o n i s h m e n t .
Statement.
Astonishment.
Disappointment.
Neutral
Exclamation.
Exclamation w i t h
incredulity.
statement.
Simple statement o f
a f a c t w i t h which
t h e speaker seems
t o be f a m i l i a r .
Surprise, indignation,
speaker's r e a c t i o n t o
a f a c t he d i d not know
before.
But we knew t h a t
b e f o r e anyway.
Amazement.
Surprise,
Matter-of-fact
statement.
Unforgivable
observation.
Astonishment.
Matter-of-fact
s t a t e m e n t , t h e r e was
not r e a l l y any doubt
about her b e h a v i o u r .
I t was not c e r t a i n i f
she would t e l l t h e
t r u t h or not.
Contrary to expectation
she has been l y i n g ,
comes as a s u r p r i s e ;
g r a d u a l t r a n s i t i o n from
(b) t o ( c ) .
158
astonishment.
Another p a r a p h r a s i n g e x p e r i m e n t was c a r r i e d o u t w i t h t h e f o l l o w i n g sentences
c o n t a i n i n g f i r s t an ' e a r l y ' and t h e n a
'late'
peak
was
that?")
(the
underlined
word
r e c e i v e d t h e peak a c c e n t ) :
( 1 ) "Wer war das?" ("Who
d i d t h a t ? " , "Who
(2) "Mach' b i t t e das F e n s t e r zu!" ("Shut t h e window, p l e a s e . " )
(3) "Was
i s t denn e i n Atom?" ("What's an atom?")
(4) "Sehen w i r uns a l s o morqen?"
("So
we
are
going
to
see
each
other
tomorrow.")
Each p a i r o f ' e a r l y / m e d i a l ' peak u t t e r a n c e s was s e l e c t e d f r o m
10
naturally
produced r e p e t i t i o n s (speaker KK) and p l a y e d t o l i s t e n e r s as o f t e n
l i k e d . They had t o w r i t e down t h e i r assessment o f t h e s i t u a t i o n
a t t i t u d e t h a t f i t t e d each sentence and peak p a t t e r n . Here a r e
or
some
as
they
speaker
of
the
answers:
'ear7y'
(1) Several people are asked:
which o f you was i t ?
'medial'
Speaker A asks speaker B t h e name
o f a t h i r d person.
In sense o f "who d i d t h a t ? "
Somebody unknown t o t h e speaker i s
p a s s i n g by and t h e speaker asks a
l i s t e n e r t h e name o f t h e unknown
person.
The speaker asks t h e l i s t e n e r
a q u e s t i o n , he knows t h e
answer h i m s e l f and urges t h e
l i s t e n e r t o g i v e him t h e
r i g h t answer.
The speaker wants t o know something
unknown t o him, e.g. he has j u s t seen
somebody whose name he does n o t know.
Reproachful q u e s t i o n :
t h e person concerned has t o
expect a r e p r i m a n d f o r some
mischief.
N e u t r a l , p o s i t i v e q u e s t i o n , e.g. a
t e a c h e r ' s q u e s t i o n : "Charles t h e Great,
who was t h a t ? "
Speaker sounds s u p e r i o r ,
demanding, t r i e s t o be
distant.
Speaker asks i n a f a m i l i a r way, i s on
t h e same l e v e l as t h e person spoken t o .
( 2 ) Speaker, r a t h e r annoyed,
asks somebody s t a n d i n g a t
t h e open window t o shut i t ,
Order t o shut t h e window
a t once.
Request t o shut t h e window, n o t t h e
door.
F r i e n d l y r e q u e s t t o s h u t t h e window,
n o t t h e door, because, e.g., o t h e r w i s e
grandmother m i g h t c a t c h a c o l d .
159
Tone o f a command, s l i g h t l y
t h r e a t e n i n g , repeated order
t o t h e naughty son, o b j e c t
of shutting i s self-evident,
c o u l d be l e f t unmentioned.
Opposed t o "shut t h e door p l e a s e , "
o b j e c t has t o be d e f i n e d s p e c i a l l y ,
Speaker does n o t know t h e answer
h i m s e l f and asks t h e l i s t e n e r t o g i v e
him t h e i n f o r m a t i o n .
( 3 ) Speaker asks t h e l i s t e n e r f o r
specific information t o test
him, i . e . t h e speaker knows
t h e answer t o t h e q u e s t i o n
himself.
Speaker knows t h e answer
a l r e a d y , e.g. c o u l d be a
teacher.
Speaker does n o t know t h e answer, asks
a real question.
Teacher t o h i s c l a s s ,
rhetorical.
A f t e r t h e t e a c h e r has p r o v i d e d t h e
e x p l a n a t i o n a p u p i l n o t having heard
i t asks what an atom i s .
Exam q u e s t i o n .
Continuation i n a chain o f questions.
( 4 ) Statement, a l l s e t t l e d ,
routine utterance.
Tomorrow, n o t today o r t h e day a f t e r
tomorrow o r any o t h e r day t h e speaker
might p r e f e r .
At t h e end o f an o r d i n a r y
conversation, routine.
Speaker mentioned tomorrow f o r t h e
next meeting and t h e n changed i t ; t o
make sure he r e p e a t s t h e new arrangement a t t h e end o f t h e c o n v e r s a t i o n .
Statement.
C o n f i r m a t i o n o f tomorrow as a g a i n s t
t h e day a f t e r .
The meanings
that
may
be
abstracted
from
the dialogue
data
and t h e
paraphrases f o r t h e t h r e e peaks a r e :
( a ) early:
e s t a b l i s h e d f a c t ; no room f o r d i s c u s s i o n ;
final
summing
up o f
argument
( b ) medial:
new f a c t ; open f o r d i s c u s s i o n ; s t a r t i n g a new argument
( c ) 7ate: emphasis on a new f a c t and c o n t r a s t t o what should e x i s t o r e x i s t s
i n t h e speaker's o r hearer's
The
idea.
FO peak d i f f e r e n c e s a r e t h u s n o t a s s o c i a t e d w i t h s t r e s s ,
which
remains
t h e same i n a l l t h r e e cases, b u t w i t h i n t o n a t i o n , which i s i n t u r n l i n k e d t o
semantic c a t e g o r i e s e x p r e s s i n g
t h e speakers e v a l u a t i o n o f f a c t s
in
respect
'medial'
and
'late'
peaks s i m i l a r c a t e g o r i z a t i o n s have been proposed
f o r English,
the
'late'
peak e x p r e s s i n g
h i s uncertainty
o f e x p e c t a t i o n s . As r e g a r d s t h e d i s t i n c t i o n
t h e speaker's
incredulity
160
between
or
(Ward
&
H i r s c h b e r g , 1985; P i e r r e h u m b e r t
& Steele,
The p h o n e t i c d i f f e r e n t i a t i o n between t h e
1989).
three
peaks
and
the
associated
changes o f meaning p o i n t t o another i n s t a n c e o f what Ohala (1983,
1984)
c a l l e d t h e f r e q u e n c y code: low
high
ones
i n t h e case under d i s c u s s i o n , t h i s l i n k has
been
submissiveness.
Of course,
frequencies
signal
domination,
has
g i v e n l i n g u i s t i c p l a s t i c i t y i n two ways:
the synchronization w i t h the s y l l a b l e s t r u c t u r e , i . e . w i t h
human
sound
articulation,
a semantic d e n o t a t i o n , r a t h e r than an e x p r e s s i v e meaning.
But t h e semantics o f
'closed
vs.
open
to
r e l a t e d t o 'domination v s . submissiveness'.
argumentation'
are
intimately
I t i s , however, n o t
necessarily
t h e d o m i n a t i o n o r submissiveness o f t h e speaker t h a t i s s i g n a l l e d
may be t h a t o f t h e s i t u a t i o n o r o f o t h e r communicative p a r t n e r s
e s t a b l i s h e d f a c t o r l e a v i n g t h e door open f o r change and new
are t h e b a s i c , u n d e r l y i n g meanings o f ' e a r l y '
a c t u a l meanings o b s e r v a b l e on
the
surface
in
these
basic
c o n t e x t s depend on t h e i n t e r p l a y o f
c o n t o u r s w i t h t h e semantics a t t h e l e v e l s o f
and across sentences,
vs.
setting
an
These
peaks.
The
utterances
and
semantics
syntactic
i t
things.
'non-early'
individual
here,
of
intonation
structures,
within
and o f t h e l e x i c o n .
I f an e a r l y peak i s used i n q u e s t i o n s ,
whose
semantics
suggest
openness,
t h e n t h e q u e s t i o n g e t s s p e c i a l c o n n o t a t i o n s i n keeping w i t h t h e semantics o f
t h e e a r l y peak i n t o n a t i o n : t h e q u e s t i o n i s asked w i t h a
presumed
knowledge
o f t h e answer, as i n
-
t h e t e a c h e r ' s q u e s t i o n "Wer war das?" ("Who
did that?" =
I'll
find
o u t anyway; p o s s i b l e t h r e a t )
t h e exam q u e s t i o n "Was
the
resume
asking
i s t e i n Phonem?" ("What's a phoneme?")
f o r confirmation
"Das
Phonem
ist
also
L a u t k l a s s e . " ("So t h e phoneme i s a sound c l a s s . " = Can we keep
eine
that
in
mind and s t a r t from t h e r e , moving t o t h e next q u e s t i o n ? )
I f an
imperative
construction
gets
an
early
peak,
c o n t r a d i c t i o n between t h e s i g n a l l i n g , t h r o u g h i n t o n a t i o n ,
there
of
is
the
again
a
expected
c o m p l e t i o n o f an a c t i o n , and, t h r o u g h s y n t a x , o f t h e o r d e r t o c a r r y i t o u t .
T h i s c o n t r a d i c t i o n produces t h e c o n n o t a t i o n o f annoyance and
t h e d e l a y o f an a c t i o n . "Mach' b i t t e das
161
Fenster
z u . " ("Shut
impatience
the
at
window.
p l e a s e . " ) may become a t h r e a t i n s p i t e o f " b i t t e " . The e a r l y peak
can
also
g e t t h e c o n n o t a t i o n o f r e s i g n a t i o n because n o t h i n g can be done t o a l t e r
the
e s t a b l i s h e d f a c t s : "Nun g u t . Wie S i e w o l l e n . " ( " A l r i g h t . As you l i k e . " ) .
The
r e s i g n a t i o n i s a l l t h e g r e a t e r t h e e a r l i e r t h e FO f a l l
the
low FO t a i l
and
the
longer
on " g u t " and " w o l l e n " . I n e i t h e r - o r q u e s t i o n s , an e a r l y peak
second p o s i t i o n s i g n a l s a
whereas a succession
choice
o f medial
within
a
closed
peaks w i t h low FO i n
set
of
between
in
alternatives,
refers
to
an
open s e t o f a l t e r n a t i v e s , which a r e s i m p l y g i v e n as p o s s i b l e examples from a
longer l i s t :
" W i l l s t du Tee oder K a f f e e ? "
R i s i n g p a t t e r n s i n s t e a d o f medial
("Would you l i k e t e a o r c o f f e e ? " ) .
peaks convey t h e same open s e t
but
sound
l e s s c a t e g o r i c a l and more f r i e n d l y .
In t h e l a t e
peak,
the
the
openness
c o n n o t a t i o n o f t h e r i s e and i n t r o d u c e s t h e speaker's d i f f e r e n c e o f
opinion,
which i s r a t e d v e r y h i g h
preceding
in
low
relation
FO
to
interferes
observable
with
facts.
The
s t r e s s e s t h e d i f f e r e n c e between h i s o p i n i o n o r way o f a s s e s s i n g
things
t h e o p i n i o n o f o t h e r s o r f a c t s o r b e l i e f s as t o how t h i n g s should
leads
to
meanings
of
surprise,
incredulity,
"that
peak s h i f t t o t h e r i g h t . Very o f t e n t h e l a t e peak
is
to
be
the
combined
p a r t i c l e s , r e i n f o r c i n g t h e i r meanings, such as (word w i t h l a t e
and
be.
can't
i n s i n u a t i o n , t a l k i n g down, changing i n degree a c c o r d i n g
speaker
This
true",
amount
with
peak
of
modal
accent
underlined)
"ja"
i n exclamations
"Da s t e h t j a e i n e K i r c h e ! " ("Oh,
there's
a
church!"),
expressing
s u r p r i s e because r e a l i t y d i f f e r s from t h e speaker's view,
"doch"
i n statements
and
imperatives/requests
"Er i s t doch gekommen." ("He's come, what are you g o i n g on a b o u t . " ) ,
"Setzen S i e s i c h doch." ("Do s i t down."), "You a r e
it
is
my
opinion
that
you
should
be
still
sitting."),
standing,
expressing
o p p o s i t i o n t o what t h e speaker i s c o n f r o n t e d w i t h ,
"etwa"
i n questions
"Hast du das etwa g e k a u f t ? "
expressing
("You d i d
not
buy
i n c r e d u l i t y , which i s a l l t h e s t r o n g e r
that,
did
the
you."),
greater
the
emphasis s i g n a l l e d by peak h e i g h t .
In
these examples t h e modal p a r t i c l e may be m i s s i n g , b u t t h e presence
l a t e peak s t i l l
conveys t h e meaning o f
162
a
contrast
between
of
a
t h e speaker's
o b s e r v a t i o n and h i s o p i n i o n on i t . I n u t t e r a n c e s , such
"Natiirlich."
any
opinion
("Of course.")
to
the
as
"Ja."
("Yes."),
t h e speaker s t r e s s e s h i s own o p i n i o n and r e j e c t s
contrary,
producing
presumptious u n d e r t o n e . T a l k i n g t o a c h i l d ,
a
supercilious,
"Wie h e i B t
du
arrogant,
denn?"
("What's
y o u r name?") s t r e s s e s t h e d i s t a n c e between t h e speaker and t h e addressee and
gives the impression
o f t a l k i n g down.
C h r i s t i n e ubernachtet?"
In
a
sentence
("Did you spend t h e n i g h t
like
with
"Hast
Christine?")
are i n d i c a t i o n s t h a t t h e addressee has done j u s t what t h e speaker
but should n o t have because t h i s c l a s h e s
speaker p u r p o r t s t o h o l d , r e s u l t i n g i n
with
moral
reproach
bei
there
suggests,
standards
or
du
which
insinuation;
the
combined
w i t h a h i g h peak i t suggests i n c r e d u l i t y .
The i m p o r t a n t l e s s o n t o be l e a r n t from these d a t a i s t h a t t h e r e i s a
l i n k between p a r t i c u l a r FO c o n t o u r s
and s p e c i f i c meanings, b u t t h i s l i n k
not one on t h e s u r f a c e , b u t u n d e r l i e s t h e a c t u a l
r e s u l t o f an i n t e r a c t i o n o f v a r i o u s
(e.g.
Scherer,
1985)
substance/expressive
who
have
direct
meaning
been
meanings,
levels.
which
Social
concerned
with
is
are t h e
psychologists
these
direct
meaning r e l a t i o n s , have o f t e n l a c k e d a d e t a i l e d i n s i g h t
i n t o t h e p h o n e t i c and semantic s t r u c t u r e s o f language as a p r e r e q u i s i t e t o a
successful
interpretation.
The
corollary
of
the
phonetic-semantic
e x p l a n a t i o n s o f f e r e d f o r t h e use o f d i f f e r e n t FO peaks i n i n t o n a t i o n i s t h a t
these p h o n o l o g i c a l
intonation categories i n t h e i r association with
r e l a t a b l e i n one form o r another t o t h e b a s i c ones g i v e n must
meanings
be
at
least
widespread i n languages, p r o v i d e d t h e p h o n o l o g i c a l dichotomy has n o t a l r e a d y
been booked a t some o t h e r l e v e l , e.g. tone o r word
2.3 General d i s c u s s i o n concerning
accent.
Hypothesis ( 2 )
The p e r c e p t i o n experiments o f 2.1 and t h e semantic e v a l u a t i o n
paraphrasing
tasks
in
2.2
have
largely
confirmed
derived
Hypothesis
C o n t r i b u t i o n I ( K o h l e r , 1991b): t h e s h i f t o f an FO peak i n
a
from
(2)
of
single-accent
t e r m i n a l u t t e r a n c e between a p r e n u c l e u s and a nucleus p o s i t i o n r e s u l t s i n
c a t e g o r i c a l change o f
perception,
categorical
switch
semantic
along
'closed/open t o a r g u m e n t a t i o n ' ;
produces a gradual
expressing
which
the
is
correlated
dimension
t h e corresponding
a u d i t o r y change
correlated
with
equally
'established/new'
realignment
with
an
a
to
semantic
a
the
or
right
continuum
degrees o f d i s t a n c e which t h e speaker e s t a b l i s h e s between h i m s e l f
and t h e w o r l d as i t p r e s e n t s
i t s e l f t o him. T h i s degree o f
163
distance
rather
than
t h e degree
semantic
basis
o f emphasis,
as
o f t h e 'medial'
c o r r e l a t e d w i t h peak
formulated
to 'late'
i n Hypothesis
peak positions,
( 2 ) , i s the
emphasis
being
height,
3. I n t o n a t i o n and s t r e s s
It
has a l r e a d y been p o i n t e d
in
Section
associated
2.
represent
with
the
out t h a t the three
different
same
FO peak p o s i t i o n s
phonological
stressed
categories
syllable.
So
discussed
of
intonation
intonation
must
be
d i f f e r e n t i a t e d from s t r e s s , t h r o u g h which a s y l l a b l e i n a c h a i n i s s e l e c t e d
and marked f o r an intonation
peak
( o r v a l l e y ) t o be hooked o n t o .
But t h e
s t r e s s f e a t u r e may be chosen f o r d i f f e r e n t s y l l a b l e s i n a sequence, and thus
a s h i f t o f an FO peak ( o r v a l l e y ) p o s i t i o n from one s y l l a b l e t o a n o t h e r can
a l s o change t h e s t r e s s p o s i t i o n i n a s y l l a b l e c h a i n ,
not j u s t the intonation
peak ( o r v a l l e y ) a s s o c i a t e d w i t h i t . FO peaks can t h e r e f o r e become cues t o
s t r e s s beside b e i n g cues t o i n t o n a t i o n . Then two q u e s t i o n s a r i s e :
(a)
Under what c o n d i t i o n s
in
sound
different
pattern
duration
and
syllable?
changes,
i s an FO peak s h i f t
intensity)
Two
but
cases
the
sufficient
have
peak
( w i t h o u t c o n c o m i t a n t changes
t o be
pattern
to
shift
stress
distinguished:
stays,
or
the
both
to
a
stress
change.
In
p r i n c i p l e , a t each s t r e s s p o s i t i o n t h r e e i n t o n a t i o n peaks are p o s s i b l e .
(b)
How
can
the
stress
and
intonation
functions
of
FO
peaks
be
d i f f e r e n t i a t e d , and i n what ways do t h e y i n t e r a c t ?
These q u e s t i o n s r e l a t e t o t h e l e v e l
o f l e x i c a l s t r e s s o r o f sentence
because words i n sentences do n o t a l l r e t a i n t h e i r s t r e s s e s .
t h e f o r m e r , 3.3 w i t h t h e l a t t e r .
signalling
will
with
of stress,
conflicting
intonation
3.1 d e a l s w i t h
I n 3.2 t h e importance o f d u r a t i o n
in addition
deal w i t h t h e p e r c e p t u a l
stress
t o FO,
will
be d i s c u s s e d .
f o r the
F i n a l l y , 3.4
a m b i g u i t y between one and two accents combined
patterns,
and
3.5
will
enquire
into
the
r e l e v a n c e o f i n t e n s i t y f o r t h e c u i n g o f s t r e s s and i n t o n a t i o n .
3.1
Lexical
stress
German o f f e r s good examples f o r t e s t i n g t h e i s s u e s o f s t r e s s s i g n a l l e d by FO
peak p o s i t i o n and o f s t r e s s and i n t o n a t i o n i n t e r a c t i o n a t t h e l e x i c a l
level
because i t has minimal v e r b p a i r s , w i t h e i t h e r p r e f i x o r stem s t r e s s , which
can
occur
i n the
same
umlagern." [ C B v i B t s vol
natural
sentence
frame,
'umla:gBn ( u m ' l a : g B n ) ] ,
164
e.g.
with
"Er
stress
wird's
wohl
e i t h e r on t h e
p r e f i x "um-", meaning " v e r l a g e r n "
("He i s presumably g o i n g t o s h i f t
a n o t h e r p l a c e . " ) , o r on " - l a - " , meaning " b e l a g e r n "
to
besiege
("He i s presumably
to
going
it.").
U t t e r a n c e s o f t h e above two sentences,
'medial'
i t
(a) with
stress
on
"um-"
and a
i n t o n a t i o n peak on t h i s s y l l a b l e , and ( b ) w i t h s t r e s s on " - l a - " and
an ' e a r l y ' i n t o n a t i o n peak, which i s a c t u a l l y l o c a t e d on t h e s y l l a b l e "um-",
were analysed and F i g . 19 p r e s e n t s t h e
d i s p l a y s . The FO peak
identical
positions
waveforms
together
with
i n t h e two u t t e r a n c e s
their
are
FO
practically
i n r e l a t i o n t o t h e s y l l a b l e s t r u c t u r e s o f "umlagern":
they
at more o r l e s s t h e same t i m e i n t e r v a l j u s t b e f o r e t h e b e g i n n i n g
occur
o f / I / . The
d i f f e r e n c e s between t h e two a r e i n t h e shapes o f t h e FO peak c o n t o u r s and i n
t h e s y l l a b l e d u r a t i o n s . I n t h e u t t e r a n c e w i t h stem s t r e s s
i n F i g . 19b t h e
post-peak FO descent i s more g r a d u a l , t h e s y l l a b l e "um-" s h o r t e r (135 ms i n
Fig.
19b v s . 222 ms i n F i g . 19a) and t h e r e f o r e t h e FO r i s e f a s t e r ,
at
a s t r u c t u r a l l y e a r l i e r p o i n t (beginning o f the / I /
at
t h e "um-" s y l l a b l e o n s e t , as i s t h e case i n t h e u t t e r a n c e
starting
i n "wohl" r a t h e r
with
than
prefix
s t r e s s ) . The " - l a - " s y l l a b l e s i n t h e two u t t e r a n c e s , on t h e o t h e r hand, have
very s i m i l a r durations i n t h e
Fig.
stem
and p r e f i x
stress
19b v s . 258 ms i n F i g . 19a). Two f u r t h e r s t i m u l i
words
were
(268 ms i n
generated
t h e two i l l u s t r a t e d i n F i g . 19 by exchanging t h e FO c o n t o u r s (see
These f o u r s t i m u l i
from
Fig. 20).
( S T l - ST4) were t h e b a s i s f o r c r e a t i n g f o u r s e r i e s o f FO
peak p o s i t i o n s ( P I - P 4 ) :
PI
A s e r i e s o f 12: 6 l e f t s h i f t s ( p a r a l l e l
transposition o f the l e f t
and t i m e expansion o f t h e r i g h t branch) and 5
complete
parallel
branch
right
s h i f t s o f 30 ms each i n t h e u t t e r a n c e o f F i g . 19a.
P2
A s e r i e s o f 9: 8 complete p a r a l l e l
left shifts
o f 30 ms
each
i n the
u t t e r a n c e o f F i g . 19b.
P3
A s e r i e s o f 12 i n t h e u t t e r a n c e o f F i g . 20a, f o l l o w i n g t h e procedure i n
PI.
P4
A s e r i e s o f 9 i n t h e u t t e r a n c e o f F i g . 20b, f o l l o w i n g t h e
procedure i n
P2.
PI and P3 a r e based on t h e o r i g i n a l
p r e f i x s t r e s s , P4 and P2 on t h e o r i g i n a l
stem s t r e s s u t t e r a n c e , and i n each p a i r i n g t h e
between more a b r u p t l y and s l o w l y f a l l i n g
From t h e s e
f o u r sets o f s t i m u l i
FO
two t e s t s
165
series
peak
form
contours,
were c o m p i l e d :
an
opposition
respectively.
Test I combined
1.5181
.0000
PITCH
CHZ]
Fig.
19
Speech waves and FO c o n t o u r s ( 1 i n e a r s c a l e ) o f t h e o r i g i n a l ( a ) p r e f i x s t r e s s
w i t h 'medial' peak and ( b ) s t e m - s t r e s s w i t h e a r l y peak i n "Er w i r d ' s wohl
umlagern." A, B, C mark t h e FO base and peak p o i n t s f o r peak c o n t o u r s h i f t .
166
1.5181
.0000
TIME<REL)
CSECa
I
SPEECH
PITCH
CHZ:
F i g . 20
As i n F i g . 19, b u t w i t h exchanged FO c o n t o u r s ,
t i m i n g o f t h e new u t t e r a n c e .
167
adjusted
t o the d i f f e r e n t
t h e more s h a r p l y f a l l i n g s e t s PI and P4, Test I I t h e s l o w l y f a l l i n g s e t s P2
and P3. S u b j e c t s were asked t o i d e n t i f y
either
"belagern"
details
about
(stem
test
stress)
stimulus
or
"verlagern"
generation,
a d m i n i s t r a t i o n can be found i n K o h l e r
In
PI
and
P3
the
series
of
FO
the s t i m u l i
test
with
t h e meanings o f
(prefix
tape
stress).
construction
Further
and
test
(1990c).
peak
positions
straddle
the
syllable
s t r u c t u r e s where a change from p r e f i x t o stem s t r e s s i s t o be expected i f FO
i s a s u f f i c i e n t cue. The two s e t s d i f f e r
i n t h a t t h e peak shape o f P3, b u t
not o f P I , approximates t h e more s l o w l y descending FO c o n f i g u r a t i o n found i n
t h e e a r l y peak o f t h e o r i g i n a l
stem-stress utterance
( c f . F i g . 19b). I t i s
h y p o t h e s i z e d , t h e r e f o r e , t h a t i f s t r e s s i s p e r c e p t u a l l y s h i f t e d a t a l l i n PI
and P3, t h e r e
will
be a more c l e a r - c u t
change
i n PI because t h e r e
h i g h e r p r o b a b i l i t y i n P3 t h a t an FO peak p o s i t i o n on "um-"
p e r c e i v e d as a 'medial'
is a
can n o t o n l y be
o r ' l a t e ' peak p r e f i x s t r e s s b u t a l s o as an ' e a r l y '
peak stem s t r e s s . S i m i l a r l y , t h e r e would be a g r e a t e r l i k e l i h o o d i n P2 than
i n P4 f o r an ' e a r l y ' peak stem s t r e s s t o i n t e r f e r e w i t h a 'medial'
or ' l a t e '
peak p r e f i x s t r e s s because o f t h e slower FO descent and i t s t i m e
expansion
i n t h e l e f t s h i f t o f P2 as a g a i n s t
P4.
Results
Figs.
21 and 22 p r e s e n t t h e d a t a o f t h e two i d e n t i f i c a t i o n
original
prefix
and stem s t r e s s
more s h a r p l y f a l l i n g peak
In
and D i s c u s s i o n
the s h i f t
original
s e r i e s , r e s p e c t i v e l y , each w i t h
thus
o f t h e more
sharply
p r e f i x - s t r e s s utterance
override
unstressed " - l a - "
duration,
syllable
falling
there
FO
peak
contour
i s a c l e a r change
particularly
i n the original
d u r a t i o n under s t r e s s . I n s t i m u l u s
from 1 t o 12 t o y i e l d
slow
and
through
the
contours.
stem s t r e s s , i n s p i t e o f t h e d u r a t i o n o f "um-"
can
tests f o r the
from
p o i n t i n g t o t h e former.
since
utterance
the
duration
i s very
10, which i s t h e f i r s t
an unequivocal
initial to
close
of
FO
the
to i t s
i n the ordering
stem-stress c a t e g o r i z a t i o n w i t h
over
80% p o s i t i v e responses, t h e FO peak p o s i t i o n i s 30 ms i n t o t h e vowel o f t h e
syllable
"-la-".
This
corresponds
t o the data
discussed
c o n c e r n i n g t h e change from an ' e a r l y ' t o a 'medial'
stressed
i n Section
2.,
i n t o n a t i o n peak on t h e
s y l l a b l e . The f a c t t h a t t h e change from one s t r e s s p o s i t i o n t o t h e
168
%„bGlQgGrn"
1
2
3
4
5
6
7
8
9
1
0
11
12
stim nr
F i g . 21
Percentage s t e m - s t r e s s responses f o r "umlagern" (= " b e l a g e r n " , i . e . stem
s t r e s s ) i n t h e s e r i e s o f 12 FO peak p o s i t i o n s ( f r o m l e f t t o r i g h t ) combined
w i t h t h e o r i g i n a l p r e f i x - s t r e s s u t t e r a n c e o f "Er w i r d ' s wohl umlagern."
( n r . 7 appr. o r i g i n a l peak p o s i t i o n ) . Broken l i n e = P3, s l o w l y f a l l i n g peak
c o n t o u r ( n = 80 a t each d a t a p o i n t ) , c o n t i n u o u s l i n e
= P I , sharply f a l l i n g
peak c o n t o u r ( n = 185 a t each d a t a p o i n t ) , d o t t e d l i n e = P I , s h a r p l y f a l l i n g
peak c o n t o u r , b u t i n Test I I I o f 3.2, see t e x t (n = 170 a t each data p o i n t ) .
other
i s gradual r a t h e r than c a t e g o r i c a l
duration
cue. But we a l s o
and i n t o n a t i o n f u n c t i o n s
the
beginning
simultaneously
of
have t o c o n s i d e r some i n t e r a c t i o n o f t h e s t r e s s
o f FO because t h e FO peak assumes p o s i t i o n s
the syllable
function
finding
that
initial-stress
when
nucleus
as t h e 'medial'
s t r e s s e d "um-" and as t h e ' e a r l y '
relevance o f t h i s
can be r e l a t e d t o a r e s i d u e o f t h e
intonation
t h e more
/a:/ o f
or 'late'
intonation
interference
slowly
stress
FO peak
which
intonation
peak i n s t r e s s e d
with
falling
"-la-"
before
can
peak i n
" - l a - " . The
i s c o n f i r m e d by t h e
i s substituted the
c a t e g o r y i s n o t so c l e a r l y r e p r e s e n t e d : t h e i n t e r p r e t a t i o n o f
169
%,.bGlQgGrn"
100 T
0 '
1
1
1
1
1
1
1
1
1
2
3
4
5
6
?
8
1—
9
stim nr
F i g . 22
Percentage s t e m - s t r e s s responses f o r "umlagern"
(= " b e l a g e r n " , i . e . stem
s t r e s s ) i n t h e s e r i e s o f 9 FO peak p o s i t i o n s ( f r o m l e f t t o r i g h t ) combined
w i t h t h e o r i g i n a l s t e m - s t r e s s u t t e r a n c e "Er w i r d ' s wohl umlagern." ( n r . 9
appr. o r i g i n a l peak p o s i t i o n ) . Broken l i n e = P2, s l o w l y f a l l i n g peak c o n t o u r
(n = 80 a t each data p o i n t ) , c o n t i n u o u s l i n e
= P4, s h a r p l y f a l l i n g peak
c o n t o u r ( n = 185 a t each d a t a p o i n t ) , d o t t e d l i n e = P4', s h a r p l y f a l l i n g
peak c o n t o u r and d u r a t i o n s o f p r e f i x s t r e s s ( c f . Test I I I o f 3.2, n = 170 a t
each d a t a p o i n t ) .
an
'early'
intonation
peak
f o r stem
stress
i s then
never
completely
precluded.
When
an
FO
peak
contour
i s shifted
through
the original
stem-stress
u t t e r a n c e t h e r e i s no change between t h e s t r e s s c a t e g o r i e s ( F i g . 2 2 ) :
answers remain
cannot
predominantly
i n f a v o u r o f stem s t r e s s .
In this
the
case, FO
o v e r r i d e t h e d u r a t i o n cue c o m p l e t e l y because "um-" i s t o o s h o r t i n
r e l a t i o n t o " - l a - " t o signal
initial
s t r e s s . There i s some e f f e c t o f FO when
170
the
more
stimuli
sharply
falling
FO peak
occurs
within
the syllable
1 t o 5 t h e FO peak has been s h i f t e d l e f t w a r d
"um-". I n
a l l t h e way i n t o t h e
p r e c e d i n g s y l l a b l e "wohl", whereas i n 6 t o 8 i t has been moved o n l y
as f a r
back as some p o i n t w i t h i n t h e p r e f i x s y l l a b l e "um-", and i n t h e s e
stimuli
there
a r e up t o 30% judgements o f p r e f i x s t r e s s . T h i s p a t t e r n suggests t h a t
the o v e r r i d i n g
salience
o f duration
i n the original
p r e f i x stress
i s checked somewhat when t h e c h a r a c t e r i s t i c s h a r p l y
in
the relevant
falling
stimulus
c o n t o u r occurs
s y l l a b l e and i s more n a r r o w l y l i m i t e d t o i t , a l l o w i n g t h e
i n t e r p r e t a t i o n o f a ' m e d i a l ' o r ' l a t e ' peak on "um-", r a t h e r t h a n an ' e a r l y '
one
on t h e f o l l o w i n g
"-la-".
In t h e other
series,
however,
t h e slowly
f a l l i n g and time-expanded FO c o n t o u r reduces t h e p r o b a b i l i t y o f i n t e r p r e t i n g
t h e peak as a ' m e d i a l ' o r ' l a t e '
peak f o r a p r e f i x s t r e s s ,
because o f t h e
s t r o n g e r i n t e r f e r e n c e from an ' e a r l y ' peak i n t e r p r e t a t i o n on " - l a - " ,
due t o
t h e w i d e r span o f t h e FO peak descent.
The q u e s t i o n s asked i n i t i a l l y can now be answered as f o l l o w s :
(a) An FO peak s h i f t
from
one s t r e s s
by i t s e l f
i s s u f f i c i e n t t o b r i n g about a c l e a r change
position
t o another,
stressed-syllable-to-be
provided
the duration
ofthe
toward which t h e FO peak i s s h i f t e d i s n o t t o o
s h o r t . But even when i t i s , t h e r e i s a r e s i d u a l
FO e f f e c t .
(b) The i n t o n a t i o n f u n c t i o n o f FO i n t e r f e r e s w i t h i t s s t r e s s f u n c t i o n i f t h e
latter
i s n o t supported
by d u r a t i o n .
This
finds
i t s expression
gradual change from one s t r e s s p o s i t i o n t o a n o t h e r i n a b u t t i n g
where an a m b i g u i t y
can a r i s e
between a ' m e d i a l '
or 'late'
in a
syllables
intonation
peak i n one s t r e s s e d s y l l a b l e and an ' e a r l y ' i n t o n a t i o n peak r e l a t e d t o
a subsequent s t r e s s e d
the
s y l l a b l e . This
i n t e r a c t i o n i s s t r e n g t h e n e d when
shape o f t h e FO peak c o n t o u r approximates t h e more s l o w l y
one o f t h e ' e a r l y ' i n t o n a t i o n peak o f a l a t e r
3.2 D u r a t i o n as a f e a t u r e i n s t r e s s
It
has been
perception,
shown
i n 3.1 t h a t
duration
vowels and p o s t v o c a l i c
can become
falling
stress.
perception
although
FO
i s a strong
an a d d i t i o n a l
sonorants a r e s h o r t e r
distinctive
cue i n s t r e s s
feature
t h a n would be a s s o c i a t e d
when
with
t h e p r o d u c t i o n o f a s t r e s s e d s y l l a b l e . On t h e o t h e r hand, i f t h e y a r e l o n g e r
than would be a s s o c i a t e d w i t h
disturbed,
an u n s t r e s s e d
s y l l a b l e , t h e FO cue may be
b u t never dominated by t h e d u r a t i o n cue.
171
3.2.1 D u r a t i o n
increase f o r inducing s t r e s s perception
i n FO peaks
The importance o f d u r a t i o n f o r s t r e s s p e r c e p t i o n was f u r t h e r i n v e s t i g a t e d i n
an experiment t h a t r e p e a t e d Test I o f 3.1 by u s i n g t h e peak s e r i e s PI and a
modified
peak s e r i e s P4', i . e . t h e s e t s
of stimuli
based on t h e o r i g i n a l
p r e f i x and stem s t r e s s u t t e r a n c e s , r e s p e c t i v e l y , b o t h combined w i t h t h e more
sharply
falling
FO c o n t o u r
derived
from
t h e p r e f i x - s t r e s s u t t e r a n c e (see
F i g s . 19a and 2 0 b ) . But t h i s t i m e a new b a s i s s t i m u l u s ST4' f o r a s e r i e s P4'
was c r e a t e d
vowel
by a d j u s t i n g t h e d u r a t i o n s
[a:] o f the syllable
v a l u e s as i n t h e b a s i s
deleting
"-la-"
of the syllable
i n t h e basis
"um-"
stimulus
s t i m u l u s S T l . By r e p e a t i n g
[um] and t h e
ST4 t o t h e same
some p e r i o d s
i n [um] and
some i n [ a : ] , [u] was l e n g t h e n e d from 70 ms t o 117 ms,
[m] from
65 ms t o 105 ms, and [ a : ] reduced from 210 ms t o 189 ms. Then t h e FO c o n t o u r
of
t h e b a s i s s t i m u l u s STl was t r a n s f e r r e d - sound segment by sound segment -
to
the modified
basis
stimulus
ST4'. The
series
P4' was
generated
by
s h i f t i n g t h e FO peak t o t h e l e f t as f o r P4.
Series
PI and P4' were t h e n c o m p i l e d t o a new Test I I I , which o n l y
differs
from Test I i n t h e segment d u r a t i o n s o f P4' v s . P4. The f i r s t 7 s t i m u l i o f
PI and t h e l a s t 7 o f P4' occupy t h e same ranges o f FO peak p o s i t i o n s , have
very
similar
comparable
either
segment
FO c o n t o u r s ,
the
original
stem-stress utterance
The
durations
hypothesis
(with
but they
[um]
differ
prefix-stress
with
Test
i n t h e basis
utterance
i n P4', i m p l y i n g
connected
and [ a : ] being
in
identical)
stimulus,
PI
or
the
to prefix
stress
I I I was
that
t h e change
i n a l l cases o f t h e s e r i e s , r e s u l t i n g
response f u n c t i o n s f o r s t i m u l i
which i s
original
s p e c t r a l and i n t e n s i t y d i f f e r e n c e s .
d u r a t i o n s i n P4' v s . P4 would be s u f f i c i e n t t o r e v e r s e judgement
stress
and
o f segment
from stem
i n similar
1 - 7 o f PI and f o r s t i m u l i 3 - 9 o f P4', and
would t h u s p o i n t t o t h e low r e l e v a n c e
o f s p e c t r a l and i n t e n s i t y f e a t u r e s i n
German s t r e s s p e r c e p t i o n . Test I I I was r u n w i t h 34 l i s t e n e r s .
R e s u l t s and D i s c u s s i o n
The d o t t e d
Test
lines
i n Figs.
I I I . The h y p o t h e s i s
confirmed
responses,
identical
21 and 22 p r e s e n t
the results
o f t h e complete r e v e r s a l
by P4 and P4' i n F i g . 22 y i e l d i n g
r e s p e c t i v e l y . The l e f t
shift
o f judgements has been
ca. 80% and 20%
o f t h e response
PI s e r i e s i n Test I I I , compared w i t h Test
172
of identification
"belagern"
function f o r the
I , may be due t o t h e
t e s t d e s i g n : t h e decrease o f t h e number o f c l e a r s t e m - s t r e s s cases and t h e
increase
o f t h e number o f c l e a r p r e f i x - s t r e s s cases by swapping P4' f o r P4
may have pushed t h e responses t o t h e more a m b i v a l e n t
d i r e c t i o n o f stem s t r e s s , b u t t h e r e
curve o f Test I I I ,
3.2.2
Duration
Parallel
of
i n PI i n t h e
i s a l s o more n o i s e i n t h e PI response
as i s shown by t h e o f f s e t o f 10% - 20%.
decrease f o r e l i m i n a t i n g s t r e s s p e r c e p t i o n
t o generating
shortening
cases
i n FO peaks
ST4' f r o m ST4, a new S T l ' was g e n e r a t e d from STl by
t h e d u r a t i o n s o f [um] t o 70 ms - 65 ms ( f r o m 117 ms - 105 ms) and
[ a : ] t o 210 ms
(from
189 ms),
applying
t h e same
period
splicing
procedure. Then t h e same peak s h i f t s t o t h e l e f t and r i g h t were performed as
in
PI, resulting i n P I ' with
contours.
Informal
12 peak
positions
l i s t e n i n g t o the series
and s h a r p l y
falling
P I ' by p h o n e t i c i a n s
t h a t a l l t h e 12 s t i m u l i were u n e q u i v o c a l l y
FO
established
p e r c e i v e d as stem s t r e s s e d ,
even
when t h e FO peak p o s i t i o n was on "um-". Because o f t h i s v e r y c l e a r evidence
no
f u r t h e r formal
duration
t e s t was r u n .
These
o f a stressed-syllable-to-be
results
prove
i s too short
again
that
i f
the
t h e FO cue may n o t be
s u f f i c i e n t t o signal stress.
3.2.3
Conclusion
In German, s t r e s s
i s cued by two f e a t u r e s ,
expressed i n a d i s t i n c t i v e
cue
c l e a r l y dominates
feature
notation
i f the duration
FO and d u r a t i o n ,
which may be
as iFSTRESS, ±DSTRESS. The FO
i s not too short
f o r stressed
s y l l a b l e s ; otherwise longer duration i s required t o signal s t r e s s . Syllables
are
thus
marked
as s t r e s s e d / u n s t r e s s e d
-FSTRESS, -DSTRESS = u n s t r e s s e d ,
e.g.
i n non-initial
("exit"),
valley),
points
which
( 3 ) +FSTRESS,
again
of
increased
compounds
duration,
+DSTRESS = p r i m a r y
("Ausfahrt"
b u t no
stress,
a r e hooked. The i n t o n a t i o n a s s o c i a t e d w i t h
among o t h e r t h i n g s , d e f i n e d
be expressed
features: (1)
( 2 ) -FSTRESS, +DSTRESS = secondary s t r e s s ,
components
receive
by t h e two s t r e s s
[ 'aus ,fa:Bt]
intonation
where
peak ( o r
the intonation
stressed
syllables
is,
a c c o r d i n g t o d i f f e r e n t peak p o s i t i o n s , which may
i n distinctive
feature
notation
dichotomy between ' e a r l y ' and ' n o n - e a r l y ' i n t o
taking
t h e primary
account: ±EARLY, and -EARLY
may t h e n be ±LATE.
At
each
possible.
potential
But s i n c e
stress
position
+FSTRESS,
t h e FO o f these
173
peaks
three
serves
intonation
t o signal
peaks a r e
t h e stressed
syllable
- as a s t r e s s
cue - and a t t h e same t i m e
r e l a t i o n t o such a s t r e s s e d
interference
syllable
- as an i n t o n a t i o n cue, t h e r e may be
between t h e two cue f u n c t i o n s
leading
temporal d i s t a n c e between s u c c e s s i v e p o t e n t i a l
of
t h e type
"umlagern",
t h e peak p o s i t i o n i n
i s small,
t o ambiguity,
i f the
s t r e s s e s , as i n l e x i c a l
particularly
because
of a
items
lack o f
i n t e r v e n i n g u n s t r e s s e d s y l l a b l e s ( e . g . c o n t a i n i n g / a / ) and even more so i n
the case o f a b u t t i n g s y l l a b l e s w i t h s h o r t q u a n t i t y vowels.
3.3
Sentence s t r e s s
In
sentences
n o t every
lexical
item
gets
a
+FSTRESS
a s s o c i a t i o n w i t h i n t o n a t i o n peaks (and v a l l e y s ) , a l t h o u g h
level
i t has l e x i c a l
marked
as having
o f receiving
for
the
a t a more a b s t r a c t
s t r e s s , i . e . a t l e a s t one s y l l a b l e
the potential
marking
i s phonologically
the features
+FSTRESS and
+DSTRESS. The r u l e s o f grammar and p r a g m a t i c s d e t e r m i n e which l e x i c a l - s t r e s s
s y l l a b l e s a r e g i v e n t h e f e a t u r e c o m b i n a t i o n s +FSTRESS, +DSTRESS o r -FSTRESS,
+DSTRESS i n sentences. I n a sentence such as "Aber d e r Leo s a u f t . " [abB dB
•le:o:
("But Leo d r i n k s . " ) ^
"zoift]
e i t h e r the subject
"Leo" o r t h e verb
" s a u f t " may be i n f o c u s , r e c e i v i n g t h e f e a t u r e s +FSTRESS, +DSTRESS, o r both
elements may be so c h a r a c t e r i z e d s i m u l t a n e o u s l y .
i s whether t h e f i n d i n g s a t t h e l e x i c a l
the
sentence
level,
v i z . whether
section
also
o f t h e peak s h i f t
there
i s t h e issue
peaks
( ' e a r l y ' , 'medial',
what was found
t o be answered
i n 3.1 - 2 can be r e p l i c a t e d a t
a switch
a n o t h e r can be brought about s i m p l y
In t h i s case i t w i l l
level
The q u e s t i o n
from
one s t r e s s
position to
by FO peak s h i f t t h r o u g h t h e sentence.
have t o be checked whether a t some
scale
both stresses
o f t h e perceptual
'late')
i n t h e sentences
a r e r e a l i s e d . And f i n a l l y ,
manifestation
a t each s t r e s s
o f Section
intermediate
of different
intonation
position, i n parallel t o
2. w i t h
only
one p o t e n t i a l
accent.
3.3.1
Stimulus preparation f o r perception
A natural production
stress
and
generation.
'medial'
o f the utterance
intonation
peak
experiments
"Aber d e r Leo s a u f t . " w i t h
on
"Leo" was
used
sentence
f o r stimulus
F i g . 23 shows t h e speech wave, energy and FO c o n t o u r s .
A series
^ T h i s sentence p l a y e d an i m p o r t a n t r o l e i n some e x p e r i m e n t s o f t h e Munich
I n t o n a t i o n P r o j e c t (see Altmann e t a l . , 1989) and was t a k e n as t h e b a s i s
o f f u r t h e r experiments i n t h e K i e l I n t o n a t i o n P r o j e c t f o r purposes o f
cross-reference.
174
1.4800
.0000
TinE(REL)
I
[SEC]
45T
ENERGY
CdB]
L
J
I
I
SPEECH
200-1
PITCH
CHZ3
F i g . 23
Speech wave, energy and FO c o n t o u r s ( l i n e a r s c a l e ) o f t h e u t t e r a n c e
der Leo s a u f t , " w i t h s u b j e c t s t r e s s and 'medial' peak. The t i m e
i n d i c a t e t h e FO base and peak p o i n t s f o r peak c o n t o u r s h i f t .
of
7 left
shifts
(parallel
expansion o f t h e r i g h t
of
on t h e b a s i s
assessment o f t h e s e r i e s d e t e c t e d
stimuli
and t i m e
right
shifts of
o f t h e utterance
a poor q u a l i t y
t h e segment / z / and o f t o o s t r o n g a f i n a l
last
branch
branch) and o f 11 complete p a r a l l e l
30 ms each were g e n e r a t e d
informal
transposition of the l e f t
o f t h e s e r i e s , from
aspiration;
15 t o 19, w i t h
"Aber
marks
i n F i g . 23. An
i n the synthesis
furthermore, t h e
t h e accent
on " s a u f t "
sounded t o o s t r o n g a t t h e b e g i n n i n g
and husky a t t h e end, o b v i o u s l y due t o
the
final
wrong
energy
desynchronization
contour
fora
o f FO and energy
and t h e d i s c r e p a n c y
thefinal
peak
position,
(see 2.1.1.5). To remedy
and t o c r e a t e as n a t u r a l s y n t h e t i c v e r s i o n s
[ z ] was d e v o i c e d ,
FO
i.e.to a
these
as p o s s i b l e , almost t h e e n t i r e
a s p i r a t i o n reduced by l o w e r i n g
t h e dB-values,
between energy and FO e l i m i n a t e d by l o w e r i n g
175
defects
t h e energy
As F i g , 23, b u t w i t h t h e 19 peak p o s i t i o n p o i n t s marked.
in
"Leo"
and
by r a i s i n g i t around t h e FO
r e g e n e r a t e d w i t h t h e s e parameter
peaks. The
modifications
peak s e r i e s
of the stimulus
i t formed t h e b a s i s f o r i d e n t i f i c a t i o n and s e r i a l discrimination
3.3.2
was
then
i n F i g . 23;
tests.
Identification test
Five r e p e t i t i o n s o f t h e 19
f o r m a t o f 1.4
stimuli
were randomized and
(2) f o r s i n g l e s t i m u l i )
t o decide whether "Leo"
presented
t o 31 l i s t e n e r s w i t h
o r " s a u f t " was more s t r o n g l y
( i n the
the i n s t r u c t i o n
stressed.
R e s u l t s and D i s c u s s i o n
F i g . 25 p r e s e n t s t h e r e s u l t s o f t h e i d e n t i f i c a t i o n
very c l e a r l y t h a t
verb s t r e s s . The
positions
a simple
FO
peak s h i f t
test,
which
demonstrate
causes a change from s u b j e c t
t r a n s i t i o n i n t h e response f u n c t i o n between t h e two
indicates
- as was
confirmed i n phonetic expert l i s t e n i n g
176
to
stress
- that
as t h e peak i s moved i n t o t h e f r i c a t i v e
giving
a late
FO
rise
[ ? ] and t h e r e f o r e spans both words,
t o "Leo" and an e a r l y
FO f a l l
t o "sauft*
the
p e r c e p t i o n o f d o u b l e s t r e s s r e s u l t s , which d i s a p p e a r s again when t h e peak i s
located
a t t h e beginning
o f t h e vowel
o f t h e verb
and t h e i m p r e s s i o n o f
focus s t r e s s on t h e l a t t e r i s c r e a t e d .
KX)-r
% 'subjeci- stress'
I
1
I
I
3
2
I
4
I
5
I
6
I
7
I
8
I
9
I
10
I
11
I
12
13
14
15 17 19
16 18
stim nr
F i g . 25
I d e n t i f i c a t i o n f u n c t i o n showing percentage ' s u b j e c t s t r e s s ' judgements f o r
19 s t i m u l i "Aber d e r Leo s a u f t . " w i t h FO peak s h i f t from l e f t t o r i g h t ,
( n r 8 appr. o r i g i n a l peak p o s i t i o n ) , n = 155 a t each d a t a p o i n t .
3.3.3 S e r i a l d i s c r i m i n a t i o n t e s t s
The s e r i e s o f 19 s t i m u l i was p a r t i t i o n e d i n t o two s u b - s e r i e s :
- 10 r e p r e s e n t i n g c l e a r i n s t a n c e s
stimuli
o f the category
o f s u b j e c t s t r e s s and ( b )
14 - 19 r e p r e s e n t i n g c l e a r i n s t a n c e s o f t h e c a t e g o r y
according
t o the results of the identification test.
(numerical)
o r d e r i n g was p r e s e n t e d t o 32 s u b j e c t s
177
(a) stimuli 1
o f verb s t r e s s ,
Each s e t i n ascending
f o r evaluating
a t which
stimulus i n the series the f i r s t
and f u r t h e r changes i n t h e speech melody
had o c c u r r e d .
R e s u l t s and D i s c u s s i o n
Tables X I I and X I I I
(a)
p r e s e n t t h e r e s u l t s o f t h e s e r i a l discrimination
tests
and ( b ) .
Table X I I
Frequency d i s t r i b u t i o n o f 'change has o c c u r r e d ' responses o f 32 l i s t e n e r s i n
the l e f t - r i g h t sequence o f t h e s e r i a l discrimination
t e s t across t h e f i r s t
10 s t i m u l i w i t h FO peak s h i f t s i n "Aber der Leo s a u f t . " ( 1 = l e f t - m o s t ,
10 = r i g h t - m o s t p o s i t i o n )
Stimulus
2
4
5
6
7
1
5
12
10
2
4
7
14
9
F i r s t change
perceived
10
8
F u r t h e r changes
perceived
Total
1
5
12
5
7
7
5
7
7
{2 l i s t e n e r s p e r c e i v e d no change a t a l l . )
Table
XIII
Frequency d i s t r i b u t i o n o f 'change has o c c u r r e d ' responses o f 32 l i s t e n e r s i n
the l e f t - r i g h t sequence o f t h e s e r i a l discrimination
t e s t across t h e l a s t
6 s t i m u l i w i t h FO peak s h i f t s i n "Aber der Leo s a u f t . " (14 = l e f t - m o s t ,
19 = r i g h t - m o s t p o s i t i o n )
Stimulus
F i r s t change
perceived
16
17
18
19
16
11
1
1
5
2
5
16
3
6
F u r t h e r changes
perceived
Total
16
(3 l i s t e n e r s p e r c e i v e d no change a t a l l . )
In
both s e r i e s t h e f i r s t
stimulus
respective
i n which
syllable
p e r c e p t u a l change has a maximum f r e q u e n c y
t h e FO
nucleus
peak
occupies
the f i r s t
( n r 5 i n ( a ) and n r 16
coincides w i t h t h e data obtained
position
178
within
i n ( b ) ) . This
i n t h e peak a l i g n m e n t
test
at the
the
result
i n utterances
containing
a s i n g l e p o t e n t i a l accent
( c f . 2.1.1).
I t points
t o t h e change
from an ' e a r l y ' t o a ' m e d i a l ' peak w i t h i n each s t r e s s p o s i t i o n .
A corresponding c l e a r - c u t
switch
was n o t observed i n t h e "umlagern"
series
o f 3.1.^ The reason f o r t h i s d i f f e r e n c e l i e s i n t h e s h o r t e r d u r a t i o n o f [um]
vs.
[ l e : o : ] , which a l l o w s l e s s s e p a r a t i o n o f t h e i n t o n a t i o n peak and
p o s i t i o n s and causes t h e FO c o n f i g u r a t i o n
syllables,
given the width
t o straddle
stress
b o t h p o t e n t i a l accent
o f t h e s h i f t e d peak c o n t o u r ,
across
a
greater
number o f s t i m u l i . The more gradual t r a n s i t i o n from p r e f i x t o stem s t r e s s i n
the
response
function
further
indication
segment
durations
o f F i g . 2 1 , compared
of this
that
stronger
within
that
i n F i g . 25,
stress/intonation
are i n s u f f i c i e n t
t i m i n g t o . To achieve a g r e a t e r
with
interaction
for restricting
is a
across
t h e chosen
peak
s e p a r a t i o n o f t h e d i f f e r e n t i n t o n a t i o n peaks
each a c c e n t , t h e peak descent would a t l e a s t
encroach l e s s on t h e o t h e r peak and s t r e s s
have t o be f a s t e r t o
positions.
3.4 Perceptual a m b i g u i t y between s i n g l e and double accent
I n s p i t e o f t h e more adequate temporal
f o r separating
is s t i l l
s t r u c t u r e i n "Aber d e r Leo
the t h e o r e t i c a l l y possible
an ambiguous t r a n s i t i o n p e r i o d
peak and s t r e s s
sauft.",
positions,
between t h e two p o t e n t i a l
there
accents,
as shown i n F i g . 25. And as was argued i n 3.3.2, t h i s ambivalence i s n o t so
much between e i t h e r s u b j e c t
and double s t r e s s .
an e a r l y
fall
configurations
o r v e r b focus s t r e s s , b u t between s u b j e c t
focus
I n t h e l a t t e r case, t h e l a t e r i s e on "Leo", f o l l o w e d
on " s a u f t " , may
- 'late'
be i n t e r p r e t e d
followed
by ' e a r l y '
as b e l o n g i n g t o two FO
-, w i t h o u t
two accents
are perceived,
still
c l o s e temporal
there
must
proximity
be a s t r e t c h
along
i n t h e second o n l y
one.
In the f i r s t
Because
between t h e two p o t e n t i a l s t r e s s
t h e peak
shift
scale
where
o f the
positions,
the signal
a m b i v a l e n t between t h e s e two i n t e r p r e t a t i o n s . That we a r e here d e a l i n g
a confusion o f subject
listening
t o the series
establishing
stress
focus s t r e s s
o f 19 FO
and double s t r e s s
peak
on "Leo" i n s t i m u l i
shifts
peak
an i n t e r v e n i n g d i p
between t h e two, o r as a s i n g l e ' l a t e ' FO peak on t h e s u b j e c t .
case,
by
is
with
i s proved by e x p e r t
i n "Aber d e r Leo
12 - 14, which may
o r may
sauft.",
n o t be
^ The r e l e v a n t s e r i a l d i s c r i m i n a t i o n t e s t s were c a r r i e d o u t b u t are n o t
r e p o r t e d here i n d e t a i l . The r e s u l t s were n e g a t i v e so t h a t t h e summarising
statement i s c o n s i d e r e d s u f f i c i e n t .
179
accompanied by s t r e s s
on " s a u f t " .
In stimulus
15, however, t h e change t o
focus s t r e s s on t h e v e r b has taken p l a c e : t h e peak r i s e
away from t h e p o t e n t i a l
associated
with
accent
the subject,
syllable
FO
i s now f a r enough
i n "Leo" and t h e r e f o r e
being
low d u r i n g
no l o n g e r
t h e whole o f t h e word
"Leo".
The p e r c e p t u a l a m b i g u i t y between a s i n g l e ' l a t e ' peak and a ' l a t e ' + ' e a r l y '
peak
combination
i s even
and
s t r o n g e r i n cases
syllables
abut
the f i r s t
glanzt.",
as i s shown i n C o n t r i b u t i o n
a b u t t i n g accents t h e f i r s t
the
or
contains a
vowel
where
short
two
potential
vowel,
IV ( H e r t r i c h ,
as
1991a).
accent
i n "Der
Ring
Even when i n
i s l o n g , o r when a s h o r t o r l o n g vowel i n
f i r s t p o t e n t i a l accent p o s i t i o n i s f o l l o w e d by one u n s t r e s s e d vowel { / a /
/B/),
(see
as i n "Die Uhr t i c k t . " ,
"Die Bremse q u i e t s c h t . " , "Die Maler malen."
H e r t r i c h , 1991a), a p e r c e p t u a l c o n f u s i o n between t h e two c a t e g o r i e s i s
possible.
The c o n f u s i o n can be avoided
i f f o r the single
'late'
peak t h e
descent i s r a p i d t o a v o i d t r e s p a s s i n g on t h e second accent s y l l a b l e domain,
as was demonstrated
f o r "Die Maler malen." ( l o c . c i t . ) .
d i s t a n c e between two p o t e n t i a l
t h r o u g h t h e sequence produces
to
accents
So i f t h e temporal
i s s h o r t enough, t h e FO peak
p e r c e p t u a l changes from s u b j e c t focus
shift
stress
dual s t r e s s t o v e r b focus s t r e s s . And i n t h e t r a n s i t i o n area between t h e
two f o c u s s t r e s s e s , p e r c e p t i o n may
first
be ambiguous between double
and
single
a c c e n t s . T h i s a m b i g u i t y d i s a p p e a r s as t h e d i s t a n c e between p o t e n t i a l
accents g e t s l o n g e r , as i n "Die Backer haben gebacken." o r "Die S e k r e t a r i n
hat
In
not
d i e B r i e f e geschrieben." ( l o c . c i t . ) .
accent sequences a t l o n g e r d i s t a n c e s from each o t h e r d o u b l e s t r e s s does
occur by s i m p l e FO peak s h i f t
through the utterance;
t h e peak c o n t o u r
has t o be broadened a t t h e same t i m e t o r e a l i s e a ' m e d i a l ' o r ' l a t e ' r i s e on
one accent s y l l a b l e and an ' e a r l y ' f a l l
two
intonation
syllables,
there
turns
-
may
be
rise
an
and
FO
fall
on t h e n e x t one. I n between these
-
associated
d i p o f v a r i o u s degrees
with
two
stressed
of extension, t o
g e n e r a t e two p r o p e r l y m a n i f e s t e d FO peak c o n t o u r s , o r t h e two peak
are
points
j o i n e d by a p l a t e a u o r a s l i g h t monotone d e s c e n t / a s c e n t , c r e a t i n g a 'hat
pattern'
( c f . Cohen
&
' t Hart,
1967).
Although
the
'hat p a t t e r n '
is
p e r c e p t u a l l y and s e m a n t i c a l l y d i f f e r e n t from a s u c c e s s i o n o f complete peaks
(as
i s shown i n C o n t r i b u t i o n V I , H e r t r i c h , 1991b, see a l s o C o n t r i b u t i o n V I I ,
Kohler,
1991d),
there
are strong
arguments i n f a v o u r o f t r e a t i n g
180
a 'hat
p a t t e r n ' as a succession o f two peaks w i t h o u t an FO d i p :
(1)
The t i m i n g o f t h e i n i t i a l
r i s e i s e x a c t l y t h e same as t h e r i s i n g p a r t i n
a 'medial' o r ' l a t e ' peak. There a r e r i s i n g p a t t e r n s t h a t a r e t i m e d more
s l o w l y and have r i s e s up t o t h e
(see
C o n t r i b u t i o n V I I , Kohler
beginning o f t h e next stressed s y l l a b l e
1991d).
They
have
t o be
recognised
as
s e p a r a t e e n t i t i e s . So we would have t o s e t up two r i s i n g p a t t e r n s - slow
and f a s t
- but since t h e l a t t e r c o i n c i d e s w i t h t h e r i s i n g
peak p a t t e r n i t i s more economical t o have
The complementary s o l u t i o n t o r e g a r d
no new u n i t s
part o f the
'fast
rises'.
'medial' or ' l a t e '
peaks, t o o , as
being composed o f two t o n a l e n t i t i e s each - r i s e and f a l l
- i s ruled out
by t h e f a c t t h a t t h e y c o n s t i t u t e one s t r e s s , whereas t h e 'hat p a t t e r n '
r i s e s and f a l l s r e p r e s e n t two s t r e s s e s .
(2)
The t i m i n g and s y l l a b l e a l i g n m e n t s o f t h e f i n a l
fall
coincide with the
f a l l i n g s e c t i o n o f an ' e a r l y ' ( o r ' m e d i a l ' ) peak.
(3)
'Hat
patterns'
sequences
by
can
be
derived
general
phonetic
r e l a t i o n s h i p s between t h e f i r s t
removing
phonetic
from
the
rules
corresponding
changing
dipped
the
peak
prominence
and t h e second peak as a consequence o f
features characteristic
of the d e f i n i t i o n s
o f the
d i f f e r e n t FO peaks. Two cases can be d i s t i n g u i s h e d :
(a)
In
the
sequence
'medial'
(or
'late')
+
'early'
e l i m i n a t i o n o f t h e FO d i p does n o t a f f e c t t h e e s s e n t i a l
peaks,
the
feature o f the
low f a l l i n g FO i n t h e ' e a r l y ' peak and a l s o p r e s e r v e s t h e c h a r a c t e r i s t i c
(low
it
level
+) r i s e
i n t h e 'medial' ( o r ' l a t e ' )
peak (see 2.1.1.7), b u t
m o d i f i e s t h e complete m a n i f e s t a t i o n o f t h e l a t t e r
by removing t h e
s e p a r a t e FO descent, t h e r e b y r e d u c i n g i t s prominence.
(b)
I n t h e sequence ' m e d i a l ' ( o r ' l a t e ' )
the
elimination
'late'
o f t h e FO
characteristics
pattern'
dip results
+ 'medial'
i n a l o s s o f t h e 'medial' o r
o f t h e second peak
i t lacks the essential
FO
rise
( o r ' l a t e ' ) peaks,
because
i n a d e r i v e d 'hat
i n the s y l l a b l e
nucleus
(see
2.1.1.7), and s i n c e i t cannot be a s s o c i a t e d w i t h an ' e a r l y ' peak e i t h e r ,
not h a v i n g t h e e a r l y l o w f a l l ,
of
i t l a c k s t h e prominence-lending
t h e ' m e d i a l ' peak r i s e as w e l l as o f t h e ' e a r l y ' peak f a l l .
on t h e o t h e r hand, t h e f i r s t
feature
But s i n c e
peak has i t , t h e prominence o f t h e second
one i s s u b o r d i n a t e d . Thus a p r i n c i p l e d r e l a t i o n s h i p can be e s t a b l i s h e d
between
'hat p a t t e r n s '
and
peak
sequences
on
the basis
of
p h o n e t i c r u l e s m o d i f y i n g t h e r e l a t i v e prominences o f t h e peaks.
181
general
In both cases {3a) and ( 3 b ) , t h e g e n e r a t i o n o f a 'hat p a t t e r n ' from a dipped
peak sequence does n o t change t h e number o f a c c e n t s , b u t o n l y t h e prominence
r e l a t i o n s between them. Thus when t h e sentence
"Die Wahlerinnen
combined e i t h e r w i t h a 'hat p a t t e r n ' c o n s i s t i n g o f a medial
on "Wahlerinnen"
(or
'late')
focus
p l u s a medial
peak on
stress
on
fall
"Wahlerinnen",
the
subject
Contribution VI, Hertrich,
and
only the
second
deaccentuation
intonation
of
the
rise
'medial'
represents
verb
(see
also
1991b).
I n t e n s i t y i n t h e c u i n g o f s t r e s s and
The
q u e s t i o n now
as
intonation
t o whether
i t i s possible to
p e r c e p t i o n s i m p l y by v a r y i n g i n t e n s i t y . Two
(a)
(or late)
on "wahlen", o r w i t h a s i n g l e
3.5
arises
wahlen." i s
t e s t cases may
U t t e r a n c e s t h a t are ambiguous between one
and
two
change
stress
be d i s t i n g u i s h e d :
s t r e s s e s i n FO
peak
s h i f t s , such as "Aber der Leo s a u f t . " i n 3.4,
(b)
'hat
patterns'
followed
by
i n which
a medial
FO
a
medial
fall,
(or late)
FO
rise
is
r e d u c i n g t h e prominence
s t r e s s compared w i t h t h e sequence o f two complete
immediately
of
the
second
peaks ( c f . 3 . 4 ) .
I f i n t e n s i t y alone can change s t r e s s p e r c e p t i o n , then i t should be p o s s i b l e
in
( a ) t o produce a s w i t c h from double
reducing the i n t e n s i t y
raising
the
accent
The
i n t h e second accent
i t i n the f i r s t .
prominence
to i n i t i a l
relation
Similarly
by
focus s t r e s s
s y l l a b l e and
simply
by s i m u l t a n e o u s l y
i n ( b ) , i t s h o u l d be p o s s i b l e t o
a comparable
intensity
adjustment
i n the
alter
two
syllables.
i s s u e has
been t e s t e d
interactively
values a c c o r d i n g l y i n t h e RULSYS TTS
negative:
the
f o c u s s i n g , and
by
changing
the
source
s y n t h e s i s - b y - r u l e . The
consequently
the
number o f
amplitude
r e s u l t has
been
stresses or
the
prominence r e l a t i o n , does n o t change. I t i s more t h e r e l a t i v e loudness
is
by
that
a f f e c t e d (see a l s o K o h l e r , 1 9 9 1 f ) . T h i s i s f u r t h e r s u p p o r t f o r t h e l o n g -
established
finding
that
intensity
has
a low
signalling
value
f o r stress
compared w i t h FO and d u r a t i o n ( F r y , 1958).
The
situation
i s d i f f e r e n t as r e g a r d s t h e c o n t r i b u t i o n o f i n t e n s i t y t o t h e
p e r c e p t i o n o f i n t o n a t i o n . Again two cases may
(a)
I t has
a l r e a d y been d i s c u s s e d
requires
perceptual
a
parallel
timing
be d i s t i n g u i s h e d :
i n 2.1.1.5 t h a t
of the
identity.
182
intensity
a late
course
FO
peak
pattern
t o guarantee i t s
( b ) I t i s argued i n C o n t r i b u t i o n V ( K o h l e r & G a r t e n b e r g ,
1991) t h a t
lower
i n t e n s i t i e s around t h e FO peaks i n ' e a r l y ' and ' l a t e ' p a t t e r n s v i s a v i s
'medial'
ones
prominence
have
across
t o be o f f s e t
the different
by h i g h e r
intonations.
' e a r l y ' peak p a t t e r n , which accentuates
strengthened
by n o t h a v i n g
a lower
FO
t o provide
t h e same
On
the other
hand, t h e
low FO, has i t s c h a r a c t e r i s t i c s
intensity
around i t s p r e n u c l e u s
FO
maximum compensated f o r i n a h i g h e r FO peak v a l u e .
Finally,
FO,
the disruption o f the natural
source
contours,
amplitude
and
sound
parallelism
intensity
as i t i s caused by t h e s y n t h e s i s
original
'medial'
peak u t t e r a n c e , may r e s u l t
quality.
So, when a n a t u r a l
gelogen." i s t a k e n
'medial'
peak
i n t h e time
f o r the three
o f FO peak
courses o f
terminal
shifts
across
i n a degraded a c o u s t i c
speech
signal
peak
an
output
o f "Sie hat j a
as a p o i n t o f d e p a r t u r e f o r LPC s y n t h e s i s w i t h a ' l a t e '
peak, t h e s t r e s s and i n t o n a t i o n c a t e g o r i e s a r e s i g n a l l e d c o r r e c t l y , b u t t h e
u t t e r a n c e sounds husky a t t h e end and overloaded
i n t h e middle
because FO
and i n t e n s i t y d i v e r g e i n o p p o s i t e d i r e c t i o n s i n these two p l a c e s . To improve
the
synthesis
points
quality
of 'late'
i n the intensity
curve
peaks
appropriate
had t o be c a r r i e d
s a u f t . " i n 3.3.1 (see a l s o K o h l e r ,
3.6 General d i s c u s s i o n c o n c e r n i n g
corrections at
these
o u t f o r "Aber d e r Leo
1991f).
Hypothesis ( 3 )
The p e r c e p t i o n experiments o f 3.1-5 have l a r g e l y
confirmed
Hypothesis ( 3 )
and i t s c o r o l l a r i e s o f C o n t r i b u t i o n I ( K o h l e r , 1991b). I f t h e r e i s more than
one p o t e n t i a l
lexical
-
accent i n a s i n g l e - a c c e n t t e r m i n a l u t t e r a n c e - e i t h e r a t t h e
o r a t t h e sentence l e v e l
'early',
position,
'medial',
provided
'late'
- three phonological
peaks
t h e temporal
-
intonation
a r e d i s t i n g u i s h e d a t each
stress
d i s t a n c e between t h e accent places
allows
t h e s e p a r a t i o n o f t h e FO peak c o n f i g u r a t i o n s . Furthermore,
a l t e r s t h e s t r e s s p o s i t i o n as w e l l , which can r e s u l t
s t r e s s and intonation
interval
and f a l l i n g
o f two peaks on two successive
an FO d i p . T h i s
i n an i n t e r a c t i o n o f
branches o f a peak c o n t o u r
t h e t i m e a s s o c i a t e d w i t h a s i n g l e peak on t h e f i r s t
by
an FO peak s h i f t
i f two accent s y l l a b l e s occur a t such a s h o r t d u r a t i o n
that the rising
a succession
categories
ambivalence
can be a t
accent s y l l a b l e o r w i t h
accent s y l l a b l e s , n o t separated
o f a stimulus
between
single
and double
s t r e s s r e s u l t s i n a p e r c e p t u a l a m b i g u i t y between, e.g., p r e f i x and stem word
stress
at the lexical
level,
o r s u b j e c t and v e r b
183
s t r e s s a t t h e sentence
level.
I t i s o n l y when t h e FO peak i s moved o u t o f t h e j o i n t domains o f both
accent s y l l a b l e s i n o r d e r t o be e x c l u s i v e l y i n t h a t o f t h e second one t h a t
the ambiguity
has
thus
i s resolved
been
and second p o s i t i o n focus
established
between
stress results. A l i n k
'hat p a t t e r n s '
and
dipped
sequences, based on prominence r e l a t i o n s h i p s , as p o s t u l a t e d
(4) i n C o n t r i b u t i o n I . This p o i n t w i l l
FO
peak
by Hypothesis
be f u r t h e r d i s c u s s e d i n C o n t r i b u t i o n s
VI ( H e r t r i c h , 1991b) and V I I ( K o h l e r , 1991d).
Duration
i s a f u r t h e r cue t o s t r e s s i n German, b u t u s u a l l y s u b o r d i n a t e d t o
FO, u n l e s s i t i s t o o s h o r t f o r what i s t o be expected o f s t r e s s e d
Intensity
and s p e c t r a l c h a r a c t e r i s t i c s ,
play a r o l e
to
i n s t r e s s perception.
intonation
parallel
time
identity
courses
on t h e o t h e r
Intensity
and t o v o i c e
hand, do n o t seem t o
intervenes
(speech)
o f FO and i n t e n s i t y
vowels.
quality
as an i m p o r t a n t cue
when
are disrupted,
the usually
and i t i s , o f
course, t h e s i g n a l a t t r i b u t e o f loudness. F i n a l l y , t h e h e i g h t o f an FO peak
cues prominence a t t h e p e r c e p t u a l
2.3,
and C o n t r i b u t i o n s
and emphasis a t t h e semantic l e v e l (see
I , V and V I I , Gartenberg
S e c t i o n 6.; K o h l e r & Gartenberg, 1991; K o h l e r ,
4. Conclusions f o r t h e K i e l
& Panzlaff-Reuter,
1991,
1991d).
I n t o n a t i o n Model o f German (KIM)
The r e s u l t s o f t h e experiments d i s c u s s e d i n t h i s C o n t r i b u t i o n I I I suggest a
number o f p o i n t s t h a t have t o be t a k e n i n t o account i n KIM as r e g a r d s t h e
i n t o n a t i o n peak component o f t h e model.
1. KIM must comprise t h e p h o n e t i c
i n t o n a t i o n model p r o p e r and t h e s y n t a c t i c ,
semantic and p r a g m a t i c environment p r o v i d i n g i n t e r p r e t a t i o n s f o r symbolic
r e p r e s e n t a t i o n s o f sentences as i n p u t t o t h e model.
2. I n p a r t i c u l a r ,
t h i s environment must s p e c i f y t h e l e x i c a l
t o r e c e i v e sentence s t r e s s , and i t must p r o v i d e
along
t h e dimensions
'established/new',
semantic
items
that are
interpretations
'degree o f d i s t a n c e
between t h e
speaker and t h e w o r l d ' , and 'emphasis'.
3. The b a s i c c a t e g o r i e s o f t h e p h o n e t i c model
(a)
a feature
specification
of stress
include
with
reference
t o the signal
p r o p e r t i e s FO and d u r a t i o n : iFSTRESS, ±DSTRESS,
(b)
a feature
specification
o f intonation
with
reference
t o FO
peak
p o s i t i o n : lEARLY, and ±LATE w i t h i n -EARLY,
(c)
t h e t i m i n g o f t h e i n t o n a t i o n peaks depending on s y l l a b l e s t r u c t u r e s
(mono/polysyllables,
long/short
184
vowels,
voiced/voiceless
consonant
(d)
environment),
a
numerical
scale
o f peak
height
with
reference
t o degrees
of
prominence,
i n t e n s i t y adjustments
(f)
IFO and CFO m o d i f i c a t i o n s o f t h e b a s i c peak c o n t o u r s ,
(e)
t o guarantee p a r a l l e l i s m w i t h FO t i m e
course.
4. A f t e r t h e i n t r o d u c t i o n o f t h e peak c a t e g o r i e s t h e model has t o deal w i t h
their
(a)
concatenation.
An
FO
latter
descent
case
from
a peak p o s i t i o n
double
by
accentuation
("He's w r i t t e n
geschrieben."
a
followed
deaccented
sentence
secondary
in relation
one,
can be f a s t
may
result,
e.g.
in
a letter.")
to "Brief",
s t r e s s , o r i n no
stress
o r slow.
or
"Er
a
hat
the f i n a l
which
s t r e s s . But t h e d e a c c e n t u a t i o n
secondary
gets
may
main
einen
In the
accent
Brief
participle is
t h e main
result
in a
a t a l l suggesting
nuclear
default
a contrast
between, f o r example, " B r i e f " and " K a r t e " ( " c a r d " ) . T h i s i s t h e same
phenomenon
partial
as what
Kingdon
(1965,
p.
195) has
called
'semantic
s t r e s s ' w i t h r e f e r e n c e t o compounds o f d i f f e r e n t degrees o f
semantic u n i t y ,
e.g. " b u t t e r
cup" (cup f o r b u t t e r )
with
secondary
one o f
d i f f e r e n c e i s not only
"cup".
s t r e s s on "cup" v s . " b u t t e r c u p " ( r a n u n c u l u s ) w i t h u n s t r e s s e d
The p h o n e t i c
duration,
fall
(b)
manifestation o f this
but, f i r s t
and f o r e m o s t ,
of different
t i m i n g s o f t h e FO
from t h e FO peak.
Besides peak sequences v a r i o u s 'hat p a t t e r n s ' have t o be generated
and t h e semantic and pragmatic
These
points
will
supplemented
collections
be
developed
by f u r t h e r
model
i n the other
d i f f e r e n c e s evaluated.
i n Contribution
components d e r i v e d
c o n t r i b u t i o n s and
V I I (Kohler,
from
from
1991d),
the empirical
interactive
data
RULSYS
TTS
experimentation.
185
© Copyright 2026 Paperzz