Ipek 1 On the Relation Between Overlap and Release

Ipek
1
On the Relation Between Overlap and Release
Abstract
This paper is about the relation between gestural overlap and acoustic release. The aim is
to define a set of different temporal relations from which the presence/absence of release
can be predicted. In this context, an acoustic index of overlap-closure duration ratio--and
release percent in every binary combination of the voiceless stop consonants {p, t, k}
have been measured at two different prosodic boundaries, word and sentence, and at two
different vowel contexts, /a/ and /e/. The argument is that release is a result of specific
temporal relations between two consecutive gestures, and these relations are part of the
phonological representation.
1. Introduction
Providing a principled link between speech and its phonological representation has been a
challenge for phonological theories. Some theories of phonology assume that
phonological representation consists of a linear sequence of segments where each
segment is represented as a set of features and the difference between words is a result of
difference between at least one of these features. An explicit notion of time is not present
in these linear approaches to explain the relation between speech output and its abstract
representation. Yet, it is clear that speaking is organizing vocal organs over a stretch of
time, and during this organization process ‘the vocal organs are all continually in motion,
and at any given point in time, they are likely to be executing gestures associated not with
one but with several of the segments of the utterance… This phenomenon of
COARTICULATION is completely pervasive […]’ (Anderson 1974: 5).
Ipek
2
Lisker (1974) argues the relevance of temporal dimension in describing speech,
especially describing acoustically complex segments like prenasalised stops that are
present in some languages (Ladefoged 1964, Ladefoged and Maddieson 1996) and they
behave like a single unit, so is different from any sequential ordering of a nasal plus a
stop. The explanation of the difference between these two groups of sounds can be
reduced to a difference between relative timing of gestures and the difference between
languages that do and do not have prenasalised stops can be explained as a difference
resulting from temporal organisation of the relevant gestures of adjacent sounds.
What is more, while languages systematically differ from one another with respect to
phonetic differences that will function as a phonological contrast within a language, not
all phonetic differences will have this function. One such example given in Anderson
(1974:8) is the release of stops in different languages: languages systematically differ
from one another in whether stops are released or not at specific contexts. however stop
release does not function to contrast one utterance from another in any of these
languages. Yet since this phenomenon differentiates one language from another, it should
also be a part of any linguistic theory that attempts to explain speech and its
representation. This difference between languages can also be explained as a result of the
relative timing of the adjacent gestures.
Another example of such difference is presentened in Gafos (2002) by using examples
from word formation process in Moroccan Colloquial Arabic (MCA). The argument is
that the quality of the post vocalic CC cluster resulting from satisfying a specific template
of a specific morphological category, and the quality of the transition between these two
Ipek
3
consonants can be explained as a result of temporal relations between gestures rather than
as a process of mapping segments to template positions.
These differences among languages can be analysed as the presence of a feature in one
language and its absence in the other one. However, such an approach will add another
feature for any difference between any languages and will not be able to capture the
relevance of temporal dimension in distinguishing languages.
Articulatory Phonology theory developed by Browman and Goldstein (1986 et seq.)
provides a solution to the missing link between physical properties of speech and its
phonological representation. According to this approach, linguistically relevant analysis
of speech involves describing an utterance as the organization of overlapping abstract
gestural primitives where a gesture is dynamically defined, spatio-temporal unit , and
phonological representation is based on the systematic organisation of gestures
(Browman and Goldstein 1995). In this approach, the systematic differences between
languages, whether it is a difference that has a phonologically contrastive function or not,
is a result of differences in the temporal organisation of relevant gestures, and these
differences are linguistically relevant and represented.
Gestural Coordination. Gafos (2002) argues the relevance of temporal coordination of
gestures in theory of phonology and claims that differences in sound patterns among
languages result from differences in temporal coordination among gestures and that these
differences are relevant in constructing different grammars of different languages. A
specific example of this argument is the properties observed in postvocalic C1C2
sequences in CCVC1C2 context that are derived as a result of some word formation
process in Moroccan Colloquial Arabic (MCA): In certain contexts, the transition from
Ipek
4
C1 to C2 involves the presence of release in between no matter whether the C1C2 is a
homorganic or a heterorganic cluster. Despite this surface resemblance, it is argued that
the presence of release in between a homorganic sequence requires a different temporal
organisation than that of one required for a heterorganic cluster. What is more
homorganic clusters do not necessarily involve release in every context they appear,
which implies a different temporal relation of those homorganic clusters that involve
release. Such properties are suggested to result from the interaction between constraints
in the grammar that refer to temporal relations between gestures.
In this paper, temporal relation between gestures1 is expressed through the coordination
of a set of landmarks identified in the temporal domain of a gesture. These landmarks are
onset (‘o’), target (‘t’), ccenter (‘cc’), release (‘r’), and releaseoffset (‘roff’):
(1)
t
cc
r
roff
o
Gafos (2002)
Temporal organization of gestures is defined as a way of synchronizing one landmark
within the temporal structure of a gesture with another landmark in the following gesture.
Release in between a heterorganic cluster, e.g. /t#b/, is possible with a temporal relation
where cc = o:
1
In the rest of the paper, the term gesture refers only to the oral gesture of a
consonant unless otherwise stated.
Ipek
(2)
5
cc
o
cc = o
On the other hand, presence of release in between a homorganic cluster requires timing
the onset of the second gesture at some point ‘late in the release phase of the first’ gesture
(Gafos 2002:272):
(3)
roff
o
roff = o
Consonantal Sequences and Release. A stop consonant in between vowels is produced
by a complete closure in the vocal tract and then releasing it. It is assumed that during the
time period when the closure for the stop consonant is active, no air escapes through that
constriction, and this gives rise to air pressure build up in the vocal tract behind the
closure. So as a result when this closure is released, there will be a burst and a flow of air
coming through the vocal tract (Stevens 1993). On the other hand, transition from one
stop consonant to the next does not always give rise to the release of air.
At this point, it is important to distinguish articulatory release from the presence/absence
of release burst in the acoustic record. The term release in general is used to refer to the
breaking of the contact of articulators as the articulators move from one constriction to
Ipek
6
the next. However, the break of the constriction between two articulators does not
necessarily imply that there will be an acoustic consequence of that break (Henderson
and Repp 1982).
Transitions where there is a period of time where there is no constriction in the vocal tract
sometimes referred to as open transition, and the one without a break is as close
transition (Catford 1977).
In this context, overlap, release and the acoustic evidence of the release should be
considered and evaluated differently in different types of stop consonant sequences,
namely: homorganic and heterorganic. In homorganic sequences, the consonants
involved have the same target position. The absence of articulatory release in homorganic
sequences can be attributed to the fact that when the second gesture is activated to form
the relevant constriction in the vocal tract, the set of articulators that are going to
contribute to this constriction are already active because the previous constriction also
requires the contribution of the same set of articulators, and in this case the already active
articulators move little if at all and the transition between two gestures does not result in
release, in other words it is a close transition. In open transition in between the members
of a homorganic cluster, there is a period of open vocal tract. In heterorganic clusters, on
the other hand, a close transition is a result of articulatory overlap whereas an open
transition is due to much less overlap (Catford 1977).
At this point, it is important to clarify what is meant by much less overlap and open
transition. In consonant sequences, especially the ones that require full closure at some
point in the vocal tract as in the case of stop-stop sequences, the complete closure
required for the second consonant might be achieved either while the closure for the first
Ipek
7
consonant is still active or it can be achieved sometime after the closure for the first
consonant is released. In the case of the second possibility where the release of the first
gesture and the closure of the second gesture is sequential, there is a period of time of
open vocal tract which means the CLOSURE intervals of the sounds will not overlap. On
the other hand, if the closure for the second consonant is achieved sometime during while
the closure of the first consonant is still active, then there will be a co-occurrence of
closure durations, in other words the closure durations will overlap. In both cases, the
onset of the second gesture might become active while the set of articulators for the first
consonant is still active which means the consonants do overlap but this does not
necessarily say anything about the co-occurrence of closure durations. The degree of
overlap in terms of which the onset of the second consonant begins might be the same in
both cases, however due to various reasons such as the nature of the prosodic boundary or
the type of the cluster, the target for the second consonant in the open transition might be
achieved later than it is in close transition. This implies that clusters that are released will
have longer total acoustic closure duration than those that are not released:
4.
release
a.
C1
C2
No release
b.
time
C2 release
Ipek
8
This suggests the formulation of the following hypothesis regarding the relation between
release and duration ratio. For a given type of cluster
H1: Clusters that are released will have larger duration ratios than clusters
that are not released.
Measuring Duration Ratio. The overlap of two consecutive stop gestures is here
defined as the co-occurrence in time of their articulatory closure intervals. In this paper,
however, what is measured is the acoustic closure duration of stop consonants.
Accordingly, a technique for estimating overlap from acoustic data. Zsiga (2000)
formulates an index of overlap, called the duration ratio as:
C1#C2
VC1#V + V#C2V
In this method of measurement, duration ratio equals the total closure duration of C1#C2
divided by the closure duration of coda C1 and onset C2 in intervocalic context. For the
measurement to be meaningful, it is assumed that the consonants are articulated similarly
and have similar articulatory closure durations in clusters and in intervocalic contexts; for
example, if C1 has t1 duration and C2 has t2 duration in intervocalic context, the total
duration in the denominator in the formula will be equal to t1 + t2 . Based on the
assumption mentioned above, the duration of C1 in the numerator is also equal to t1 and
Ipek
9
the duration of C2 is to t2. If the total duration of C1#C2 in the numerator is smaller than
the duration of t1 + t2 in the denominator, then this is due to the overlap of C1#C2
(5)
Lines represent closure durations only.
C1
C2
C1
time
So ratios equal to or greater than 1 are an indication of little or no overlap.
This kind of measurement of overlap looks at the acoustic closure duration only; from the
point where the first gesture makes a full closure to the point of the release of the second
gesture. It does not say anything about where the onset or offset of the articulatory
gestures are.
According to this formula the duration ratio might get smaller for reasons that do not
necessarily have to do with the greater degree of co-occurrence of the two closure
intervals. For example, if one of the consonants in the intervocalic context gets longer for
some reason, then the number in the denominator will get bigger and the ratio will get
smaller.
Release. Now, consider the presence of release in the acoustic recording in front-to-back
and back-to-front heterorganic clusters. This distinction is important because not all
acoustic evidence of release means there was a period of open vocal tract. First, let’s
consider a front-to-back sequence, /t#k/ for instance, and what it means to have an
acoustic evidence of release in this type of heterorganic consonant sequences. When the
closure constriction is formed for the first gesture /t/, during the closure interval of this
gesture, the pressure builds up in the vocal tract, and even though the second gesture /k/
Ipek
10
in this case, is formed before the release of the first gesture, i.e. they overlap, there will
still be a release of air as the constriction for the first gesture is released due to the air
trapped in between two closure constrictions in the vocal tract. In other words, even
though there might be a great degree of overlap in between two consecutive front-to-back
sequences, there might still be an acoustic indication of some release burst in the acoustic
record. On the other hand, in a back-to-front heterorganic sequence, /k#t/ for example,
the first closure constriction is formed behind the closure constriction of the second
gesture. Different from front-to-back sequences, in the case of a great degree of overlap,
there won’t be an indication of release in the acoustic record because by the time the
constriction for the first gesture is released, the constriction for the second gesture will
already be in its target and this constriction formed in the more anterior part of the vocal
tract will block the escape of any air that might result from the release of the first gesture.
Taken from this perspective, having an acoustic evidence of release in the case of
homorganic and back-to-front clusters means there is really a period of time during the
transition from one consonant to the next where there is no constriction in the vocal tract,
which is not necessarily the case in front-to-back sequences. Resulting from this fact we
test the hypothesis that even in cases of lower duration ratio (more closure duration
overlap)
H2: For a given degree of overlap, clusters that are front-to-back will be
released more than the clusters that are back-to-front.
Boundary. One factor that has been found to have an effect on coarticulation is the type
of boundary in between two consecutive gestures, e.g. word vs phrase. With respect to
Ipek
11
the effect of boundary, McClean (1973) finds a delay in the onset of forward
coarticulation to the nasal consonant in the presence of junctural boundaries. Hardcastle
(1985) presents data on overlap at clause or sentence boundary and finds less overlap at
sentence boundary only at normal speech rate. Byrd et. al. (2000) reports no significant
difference between the time between the gestural onsets as a function of prosodic
boundary, however shows that the time between the extrema of two gestures is
significantly affected by the boundary. Cho (2004) examines vowel-to-vowel
coarticulation at different prosodic boundaries and finds less overlap at stronger prosodic
boundaries. Bombien et. al. (2006) looks at overlapping properties of initial /kl/ clusters
in German as a function of prosodic boundary and finds that there is less overlap at
higher prosodic boundaries.
These studies on coarticulation look mostly at articulatory data, and they suggest that
prosodic boundary does have an effect on temporal coordination of gestures. No previous
studies have used duration ratio to look at the effect of boundary.
Based on these previous findings on the effect of boundary, we evaluate the hypothesis
that
H3: Boundary will have an effect on duration ratio, specifically that a phrase
boundary intervening between two consonants will have a larger
duration ratio.
In this paper, we provide data from Turkish on stop-stop sequences and analyse the
presence or absence of release in between these stop consonants as a function of different
prosodic boundaries and vowel context, and provide a model where the presence/absence
Ipek
12
of release can be predicted from a specific temporal organization of gestures. While there
are studies on the relation between overlap and release in other languages no such study
has been done in Turkish.
This paper is organized as follows. Section 2 describes the experimental
procedure. The data analysis is done in section 3. In section 4, the relation between
overlap and release as a function of a small set of temporal relations between gestures is
discussed. The discussion of the results and the model is done in section 5, and section 6
concludes with a summary of main points.
2. Experimental Design
2.1. Subjects
Three female native speaker of Turkish participated in the study. The speakers’ age range
from 20 to 38. They have all been living in the U.S for less than 2 years.
2.2. Materials
The C1#C22 sequences are studied in Turkish. A set of two word phrases where coda
C1= {p,t,k} and onset C2 = {p, t,k} at two different prosodic boundaries and at two
different vowel context are created. At word boundary /a/ vowel context, C1 is always the
coda and an object and C2 is the onset of the verb. In the /e/ vowel context, C1 is either
the coda of a noun and C2 is the onset of an adjective or C1 is the coda of adjective and
C2 is the onset of a noun. This asymmetry results from trying to control the vowel
context preceding and following C1#C2 sequence and also trying to create meaningful
sentences.
2
# stands for some boundary in this study unless mentioned as a specific boundary.
Ipek
13
At sentence boundary, C1 is the coda of a noun and C2 is the onset of a noun at both
vowel contexts.
There are 60 different sentences in total, and subjects are asked to repeat each three
times.
2.3. Recording procedures
Subjects are recorded in a quiet room using a Logitech USB microphone and the software
Praat. Since the duration of recording with the software is limited to 4 minutes, the data
set is divided into three parts. The subjects are given instruction in Turkish, and are told
that this is a study on Turkish sounds and are given no further information. The subjects
were approximately 5 inches away the microphone and were asked not to move during
the recording procedure. The sentences are presented on a computer screen, and the
subjects are asked to repeat each sentence three times in their normal rate of speaking and
as naturally as possible. The sentences remained on the screen as they repeat them.
Sentences are randomized by using random sequence generator (www.random.org).
2.4. Measurements
Using the acoustic waveform the following measurements are made:
1. C1#C2 duration: measured from the onset of the closure for C1 to the release of C2.
2. C1#V duration: measured from the onset of the C1 closure to the release of C1.
3. V#C2 duration: measured from the onset of the C2 closure to the release of C2.
4. Release: either ‘yes’ or ‘no’ depending on the presence/absence of an evidence at the
acoustic signal.
Ipek
14
(6)
C1
onset
C2
offset
Total closure duration
release
2.5. Statistical Analysis
The duration ratio of released vs. nonreleased clusters was compared in analysis of
variance. Although release was not manipulated in advance in this experiment, it is used
as a predictor in the first analysis to compare the duration ratio in released and
nonreleased consonant sequences. Every token was clasified as released or not. The
independent variables are release, boundary (word vs. sentence), and vowel (/a/ vs /e/).
Because of the post-hoc nature of the release factor, the number of released vs unreleased
cases are not equal and vary across each boundary X vowel cell of the design, ANOVAs
Ipek
15
with unequal Ns was employed. Because of large inequalities in release across cluster
type (e.g. homorganic clusters are almost never released), cluster type could not be used
as a factor in this analysis. The effects of cluster type (coded as homorganic, front-toback, back-to-front) on overlap ratio is therefore be tested in a separate ANOVA that
does not include release as a factor, but which does include boundary and vowel.
3. Results
3.1. Effect of release on duration ratio
Table 1. gives the results of analysis of variance for duration ratio. Results indicate
significant main effects of release, boundary, and vowel on duration ratio, and no
significant interactions.
Table 1. Results of analysis of variance for duration ratio
d.f
F
p-value
release
1
11.29
0.0009
boundary
1
7.07
0.0084
vowel
1
9.16
0.0028
release*boundary
1
0.02
0.899
release*vowel
1
0.46
0.497
boundary*vowel
1
2.37
0.1249
release*boundary*vowel
1
1.72
0.1906
Error
208
Figure 1. shows a graph of mean duration ratio for released and nonreleased clusters, and
it demonstrates that the duration ratio for the released ones are higher than those of
nonreleased ones.
Ipek
16
Figure 1. Mean duration ratio for released vs nonreleased clusters
Table 2. shows the mean duration ratio of released and nonreleased heterorganic
clusters at each boundary. The mean duration ratio for each released cluster at each
boundary is greater when compared to its nonreleased counterpart, except for /k#t/ cluster
at word boundary.
Since the heterorganic clusters that have /k/ as their C2 is almost always released, the
nonreleased column is not applicable (N/A) for those clusters.
Ipek
17
Table 2. Released and Nonreleased mean duration ratio for each cluster at each boundary
WORD
SENTENCE
Released
Nonreleased
Released
Nonreleased
/p#t/
1.011
.8847
.9671
.7999
/p#k/
.8860
N/A
.84
N/A
/t#p/
1.0283
.8419
.8884
.8849
/t#k/
1.018
N/A
.95
N/A
/k#p/
.9451
.8967
1.0715
.8204
/k#t
.9828
1.0424
.9543
.8768
Consistent with the first hypothesis of this experiment, results indicate that release has a
significant effect on duration ratio and that within a specific cluster, those that are
released have higher duration ratio than those that are not released, with one exception.
3.2. Effect of cluster type on release
Table (3a) and (3b) shows the number of released and nonreleased clusters grouped into
two as back-to-front and front-to-back cluster type. The null hypothesis that the presence
of release is independent of the type of cluster can be rejected at the p<.0001 level using
chi-square test.
Ipek
18
Released
Non-released
back-to-front
16
45
front-to-back
36
9
!2 =22, p<.0001
(a)
Released
Non-released
back-to-front
32
26
front-to-back
55
8
!2 =15.43, p<.0001
(b)
Table 3. Number of released and nonreleased clusters in front-to-back and back-to-front
sequences. (a) Those where the duration ratio is < .9 (b) Those where the duration ratio is
> .9
In this experiment, the mean closure duration ratio is found to be hovering around .92.
Table in (3a) gives the number of released and nonreleased tokens that have duration
ration smaller than .9. Even among those clusters that have greater closure duration
overlap, front-to-back sequences are almost always released. Table (3b) gives the
numbers for those tokens whose duration ratio is greater than .9. The number of released
tokens for clusters that are back-to-front are higher than those with duration ratio smaller
than .9, which would be expected considering the relation reported between duration ratio
and release, however the number of released tokens in back-to-front cases is smaller than
the number of front-to-back released tokens in both <.9 and >.9 cases.
Ipek
19
Figure 2. shows the detailed result of release percent for each cluster at two different
boundaries and two different vowel contexts.
(a)
(b)
Figure 2. Values for percent released for each cluster (a) word boundary and (b) sentence
boundary.
Ipek
20
The effect of cluster type on release can be seen from these graphs, too. The first apparent
effect is that of homorganicity, especially at word boundary: homorganic clusters are
never released at word boundary in neither of the vowel context. At sentence boundary,
there is some release in homorganic clusters, yet the percent for each homorganic cluster
is very low when compared to the heterorganic ones.
With respect to heterorganic clusters, the first thing to notice is those clusters where C2 is
/k/, which are front-to-back clusters, and that they are almost always released. The other
front-to-back cluster, /p#t/ is released more than 50% of the time at word boundary, at
both vowel contexts. At sentence boundary front vowel context, it is released almost all
of the time, yet the percent is below 50% at back vowel context.
In back-to-front sequences where C2 is /p/, the release percent is below 50% for /k#p/
cluster in every context. In general, release percent of /t#p/ is over 50%, however it is still
lower than those of front-to-back clusters in most of the contexts. As opposed to other
back-to-front clusters, /k#t/ at word boundary at /a/ vowel context is almost always
released which is the same as clusters that have /k/ as their C2. On the other hand, /k#t/ at
/e/ vowel context at word boundary, is almost never released. A possible reason for this
difference will be discussed in the discussion part.
Overall, release percent of each cluster show that clusters that are front-to-back have
higher release percent when compared to clusters that are back-to-front providing
evidence for hypothesis H2.
3.3. Effect of boundary and cluster type on duration ratio
In the experiment reported here, based on the previous findings on the effect of boundary
on overlap, it is hypothesized that the duration ratio will be greater at sentence boundary.
Ipek
21
Table (4) shows the results of second analysis of variance. The results indicate a
marginal effect of boundary, a significant effect of vowel, and a highly significant 3-way
interaction of boundary by vowel by cluster type. There is no significant effect of cluster
type on duration ratio. While the effect of boundary was significant in the earlier analysis
including release, but excluding cluster type, it does not reach significance here,
presumably because of the significant 3-way interaction.
Table 4. Results of analysis of variance for duration ratio
d.f
F
p-value
boundary
1
3.07
0.0808
vowel
1
9.25
0.0026
clustertype
2
0.16
0.8483
boundary*vowel
1
2.27
0.1325
boundary*clustertype
2
1.53
0.2173
vowel*clustertype
2
1.62
0.2
boundary*vowel*clustertype
2
3.81
0.0233
Error
312
Figure (3) shows the graph of mean duration ratio for clusters at word and sentence
boundary.
Ipek
Figure 3. Mean duration ratio at word and sentence boundary.
Contrary to H3, the duration ratio at word boundary is higher than the one at sentence
boundary.
Results indicate a main effect of vowel. Figure (4) shows the graph of mean duration
ratio at two different vowel contexts. The clusters that appear at /e/ vowel context have
higher duration ratio.
22
Ipek
23
Figure 4. Mean duration ratio at two different vowel contexts.
However, there is a significant effect of boundary by vowel by cluster type. So, the main
effect of vowel and the marginal effect of boundary should be analyzed in this context.
The mean duration ratio of each cluster type, at each boundary and each vowel context is
given in Table 5. The numbers indicate that at /e/ vowel context, the mean duration ratio
of every cluster type at word boundary is greater than those at sentence boundary.
Ipek
Cluster type
Cluster type
24
homorganic
back-front
front-back
Boundary
word
sentence
.8665
.9291
.8968
.9191
.9263
.8273
homorganic
back-front
front-back
Boundary
word
sentence
.9544
.9198
.1.01
.8655
.9909
.9797
/a/ vowel
/e/ vowel
Table 5. Mean duration ratio of each cluster type at each boundary, each vowel context.
The duration ratio of each cluster at two different boundaries and two different vowel
contexts are given in figure (5a) and (5b). With respect to vowel context, the immediately
apparent effect is that at /a/ vowel context, no bar goes beyond the ratio level 1, whereas
at /e/ vowel context, almost half of the bars are either greater or equal to 1, which means
while there are clusters at /e/ vowel context which have no closure duration overlap, this
is not the case at /a/ vowel context.
Ipek
25
(a)
(b)
Figure 5. Values for duration ratio for each cluster at two different vowel contexts. (a) /a/
vowel context. (b) /e/ vowel context.
Ipek
26
The marginal effect of boundary can be clearly seen especially at /e/ vowel context;
except for three clusters, /p#p/, /t#k/, and /p#t/ duration ratios at word boundary are
greater than that of those at sentence boundary.
With respect to three way interaction, table (5) shows that mean durations of each cluster
type at word boundary, at /e/ vowel context are greater than those at sentence boundary.
The duration ratio of each individual cluster, given in Figure (5), indicates that at /e/
vowel context the ratio of all clusters that are back to front, /t#p/, /k#p/, and /k#t/, is
higher at word boundary. The same tendency holds true for homorganic clusters except
for /p#p/. However, with respect to front-to-back clusters, only /p#k/ has higher duration
ratio at word boundary, at /e/ vowel context.
4. Predicting Release from the Temporal Relation and the Type of Cluster
The aim of this section is to define temporal relations among gestures that will predict the
relation between overlap and release based on the findings in this experiment. Temporal
relations are expressed through coordination relations among gestures by using
landmarks. These landmarks are onset (‘o’), target (‘t’), ccenter (‘cc’), release (‘r’), and
releaseoffset (‘roff’), as defined in Gafos (2002):
t
(1)
cc
r
roff
o
time
Ipek
27
Before depicting possible temporal relations, I summarize the main findings of the
experiment that enter into the model:
1. For a given type of cluster, the overlapping ratio for those that are released is
higher (i.e. less closure duration overlap) than those that are not released.
2. There is no effect of clustertype on closure overlap.
3. Homorganic clusters at word boundary are never released.
4. Clusters that are front preceding back are almost always released.
5. There is less release in clusters that are back-to-front than those that are front-toback.
6. Some homorganic clusters at sentence boundary are released.
Simply, based on the findings of this experiment, the model should predict the relation
between release and closure overlap on one hand, and release and the cluster type on the
other.
In Gafos (2002) coordination relations are depicted by synchronizing one landmark
within the temporal structure of one gesture with a landmark in the temporal structure of
another gesture, e.g. r = o or r = t. These point-to-point synchronization of landmarks,
however, leaves no room for variation within or across cluster types. In this context, first
consider homorganic clusters at word boundary. Results showed that homorganic clusters
at word boundary are never released, yet not all homorganic clusters have the same
overlapping ratio. Table (6) shows mean duration ratio of homorganic clusters at word
boundary in two different vowel context:
Ipek
28
/a/
/e/
p#p
.84
.88
t#t
.88
1.02
k#k
.89
.98
Table 6. Mean duration ratio of homorganic clusters at word boundary
Now consider the difference between release percent in homorganic and heterorganic
clusters exemplified in table (7):
Duration ratio
Release percent
/p#p/
.85
0
/p#k/
.87
100
/t#t/
.95
0
/t#k/
.96
100
Table 7. Mean duration ratio of two homorganic and heterorganic clusters
Duration ratio of /p#p/ and /p#k/, and /t#t/ and /t#k/ are almost the same, however while
release is never present in a homorganic cluster, a heterorganic cluster with the same
overlapping ratio always has release.
At the other end, consider the difference in release percent between two different types of
heterorganic clusters as exemplified in table (8):
Ipek
29
Duration ratio
Release percent
/k#t/
.89
55
/t#k/
.8
100
/k#p/
.91
33
/p#k/
.87
100
Table 8. Mean duration ratio of two heterorganic cluster types
Table (8) indicates that there are cases where front-to-back clusters are released but backto-front clusters are not.
In order to be able to comprise these differences and similarities within and across cluster
types, I propose a window of possible temporal relations rather than synchronizing one
landmark with another. In this kind of modelling, release is going to be a result of where
a specific landmark of one gesture falls within the window of possible temporal relations,
on one hand and of type of the cluster, on the other.
The fact that homorganic clusters at word boundary are never released will determine the
upper bound of the window of coordination relations. A close transition in a homorganic
cluster results from continuity of the articulatory stricture, whereas an open transition is
due to releasing the stricture and renewing it in the former position. This suggests that for
a close transition to happen in between a homorganic stop sequence, the onset of the
second gesture should not be initiated after the release of the first gesture. So, the upper
bound of the onset of the second gesture should synchronize with the release of the first
gesture:
Ipek
(2)
30
R1
R1 = O2
O2
time
This coordination relation will give a value of duration ratio larger than 1 because of the
time it takes for C2 to go from the onset to its target in addition to closure durations for
C1 and C2.
Now consider the difference in release percent between front-to-back and back-to-front
sequences. As is discussed before, evidence of acoustic release in a back-to-front cluster
means that while the articulators are moving from C1 to C2, there is a period of time
during this transition where there is no constriction in the vocal tract, so in order to have a
close transition, C2 target should precede C1 release so that the second stricture will be
formed in the vocal tract before the first one is released:
(3)
T2
T2 < R1
R1
time
This coordination relation will give duration ratios smaller than 1, and no release in backto-front clusters. Presence of release in such cluster type will result from a temporal
relation where T2 follows R1 where O2 can maximally by synchronous with R1.
Ipek
31
Different from back-to-front sequences, presence of acoustic release in front-to-back
sequences is not necessarily due to a moment of open vocal tract during the transition
from C1 to C2. In this type of a cluster, C1 stricture is more anterior than C2, so even in
cases when the target of C2 precedes the release of C1, due to the air pressure built up
behind the stricture of C1, there will still be acoustic evidence of release burst. So the
presence release in a front-to-back sequence results from the nature of the cluster type
rather than a specific temporal relation.
So the window of temporal relations depicted --from T2<R1 to O2=R1-- can capture the
release properties of different cluster types such that homorganic clusters will never be
released since the upper of window synchronizes O2 with R1, front-to-back clusters will
always be released since the presence of release is independent of the temporal relation,
and release in back-to-front clusters will depend on where the particular token fall in
within the window.
Finally, results of this experiment indicate that some homorganic clusters are released at
sentence boundary. As is discussed before, an open transition in homorganic clusters is
due to relaxation of the articulatory stricture of C1 and forming it again in the same
position for C2. This suggests a coordination relation where the onset of C2 follows the
release of C1:
(4)
O2 > R1
R1
O2
time
Ipek
32
This implies a duration ratio larger than the window defined before. The mean duration
ratio of homorganic clusters that are released at sentence boundary is 1.369, which is
larger than the duration ratio of clusters that fall within the window defined.
5. Discussion
This experiment shows clearly that there is a relation between overlap and release on one
hand, and cluster type and release on the other such that for a given type of cluster
closure duration overlap for released clusters are significantly larger than those that are
not released. With respect to the relation between cluster type and release, while
homorganic clusters are never released, front-to-back sequences are almost always
released, and back-to-front clusters are released less than front-to-back, and some
homorganic clusters at sentence boundary are also released.
Presence or absence of release is a property of stop consonants and languages
systematically differ with respect to whether there is release or not in between a stop
consonant cluster, yet this difference between languages does not have a phonologically
contrastive function. In a comparative study of English and Russian on the relation
between overlap and release, Zsiga (2000) finds that in general English stops are released
more in clusters than those in Russian. In another study, Gafos (2002) argues the
presence of release in homorganic and heterorganic stop clusters in the postvocalic
context that are derived as a result of some word formation process. In the experiment,
conducted for this study, release in stop clusters is found to be affected by the type of
cluster.
Ipek
33
These differences within and across languages with respect to release of stops in clusters
can be argued to be a function of the presence or absence of a feature, [±release] for
instance. Consider the properties of release in Turkish from this perspective. In such an
approach, the absence of release in homorganic clusters at word boundary might be said
to result from the feature [-release], and as such presence of release in front-to-back
clusters is going to be due to the feature [+release]. However, this dual nature of features
will fall short when it comes to explaining the release properties of back-to-front clusters,
since release in these type of clusters is neither ‘never’ nor ‘always’. So, one or more
additional feature/s will have to be added to the system to be able to explain the fact that
back-to-front clusters are sometimes released.
On the other hand, these differences within and across languages can be seen as part of a
more general problem where the differences result from the differences in the relative
timing of articulatory gestures. In this context, Zsiga (2000) explains the differences in
release properties of English and Russian as resulting from differences in possible
temporal relations between relevant gestures: while gestures overlap more in English, so
a narrower window of temporal relations, there is less overlap in between gestures in
Russian, so a larger window. Turkish is similar to Russian in this context rather than to
English. According to the results by Zsiga (2000), release percent of each cluster in
English is below 40% whereas in Russian, there are clusters which have 90-100 %
release, similar to Turkish. So, for Turkish based on the findings of the experiment a
window of temporal relations -- from T2<R1 to O2=R1-- and a specific temporal relation
-- O2> R1 -- are suggested to capture the relation between overlap and release. With this
kind of a coordination sensitive approach, property of consonant clusters with respect to
Ipek
34
release between and within languages is reduced to a difference resulting from possible
temporal relations, and since this difference seems to differentiate one language from
another, it should be part of linguistic grammars.
In this experiment, acoustic data is also used to look at effect of boundary on closure
duration overlap. Although there are studies in the literature that examine the relationship
between boundary and overlap, no acoustic study has been done for that purpose.
Contrary to findings in the previous studies, results in this study indicated an opposite
boundary effect. This might be suggested to result from the effect of boundary on
individual units. Previous studies on the effect of boundary on the production of
individual units show that articulations are temporally longer at higher prosodic
boundaries (Byrd 2000; Keating et. al. 2004; Cho 2006; Tabain 2003; Tabain and Perrier
2005). In this context, closure durations of three consonants at two different boundaries
are given below:
Table 9. Mean closure duration of intervocalic consonants
Coda
Word
Onset
Sentence
/a/
/e/
/a/
/p/
.087
.08
.1
/t/
.08
.059
/k/
.072
.042
Word
/e/
Sentence
/a/
/e/
/a/
/e/
.09
.08
.081
.082
.1
.07
.078
.062
.069
.07
.072
.08
.76
.06
.06
.096
.092
The mean durations of intervocalic consonants at different boundaries show that except
for one case, /t/ at coda position at /a/ vowel context, the closure duration of intervocalic
consonants are higher at sentence boundary. What this means in terms of calculating
duration ratio with the formula that is used in this experiment is that, the number in the
Ipek
35
denominator is higher at sentence boundary than it is at word boundary. However, the
formula assumes that if the C durations in intervocalic contexts gets shorter or longer for
some reason, C durations in clusters are affected in the same way, and any difference in
the total duration of C clusters and the sum of intervocalic Cs is due to overlap. C
durations in clusters may not be affected in the way that they are in intervocalic contexts,
however this is not possible to tell from acoustic data.
The fourth main finding of the experiment is the vowel effect on duration ratio at word
boundary. Table (9) also shows that the closure durations of /t/ and /k/ drops from .08 to
~.06 and ~.07 to ~.04, respectively at /e/ vowel context at word boundary. The reason for
this sharp drop would be hard to explain from an acoustic data but we might argue for a
possible explanation based on relevant findings in previous studies on stops. Previous
works by Löfqvist and Gracco (1997) on bilabial stops and on lingual stops (Löfqvist
and Gracco 2002), show that lips and tongue are moving high velocities at the onset of
closure. These results for the lip and tongue are suggested to be due to the actual target
for closure being beyond the point of actual constriction. So, stops have virtual targets
beyond the point of constriction. In the case of lingual stops, this virtual target is beyond
the palate. Löfqvist and Gracco (2002) also find the velocity of the tongue tip and
tongue body at the onset of closure to be affected by preceding vowel context; it is very
low preceding the vowel /i/ when compared to /a/ and /u/. This difference is related to the
fact that the distance to travel to make the constriction is longer in the case of /a/ and /u/
vowels whereas the tongue is already very close to the palate when the preceding vowel
is /i/. What is more Löfqvist and Gracco (2002) find that while moving from one vowel
to the next, tongue moves in a loopwise fashion during closure duration rather than a
Ipek
36
vertical one. Taken from this perspective, tongue moving to the palate from a low vs mid
vowel, it is known that they are going to reach the target at the same time, and it is also
known that for this to happen the tongue will have higher velocity in the case of low
vowel, in which case the force acting upon the palate at the point of contact will be
greater in the case of a low vowel since force is positively correlated with velocity. In the
case of, tongue-palate interaction since the palate is solid (that it doesn’t move at all),
since tongue has elasticity, with the energy it possesses at the moment it hits to the palate,
it will start moving in the horizontal direction while still in full contact with the palate.
The magnitude of this horizontal movement might be positively correlated with the
energy that the tongue possesses at the moment of contact, and this might result in a
longer closure duration for the consonants that follow a low vowel. Yet, again according
to the formula, the same fact should hold true for C durations in clusters, too, which is
again hard to tell from acoustic data.
The arguments on the boundary and vowel effects based on the findings on C durations in
intervocalic context might be supported or disproved by observing the behaviour of C
durations in clusters in an articulatory study, and a support of the arguments would put
the formula into question.
An interesting finding in this experiment is the large difference in release percent in /k#t/
cluster in two different vowel contexts at word boundary; while there is release in almost
every case in /a/ vowel context, the release percent is ~10% at /e/ vowel context. The
reason why release percent drops to ~10% in /e/ vowel context in /k#t/ cluster might be
due to the fact that /k/ in Turkish is palatalized in the context of a non-back vowel, which
means the constriction location of /k/ in /e/ vowel context is more anterior than that of /a/
Ipek
37
vowel context. In /ek#t/ cluster, since /k/ is palatalised, the tongue tip is closer to the
palate when the constriction for /k/ is formed which makes the distance between the
tongue tip and the palate shorter when compared to the distance of tongue tip and palate
in /ak#t/ case, and this might make the transition between /k/ and /t/ in /ek#t/ case
possible without an audible release. This difference might be thought to result from /k#t/
cluster in /e/ vowel context having greater closure duration overlap than the one in /a/
vowel context, however the mean duration ratio of /k#t/ in /e/ vowel context is 1.08,
which means no closure duration overlap, and no release. Based on this argument and the
mean duration of the /k#t/ cluster in /e/ vowel context, a possible temporal relation for
this cluster in that specific vowel context might be:
(5)
/k/
/t/
This temporal relation suggests a very short duration from O2 to T2. There seems to be
an implication of release in between R1 and T2, yet considering the fact that actual
targets of stops is beyond palate, by the time TB (tongue body) release gesture is
initiated, there will be a very short period of time before there is actually an open vocal
tract, and that ‘space’ in between R1 and T2 represents that period, not a period of open
vocal tract.
This finding with respect to same cluster having different release properties in different
vowel contexts is important because if a same consonant cluster has different release
Ipek
38
properties in the same boundary but different vowel context, then it might be a more
exhaustive explanation to define the presence or absence of release in between consonant
clusters as a function of the complex organization of gestures involved in that broader
context rather than solely depending on the two consonants involved.
6. Conclusion
This paper has examined the relation between overlap and release. In this context, degree
of overlap and the presence/absence of release are considered. Results indicated a
significant effect of overlap and cluster type on the presence/absence of release. Based on
the findings in the experiment, a window of temporal relations have been suggested to
capture the relation between overlap, cluster type and release. Specific temporal relations
have been suggested for those clusters whose overlap and release properties do not fall in
the window suggested.
In a more broader sense, the aim of the paper has been to explain differences with respect
to release in stop consonant clusters as a function of different temporal relations, and to
argue that these relations should be part of linguistic grammars.
References
Anderson, S. (1974). The Organization of Phonology. Academic Press, New York.
Bombien, L., Mooshammer, C., Hoole, P., Kühnert, B. and
Schneeberg, J. (2006). An EPG study of initial /kl/ clusters in varying prosodic conditions
in German. In Proceedings of the 7th International Seminar on Speech
Production, Ubatuba, Brasil.
Browman, C. and Goldstein, L. (1986). ‘Towards and Articulatory Phonology’,
Phonology 3, 219-252.
Ipek
39
Browman, C. P. and Louis G. 1995. ‘Dynamics and Articulatory Phonology’, in Robert
F. Port and Timothy van Gelder (eds.), Mind as Motion, MIT Press,
Cambridge, MA, pp. 175–193.
Byrd, D. (2000) Articulatory vowel lengthening and coordination at phrasal junctures.
Phonetica, 57(1):3-16.
Byrd, D., Kaun, A., Narayanan, S. & Saltzman, E. (2000) Phrasal signatures in
articulation. In Papers in laboratory phonology V, language acquisition and the lexicon
(M. Broe and J. Pierrehumbert, editors). Cambridge: Cambridge University Press
Catford, C. (1977). Fundamental Problems in Phonetics. Edinburgh: Edinburgh
University Press.
Cho, T. (2004 ). Prosodically-conditioned strengthening and vowel-tovowel coarticulation in English. Journal of Phonetics, 32, 141-176.
Cho, T. (2006) Manifestation of prosodic structure in articulatory
variation: Evidence from lip kinematics in English. In: Louis Goldstein, Doug H.
Whalen, and Catherine T. Best (eds.), Laboratory Phonology 8: Varieties of
Phonological Competence, Mouton de Gruyter, Berlin and New York, pp. 519-548.
Gafos, A. (2002). A grammar of gestural coordination. Natural Language and Linguistic
Theory. 20: 269-337.
Hardcastle, W. J. (1985) Some phonetic and syntactic constraints on lingual
coarticulation during /kl/ sequences, Speech Communication, 4, 247-263.
Henderson, J. B. & Repp, B. H. (1982). Is a s t op cons onant released when followed by
another stop consonant? Phonetica, 39, 7 1-8 2 .
Ladefoged, P. (1964). A phonetic study of west African languages. Cambridge:
Cambridge University. Reprinted 1968.
Ladefoged, P. and Maddieson, I. (1996). The sounds of the world's languages. Oxford:
Blackwell.
Lisker, L. (I974). On time and timing in speech. In T. A. Sebeok (ed.) Current trends
in linguistics I2. The Hague: Mouton. 2387-24I8.
Löfqvist, A. & Gracco, V.L. (1997). Lip and jaw kinematics in bilabial stop
consonant production. Journal of Speech, Language, and Hearing Research, 40, 877-893.
Lofqvist, A. & Gracco, V. (2002). Control of oral closure in lingual stop consonant
production. Journal of the Acousticsal Society of America, 111.
Ipek
40
McClean, M. (1973). “Forward coarticulation of velar movement at marked junctural
boundaries.” J. Speech Lang. Hear. Res. 16, 286-296.
Stevens N. (1989) On the quantal nature of speech, J. of Phonetics 17, 3-46.
Tabain, M. (2003). Effects of prosodic boundary on /aC/ sequences:
articulatory results, Journal of the Acoustical Society of America. 113, 2834-2849.
Tabain, M. and Perrier, P. (2005). Articulation and acoustics of /i/ in pre-boundary
position in French. Journal of Phonetics, 33, 77-100.
Zsiga, E. (2000). Phonetic alignment constraints: consonant overlap and palatalization in
English and Russian. Journal of Phonetics, 28, 69-102.