Dynamics and Transparency in Vowel Harmony
by
Stefan Benus
A dissertation submitted in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
Department of Linguistics
New York University
January, 2005
_________________________
Adamantios I. Gafos
© Stefan Benus
All Rights Reserved, 2005
ii
DEDICATION
To my wife Jana
and to my sons Matej and Samuel
iii
ACKNOWLEDGEMENTS
My deep gratitude goes to all my committee members. First of all, I would
like to thank my advisor, Diamandis Gafos. As a patient teacher and mentor, he
showed me the beauty in pursuing crude intuitions and ideas as well as the
importance of clarity in argumentation once these intuitions are ready to be fleshed
out. His unrelenting willingness to read my drafts, search for ideas and arguments
often deeply obscured in my unclear prose, and provide comprehensive comments,
made a deep impact on my development as a linguist. Diamandis was instrumental in
planting in me the seed of curiosity about vowel harmony and transparency and
remained involved and extremely supportive at every step in the development of this
project. I am grateful to Louis Goldstein who provided invaluable input in
experimental data collection and analysis as well as in shaping my ideas about
articulatory gestures and dynamics. He also made my experience in Haskins
Laboratories a very fruitful and enjoyable one. Lisa Davidson became involved in the
later stages of the project; yet, her willingness to listen to, discuss, and comment on
various aspects of this dissertation had significantly improved the final product. I
would also like to thank John Singler and Greg Guy for their objective and refreshing
points of view.
iv
At various stages of this project, I greatly benefited from discussions with and
help from Khalil Iskarous, Marianne Pouplier, Arto Anttila, Mark Tiede, David
Goldberg, Erika Sólyom, Anna Szabolcsi, Zsofia Zvolensky, Maryam BakhtRofheart, Jen Nycz, Larissa Chen, and Doug Honorof. This dissertation would not be
possible without patience and endurance of my Hungarian subjects as well as those
Hungarian speakers whom I consulted while preparing experimental stimuli. In
addition, the faculty, staff, and my fellow students in the Linguistics department at
NYU created a wonderful and stimulating environment where I never felt that I was
on my own.
It would not have been possible to finish this work without the support from
my wife and sons who endured extended periods of my absence (either physical, but
more often when I was too preoccupied with my work to be fully present). Despite
this, they continued to provide positive energy, understanding, and trust. In addition
to my own family and friends in Slovakia who provided wonderful times for me and
my sons during our summer visits, I wish to thank for the moral and material support
of our extended family from the Holy Trinity Slovak Lutheran church in Manhattan.
Specially, the Havlik and Prazenka families provided invaluable life support in the
most difficult times when we were new to this country. My gratitude also extends to
the Fulbright Commission in Slovakia for sponsoring the first year of my studies at
v
New York University. This work was supported in part by NIH Grant HD-01994 to
Haskins Laboratories.
vi
ABSTRACT
This dissertation examines the phonological patterning as well as phonetic
characteristics of transparent vowels in Hungarian palatal vowel harmony.
Traditionally, these vowels are assumed to be excluded from participating in harmony
alternations. The experimental data presented in this dissertation run contrary to this
assumption. The data show that transparent vowels in Hungarian are articulated
differently depending on the harmonic domain in which they occur. Based on this
observation, the central claim defended and formalized in this dissertation is that
continuous phonetic details of all stem vowels including the transparent vowels are
relevant for the phonological alternation in suffixes.
The dissertation proposes an integrated model that relates phonetic and
phonological aspects of vowel harmony using the formal language of non-linear
dynamic. The advantage of this approach is in its potential to capture both qualitative
as well as quantitative aspects of the same pattern in a unified way. Crucially, a
dynamic approach allows one to express both phonological and phonetic
generalizations while maintaining the essential distinction between them. Hence, the
dynamic approach provides a feasible research strategy in the quest for understanding
one of the continuing challenges in the study of speech: the relation between
vii
phonology – the mental or symbolic aspects of our speaking competence, and
phonetics – continuous physical manifestations of this competence.
Applied to the particular case of transparency in Hungarian vowel harmony,
the premise of interdependency between the phonetic properties of the stem vowels
and the phonological patterns of suffix selection allows for an explanation of a broad
range of data. Most importantly, it provides a motivation for the cross-linguistic
generalizations related to transparent vowels in palatal vowel harmony systems. In
addition, the effects of tongue body height, lip rounding, and surrounding vocalic
context on the suffix selection in Hungarian receive a natural and lawful explanation.
To summarize, this dissertation presents novel experimental data from the
production of transparent vowels in Hungarian. The proposed integrated model,
relating phonetics and phonology using the formal language of non-linear dynamic,
achieves a unified explanation of both the phonetic and phonological generalizations
observed in the data and the literature.
viii
TABLE OF CONTENTS
DEDICATION
iii
ACKNOWLEDGMENTS
iv
ABSTRACT
vii
LIST OF FIGURES
xiv
LIST OF TABLES
xix
LIST OF APPENDICES
xxii
CHAPTER 1
Introduction
1
1.1. Claims
2
1.2. Organization of the dissertation
10
CHAPTER 2
Phonology of transparent and opaque vowels: Theoretical background
2.1. Introduction
14
14
2.2. Transparency and opacity in Hungarian vowel harmony:
phonological description
15
2.2.1. Vowel harmony
15
2.2.2. Hungarian palatal harmony
17
2.3. The challenge of Hungarian vowel harmony
2.3.1. Locality
29
29
2.3.1.1. Strict locality
30
2.3.1.2. Absence of locality
34
2.3.1.3. Parameterized locality
37
ix
2.3.2. The nature of transparent vowels
54
2.3.3. Exceptionality of híd-type stems
61
2.3.4. Vacillating stems
64
2.3.5. Multiple transparent vowels
66
2.4. Conclusion
70
CHAPTER 3
Hungarian transparent vowels: an experimental study
75
3.1. Introduction
75
3.2. Previous experimental studies involving transparent vowels
76
3.3. Articulatory experiment: Methodology
80
3.3.1. Magnetometry and ultrasound techniques
80
3.3.2. Stimuli and subjects
84
3.3.3. Data collection
89
3.3.4. Data labeling and extraction
91
3.3.5. Comparison of the magnetometry and ultrasound techniques
100
3.4. Results
103
3.4.1. Disyllabic stems – EMMA results
103
3.4.1.1. Subject ZZ
104
3.4.1.2. Subject BU
112
3.4.1.3. Subject CK (pilot)
116
3.4.1.4. Summary of disyllabic EMMA data: subjects ZZ, BU,
and CK
120
3.4.2. Disyllabic stems – Ultrasound results of subject ZZ
121
3.4.3. Monosyllabic stems – Results
130
3.5. Summary and discussion
135
3.5.1. Harmonic environment
135
x
3.5.2. Vowel type
136
3.5.3. Lexical pair
139
3.5.4. Disyllabic vs. monosyllabic stems
140
3.6. Conclusion
144
CHAPTER 4
Phonetics meets phonology
147
4.1. Introduction
147
4.2. Articulatory retraction is relevant for suffix selection
149
4.3. Phonetic height and suffix selection
154
4.4.
Perceptual results of coarticulation
4.5. Transparency and existing models of phonetics-phonology interface
160
163
4.5.1. Derivational model
164
4.5.2. Ohala’s perceptually-based model
165
4.5.3. Exemplar-based model
171
4.6. Vowel harmony and non-linearity between articulation and
perception
178
4.6.1. Transparency: articulatory retraction without significant
perceptual effect
179
4.6.2. Effect of lip rounding and tongue body lowering
4.7. Conclusion
187
197
CHAPTER 5
Dynamic model of vowel harmony in Hungarian
199
5.1. Introduction
199
5.2. Modeling and dynamics
201
xi
5.2.1. Why model?
201
5.2.2. Why dynamics?
203
5.2.3. Static vs. dynamic approach: example
206
5.2.4. Geometric description of a dynamic system
211
5.3. Gestural representations
223
5.4. Stem-internal blending
236
5.4.1. Assumptions
236
5.4.2. Dynamic model of stem-internal blending
242
5.4.2.1. Effect of front vowel rounding
247
5.4.2.2. Effect of front vowel height
251
5.4.2.3. Effect of front vowel advancement
255
5.5. Model of suffix selection
259
5.6. Summary and conclusion
272
CHAPTER 6
OT formalism of vowel harmony: Integrating OT and dynamics
274
6.1. Introduction
274
6.2. Optimality Theory
277
6.3. Dynamic definition of OT constraints and their evaluation
280
6.4. OT constraints for vowel harmony and their evaluation
286
6.4.1.
Markedness constraints: Agree
286
6.4.1.1. Stem-internal harmony
286
6.4.1.2. Stem-suffix harmony
291
6.4.2. Faithfulness IDENT constraints
299
6.4.3. Phonological categories and dynamic OT
309
6.4.4. Summary of the developed OT tools
312
6.5. OT analysis of Hungarian vowel harmony
313
xii
6.5.1. Transparency
313
6.5.2. Opacity
323
6.5.3. Vacillation
330
6.5.4. Monosyllabic stems
333
6.6. Typological considerations
336
6.7. Summary of the OT model
340
CHAPTER 7
343
Conclusion
343
7.1. Future research
349
APPENDICES
353
BIBLIOGRAPHY
360
xiii
LIST OF FIGURES
Fig. 1
Continuous activation of TBCL = {uvular} in papír-nak
31
Fig. 2
Non-local relationship between the initial and final /a/
32
Fig. 3
An illustration of the plastic apparatus with transmitter coils
and the placing of three receiver coils on the tongue
81
Fig. 4
Placements of the ultrasound probe
83
Fig. 5
Horizontal and vertical trajectories of articulators during the
production of zafír-ban
Fig. 6
92
Tracings of the tongue surface at the extreme front position
during the TV /i/ in buli-val
Fig. 7
94
Comparison of two curves as the difference in the area between
them
Fig. 8
96
Illustration of the pair-wise comparison of the ultrasound
curves
Fig. 9
97
Quantification of the effect of environment from the ultrasound
images
99
Fig. 10
Retraction of /i/ and /é/ in back vs. front harmony
122
Fig. 11
Illustration of possible effects of retraction on the tongue body
138
Fig. 12
Tongue shapes of four transparent vowels in the #b_b# context
from the ultrasound data of subject ZZ
xiv
157
Fig. 13
A sketch of the lexicon of exemplars of /í/ based on F2 values
173
Fig. 14
Non-linearity between articulation and perception
180
Fig. 15
Approximate mid-sagittal vocal tract configurations for nonlow unrounded vowels and the three-tube model of these
configurations
Fig. 16
181
Nomograms of the natural (formant) frequencies of the threetube model as a function of the length of the back cavity
Fig. 17
182
Energy distribution for the palatal vowels for an English
speaker and an Arabic speaker
185
Fig. 18
Non-linear relationship for front non-low unrounded vowels
186
Fig. 19
Ultrasound shapes of high front unrounded vowels and high
front rounded vowels in Hungarian
Fig. 20
Formant resonances for the unrounded front vowels and the
rounded ones
Fig. 21
188
189
Formant resonances for the spread, neutral, moderately
rounded, and closely rounded front vowels
191
Fig. 22
Illustration of the quantal differences between /i/ and /ü/
192
Fig. 23
Quantal differences between unrounded and rounded front
Fig. 24
vowels
194
Illustration of the quantal differences between [i] and [E]
197
xv
Fig. 25
Static model of balance between demand and supply
207
Fig. 26
Dynamic model of balance between demand and supply
211
Fig. 27
Phase flow showing the velocity field of f(x) = αx
212
Fig. 28
Velocity fields, flows, and potentials for a linear function
f(x) = ax + b
Fig. 29
214
Velocity fields, flows, and potentials for a non-linear function
f(x) =–x3+x
216
Fig. 30
Measure of stability as the width of the probability distribution
218
Fig. 31
Loss of stable fixed point due to continuous change in a control
parameter
Fig. 32
219
Effect of changing parameters a,b on the shape of the potential
function V(x) = ax4 + bx2
Fig. 33
220
Non-linearity of the dynamical system characterized with the
potential V(x) = ax4 + bx2
222
Fig. 34
Tract variables of Articulatory Phonology
225
Fig. 35
Attractor dynamics for constriction location
227
Fig. 36
Potentials and probability distributions of the dynamic system
V(x) = α(x – 2)2 + F(t) as a variation of the weight α
Fig. 37
229
Kinematic trajectories of the lips and the tongue body
movement and dynamic specifications that underlie them
xvi
231
Fig. 38
Dynamic formalism of [±back] vocalic feature
Fig. 39
Gestural descriptors with activation intervals and generated
232
movement of the articulators
234
Fig. 40
Spatio-temporal evolution of two adjacent vowel gestures
238
Fig. 41
Blending of two gestures
243
Fig. 42
Model of blending of a transparent and an opaque vowel with a
preceding back vowel
250
Fig. 43
Blending of a back and a front vowel gestures
254
Fig. 44
Modeling different retraction degree as a result of increased
frontness of the front vowel
256
Fig. 45
Suffix form as a function of retraction degree
262
Fig. 46
Stem-internal blending in the BTT stems
267
Fig. 47
Potential for the CL value of the suffix vowel when retraction
degree is 0.6
Fig. 48
269
Effect of preceding front suffix (x0 = 2) on the target suffix
vowel
271
Fig. 49
Illustration of a dynamic system defining the constraint *VOICE
287
Fig. 50
AGREEA-Suff (CL) – dynamic formalism and evaluation
292
Fig. 51
AGREEI-Suff(R) – dynamic formalism.
296
xvii
Fig. 52
Non-linearity between the horizontal position of the tongue
(CL) and perceptual frontness
304
Fig. 53
IDENT(front) – dynamic formalism and evaluation
306
Fig. 54
Formalization of non-linearity between the horizontal position
of the tongue (CL) and perceptual frontness
Fig. 55
Evaluation of candidates with CL = 1 by IDENT(front)[–round]
and IDENT(front)[+round]
Fig. 56
329
Quantal properties of /i/, /ü/, and /e/ as differences in respective
potentials defining the IDENT(front) constraint
Fig. 57
324
331
Transparency as an integrated system of phonetics and
phonology
345
xviii
LIST OF TABLES
Table 1
Suffix selection for BT and BBT nouns
Table 2
Results from a 2-way ANOVA for environment and vowel
69
type for subject ZZ
105
Table 3
Direction of the effect of environment for subject ZZ
105
Table 4
Effect of the type of the transparent vowel on the position
of the receivers in the front and back environment for
subject ZZ
107
Table 5
Retraction degree of individual transparent vowels
108
Table 6
Results from a 2-way ANOVA for environment and lexical
pair for subject ZZ
Table 7
110
Effect of environment in individual lexical items for subject
ZZ
Table 8
112
Results from a 2-way ANOVA for environment and vowel
type for subject BU
113
Table 9
Direction of the effect of environment for subject BU
113
Table 10
Retraction degree of individual transparent vowels
114
Table 11
Results from a 2-way ANOVA for environment and lexical
pair for subject BU
115
xix
Table 12
Effect of environment in individual lexical items for subject
BU
Table 13
116
Results from a two-way ANOVA for environment and
vowel type for subject CK
117
Table 14
Direction of the effect of environment for subject CK
118
Table 15
Retraction degree of individual transparent vowels for
subject CK
Table 16
119
Results from a 2-way ANOVA for environment and lexical
pair for subject CK
Table 17
120
Effect of environment based on the area measure of
difference between curves
Table 18
123
Mean area between the curves from the same environments
and those from different environments
Table 19
Effect of environment in individual lexical items for subject
BU
Table 20
124
Main effects of the environment, vowel type, and line on
the D value
Table 21
124
126
Average difference in mm between the tongue shapes from
the front and back environment
xx
127
Table 22
Advancement of the transparent vowels in the front
environment
Table 23
129
ANOVA results for the effect of environment in
monosyllabic stems
Table 24
131
Effect of environment in individual lexical items for
subjects ZZ and BU
Table 25
132
Effect of environment in individual lexical items for subject
CK
Table 26
133
Effect of environment in monosyllabic stems based on the
area measure of difference between curves
Table 27
Effect of environment measured on the five lines described
in Fig. 9
Table 28
135
Comparison of retraction degree between disyllabic and
monosyllabic stems
Table 29
134
142
Degree of retraction as a function of variation in quantal
features (q) and input value of Constriction Location (CL)
of the front gesture
257
xxi
LIST OF APPENDICES
APPENDIX A
353
List of stimuli for subjects ZZ and BU
APPENDIX B
355
List of stimuli for subject CK (pilot)
APPENDIX C
358
Post-hoc Tukey test: effect of vowel type on the tongue position in the
front and back environments
xxii
CHAPTER 1
1.1
Introduction
In palatal vowel harmony systems such as those of Finnish, Hungarian, or Turkish,
the [±back] quality of the suffix vowel is determined by the [±back] quality of the
stem-vowel. For example, the dative suffix in Hungarian appears either with a front
vowel /e/ or a back vowel /a/ depending on the stem vowel: ház-nak ‘house-Dative’
but kéz-nek ‘hand-Dative’. The stem vowel is thus considered a trigger and the suffix
vowel a target of the phonological harmony process. The feature [±back] is called the
harmonizing feature.
Polysyllabic stems in which vowels have opposite specifications for the
harmonizing feature are called disharmonic stems. A particularly interesting question
is what determines the form of the suffix following disharmonic stems or, in other
words, which stem vowel is the trigger of the harmony process. To answer this
question, vowels in disharmonic stems have been traditionally divided into two
categories. Transparent vowels are those vowels that may intervene between the
trigger and the target of harmony even when they bear the opposite value for the
harmonizing feature. For example, the dative suffix following disyllabic stems such
as papír ‘paper’ takes on the [+back] value of the initial vowel despite the [–back]
quality of the intervening /í/: papír-nak ‘paper-Dative’. Opaque vowels, in contrast,
require a local agreement relationship between the trigger and the target, i.e. there can
1
be no intervening vowel. For example, the dative suffix following a disyllabic stem in
which a back vowel precedes a front rounded vowel, such as parfüm ‘perfume’, must
bear the [–back] quality of the immediately adjacent preceding vowel: parfüm-nek
‘perfume-Dative’. Hence, transparent vowels allow a non-local relationship between
the trigger and the target whereas the opaque vowels ban such a relationship. In
Hungarian, the transparent vowels consist of the front unrounded vowels {/i/, /í/, /é/,
/e/}, and the opaque vowels include all back vowels and the front rounded vowels
{/ü/, /ő/, /ö/, /ı/}.
A traditional analysis of this widespread phenomenon is that the ([±back])
form of the suffix is determined by the ([±back]) form of the rightmost nontransparent vowel of the stem. In the case of stems like papír, the harmonizing feature
[+back] of the initial vowel triggers the [+back] value of the target vowel in the suffix
while the intervening [–back] vowel /í/ is disregarded in this process.
1.2
Claims
The approach taken in this dissertation follows from a belief that we can better
understand cognitive processes related to speech if we carefully study both phonetic
and phonological aspects of it. Following this approach, this dissertation adds to the
large body of work on vowel harmony a proposal that both phonetic and phonological
2
properties of vowels are relevant in determining the output of the harmony process.
Specifically, there are three major claims in this dissertation:
1.
Hungarian transparent vowels are not excluded from participating in palatal
vowel harmony. Rather, the [±back] harmonizing feature is manifested on the
transparent vowels by systematic phonetic differences in the horizontal
position of the tongue body.
2.
The phonological process of determining the discrete ([±back]) form of the
suffix depends on the fine degree of articulatory backness in the vowel
preceding the suffix vowel. Therefore, the form of the suffix is always
determined by the backness of the rightmost stem vowel; in some cases,
however, this backness is non-contrastive.
3.
The relationship between continuous details of the tongue body horizontal
position in stem-final vowels and the [±back] quality of the suffix vowel(s)
can be coherently modeled using the mathematics of nonlinear dynamics
operating over the parameters of gestural representations.
Evidence for the first claim is drawn from the experimental investigation of
the articulatory characteristics of Hungarian transparent vowels. The combination of
two techniques used in this dissertation (magnetometry and ultrasound) provides a
comprehensive picture of the articulatory characteristics of these vowels. The
findings show that the transparent vowels in stems triggering back harmony are
3
articulated slightly, but significantly, further back than the transparent vowels in front
harmony roots. For example, /i/ in Tomi-hoz ‘Tom-Diminutive-Allattive’ is more
retracted than the second /i/ in Imi-hez ‘Imre-Diminutive-Allattive’.
The transparent vowel /i/ in these two examples is surrounded by either front
vowels (Imihez) or back vowels (Tomihoz) from both sides. Therefore, it is plausible
that the surrounding vowels influence the production of transparent vowels via
coarticulation. In order to show that the observed pattern of retraction is not simply
coarticulation, the behavior of transparent vowels in monosyllabic stems was
investigated. The majority of these stems select front suffixes (cím-nek ‘addressDative’, szél-nél ‘wind-Adessive’). This is expected since the stem vowels are
phonemically front. However, a limited number of these stems select back suffixes
(híd-nak ‘bridge-Dative’, cél-nál ‘aim-Adessive’). Articulatory investigation of
phonemically identical transparent vowels in monosyllabic stems produced in
isolation (with no overt suffix) shows that the tongue body is more retracted in the
stems selecting back suffixes than in the stems selecting front suffixes. Therefore,
Hungarian speakers produce systematic differences in retraction correlating with the
form of the suffix even in the absence of the potential source of coarticulation in the
form of adjacent vowel(s).
The overarching generalization, then, is that non-distinctive retraction in
transparent vowels is linked to the phonological alternation in suffix form. The
4
advanced or retracted version of a transparent vowel correlates with the front or back
suffix respectively.
In addition to the established correlation between tongue body retraction and
suffix backness, there are several generalizations about the stem-final front vowels in
disharmonic stems and the suffixes that follow them. As argued in the second claim
of this dissertation, all of these correlations receive a unified explanation if the model
for selecting the discrete ([±back]) form of the suffix can access fine phonetic details
of articulatory backness in the vowels preceding the suffix vowel.
The first generalization, as already pointed out, is that lip rounding plays a
crucial role in the phonological behavior of front vowels in Hungarian: the unrounded
vowels behave transparently whereas the rounded ones behave opaquely. For
example, the stems papír ‘paper’ or híd ‘bridge’ select suffixes with back vowels,
whereas parfüm ‘perfume’ or tök ‘pumpkin’ select suffixes with front vowels. The
second generalization is that the vowel height of front unrounded vowels also affects
the suffix selection: the lower the vowel, the more likely it is followed by suffixes
with front vowels. Finally, the number of transparent vowels has an effect on the
form of the suffix: stems with a single transparent vowel following a back vowel
select back suffixes (kabin-ban ‘cabin-Inessive’), whereas stems with two transparent
vowels following a back vowel can select either front or back suffixes (aszpirin-ban,
aszpirin-ben ‘aspirin-Inessive’ are both possible).
5
These generalizations can be linked to the relationship between articulation
and perception of front vowels. Assuming that the participation of front vowels in
palatal vowel harmony affects the horizontal position of the tongue (Claim #1),
articulatory retraction stemming from this participation has potentially different
acoustic consequences. Following the experimental studies of Stevens (1989) and
Wood (1979, 1986), it is argued that high front unrounded vowels are phonologically
transparent because the observed tongue body retraction in the back harmonic
environment does not correspond to significant changes in the acoustic output for
these vowels. In contrast, acoustic insensitivity to tongue body retraction is limited
for both rounded and low front vowels.
In addition, a model assuming participation of transparent vowels in vowel
harmony allows us to link the different patterns of suffix selection for kabin and
aszpirin to differences in tongue body retraction of the stem-final vowel. In kabin, the
backness of the initial /a/ causes tongue body retraction of the following /i/, as
observed experimentally. This retraction is assumed to be greater than the retraction
of the stem-final vowel in aszpirin. This is because the stem-final /i/ in aszpirin is
adjacent to a retracted, yet still front, vowel /i/ whereas /i/ in kabin is adjacent to a
back vowel /a/.
With respect to the third claim, the data presented in this dissertation bear on a
fundamental question in the study of human speech: How are the quantitative aspects
6
of phonetic data related to the qualitative aspects of phonological competence (e.g.
Gafos to appear, Beckman & Kingston 1990)? Traditionally, phonetics and
phonology diverge in terms of their approach to speech events as well as in terms of
the formal tools used in explaining the relevant facts related to these events.
Phoneticians study physical properties of speech sounds. They measure the
movements of the tongue, jaw, and other articulators in real (continuous) time, and
study the acoustic features resulting from those movements. Formal models
developed to explain the observed data rely on the continuous mathematics of
calculus. On the other hand, the goal of phonology is to describe what the mind does.
It assumes a small inventory of abstract symbols (representing time-invariant states of
the phonetic system), and studies the combinatorial patterns in the distribution of
these symbols. Formal models developed to explain these patterns rely on the discrete
mathematics of algebra and logic.
The proposal that sub-phonemic1 backness plays a role in determining the
phonological form of the suffix adds to the increasing body of evidence accumulated
over the past twenty years indicating that the assumption of the strict division of labor
between phonetics and phonology should be reconsidered (see Pierrehumbert et al.
2001 for review). Research shows that speakers of different languages can produce
1
The adjectives sub-phonemic, phonetic, continuous are used interchangeably in this
dissertation.
7
and perceive minute phonetic details and use them in their phonology. Furthermore,
some of the phonologically meaningful details have a temporal, i.e. inevitably
continuous, basis (Browman & Goldstein 1992).
In attempts to make phonetic information available to phonology, models with
a closer relationship between phonetics and phonology have been proposed (e.g.
Steriade 1997, Flemming 1995, Boersma 1998). One of the most developed and
tested models in the area of phonetics-phonology interface is presented in the work of
Browman & Goldstein (1986 et seq.). Their model allows phonology to access
phonetic details via the parameters of dynamically defined gestural representations,
and consequently provides for a potentially coherent integration of continuous
phonetics and qualitative phonology (e.g. Browman & Goldstein 1995, Gafos 2002,
to appear). In this view, phonology and phonetics are construed as two dimensions of
a single complex dynamic system. The mathematical tools of nonlinear dynamics
provide the formal language for building models of such complex systems.
Despite its success as an alternative to symbol manipulations, the dynamic
approach has focused on a limited set of phonological patterns such as feature
neutralization or allophonic assimilation. These processes are most closely connected
to phonetics, and, although systematic, they represent only a subset of phonological
patterns due to the following characteristics. The majority of these processes are
typically automatic because they apply without exception once their environment is
8
satisfied. They are usually allophonic because they affect only non-contrastive
features of sounds, and they are local because the trigger and the target of the process
are adjacent. Moreover, they typically depend on the position of the affected sounds
in the prosodic structure, affecting, for example, sounds at the edges of higher-level
categories like prosodic words or phrases. Finally, these processes usually depend on
a particular style or rate of speech.
Processes that operate independently of prosodic structure, style, or rate of
speech, exhibit exceptional behavior, involve non-adjacent sounds, or affect
contrastive features of sounds are considered ‘core’ phonological processes. These
processes are typically modeled as purely symbolic alternations unrelated to the
continuous domain of phonetics. A prime example of such a process is vowel
harmony. This dissertation argues that even the ‘core’ phonological processes such as
vowel harmony can be better understood and receive an explanatory account if
phonetic and phonological facts related to vowel harmony are construed as interdependent and modeled using the tools of nonlinear dynamics.
To summarize, this dissertation presents results from a phonetic and
phonological investigation of transparent vowels in Hungarian. Based on these
results, it is argued that transparent vowels are in fact integral parts of harmonic
domains. A theory strongly rooted in phonetic facts is proposed and the argument that
phonetic details play a crucial role in explaining abstract phonological patterns is
9
developed. This argument is followed by the construction of a dynamic model that
can account for both a) the phonological process of suffix selection in the vowel
harmony system, and b) the phonetic characteristics of transparent vowels. Hungarian
vowel harmony is thus construed within a unified cognitive system where phonetics
and phonology represent distinct but inter-dependent levels of different granularity.
1.3
Organization of the dissertation
Chapter 2 lays the theoretical foundation for the investigation of Hungarian palatal
harmony. It provides a description of the complex patterns of suffix selection in
Hungarian and reviews the representative proposals for explaining these patterns. The
issues that are at the center of attention in this chapter include general questions of
locality and the nature of transparent and opaque vowels, as well as particular
generalizations observed in the Hungarian data such as the division of stems with
only transparent vowels into those that trigger front suffixes and those that trigger
back suffixes, vacillation in the form of the suffix for individual stems, and the
phonological relevance of the number of intervening transparent vowels between the
trigger and the target of harmony.
Chapter 3 reviews the handful of available acoustic studies involving
transparent vowels, and presents the results of the articulatory investigation of
transparency in Hungarian vowel harmony. The major finding reported in this chapter
10
is that transparent vowels in stems with phonologically back harmony are slightly,
but significantly, more retracted than transparent vowels in front harmony stems.
Hence, an articulatorily retracted stem-final transparent vowel is followed by a
[+back] suffix whereas a less retracted transparent vowel is followed by a [–back]
suffix. The upshot is that minor phonetic differences in the retraction of the stem-final
vowel correlate with a phonological alternation in suffixes.
Chapter 4 provides a synthesis of the patterns reviewed in Chapters 2 and 3. It
starts by describing the correlations between four articulatory characteristics of front
vowels and the form of the suffix that follows these front vowels. These articulatory
characteristics are tongue body horizontal position, tongue body height, lip rounding,
and perceptual resistance to coarticulation. Drawing on the correlations between these
four phonetic features and the form of the suffix, this chapter argues that the phonetic
and phonological properties of transparent vowels are inter-related. Moreover, it is
argued that existing models of the phonetics-phonology relationship do not fully
capture the generalizations observed in the Hungarian data. The chapter then goes on
to develop an argument that the observed regularities in the form of the suffix
following the front vowels in disharmonic stems are reducible to the independent
quantal relationship between articulation and perception of these front vowels.
Chapter 5 presents a formal model that links small differences in the details of
articulation to the categorical alternation in the form of the suffix. The model is
11
couched in the framework of nonlinear dynamics (e.g. Percival & Richards 1982,
Kelso et al. 1993, Gafos, to appear) operating over the parameters of dynamic
gestural representations (Browman & Goldstein 1989, 1995, Gafos 2002). The model
consists of two components. In the first component, the relationship between stemfinal retraction and other vocalic dimensions (height, rounding, and adjacent vowel
context) is modeled as stem-internal blending of vowel gestures. The current model
of gestural blending (Saltzman and Munhall 1989) is extended by proposing that
vowel gestures blend only to the extent that this blending allows the perceptual
recoverability of the original gestures.
The second component of the model develops an account of the stem-suffix
harmony, i.e. the selection of the [±back] quality of the suffix vowel, and crucially
relies on the degree of articulatory retraction of the stem-final vowel gesture. It is
proposed that the suffix selection follows from the nonlinear dynamic relationship
between the stable order parameter ([±back] of the suffix vowel) and the continuous
control parameter of retraction degree. Effectively, non-contrastive differences in the
degree of tongue body retraction of the transparent vowels result in categorical
alternations in the suffixes that follow them. This is due to the fundamental property
of nonlinearity that allows us to link changes along continuous dimensions to
categorical alternations.
12
Chapter 6 uses the insights gained from the dynamic modeling of vowel
harmony to develop an integrated model of phonetics and phonology of Hungarian
vowel harmony that is couched in the Optimality Theoretic (OT) framework (Prince
& Smolensky 1993). This framework has proven successful in accounting for crosslinguistic patterns and making typological predictions. The crucial extension of OT
proposed in this chapter is that constraints may be defined as dynamic systems that
control phonetic parameters. In this way, the cognitive system that models the
transparency in vowel harmony is phonetically grounded because constraints are
defined over phonetic dimensions. Crucially, such a system is also phonologically
stable because it does not ascribe relevance to all phonetic details. Rather, the
proposed OT model transforms high-dimensional phonetic variation into a lowdimensional system of (dis)preferred phonetic values.
13
CHAPTER 2
Phonology of transparent and opaque vowels: Theoretical background
2.1.
Introduction
The goal of this chapter is to provide an overview of the phonological patterns that
are at the center of this dissertation: vowel harmony, transparency, and opacity. These
patterns are exemplified in a description of the phonological system of Hungarian
palatal vowel harmony. This chapter thus lays the foundation for the subsequent
chapters that deal with the phonetic description of, and theoretical models for, the
pattern of vowel harmony in Hungarian. After the descriptions of the major patterns,
generalizations arising from the Hungarian data will be discussed and representative
proposals that deal with these generalizations will be critically reviewed.
Section 2.2 describes the phonology of transparent and opaque vowels in
Hungarian palatal vowel harmony. The Hungarian pattern is well documented in the
literature and has been analyzed in several theoretical studies. Yet, the complexity of
the pattern still provides multiple challenges for phonological research, and
constitutes a rich testing ground for the models of vowel harmony.
Section 2.3 summarizes the key features of transparency and opacity that
Hungarian shares with other harmony systems. Representative analyses of these
features are reviewed. It is argued, however, that despite the considerable attention
these patterns have received, a satisfactory account has been elusive and several
14
issues remain open. Emphasis is given to the nature of transparent vowels and locality
requirements in vowel harmony. In addition, the section discusses the systematicities
in the Hungarian data that have received comparably little attention in the mainstream
literature but also warrant a principled analysis. Section 2.4 summarizes the
discussion and concludes.
2.2.
Transparency and opacity in Hungarian vowel harmony: phonological
description
2.2.1. Vowel Harmony
Vowel harmony is a phonological regularity where vowels in a word agree in one or
more feature(s). It is a widespread pattern attested in many genetically unrelated
languages. The vocalic features that participate most typically in vowel harmony
systems are the features related to the backness of the tongue body, the rounding of
the lips and the retraction of the tongue root. Agreement in the horizontal position of
the tongue body, called [±back] or palatal harmony, is found in Finnish or Uyghur.
Agreement in the position of the lips, called [±round] harmony, is found in Turkish
15
and Mongolian. Agreement in the position of the tongue root, called [±ATR]
harmony, is found in Wolof, Akan, Granada Spanish, and other languages.2,3
An important characteristic of vowel harmony is its unboundedness. When
harmony applies, all vowels in the relevant domain typically bear the harmonizing
feature. This domain of application is usually restricted either in morphological or in
prosodic terms. In most common cases, harmony applies within words, i.e. both
stems and affixes participate (see S. Anderson 1980 for discussion on what is
considered a harmony process and van der Hulst & van der Weijer 1995 and
references therein on the advantages and disadvantages of defining the harmony
domain in prosodic or morphological terms).4 Due to the unboundedness feature,
vowel harmony has a similar function as stress in delimiting a word (Trubetskoy
1939), and plays an active role in word segmentation (Suomi et al. 1997, Vroomen et
al. 1998). In the same way that there is only one syllable with primary stress per
word, there is also one value of the harmonizing feature per word.
2
Finnish belongs to the Finno-Ugric language family spoken in Finland; Uyghur,
Turkish, and Khalka are Altaic languages spoken in China, Turkey, and Eastern
Mongolia respectively; Wolof is an Atlantic language spoken in Senegal; and Akan is
a Kwa language spoken primarily in Ghana.
3
There are also less common vowel harmony systems where the harmonizing feature
is height and nasality.
4
There is one reported case of harmony spanning a domain greater than a word.
Saeed (1999) claims that ATR harmony in Somali extends to the whole sentence. In
contrast, some harmony alternations are claimed to be restricted to domains smaller
than words such as prosodic feet in Chamorro (Topping 1968).
16
However, a closer examination of almost any vowel harmony system reveals
that the alternations are hardly regular. There are various restrictions, limiting factors,
and systematic exceptions in this process that provide a rich testing ground for
phonological research. The next section describes a typical example of such a
phonological system: palatal vowel harmony in Hungarian.
2.2.2. Hungarian palatal harmony
Hungarian is a Finno-Ugric language spoken by approximately 10 million people in
Hungary. There are substantial minorities of Hungarian speakers in Romania and
Slovakia. This section presents the phonological patterning of Hungarian vowels in
the process of palatal vowel harmony. Hungarian also exhibits rounding harmony
(Vago 1980, Kornai 1991) that will not be discussed in this thesis.
The Hungarian vowel inventory is shown in (1). The orthographic symbols
are followed by their respective phonetic values in square brackets. The relationship
between orthography and sound is mostly regular. The acute accent denotes length,
the umlaut denotes rounding of front vowels, and the ‘long umlaut’ denotes front
round long vowels. There are two deviations from this regularity. The vowels /e/ and
/é/ are different not only in length but also in height. The vowels /a/ and /á/ differ in
rounding. The standard (Budapest) dialect has a low short [E] whereas some other
17
dialects have a mid [ë] instead, indicated in parentheses below.5 In the following, I
will follow the tradition in the literature and use orthographic symbols in discussing
Hungarian data.
(1)
Hungarian vowel inventory (Ringen & Vago 1998: 394)
high
mid
low
[–back]
[–round]
[+round]
i [i] í [i:]
ü [y] ő [y:]
(e [ë]) é[e:] ö [O] ı [O:]
e [E]
[+back]
[–round]
[+round]
u [u] ú [u:]
o [o] ó [o:]
á [a:]
a [ç]
As mentioned above, words in a vowel harmony language typically contain
only vowels that agree in one or more feature(s). In Hungarian palatal harmony, the
harmonizing feature is the position of the tongue body. Hence, all vowels in a word
are drawn either from the ‘front’ set {i í e é ö ı ü ő}, articulated with a frontward
movement of the tongue body, or from the ‘back’ set {u ú o ó a á}, articulated with a
backward movement of the tongue body. An example of the former is vidék
‘countryside’ and of the latter város ‘town’.
The majority of suffixes have two forms: one with a front vowel and another
with a back vowel. Therefore, the [±back] feature of the vowel(s) in harmonic stems
dictates the quality of the vowel(s) in suffixes. This is shown in (2) below.
5
For example, the Transdanubian dialect of Western Hungary (Szilárd Szentgyörgyi,
p.c.).
18
(2)
Regular vowel harmony
Front
v i d é k- t ı l
countryside-Abl.
Back
v á r o s- t ó l town- Abl.
–back
öröm-nek
hegedő-nél
víz-ben
+back
joy-Dat.
violin-Adess.
water-Iness.
mókus-nak
harang-nál
ház-ban
squirrel-Dat.
bell-Adess.
house-Iness.6
A particularly interesting fact is that there exist disharmonic stems where
vowels have opposite specifications for backness, as shown in (3) and (4). For
example, papír ‘paper’ has a back vowel followed by a front vowel. The comparison
of data in (3) and (4) reveals that disharmonic stems do not behave uniformly with
respect to suffix selection. The stems in (3) represent the behavior where the suffix
vowel agrees in backness with the stem-initial vowel. On the other hand, the suffix
vowel in (4) agrees with the stem-final vowel.
(3)
Disharmonic stems with transparent vowels
papír-nak
paper-Dat.
kábít-om
daze-1st sg. def.
gumi-nak
rubber-Dat.
Tomi-nak
Tom, dim.-Dat.
kávé-nak
coffee-Dat.
bódé-tól
hut-Ablat.
6
Hungarian has a rich system of morphological case marking. Sources assign
between 16 and 24 different case endings. In addition to the standard Nominative,
Accusative, Dative, Instrumental, there are cases denoting spatial and kinetic
conditions. These are expressed with prepositions in English. For example, Ablative
means ‘from (nearby)’, Adessive means ‘at’ and Inessive ‘in’.
19
(4)
Disharmonic stems with opaque vowels
sofır-nek
driver-Dat.
parfüm-nek perfume-Dat.
büró-nak
bureau-Dat.
béka-nak
frog-Dat.
The data in (3) and (4) show that the characteristics of the stem-final vowels
affect the choice of the suffix that follows them. The vowels that allow harmony
between the initial stem vowel and the suffix vowel are called transparent. This is
shown in (3) where the initial and suffix vowels in each word are back despite the
fact that the vowel in the second syllable is front. Since these front vowels do not
prevent the agreement between the two back vowels, they are considered
phonologically transparent.
Opaque vowels are those vowels that impose their own [±back] feature on the
suffix vowel, thereby blocking the agreement between the first and the suffix vowels.
This is shown in (4). As a result, each suffix vowel in (4) agrees with its stem-final
vowel and not with the stem-initial vowel, as is the case in (3).
A review of the facts that are at the center of this dissertation will be
presented below by listing five salient properties in the behavior of Hungarian
transparent vowels. The set of transparent vowels consists of the front unrounded
vowels {/i/, /í/, /é/, and /e/}. For ease of reference, the following terms will be
adopted: monosyllabic stems with transparent vowels, such as cím ‘address’ or szél
20
‘wind’, will be referred to as T stems, disyllabic stems where a [+back] vowel
precedes a transparent vowel, such as papír ‘paper’ or kávé ‘coffee’, as BT stems,
and finally, trisyllabic stems where a back vowel precedes two transparent vowels,
such as aszpirin ‘aspirin’ or oxigén ‘oxygen’, as BTT stems.7
The first property of transparent vowels is that they allow phonological
agreement between vowels that are not in consecutive syllables. This is exemplified
with the data in (3), where the initial back vowel in BT stems agrees with the back
suffix vowel although there is an intervening front vowel.
The second property of transparent vowels in Hungarian is their ability to
trigger both front and back suffixes. The majority of T stems trigger front suffixes, as
shown in (5).
(5)
Transparent vowels in T stems trigger front suffixes
ív-em
bow-1st sg. poss.
cím-hez
address-Allat.
ing-em
shirt-1st sg. poss.
hisz-ük
believe-1st pl. poss.
szél-lel
wind-Instr.
éj-nek
night-Dat.
7
With respect to the etymology of the T, BT, and BTT stems, most of the T stems
have Finno-Ugric origin. However, most of the BT and BTT stems are borrowings.
Throughout its history, Hungarian borrowed extensively from Slavic languages,
German, Turkish, and Latin. Papp (1982) observed that of a few thousand roots, there
is the same number of Finno-Ugric (614) as Slavic (569) words, and about half that
number of Germanic words (330). To the best of my knowledge, these borrowings
are integrated in the Hungarian phonological system and do not constitute a separate
group with special phonological patterns as is the case for example in Japanese.
21
However, there is a set of approximately sixty, mostly monosyllabic, T stems
that trigger back suffixes. A subset of these exceptional stems is presented in (6).
(6)
Transparent vowels in T stems trigger back suffixes (N≈60)
vív-om
inhale-1st sg. poss.
síp-hoz
whistle-Allat.
fing-om
fart-1st sg. poss.
nyit-uk
open-1st pl. poss.
cél-lal
aim-Instr.
héj-nak
crust-Dat.
The third property of transparent vowels is their ability to trigger vacillation in
suffixes. Vacillation is variation both across speakers as well as within an individual
speaker. Some BT stems triggering vacillation are shown in (7). For example,
Hungarian speakers consider both hotel-ben and hotel-ban as acceptable renderings
of ‘hotel-Iness.’.
(7)
Some BT stems are vacillating
hotel-ban/hotel-ben
hotel-Iness.
Ágnes-ban/Ágnes-ben
Agnes-Iness.
koffer-ban/koffer-ben
suitcase-Iness.
affér-ban/affér-ben
affair-Iness.
Vacillation in BT stems is mostly triggered when a back vowel is followed by
the low vowel /e/. This is a salient property of this group of stems (e.g. Vago 1980,
Ringen 1975). Vacillation is also possible for BT stems with long, mid /é/. However
22
the preferred suffix following Bé stems is [+back], as shown in (3), and only a
handful of these stems vacillate.8
The fourth property of transparent vowels is that their height correlates with
their degree of phonological transparency. The following three observations can be
drawn from the Hungarian data. First, T stems with {/i/, /í/, /é/} can trigger back
suffixes, as seen in (6), whereas there are no such stems with /e/. Second, BT stems
with the vowels {/i/, /í/, /é/} tend to trigger back suffixes, as seen in (3). But BT
stems with low /e/ show the vacillation pattern, as seen in (7). Third, some BT stems
where /e/ follows a back vowel trigger front suffixes only, e.g. Józef-nek, *Józef-nak.
Hence, [–back] vowels {/i/, /í/, /é/} may trigger [+back] suffixes in T stems, and
trigger mainly [+back] suffixes when preceded by a [+back] vowel in BT stems.
Neither of these properties applies to [–back] /e/, which is followed by [–back]
suffixes in T stems exclusively, and typically triggers vacillation or selects [–back]
suffixes in BT stems. Therefore, the [–low] vowels {/i/, /í/, /é/} differ phonologically
from [+low] /e/ in that the former vowels are more transparent than the latter.
A more minute difference can be observed between {/i/, /í/} and /é/. BT stems
with the vowels {/i/, /í/} always trigger back suffixes whereas some such stems with
8
The electronic database of Hungarian words (Füredi et al. 2004) lists 96 BT nouns
with /é/ specified for suffix selection. Out of these, 85 select back suffixes, 9 are
listed as vacillating, and 2 as selecting only front suffixes. To compare, the same
database has 125 BT nouns with /e/ specified for suffix selection, out of which 56
select front suffixes, 60 vacillate, and 9 select back suffixes.
23
/é/ may trigger vacillation, e.g. affér-ban/affér-ben ‘affair-Iness.’. Moreover, there are
only two T stems where /é/ triggers back suffixes (cél, héj) whereas all other
members of this approximately 60-member set of T stems are those with the vowels
{/i/, /í/}.
Based on these observations, the original binary division of Hungarian vowels
into transparent and opaque should be revised. In this dissertation, phonological
transparency is defined as the potential of stem-final vowels specified as [α back] to
allow [–α back] suffixes. In contrast, vowels exhibit harmonic behavior when they
are specified as [α back] and trigger only [α back] suffixes. It is proposed that
transparency is a scalar property, and that {/i/, /í/} are maximally transparent, /é/ is
slightly less transparent, and /e/ is minimally transparent. Hence, the lower the vowel,
the less transparent it is.
This correlation is also supported in those dialects where the short /e/ has two
allophones: low [E] and mid [ë]. Stems with a back vowel followed by low [E]
behave harmonically and select front suffixes whereas stems with a back vowel
followed by the mid vowel [ë] behave transparently (Sz. Szentgyörgyi, p.c.).
The final property of transparent vowels is their cumulativity. Stems with two
transparent vowels following a back vowel (BTT stems) behave differently in terms
of suffix selection from BT stems (Ringen and Kontra 1989, van der Hulst 1985,
24
Kaun 1995). Some representative examples are listed in (8) and (9) where all stems
are followed by the Inessive suffix -ban/ben.
(8)
(9)
BTT
kabinet-ben/*ban
november-ben/*ban
oxigén-ben/*ban
alpesi-ben/*ban
BTT
aszpirin-ban/ben
kolibri-ban/ben
kombiné-ban/ben
agresszív-ban/ben
BT
administration hotel-ben/ban
November
stopper-ben/ban
oxygen
szomszéd-ban/*ben
alpine
hasi-ban/*ben
aspirin
humming bird
slip
aggressive
BT
kabin-ban/*ben
zokni-ban/*ben
gané-ban/*ben
masszív-ban/*ben
hotel
stopwatch
neighbor
abdominal
cabin
sock
manure
massive
In (8), BTT stems select front suffixes whereas similar BT stems all accept
back suffixes either exclusively or as vacillation. In (9), all BTT stems vacillate
whereas BT stems with similar vocalic profiles trigger back suffixes. The stem-final
transparent vowels in BTT stems are identical to the stem-final vowels in the paired
BT stems. Hence, the difference in suffix selection arises from the presence of an
additional transparent vowel in BTT stems.
To support the difference between BT and BTT stems in suffix selection,
consider data reported by Farkas & Beddor (1987) shown in the left column in (10).
The data show that addition of a transparent vowel to a BT stem affects the choice of
the suffix: BT stems are followed by [+back] suffixes whereas BTT stems vacillate.
The second and third columns show the results of an empirical study by Ringen &
25
Kontra (1989), in which subjects were asked to provide a suffix for these BT and
BTT stems. The numbers correspond to the percent values of the subjects selecting
front (%F) or back (%B) suffixes. The data in (10) confirm that addition of a
transparent vowel to a BT stem results in increased probability of selecting a front
suffix.
(10)
BT stems vs. BT+T stems
mam + i - nak
mam + csi - nak
mam + i + csi - nak/nek
Acél
Acél-nak
Acél + ék - nek/nak
%F %B
0.0 100.0 ‘mother+diminutive1-Dat.’
‘mother+diminutive2-Dat.’
45.7 54.3 ‘mother+dim1 +dim2-Dat.’
2.9
25.0
‘Acél (family name)’
97.1 ‘Acél-Dat.’
75.0 ‘Acél+collective+Dat.’
The difference between BT and BTT stems can also be observed by a
quantitative analysis of the corpus of Hungarian nouns specified for suffix selection
(Füredi et al. 2004). This analysis confirms that BTT stems prefer front suffixes
whereas BT stems prefer back suffixes. Of the stems in the corpus that do not
vacillate (74% for BT and 78% for BTT), 80% of BTT stems select front suffixes and
20% select back suffixes. In comparison, only 22% of BT stems select front suffixes
and as much as 78% select back suffixes.9
9
This calculation includes Be stems, hence the relatively high number of front
suffixes.
26
The data from BTT stems also corroborate the correlation between height and
transparency discussed above as the fourth property of transparent vowels. The vowel
/e/ behaves less transparently than {/i/, /í/, /é/} also in BTT stems. The following
generalizations can be drawn from the study of 119 BTT nouns and adjectives in the
corpus.10 If both transparent vowels are /e/, front suffixes follow (1 exception). Of
those BTT stems that select front suffixes (N = 71), all but 8 have at least one /e/. Of
those that select back suffixes (N = 23), only 3 have /e/. Therefore, in the BTT stems
that do not vacillate, 63 out of 66 stems (95%) where one transparent vowel is /e/
select front suffixes. On the contrary, 20 out of 28 (71%) BTT stems with transparent
vowels other than /e/ select back suffixes. In short, the presence of /e/ in BTT stems
implies that front suffixes follow. On the other hand, there is no clear tendency if {/i/,
/í/, /é/} are present, and each of the three options (a front suffix, a back suffix, or
vacillation) is possible. This means that /e/ behaves less transparently in BTT stems
than {/i/, /í/, /é/}.
The BTT data calls for a re-analysis of the term ‘transparent vowel’ with
respect to the {/i/, /í/, /é/, /e/} set in Hungarian. Traditionally, {/i/, /í/, /é/} were
10
BTT verbs are not included because almost all of them are morphologically
decomposable into stem+ik. Therefore, the two transparent vowels in these BTT
stems do not have identical morphological categories; one is a stem vowel and one is
a suffix vowel. This contrasts with BTT nouns and adjectives where both transparent
vowels belong to the stem. Moreover, including the ‘-ik’ BTT verbs would create a
bias because all have /i/ as one of the stem transparent vowels.
27
considered transparent and the status of /e/ was controversial as some argued for its
transparency (Vago 1980) and some for its opacity (Ringen 1975). BTT data show
that under some circumstances even {/i/, /í/, /é/} behave opaquely. The context in
which they occur (BT vs. BTT) has an effect on their transparency. This supports the
hypothesis that phonological transparency is not a binary but a scalar quality (see also
Ringen & Kontra 1989, Kaun 1995).
Additionally, the effect of the number of transparent vowels on suffix
selection is not random. Increasing the number of the transparent vowels that follow a
back vowel results in a lower degree of phonological transparency. This means that,
in general, a single transparent vowel from the set {/i/, /í/, /é/} becomes less
transparent when joined with another transparent vowel, and minimally transparent
/e/ becomes not transparent in the same situation.
Finally, the differences among the transparent vowels observed for T and BT
stems carry over to BTT stems. The vowel /e/ differs from the rest of the transparent
vowels in that it is most likely to be followed by front suffixes and is thus the least
transparent. This effect can be observed to a lesser degree with /é/. Therefore, internal
differences among the four transparent vowels based on their height are real and
should be given a principled explanation.
28
2.3.
The challenge of Hungarian vowel harmony
Vowel harmony, transparency, and opacity have been at the center of phonological
research; some major advances in phonology such as non-linear representations were
motivated by vowel harmony data. Sub-sections 2.3.1 and 2.3.2 discuss general
features of transparency and opacity that Hungarian shares with other harmony
systems. Sub-sections 2.3.3 to 2.3.5 discuss patterns that are specific to Hungarian
palatal harmony.
2.3.1. Locality
Transparent vowels in Hungarian are phonologically [–back], yet they allow
agreement (harmony) between non-adjacent [+back] vowels. In this sense, vowel
harmony appears to be a non-local process because the trigger and the target of this
process are not adjacent: both consonants and vowel intervene.
However, it has been argued that locality plays an important role in
understanding phonological processes (Clements & Hume 1995, McCarthy 1989,
Odden 1994, Steriade 1995). For instance, the process of nasal place assimilation is
very common when a nasal and an obstruent are adjacent. But the same process is not
attested when a vowel intervenes between the nasal and the obstruent (Clements &
Hume 1995). Locality construed as adjacency at some level of representation carries
29
a lot of explanatory power in phonology.11 Therefore, an analysis of transparency in
vowel harmony that respects locality is desirable.
There are three general types of approaches to the issues of transparency and
locality. The first proposes that the transparent vowel is a full member of the
harmonic domain throughout the phonological derivation (e.g. Gafos 1999, Ní
Chiosáin & Padgett 2001). The second explicitly excludes transparent vowels (and
intervening consonants) from the harmony domain and construes vowel harmony as a
non-local process (e.g. Ringen 1975, Vago 1980, S. Anderson 1980). The third and
the most common option is to parameterize locality at some level of representation or
derivation (e.g. McCarthy 1989, Archangeli & Pulleyblank 1994, Piggott 1996). In
the following, I will briefly review these three approaches.
2.3.1.1. Strict locality
In this approach to issues of locality and transparency, the harmonizing feature
affects all locally adjacent members in the harmony domain (consonants and vowels
alike). Hence, this approach predicts that transparent vowels do undergo harmony and
should differ depending on the harmony context. This approach, which I will refer to
as the Strict Locality hypothesis (SL), is characterized with the following statement:
11
The notion of locality is not restricted to phonology. It also plays a crucial role in
constraining movement in syntax.
30
“… segments are either blockers or participants in spreading: there is no
transparency” (Ní Chiosáin & Padgett 2001: 2).
As an example of SL, consider the proposal in Gafos (1999: 43-46). In
Gafos’s model, vowel harmony is construed as a continuous activation of a
harmonizing gesture. The consonantal gestures are assumed to be super-imposed on
the vowel gestures, which yields adjacency of vowel gestures across intervening
consonantal gestures (e.g. Öhman 1966, Fowler 1983). In contrast, consonantal
gestures are not adjacent across an intervening vowel gesture (Gafos 1999, Ní
Chiosáin & Padgett 2001).12
Gafos proposed that the presence of the global harmony gesture yields a local
relationship between the trigger and the target of vowel harmony including the
intervening transparent vowel. His idea is schematically illustrated in 0 with the
example of transparency in Hungarian.
a={
, wide}
í = {palatal, narrow}
a={
, wide}
Uvular
Fig. 1 – Continuous activation of Tongue Body Constriction Location
(TBCL) = {uvular} in papír-nak
12
See Gafos (1999) for the implications of this asymmetry between CVC and VCV
sequences in the patterning of consonantal and vowel harmony systems.
31
As discussed before, /í/ is transparent in Hungarian words such as papír-nak ‘paperDat.’. It is [–back] while the trigger and the target of harmony are [+back]. The
harmonizing feature [+back] is formalized by Gafos as a global harmony gesture
representing this feature. In the case illustrated in 0, the harmony feature [+back]
corresponds to a gesture specified for the ‘uvular’ constriction location of the tongue
body. 0 shows this global harmony gesture as the shaded box. As can be seen, both
initial and final vowel gestures have their constriction location ‘borrowed’ from the
specification of the harmony gesture. Hence, the global harmony gesture acts as a
‘bridge’ that is contiguous over the whole domain of the word. This is another
formalization of the idea that in languages with vowel harmony, a harmonizing
feature occupies a separate tier and is linked to all vowels in a word.13
Fig. 2 represents the relationship of vowel gestures in a language without
vowel harmony. The figure shows that the absence of the global harmonizing gesture
results in such a constellation of vocalic gestures where the two back vowel gestures
are not locally adjacent.
a = {uvular , wide}
í = {palatal, narrow}
a = { uvular , wide}
Fig. 2 – Non-local relationship between the initial and final /a/.
13
The specification in 0 is assumed to apply to all words in a vowel harmony
language. Hence, even in words with no suffixes such as papír ‘paper-Nom.’, the
initial /a/ would borrow its uvular specification from the harmonizing gesture.
32
The crucial proposal that differentiates the strict locality approach from other
approaches is the suggestion that the global harmony gesture (uvular) coexists with
the local gesture for the transparent vowel (palatal).14 Gestural overlap is one of the
core properties of speech and the situation in 0 is an instantiation of such overlap
where a single articulator (tongue body) is required to be at two different locations
(uvular and palatal). Gafos (1996: 44) hypothesized that ”[t]he result of overlap is
that the actual constriction location of [SB: a transparent vowel] is effectively
retracted”. Therefore, the prediction of the Strict Locality hypothesis is that [–back]
transparent vowels in a [+back] harmony domain are articulated with more tongue
body retraction than in a [–back] harmony domain.
Support for Gafos’s proposal may be found in experimental work by Boyce
(1988). Boyce showed that in Turkish, the rounding harmonic gesture in a V1CV2
sequence, where both V1 and V2 are [+round], is present during the consonant as
well. Furthermore, continuous rounding was present even when three consonants
intervened, as in V1CCCV2. This pattern was notably different from coarticulatory
rounding observed on consonants in the same V1CCCV2 sequence in languages
without vowel harmony such as English.
14
As will be discussed below, several approaches assume a similar process of
spreading the harmony feature throughout the whole domain, but at the same time
include a rule or constraint that deletes the specification [+back] (TBCL = uvular)
from the transparent vowel /í/.
33
Boyce’s findings suggest that rounding in Turkish is not a property of vowels
only. Rather, it is a property of the whole word and affects all segments within this
domain. Following the extension of this idea to palatal vowel harmony by Gafos
(1999), transparent vowels are predicted to be subject to vowel harmony. Just as
Turkish consonants in [+round] harmony domain surface with non-contrastive
rounding, Hungarian front vowels in [+back] harmony domain should surface with
non-contrastive retraction.
2.3.1.2. Absence of locality
The second option for dealing with transparency in vowel harmony is to explicitly
exclude transparent vowels (and intervening consonants) from the harmony domain
and construe vowel harmony as a non-local process. This approach is exemplified
with analyses where the phonological harmony between the trigger and the target
vowels skips transparent vowels and consonants. For example, to account for the
presence of transparent vowels and the irrelevance of consonants, Ringen’s (1975)
harmony rule in (11) ‘skips over’ these vowels and consonants. This is shown with
the bracket notation in the structural description of this rule. The brackets indicate
that zero or more consonants (Co) and, most relevantly, non-low unrounded vowels if
any can intervene between the trigger vowel specified as [β back] and the target
vowel marked with the underline. A similar strategy of skipping the transparent
34
vowel(s) and other consonants in the application of the vowel harmony rule was
advocated in Vago (1980), S. Anderson (1980), and others.
(11)
Vowel harmony rule for Finnish (Ringen 1975: 24)
One of the most compelling arguments for the non-local nature of some vowel
harmony processes was developed in S. Anderson (1980). Anderson pointed out that
the phonological transparency of vowels typically correlated with the absence of the
harmonic counterparts for these vowels in the inventory. This is the case of
Hungarian described in Section 2.2.2 because the [–back] counterparts of the
transparent /i/ and /e/ vowels are missing from the vowel inventory. However,
Anderson noted that this is not always the case. For example, /i/ is transparent in the
rounding harmony of Khalkha Mongolian because in forms such as döči-ööd ‘by
forties’ /i/ is [–round] in the [+round] domain. But, the language inventory of
Khalkha Mongolian includes [+round] /ü/. Also, in addition to the transparency of /i/
and /e/ in the palatal harmony of Finnish, high front rounded /y/ sometimes behaves
transparently because it is [–back] in the [+back] domain in forms such as marttyyrius ‘martyrdom’. However, the [+back] counterpart of /y/ is /u/, which is common in
Finnish. In both Finnish and Khalkha then, the transparent vowels do have their
35
harmonic counterparts and, as Anderson observed, it was not clear why these vowels
were not subject to harmony (*döčü-ööd, *marttuurius). Anderson concluded:
“…this shows that neutral vowels cannot actually be integral parts of harmonic
domains … but must rather be skipped over … in assigning harmonic features” (S.
Anderson 1980: 32).
However, this argument was weakened by the observations in van der Hulst
(1985) and Steriade (1995). With respect to Khalkha Mongolian, van der Hulst cites
Chinchor (1979) who noted that the high rounded vowels do not behave as regular
harmonic vowels because they block spreading of the [+round] feature. This was
subsequently supported by Svantesson (1985). Steriade argued that /i/ is not subject
to harmony due to the privative nature of the [round] feature. Van der Hulst
suggested that the special behavior of /ü/ in Khalkha’s rounding harmony allowed for
lexical specification to prevent /i/ from acquiring [+round] by a phonological
process.15
With respect to Finnish, as will be discussed in more detail in Chapter 4, the
transparency of /y/ proposed by Anderson may be only apparent (Campbell 1980,
Wiik 1995, Välimaa-Blum 1999). Campbell (1980) refers to the transparent behavior
15
Moreover /ü/ is analyzed as the [+back] [+ATR] vowel [u] while the traditional /u/
is analyzed as [–ATR] [U] (Svantesson 1985). It is then possible that the special status
of /ü/ in Khalkha Mongolian arises from constraints on the ATR rather than the
[round] harmony pattern.
36
of /y/ as indicating a more prestigious and learned style while the harmonic behavior
marks a more colloquial style. Based on this, van der Hulst (1985) claims that the
patterns found in colloquial styles are more phonologically natural, and the
prestigious style may arise from the application of a non-phonological rule. In
addition, Anderson’s argument is weakened by the observation that a front rounded
[y] is often perceptibly retracted in the back harmony domain (Wiik 1995, VälimaaBlum 1999). Hence, [y] in these cases changes to [u] or [u] and is thus subject to
[+round] harmony, contra Anderson’s claim.
A major weakness of the non-local approaches to transparency is that they
over-generate and predict that any vowel in a particular harmony system may behave
transparently. This prediction is not supported and crosslinguistic studies reveal
systematic implicational generalizations (e.g. L. Anderson 1980). At the same time,
the non-local approach is very appealing due to the straightforward account of both
transparency and opacity. The studies reviewed in the next section address this
weakness by parameterizing locality.
2.3.1.3. Parameterized locality
The third, and the most common, option for dealing with transparency while
maintaining locality is to parameterize locality at some level of representation or
derivation. These approaches use various technical devices to exclude the transparent
37
vowel (and other consonants) from the surface harmonic domain while preserving
some, typically abstract, notion of locality. Hence, locality is respected at some (nonsurface) level of derivation, at a representational level (using geometrical hierarchical
representations and/or underspecification), or as a violable OT constraint. I will start
by discussing proposals for preserving adjacency of vowels across intervening
consonants, and then move to the proposals that parameterize the locality hypothesis
to account for the existence of transparent vowels.
One of the problems for the analyses of harmony that used linear
representations of the SPE style (Chomsky & Halle 1968) was how to limit the
application of harmony rules to vowels and some secondary consonantal features.
With the advances of non-linear phonology, two types of proposals for defining
vowels as the units bearing the harmony feature for vowel harmony processes were
put forth. The first one uses hierarchical representations of features where all
segments are assumed to have a Place node that subsumes features such as [labial],
[coronal], and [dorsal] (Sagey 1986, Clements & Hume 1995). In addition, vowels
are assumed to have a V-Place node. Vowel harmony is then construed as spreading
of a harmonizing feature [±F] on the tier occupied by the V-Place node. This is
shown schematically in (12).
38
(12)
Vowel harmony as autosegmental spreading on the V-Place node
C–V–C–V–C
| |
|
|
|
Pl. Pl. Pl. Pl. Pl.
|
|
V-Pl.
V-Pl.
[±F]
Sagey (1986) and others have convincingly argued that the hierarchical nonlinear representation is superior to the SPE tradition of representing a sound as an
unordered set of feature values. For example, the hierarchical representations are
more restrictive than the SPE ones because they do not allow arbitrary pairing of
features in phonological rules. In addition, McCarthy (1989) argued that the division
of consonants and vowels into separate tiers, also called C/V planar segregation, is
necessary in the phonological analysis of non-concatenative languages such as
Arabic.
However, the evidence for such a rich representational structure as the Vplace tier is very limited. Gafos (1998) argued that C/V planar segregation was not
needed and proposed to re-analyze McCarthy’s data as reduplication. Vowel harmony
thus remained the sole argument for the V-Pl representation.
The second possibility for restricting the target of harmony to vowels is to
formalize harmony as a relationship between adjacent prosodic nodes (van der Hulst
& van der Weijer 1995 for a review, Piggott 1996). Krämer (2002) developed a
39
recent version of this proposal where he characterized harmony as a Syntagmatic
Correspondence between adjacent prosodic nodes, which would be syllables in the
Hungarian case.16 Since vowels are considered syllable heads, the harmonizing
feature percolates from the syllable nodes down to the vowel as its head. In this
approach, consonants are assumed unaffected by the harmonizing feature because
they are not syllable heads. This proposal is schematically illustrated in (13), each
eligible vowel in the harmony domain is specified for the harmonizing feature.
(13)
Vowel harmony as autosegmental spreading on the syllable node
[±F]
σ
σ
C V. C V C
After discussing proposals for limiting the target of a harmony process to
vowels, I now review representative proposals for parameterization of the locality
hypothesis directly related to the analysis of transparent vowels. These proposals can
be divided into two categories: those assuming that some version of the strict locality
hypothesis is correct, but that it cannot be maintained at all derivational levels (e.g.
16
The arguments for the prosodic parameterization come from Yucatec Maya vowel
harmony (2001). Krämer observed that onset consonants in Yucatec Maya are
transparent to the vowel harmony and coda consonants are opaque. This led Krämer
to suggest that harmony in Yucatec Maya operated on the moraic level; coda
consonants are moraic whereas the onset ones are not.
40
Clements 1976, Ní Chiosáin & Padgett 2001), and those excluding the transparent
vowels from vowel harmony due to their markedness characteristics but maintaining
an abstract (representational or operational) notion of locality (e.g. Archangeli &
Pulleyblank 1994, Smolensky 1993, Kaun 1995, Walker 2003, Baković & Wilson
2000).
The approaches that belong to the first category are based on the insights of
the root-marker (e.g. Lightner 1965) and early autosegmental (e.g. Clements 1976)
analyses of vowel harmony where the harmonizing feature was assumed to affect all
vowels in a harmony domain. For example, in the autosegmental approach, spreading
of the harmonizing feature to all vowels in a word was due to the Universal
Association Convention (UAC, Goldsmith 1976), which prohibits both gapping
association as well as line crossing. This is shown in (14).
(14)
Gapped configurations are prohibited
*X
X
X
+F
*X
X
+F
–F
X
Due to these constraints, the most common approach to transparency in
autosegmental theory was to weaken the purely representational character of harmony
(spreading to satisfy UAC) and invoke a hybrid approach that used both
representations and rules. In such an approach, transparent vowels underwent
41
harmony locally and received the harmonic feature. Then, a late neutralization rule
de-linked this feature and provided a default value for the transparent vowel (e.g.
Clements 1976).
A similar idea was developed in a more recent work by Ní Chiosáin & Padgett
(2001). Ní Chiosáin & Padgett assumed that the transparent vowel respects locality at
one level, but the vowel quality is ‘repaired’ at another stage. Therefore, the local
relationship in this theory holds in a representation that never surfaces. The case
study used by Ní Chiosáin & Padgett is the transparency of [e] in a limited tense/lax
harmony in Pasiego (Penny 1969, 1970). In their account, vowel [e] changed to [E] as
a result of laxing harmony. However, the vowel space for lax vowels is limited and
[E] would not be sufficiently contrastive from other lax vowels.17 Hence, [E] is
considered more marked than /e/. As a result, [E] is not allowed on the surface and
changes back to [e].
In the second category of proposals, the difference between transparent and
opaque vowels in terms of their participation in vowel harmony is ascribed to the
differences in their markedness. As noted in the discussion of S. Anderson (1980)
above, the major insight was that in many vowel harmony systems the presence of
transparent vowels usually corresponds with the absence of their harmonic
17
In their formalism of this proposal, Ní Chiosáin & Padgett assume the theory of
auditory contrast in Flemming (1995).
42
counterparts in the inventory. For example, the [–back] vowels /i/ and /e/ behave
transparently in the palatal vowel harmony of Finnish. Their [+back] unrounded nonlow counterparts are missing in the Finnish vowel inventory. Hence, the contrast in
terms of [±back] does not exist for the non-low unrounded vowels and the default
value for these vowels in Finnish is [–back]. In contrast, Turkish vowel inventory has
the high back unrounded vowel /i/ and the front /i/ does not behave transparently.
The idea of building theories of phonology that are further constrained by
minimizing the information present in the lexicon prompted the development of
Underspecification Theory (e.g. Archangeli 1988, Steriade 1987). The rationale
behind this approach is that there is no need to specify in the lexicon that which is
predictable on independent grounds.18 The central idea of underspecification that is
relevant for vowel harmony is that elements with unmarked predictable feature values
typically do not participate in processes involving marked features: “… redundant
phonological features in language are inert, neither triggering phonological rules nor
interfering with the workings of contrastive features.” (Itô et al. 1995: 571). In the
case of palatal vowel harmony, the predictable and thus unmarked specification of the
transparent vowels is [–back]. The absence of [–back] in the underlying specification
18
Certain types of underspecification were commonly assumed before
Underspecification Theory such as in the archisegmental (e.g. Ringen 1975) and
autosegmental (e.g. Clements 1976) analyses of harmony. For example, the spreading
of a harmonizing feature to suffix vowels takes place only if these vowels are not
already specified for this feature; i.e. they have to be underspecified.
43
of the transparent vowels allows them to be unaffected by the [+back] spreading. I
will briefly review several formalizations of this idea.19
An underspecification analysis of transparency assuming non-linear
representations can be illustrated with the treatment of Finnish vowel harmony where
the transparent front non-low unrounded vowels /i/ and /e/ do not have their [+back]
counterpart. In an analysis assuming underspecification, there is no need to specify
the feature [back] in the underlying representation of /i/ and /e/ because its [–back]
value is predictable. The underspecification analysis is sketched in (15) with the
example of koti-na ‘home-Essive’.
(15)
Underspecification treatment of transparency
V V V [+back]
kO.tI-nA
V V V [+back]
ko.tI-nA
V V V
[+back]
ko.tI-na
19
I [–back]
ko.ti-na
There are two types of underspecification in the literature: contrastive and radical.
In the first, lexical representations contain only those specifications that are necessary
to distinguish among phonemes. In the second, only the specifications that are
considered marked, either in general, or in a language-specific sense, are allowed.
The other value (non-contrastive or unmarked) is assumed not to be active until late
in the derivation when it is supplied by a (default) rule. For the transparent [–back]
vowels in palatal vowel harmony, both Radical and Contrastive Underspecification
make the same claim. A comparison of these two approaches is beyond the scope of
this dissertation. For a relevant summary and critical discussion, see Steriade (1995).
44
A disharmonic stem with a transparent vowel is specified for the harmonizing
feature [+back]. Following the Universal Association Convention, [+back] is linked
to the first vowel. The transparent vowel is not specified for the [back] feature,
effectively making it invisible to the harmony tier of [back]. The suffix vowel is then
linked to the [+back] autosegment without causing line crossing. The derivation is
completed with the rule filling the predictable value [–back] for the transparent
vowel. Opacity is accounted for by underlyingly specifying opaque vowels.20
In a variation of this approach, van der Hulst & Smith (1986) proposed that
the harmonizing feature of the transparent vowels is specified underlyingly, rather
than supplied at later stages of derivation. The inactivity of this vowel with respect to
the harmony process was formalized as a morpheme-structure condition that
protected the underlyingly specified feature value from the effect of a phonological
harmony rule.21 This approach is illustrated in (16) with the same Finnish word as
before. The transparent [i] is linked with the [–back] feature that cannot spread. The
remaining two vowels receive a default specification, in this case [+back].
(16)
kO. ti. nA [–back]
ko. ti. na
[+back][–back][+back]
20
See Kenstowicz (1994) for a discussion of technical problems involved in splitting
the [+back] domain in order to insert a [–back] feature.
21
The necessity of morpheme-structure conditions in phonology has been challenged
in more recent work (e.g. McCarthy 1998, Gafos 2003)
45
The authors claim that this type of analysis achieves a straightforward account
of transparency. However, the simple solution for the transparent vowels makes the
account for opaque vowels more complicated. It is not possible to assume that opaque
vowels are underlyingly specified because they would not differ from transparent
vowels.
Instead, van der Hulst & Smith (1986) invoked the prosodic hierarchy where
not only syllables and moras but also individual segments are assumed to occupy a
separate tier. In addition, parameters of accessibility and association for each vowel
are introduced. The markedness of the harmonizing feature in the opaque vowels is
analyzed as their inaccessibility because they are bound on the segmental tier. Hence,
opaque vowels are not subject to harmony. Yet, their feature is allowed to spread
outside the boundary to the suffix vowels. Transparent vowels are accessible but
underlyingly linked to the harmonizing feature whereas other harmonic vowels are
also accessible but received their harmonic specification via spreading and not via
underlying association. Therefore, avoiding problems connected to the analysis of
transparent vowels requires additional machinery and assumptions for dealing with
opaque vowels.
A slightly different approach was proposed in Kiparsky’s analysis of Finnish
palatal harmony (1973). Kiparsky argued that all suffixes in Finnish are underlyingly
specified as [+back] (cf. Vago’s (1980) argument for underlyingly [–back] feature for
46
some suffixes in Hungarian). Hence, in disharmonic Finnish stems such as koti
‘home’ the suffix would be [+back] underlyingly. Because the stem-final vowel in
koti is stipulated as transparent, therefore not spreading its [–back] feature, the suffix
remains [+back]: koti-na ‘home-Essive’. In this analysis, however, the agreement
between the [+back] stem-initial vowel /o/ and the [+back] suffix vowel /a/ is not
explained because these two vowels do not interact in any way. This leads to
problems in accounting for the [±back] form of the suffixes following the stems with
only transparent vowels. In both Finnish and Hungarian, the default form of the suffix
in these stems is [–back].
Locality is thus maintained in the underspecification approach by making use
of a) the predictability of the unmarked [–back] feature of the transparent vowels, and
b) hierarchical non-linear representations. Based on the theory of markedness, the
underspecification approach assumes a kind of representation for the transparent
vowels that allows preserving a representational notion of locality following from the
theory of non-linear representations.
Analyses in the framework of Government Phonology stem from the advances
of autosegmental theory (Kaye et al. 1985, van der Hulst 1988, Ritter 1995, Dienes
1997). The primitives of this theory are monovalent (single-valued) units U, I, and A
(Kay et al. 1985, Goldsmith 1985) that are arranged on three independent tiers. A
major innovation of this approach is the establishment of a (governing) relationship
47
among the primitives of a segment. Hence, each segment has one head and one or
two operators. For example, [i] is represented with the head I (I), [ü] with the head U
and the operator I (I.U), and [ö] with the head U and operators A and I (AI.U).
Vowel harmony in this approach is construed as local spreading of an F
element, F χ{I, U, A}. Due to the added structure of the governing relationships, the
nature of feature spreading is local. However, various constraints on underlying
representation and type of the spreading responsible for harmony prevent ‘docking’
of the spreading feature on the transparent vowel. This effectively results in
transparent vowels being excluded from vowel harmony. In this way, the analyses in
Government Phonology share the predictions and problems of the other
parameterized locality approaches discussed in this section.
The type of locality parameterization used in the analyses within the
Optimality Theory (Prince & Smolensky 1993) is typically based on the contrastive
underspecification that follows from differences in markedness of individual
sounds.22 In OT, phonological regularities derive from the resolution of the conflict
between two or more parallel requirements of a phonological system formulated as
OT constraints.
22
Section 6.2 in Chapter 6 includes a brief introduction to the main concepts of OT.
48
In the case of transparency and locality, one requirement is that vowel
harmony applies locally. This requirement is typically formalized with the ALIGN
family of constraints (McCarthy & Prince 1993, Kirchner 1993) or the AGREE family
of constraints (e.g. Baković 2000, Kiparsky & Pajusalu 2003). In the former, a
phonological feature F (e.g. [+back]) aligns with a morphological category M (e.g.
‘word’) and keeps the insight of the root-marker and autosegmental approaches that
the harmonizing feature spreads from the trigger (auto)segment to the whole word. In
the latter, harmony is viewed as an agreement between adjacent vowels, which is a
parallel re-analysis of the iterative application of the vowel harmony rule in the SPE
tradition.23
The other requirement is that the marked back unrounded non-low vowels are
prohibited. In other frameworks, the exclusion of transparent vowels from harmony
was achieved with a skipping condition in a vowel harmony rule, a late neutralization
rule, a morpheme structure condition, or underspecification. In OT, the same is
achieved by a constraint that prevents linking of transparent vowels to the
harmonizing feature [+F]. This constraint states that the featural specification that
combines the features of a transparent vowel and [+F] is disallowed: *TV[+F].
23
These two approaches (Align vs. Agree) differ in several important areas. See the
critique of the Align approach in Baković (2000) and Krämer (2002). However, the
differences between them are orthogonal to the reliance on the parameterized notion
of locality formalized as a violable ALIGN or AGREE constraint.
49
The two requirements, formalized as the constraints ALIGN[+F] (or
AGREE[+F]) and *TV[+F], present a conflict. In disharmonic stems with a [–F]
transparent vowel and [+F] harmony, there is no surface form that satisfies both
constraints. Either [+F] extends over the whole domain of the word, which violates
*TV[+F], or the transparent vowel surfaces as [–F], which violates ALIGN[+F]. In
OT, a conflict between constraints is resolved by their ranking. Hence, the ranking of
ALIGN[+F] and *TV[+F] determines the output of vowel harmony. If ALIGN[+F]
dominates *TV[+F], a case like Turkish with no transparent vowels emerges. In the
opposite case, *TV[+F] dominates ALIGN[+F] and the pattern of transparency as in
Hungarian or Finnish is derived.
Below I discuss two proposals based on the resolution of the conflict between
a *TV[+F] constraint and a harmony (AGREE or ALIGN) constraint that leads to the
parameterization of locality.
Baković & Wilson (2000) showed that a simple markedness constraint
preventing a [–F] transparent vowel from being affected by a [+F] harmony feature
yields unwelcome results when harmony is analyzed with the strictly local AGREE
constraints. Applied to the Hungarian data, this analysis predicts the opaque behavior
of /í/: *papír-nek. This is because the candidate papír-nak violates the AGREE
constraint more (both ‘a-i’ and ‘i-a’ disagree in backness) than the candidate papírnek (only a-i disagree). To repair this problem, Baković & Wilson proposed a
50
‘targeted’ markedness constraint that ensures the transparent behavior of the vowel
that is minimally deviant from the vowel affected by the local application of
harmony: papír-nak is minimally different from the fully harmonic, but disallowed
*pap[µ:]r-nak.
Crucially, the local harmony constraint AGREE is not respected in the words
with transparent vowels. The authors, however, maintained that locality is preserved.
They argued that locality is respected through their formalization of constraints: “…
there is no constraint in our analysis that evaluates candidates by comparing the
feature specifications of non-adjacent segments …” (Baković & Wilson 2000: 54).
Therefore, in this approach strict locality is also parameterized: all primitives of the
analysis respect locality, but the surface forms violate it due to constraint ranking.24
The second approach to transparency in the OT framework that I will discuss
is proposed in Kiparsky & Pajusalu (2003). The basic idea of underspecification, that
less-marked (less-contrastive) vowels impose less restriction on their environment, is
re-analyzed as a degree to which vowels tolerate disharmony. In Kiparsky & Pajusalu
(2003), the markedness of a vowel determines whether it agrees or disagrees with
24
From a technical point of view, Baković & Wilson had to introduce powerful
machinery to the OT evaluation. Specifically, their analysis requires that candidates
are evaluated in a pair-wise fashion. Despite their effort to constrain the machinery
for selecting the pair of special candidates, the pair-wise evaluation weakens one of
the main advantages of OT: its parallel nature of evaluation. See also a similar
account couched in Sympathy Theory (McCarthy 1999) by Walker (1998), and
discussion of this proposal in Krämer (2001:165)
51
adjacent vowels. Because the transparent vowels are assumed to be unmarked, they
are allowed to disagree with adjacent vowels, as is the case in Hungarian’s papír-nak
‘paper-Dat.’. The opaque vowels are assumed to be marked, and hence are required to
agree with at least one adjacent vowel, as is the case in parfüm-nek ‘perfume-Dat.’.
It may seem that locality is respected in Kiparsky & Pajusalu’s analysis
because it assumes that there is no interaction between the first and the third vowel of
papír-nak. Hence, the [+back] suffixes following BT stems do not arise from
agreement with the initial back vowels. Rather, both initial and suffix vowels interact
with the transparent vowel separately and disharmony of /í/ with these vowels is
tolerated due to its unmarked quality. However, a closer examination of this proposal
reveals a crucial need for a non-local mechanism.
To account for transparency, Kiparsky & Pajusalu must extend the domain of
the originally local AGREE constraint:
“… we need a constraint that somehow forbids disharmony across intervening
neutral vowels [SB: such as disharmony between the initial back vowel and
the suffix vowel in *papír-nek]. When that constraint dominates the stricter
local harmony constraint AGR[EE](Back), transparency results“ (Kiparsky &
Pajusalu 2003: 9).
This is achieved by generalizing the domain of evaluation of a conjoined
constraint that militates against disharmony of marked vowels. Then, for the case of
*papír-nek, the initial vowel /a/, although unmarked, disagrees with the following /i/.
52
In contrast, the suffix vowel /e/, although in agreement with /i/, is a marked vowel. A
high-ranking constraint against this disharmony eliminates *papír-nek but requires
the type of evaluation that simultaneously refers to the non-local relationships (a-í, íe) by one constraint.
To summarize the proposals assuming a parameterized notion of locality,
underspecification does away with the need to account for transparency with
powerful neutralization rules or problematic abstract stages. This is because
transparent vowels never undergo harmony in the underspecification approach. In
addition, the underspecification approach captures the intuition that the vowels with
unmarked or noncontrastive feature values typically behave transparently whereas
those with marked or contrastive ones behave opaquely. Hence, the
underspecification approach captures the increased flexibility of transparent vowels
compared to the opaque vowels. Due to the predictability of their [–F] value in the
harmony process, the transparent vowels display compatibility with both [+F] and [–
F] adjacent vowels. At the same time, the approaches subscribing to a parameterized
notion of locality assume that transparent vowels do not participate in vowel
harmony. This prediction is shared with the non-local approaches, and contradicts the
prediction of the strict locality hypothesis. These predictions will be tested in the
experiments described in Chapter 3.
53
2.3.2. The nature of transparent vowels
In Hungarian, front unrounded vowels {/í/, /i/, /é/, /e/} behave transparently whereas
front rounded vowels {/ü/, /ő/, /ö/, /ı/} behave opaquely. This distribution of front
vowels is representative of many palatal vowel harmony systems, and to my
knowledge there is no language in which this distribution is reversed (e.g. L.
Anderson 1980, Kiparsky & Pajusalu 2003). Hence, an analysis that provides an
explanation for this cross-linguistic regularity is highly desirable. The main question
related to this distribution is whether the correlation between rounding and opacity in
palatal harmony is accidental. In general, it is desirable to derive the phonological
behavior related to transparency, opacity, and the mentioned correlations from
independent facts, rather than stipulate them.
In early phonological analyses, the division of front vowels into transparent
and opaque was stipulated without independent motivation. Therefore, irrespective of
the technical means for differentiating the transparent and opaque vowels, the reason
for this particular division of vowel inventories in palatal vowel harmony systems
was not provided. In addition, the proliferation of rules and restrictions to the
application of rules to account for the ‘special character’ of transparent and opaque
vowels made the understanding of the nature of these vowels even more elusive. For
example, Kiparsky (1982) observed that for a basic analysis of Finnish facts the
group of transparent vowels had to be mentioned as many as nine times in five
54
different rules while powerful devices such as non-local phonological rules and
morpheme conditions had to be introduced into phonological theory.
Advances in the underspecification and markedness theories allowed
capturing the observation that transparent behavior is conditioned by the vowels
specified for the unmarked or non-contrastive value of the harmonizing feature
whereas the opaque vowels are typically specified for the marked or contrastive
value. A very productive line of research used the combination of cross-linguistic
markedness scales with language-specific ranking of other phonological demands
within the OT framework to derive the difference between transparent and opaque
behavior of vowels (e.g. Smolensky 1993, Kirchner 1993). The motivation for the
division into transparent and opaque vowels was a structural one: it was determined
by general and language-particular asymmetries in the symbolic inventory of a
particular harmony system.25
One of many recent examples of such an approach can be found in Krämer
(2002). Krämer argued that the motivation for the transparent behavior of Finnish
front, non-low, unrounded vowels lies in the absence of a harmonic counterpart to
25
A similar case can be made for the asymmetry among consonants in terms of their
participation vs. non-participation in harmony. Usually the dorsal (and uvular)
consonants participate in palatal vowel harmony, and coronal and labial consonants
do not. A structural motivation for this phonological asymmetry may be attributed to
the featural geometry where the vocalic feature [back] is dominated by the dorsal
node.
55
these vowels in the Finnish inventory: “… the [SB: transparent] behavior of the
vowels i and e can be seen as a marking or compensation strategy triggered by the
imbalance they cause in another dimension of the phonology, the vowel inventory.”
(Krämer 2002: 43).26
It is plausible that there is indeed a close relationship between the asymmetry
in the inventory and the division of vowels into transparent and opaque. However,
this relationship does not necessarily have to be a causal one. It also might be the case
that both the gap in the inventory as well as the differences in their phonological
behavior can be reduced to some independently required underlying property.
For example, Kaun (1995), Baković & Wilson (2000), and Gafos (1999)
suggest that phonetic characteristics of transparent vowels play a role in the way they
pattern phonologically. In this way, transparency and opacity derive from constraints
on articulatory-acoustic relationships in vowels.
Kaun (1995) construed vowel harmony as a process that extends the temporal
span of the harmonic features in order to facilitate better perception of these features.
In her view, transparent vowels are those vowels “… [whose] occurrence does not
26
In Krämer’s analysis, transparent vowels are underlyingly specified and actively
enforce either harmony or disharmony with adjacent vowels. Hence, transparent
vowels impose significant restrictions on their environment because they have to be
either flanked from both sides by front vowels or by back vowels. In this sense,
Krämer’s approach is a reversal of the underspecification approach where transparent
vowels are analyzed as tolerating disharmony due to their unmarked quality.
56
constitute a substantial interruption of the signal associated with the extended [SB:
harmonic] feature” (Kaun 1995: 142). In this approach, harmony is a perceptuallydriven mechanism where transparent vowels are more perceptually compatible with
the harmonizing feature than opaque vowels are.
In addition, Kaun extended the binary division of underspecified and specified
and, by extension, transparent and opaque segments by proposing a ‘transparency
continuum’. This universal hierarchy is based on the perceptual compatibility of
various segments with the harmonizing feature, hence more levels of transparency
were possible to formalize. This is also a welcome result because, as argued in
Section 2.2.2 of this dissertation, the phonological behavior of Hungarian front
vowels in vowel harmony is better characterized as a continuum between opacity and
transparency rather than a binary division between them.
As will be discussed in Chapter 4, motivating transparency and opacity on
perceptual grounds achieves some success in explaining the differences between
transparency and opacity. It is, however, not sufficient to explain all observed
patterns. In addition, Kaun’s model predicts that the perceptual features of /i/ as a
transparent vowel should be compatible with the harmonizing feature [+back].
Several studies that will be reviewed in Chapter 4 do not support this prediction. It
will be proposed that considering both articulatory and perceptual patterns provides a
better understanding of the phonological patterns of transparency and opacity.
57
Considerations for both articulatory and acoustic properties in motivating
transparency and opacity in vowel harmony are employed in Baković & Wilson
(2000) and Gafos (1999). Baković & Wilson (2000) analyzed data from the ATR
harmony pattern of Wolof in which [+high, +ATR] vowels (i, u) are transparent and
their [–ATR] counterparts are missing in the Wolof vowel inventory. The crucial OT
constraint that is responsible for the transparency of /i/ and /u/ is defined as a
conjunction of articulatory and acoustic requirements on these vowels. Articulatorily,
a [+high, –ATR] specification imposes antagonistic requirements on tongue
movement (Archangeli & Pulleyblank 1994): it requires elevation of the tongue body
and simultaneous retraction of the tongue root. These competing requirements,
according to Baković & Wilson, make [+high, –ATR] vowels articulatorily marked.
In addition, the competing articulatory requirements result in less than perfect
acoustic contrast with other vowels. Hence, [+high, –ATR] vowels are also
perceptually marked.
For Baković & Wilson (2000), the nature of the transparent vowels is then in
the relationship between articulation and acoustics. More specifically, their targeted
constraint allows only those articulatory changes that result in a vowel perceptually
very similar to a target vowel.
In Gafos (1999), the articulatory-acoustic relationship of vowels plays an
important role in the argument that transparent vowels are not excluded from vowel
58
harmony alternations. Under this approach, the nature of transparent vowels lies in
the non-linear relationship between the articulatory and acoustic properties of these
vowels. Considering palatal vowel harmony systems such as Hungarian, Gafos
pointed out that articulatory variation of non-low, front, unrounded vowels in the
front-back dimension does not result in comparable acoustic variation (Stevens 1989,
Wood 1979). Hence, it is possible that the assumed tongue retraction, caused by the
overlap of the gesture for /i/ with the global harmony [+back] gesture (0 in Section
2.3.1.1), falls within an acoustically stable region with minimal acoustic
consequences. In this view, the transparency of /i/ and /e/ in palatal harmony arises
from independent quantal features of these vowels because they have limited acoustic
sensitivity to certain articulatory changes.27
In short, the difference between transparent and opaque vowels in terms of
their phonological behavior may arise from independently motivated constraints on
articulatory-acoustic relationships.
In addition to the major division between rounded and unrounded front
vowels, there are also differences within the set of transparent vowels. Most
strikingly, the data in Section 2.2.2 showed that the phonological behavior of the low
/e/ in the harmony pattern differs from the behavior of non-low {/i/, /í/, /é/}. The
27
The studies of Stevens (1989) and Wood (1979) will be reviewed in more detail in
Chapter 4.
59
status of /e/ has been controversial in the literature on Hungarian palatal harmony.
Ringen (1975) argued that /e/ is a harmonic vowel in Hungarian. Vago (1980), on the
other hand, argued that /e/ is transparent due to the back suffixes following the word
maszek ‘self-employed’. This claim is refuted in van der Hulst (1985) and Ringen
(1975).28
However, analyzing /e/ in these accounts as having a special status does not
explain the relationship between /e/ and the rest of the transparent vowels. Vago
(1980) considered /e/ transparent, but subject to two different vowel harmony rules
while {/i/, /í/, /é/} were subject to a single rule. Ringen (1975) argued that /e/ is nontransparent, hence triggers front suffixes in BT stems. In her account, vacillation
results from the optional application of a suffix-backing rule for Be stems. Van der
Hulst (1985) assumed that vacillation in Be stems arises because surface /e/ has two
distinct underlying representations: “… a root with an /e/ in the final syllable cannot
be mapped onto a unique underlying representation…” (van der Hulst 1985: 282).
Can articulatory-acoustic differences between /e/ and {/i/, /í/, /é/} be linked in
a principled way to the differences in phonological transparency in palatal harmony?
If the answer is positive, then the mentioned differences between /e/ and {/i/, /í/, /é/}
in their phonological behavior might be explained without losing the explanation for
28
Van der Hulst questions its morphology since maszek is an acronym made from the
first syllables of magán szektor ‘private sector’. Ringen claims that maszek is a
vacillating stem, which also conforms to the intuitions of my informants.
60
their common characteristics. I will argue that the proposal developed in Chapters 4
and 5 offers a plausible approach along these lines.
2.3.3. Exceptionality of híd-type stems
A particularly challenging generalization in the Hungarian data presented in
Section 2.2.2 is that stems with only transparent vowels (T stems) trigger both front
and back suffixes. The default and productive pattern for T stems is to trigger front
suffixes but a limited set of stems trigger back suffixes. This pattern is not unique for
Hungarian; a similar division of T stems into two groups is observed in Uyghur
(Lindblad 1990). However, the default and productive pattern in Uyghur is that T
stems trigger back suffixes and only exceptionally front ones.
In any analysis, the transparent vowels in T stems that select back suffixes
(e.g. híd-nak ‘bridge-Dat.’) have to contrast underlyingly with the transparent vowels
in those T stems that select front suffixes (e.g. víz-nek ‘water-Dat.’). This is because
the choice of the suffix is not predictable from the phonological environment.29 At the
same time, traditional analyses assumed that the mechanism responsible for the
selection of back suffixes in híd-type stems is not identical to the mechanism for
29
In Finnish, some monosyllabic stems also trigger back suffixes. However, this
pattern is limited to vowel-initial derivational suffixes. Therefore, this pattern can be
derived from the difference between C-initial and V-initial suffixes. For example,
Kiparsky & Pajusalu propose to formalize it as mis-alignment between the syllable
and the morpheme boundaries (Kiparsky & Pajusalu 2003: 6, fn7).
61
selecting back suffixes in papír-type stems (e.g. Ringen 1975, Vago 1980). This is
because in papír-nak, the [+back] value of the suffix vowel is typically derived from
the [+back] value of the initial vowel. In contrast, there is no such trigger present in
the híd-type stems. Therefore, it appears that three types of transparent vowels should
be considered phonologically different: those in víz-type stems, those in híd-type
stems, and those in papír-type stems. Informally, /í/ in the víz-type stems triggers
front suffixes, /í/ in the híd-type stems triggers back suffixes, and /í/ in the papír-type
stems appears transparent.
Under this assumption, various formal means have been proposed for the
definition of the híd-type stems as different from both víz-type and papír-type stems.
For example, Vago (1980) argued that the stem vowels in the híd-type stems are
specified underlyingly as [+back], trigger the [+back] form of the suffixes, and then
change to [–back] by a late neutralization rule. Ringen (1975) argued that híd-type
stems are marked with a diacritic that triggers a minor rule of suffix backing
(disharmony). Finally, van der Hulst (1985) proposed a floating autosegment [+back]
that is restricted to surface on the suffix vowel and not on the stem vowel.
However, none of these solutions provide a principled explanation for the
correlation valid in Hungarian data and observed in Section 2.2.2: the relation
between height and transparency of the vowels in the T stems is just like that for the
vowels in the BT stems. In other words, the group of T stems that trigger back
62
suffixes is a phonologically arbitrary collection because the form of the suffix cannot
be predicted from either the vowels or the consonants of these stems. Hence, listing
the T stems that select back suffixes as exceptions would not explain the
systematicity in the data related to the correlation between height and transparency.30
Kiparsky’s (1973) approach to Finnish vowel harmony involves a proposal
that seems to avoid the problem mentioned above. In his analysis, the transparent
vowels in both híd-type stems and papír-type stems are unified because both types of
stems are assumed to be excluded from vowel harmony by a diacritic [–Vowel
Harmony]. As a result, under Kiparsky’s assumption that the suffix vowels are
specified as [+back], the default back forms of the suffix are generated. However, the
diacritic solution for the papír-type stems does not explain why precisely the stems
with an initial back vowel pair with the exceptional híd-type stems in selecting back
suffixes. Additionally, the assumed [+back] underlying value for Hungarian suffixes
is controversial. Vago (1980) argued that at least some suffixes are underlyingly
[–back], because when they form an independent harmony domain and surface as
clitics, they are [–back]. On the other hand, van der Hulst (1985) questioned the
assumption that suffixes and independent stems are derived from the same synchronic
underlying representation even if they may be related diachronically.
30
Vago (1980) used a similar type of argument to support the underlying [+back]
specification of híd-type stems.
63
In short, a satisfactory analysis of suffix selection in T stems should explain
common characteristics of transparent vowels in T and BT stems, as well as maintain
the distinction between the T stems that select front suffixes and those that select
back suffixes.
2.3.4. Vacillating stems
The vacillation pattern in the Hungarian data reviewed in Section 2.2.2 constitutes
another challenge for the theoretical models of harmony. Recall that stems such as
hotel or aszpirin may be followed by either front or back suffixes. Hungarian
speakers readily accept this duality and it can be observed both within and across
individual speakers. Hence, I assume in this dissertation that vacillation is a freev
ariation pattern.31 The source of this variation is thus likely to be in the phonology of
Hungarian vowel harmony.32
31
Although the literature known to me does not mention any known sociolinguistic
factors influencing which suffix is chosen at a given moment, it is interesting that
there seems to be a difference based on age. Ringen and Kontra (1989) observed a
statistical tendency for the adult speakers to prefer front suffixes in the disharmonic,
mostly BTT type stems. On the other hand, Gósy (2000), using the same stimuli
stems as Ringen and Kontra, found the opposite tendency for children around the age
of 10. Moreover, the quality of short /e/, which seems to interact with the pattern of
suffix selection, is likely influenced by sociolinguistic factors (E. Sólyom, p.c.)
32
Ringen and Kontra (1986) observed that the choice of the suffix in vacillating
stems can be affected by the front/back vowel of a clitic several syllables before the
suffix vowel. Hence, vacillation may be context dependent.
64
To complicate the matter, the vacillation pattern cannot be linked to any
single vowel (e.g. vacillating stems hotel, affér, aszpirin). This is because vowels
triggering vacillation in one context do not have this quality when they appear in a
different context. For example, /i/ in buli ‘party’ does not trigger vacillation whereas
/i/ in aszpirin ‘aspirin’ does. Therefore, all transparent vowels seem to be able to
trigger vacillation under certain conditions and the patterns cannot be analyzed by
invoking exceptionality of a particular vowel, as was the case for /e/ reviewed in
Section 2.3.2.
Moreover, vacillation in terms of suffix selection is skewed: the distribution
of suffixes in vacillating stems is hardly ever 50% front and 50% back suffix. Rather,
empirical studies reported statistical tendencies in the vacillation pattern (e.g. Ringen
& Kontra 1989, Gósy 2001, Hayes 2004). For example, Be stems are more likely to
trigger front suffixes than Bé stems that are in turn more likely to trigger front
suffixes then Bi stems. Due to the assumption that the source of these tendencies is in
the phonological grammar, there is a problem of reconciling the symbolic discrete
nature of grammar with the continuous nature of the observed tendencies. Hence, a
satisfactory model of transparency in Hungarian vowel harmony has to be flexible
enough to model both discrete alternations in the form of the suffix (in each
instantiation every suffix is either [+back] or [–back]) as well as continuous statistical
65
tendencies in these alternations (Be stem is more likely to be followed by a front
suffix than a Bé stem is).
2.3.5. Multiple transparent vowels
The last challenge posed by the Hungarian data to be discussed in this chapter is the
effect of the number of transparent vowels on the form of the suffix. It was observed
in Section 2.2.2 that the number of consecutive transparent vowels that follow a back
vowel affects transparency: the more transparent vowels, the less transparent the
behavior they exhibit. The problem of capturing the complexity of BT and BTT data
can be expressed as follows. On the one hand, a desirable analysis has to allow
seemingly non-local agreement across a transparent vowel in the BT stems. On the
other hand, though, such an analysis should also explain the deterioration of this
agreement with increasing distance between the trigger and the target.
Although successful in accounting for the BT stems pattern, the analyses
where transparent vowels do not participate in vowel harmony fail in accounting for
BTT stems. The BTT data show that the analyses where transparent vowels do not
participate in vowel harmony miss an important generalization. In these analyses, the
number of transparent vowels is not expected to affect suffix selection since
participation in vowel harmony is a categorical property. This is a problem in
accounts where the transparent vowels are skipped over in the application of the
66
vowel harmony rules (Anderson 1980, Vago 1980, Ringen 1975), as well as in many
approaches couched in the Autosegmental, Government Phonology, or OT
frameworks that use underspecification or late neutralization rules (Clements 1976,
Van der Hulst & Smith 1986, Ritter 1995, etc.). In all of these analyses, a vowel is
either transparent or harmonic (opaque). Whether there are one or two transparent
vowels following a back vowel should not make any difference in suffix selection
because neither one of the transparent vowels participates in vowel harmony.
However, as the difference between BT and BTT data show, the number of
transparent vowels does affect the suffix selection.
There are two types of proposals for dealing with BT and BTT stems in a
unified way. In the first one, Kaun (1995) derives the difference between BT and
BTT stems from a proposed ‘transparency continuum’. She argues that “… two
syllable peaks … constitute an interruption of the harmonic span that is excessively
substantive” (Kaun 1995: 149). This approach is promising; however, it also seems
insufficient because Kaun does not offer any precise expression for the relationship
between a temporal span of a feature and the perception of this feature. In addition,
Kaun’s analysis is able to derive the opaque behavior of BTT stems but not the
salient pattern of vacillation seen with such stems.
In the second type of proposals, it is argued that the difference in suffix
selection is related to differences in the prosodic structure of these stems (Ringen &
67
Kontra 1989, Ringen & Heinämäki 1999). In Hungarian, primary stress falls on the
leftmost syllable. The issue of secondary stress is more controversial, some claiming
its absence (e.g. Siptár & Törkenzy 2000: 22, Kálmán & Nádásdy 1994), and some
assigning it on each successive odd-numbered syllable (e.g. Hayes 1995: 330).
Assuming the presence of secondary stress, the prosodic difference between BT and
BTT stems is that the stem-final transparent vowel in BT stems is in the unstressed
position whereas the stem-final transparent vowel in BTT stems is the head of the
foot and receives secondary stress.33 Thus, the differences in suffix selection may be
linked to this prosodic difference.
Support for this correlation between stress and suffix choice comes from the
results presented in Ringen & Kontra (1989). In their experiment, subjects were asked
to provide a form of alternating suffix in various polysyllabic disharmonic stems.
Ringen & Kontra observed that a transparent vowel in BT stems is usually followed
by a back suffix, and a transparent vowel in the third or fourth syllable tends to be
followed by a front suffix. For example, subjects preferred front suffixes in words
like hidrogén or bronchitis. The transparent vowel in the third syllable in these words
presumably receives secondary stress, and is therefore more likely to trigger front
33
Evidence for metrical foot structure does not necessarily have to come from the
stress assignment but from other phonological processes. However, the literature
available to us does not mention any such process in Hungarian, and Siptár &
Törkenzy (2000: 22) claim that rhythmic alternation does not interact in any way with
the rest of the phonology.
68
suffixes than a transparent vowel in a BT stem where it is unstressed.34 Therefore,
Ringen & Kontra hypothesized that the presence of stress influences the choice of the
suffixes. In addition, Ringen & Heinämäki (1999) observed that in the similar vowel
harmony system of Finnish, a stressed transparent vowel is more likely to be followed
by front suffixes than an unstressed one.
However, this prosodic difference does not seem to play as strong a role in
suffix selection as Ringen & Kontra suggest. This can be seen by comparing the
phonological behavior of BT stems and BBT stems. Table 1 describes the
quantitative pattern of suffix selection obtained by analyzing the electronic database
tagged for suffix selection (Füredi et al. 2004). It can be seen that the BBT stems
pattern similarly to the BT stems. The prevalent choice of suffix is [+back] for both
of them. Therefore, this result shows that the putative secondary stress on the
transparent vowel in the BBT stems does not translate into a greater potential for
selecting front suffixes.
BT
BBT
Back suffix
N
%
201
58
206
88
Front suffix
N
%
57
16
23
10
Vacillation
N
%
89
26
6
2
Total
N
347
235
Table 1 – Suffix selection for BT and BBT nouns.
34
Ringen & Kontra’s suggestion is slightly different in that they propose that the
suffix choice correlates with the presence of primary stress on the back vowel
preceding the transparent vowel. The prediction discussed in the next paragraph and
tested in Table 1 still holds: BBT stems should trigger more front suffixes than BT
stems.
69
Ringen & Kontra’s hypothesis is further weakened by the results reported in
Gósy (2000). Gósy used identical stimuli and methodology as Ringen & Kontra in
elicitation experiments with a population of 10-11 years old children. She observed
that, in contrast to the adult speakers in Ringen & Kontra’s study, the children tend to
prefer back suffixes in the stems where the transparent vowels receive secondary
stress.
To sum up, the hypothesis that prosodic differences between BT and BTT
stems provide motivation for differences in the patterns of suffix selection in these
stems seems incomplete. A more promising option, which does not preclude the
incorporation of prosodic effects, is to investigate the difference between BT and
BTT stems in terms of phonological context. The stem-final vowel in the BT stems is
preceded by a back vowel. In contrast, the stem-final vowel in BTT stems is preceded
by a front vowel. Hence, if transparent vowels are allowed to participate in vowel
harmony, the contrast between BT and BTT stems may be derived as the effect of
environment on stem-final vowels.
2.4.
Conclusion
The aim of this chapter was to review the state of the art in understanding of
transparent and opaque behavior in vowel harmony, and set the stage for further
70
investigation and modeling of these phenomena. This aim was approached in two
steps: first the data representing the case study of this thesis were described; then the
challenging and unresolved issues in this data were discussed, with the relevant
aspects of representative analyses evaluated.
The key issues related to the presented Hungarian data can be divided into
two categories. The first one represents the questions of locality and the nature of
transparent and opaque vowels. These issues apply to almost any vowel harmony
system. The second category represents particular generalizations observed in
Hungarian data: the division of stems with only transparent vowels into those that
trigger front and those that trigger back suffixes, vacillation in the form of the suffix
for particular stems, and the phonological relevance of the number of intervening
transparent vowels between the trigger and the target of harmony.
With respect to the first category, the approaches can be summarized in the
following way. There were two general issues: locality requirement (strict,
parameterized, none) and motivation for transparent-opaque division (acousticsarticulation, general markedness, stipulated). The review of the literature related to
these issues centered around four approaches to these issues. The first assumed that
vowel harmony is a strictly local process, which predicts that all segments in its
domain are affected, some phonemically and some sub-phonemically. If both
phonetics and phonology are relevant in vowel harmony, the explanation for the
71
division between transparent and opaque vowels might be found in their acousticarticulatory characteristics. The second approach assumes parameterized locality
where the acoustics-articulatory considerations motivate the exclusion of transparent
vowels from participating in vowel harmony. The third approach excludes transparent
vowels from harmony based on general markedness considerations and sub-phonemic
differences are irrelevant for vowel harmony. The last, fourth approach assumes that
vowel harmony is a non-local process with no role for phonetics and no external
motivation for the transparent-opaque division.
In the category of generalizations particular to Hungarian, a desired analysis
should provide a unified treatment of transparent vowels in BT and T stems while
maintaining the difference between the T stems that select front suffixes and those
that select back suffixes. The analysis of the vacillating pattern should be unified with
the analyses of the transparent and opaque patterns because vacillation was argued to
be just one pattern on the continuum between transparency and opacity. Moreover,
the analysis should be flexible enough to account for the statistical tendencies in
suffix selection observed in Hungarian stems. Finally, the generalization that BTT
stems are more likely to select front suffixes than similar BT stems should be
explained. As argued, an explanation for this pattern requires abandoning the premise
that transparent vowels are irrelevant for the suffix selection.
72
What seems to be at stake and bears on both general issues and those
particular to Hungarian is the relationship between phonetics and phonology in the
pattern of vowel harmony.
An approach where phonetic patterning is relevant for the phonological
system of vowel harmony is an attractive one because it respects locality. At the same
time, it might offer a principled explanation for the differences in phonological
behavior of vowels. The spirit of such an integrated approach to the relationship
between phonetics and phonology is expressed, for example, in Pierrehumbert’s
statement that “… [t]here is no particular point on the continuum from the external
world to cognitive representations at which it is sensible to say that phonetics stops
and phonology begins” (Pierrehumbert 2000: 14). The question under this approach is
what cognitive model that combines the information from the phonetic and
phonology has the most explanatory power.
In contrast, the approaches assuming the absence, or the parameterization, of
locality predict that transparent vowels do not participate in the vowel harmony
pattern on the surface. The role of acoustic-articulatory considerations is at most to
motivate the exclusion of transparent vowels from participation in the vowel harmony
pattern. But the only way to achieve this wholesale exclusion is with a discrete
computational mechanism that treats transparent vowels as symbol-like entities
independent of their phonetic characteristics. Consequently, finer differences among
73
the transparent vowels as well as their cumulative effect are problematic for this
approach.
Based on the above discussion, the information about the participation of
transparent vowels in vowel harmony bears crucially on these two approaches: if
phonetics does play a role in the phonology of vowel harmony, the assumptions of
the first approach are supported, and those of the second approach are weakened.
Chapters 3 and 4 discuss experimental and other evidence pertaining to this issue.
74
CHAPTER 3
Hungarian transparent vowels: an experimental study
3.1.
Introduction
Vowel harmony in general, and transparency more specifically, have been one of the
core areas of phonological research. The complex pattern of suffix selection
described in the previous chapter provides a rich testing ground for hypotheses
related to the nature of phonological representations as well as cognitive processes in
speech. However, a surprising fact is that relatively little attention has been devoted
to the phonetics of transparency. Section 3.2 reviews the results of the handful of
available studies.
Two conclusions emerge from this review. First, there is a lack of articulatory
data in the limited literature concerning the production of transparent vowels because
all available studies investigate only the acoustics of these vowels. Second, there is
agreement in the literature that transparent vowels are subject to phonetic
coarticulation from adjacent vowels but not subject to the phonological pattern of
harmony. The rest of this chapter reports the results of an articulatory investigation of
Hungarian transparent vowels that addresses both these factors. On the one hand, the
combination of two techniques used in the experiment to be described in this chapter
(magnetometry and ultrasound) provides a comprehensive picture of the articulatory
characteristics of transparent vowels. On the other hand, the construction of the
75
stimuli allows for testing the hypothesis that the participation of transparent vowels in
vowel harmony is limited to phonetic coarticulation.
Section 3.3 discusses the methodological issues related to the techniques of
data collection, extraction, and analysis, together with the description of the subjects
and the stimuli. Section 3.4 presents the results of the experiments. The major finding
reported in this section is that transparent vowels in stems that select back suffixes are
articulated slightly, but significantly, further back than transparent vowels in stems
that select front suffixes. Hence, when a stem-final transparent vowel is followed by a
[+back] suffix, it is articulatorily retracted. In contrast, when a transparent vowel is
followed by a [–back] suffix, it is less retracted. The upshot is that minor phonetic
differences in the retraction of the stem-final vowel correlate with a phonological
alternation in suffixes. Moreover, as will be shown, this result cannot be attributed
solely to phonetic coarticulation. Section 3.5 concludes by summarizing and
discussing the main findings.
3.2.
Previous experimental studies involving transparent vowels
The only phonetic investigation of Hungarian transparent vowels in a back/front
context that is available to me is reported in Vago (1980). Vago gives a brief account
of a study by Fónagy (1966), who compared the acoustic properties of transparent
vowels occurring with back vowels (e.g. iga ‘yoke’) and with front vowels (e.g. ige
76
‘verb’). Fónagy found that the transparent vowels in words with back vowels are “…
pronounced somewhat further back…” (Vago 1980: 17) than in the words with front
vowels. Fónagy attributed this difference to phonetic assimilation.
Gordon (1999) reported the results of an acoustic investigation of Finnish
transparent vowels in front and back harmonic contexts. Two subjects (male and
female) read a list of bi-and tri-syllable words where a single transparent vowel /i/ or
/e/ was adjacent to a back or front vowel in the following way. In disyllabic words, a
transparent vowel followed a front or back vowel, as in tätti ‘aunt’ vs. ase ‘weapon’,
or a transparent vowel preceded a front or back vowel, as in iho ‘skin’ or hely
‘trinket’. In trisyllabic words, the transparent vowel was surrounded by back or front
vowels, as in tättihan ‘aunt -emphatic’ or asehan ‘weapon-emphatic’. The F2 values
were measured using the LPC algorithm at the temporal mid-point of each transparent
vowel to minimize the effect of surrounding consonants.
When the values from all conditions were pooled (the number of syllables and
the position of the harmonic vowel as preceding or following the transparent vowel),
the effect of harmonic environment on the F2 value of the transparent vowels was not
significant. However, when the disyllabic words with the transparent vowels
preceding the harmonic vowels such as iho ‘skin’ and hely ‘trinket’ were excluded,
the environment had a significant effect on the F2 values. More specifically, /i/ for
both subjects, and /e/ for the male subject had significantly higher F2 values in the
77
front context than in the back context. Assuming that F2 measures tongue body
retraction during vowel production, Gordon’s data suggested that the Finnish
transparent vowels are significantly more retracted when preceded by back vowels
than when preceded by front vowels. Gordon interpreted his data as suggesting that
vowel harmony in Finnish affects the transparent vowels “at a low phonetic level”
(Gordon 1999: 20).
Välimaa-Blum (1999) reported the results of another acoustic study
concerning Finnish transparent vowels. Välimaa-Blum hypothesized that the lack of
back unrounded non-low vowels in the Finnish inventory could be explained by the
following allophonic variation of the front non-low unrounded vowels: “maybe they
[SB: /i/, /e/] have front allophones in words with front vowels and back allophones in
words with back vowels” (Välimaa-Blum 1999: 259). To test this hypothesis,
Välimaa-Blum designed an experiment where three subjects (one female and two
males) read a short passage in two speaking rates (normal and hyper-articulated). The
acoustic data (F1 and F2) were collected from the transparent vowel /i/ when it
occurred in words with back vowels and in words with front (both harmonic and
transparent) vowels. The first and the second formants were measured to test the
effect of environment (back vs. front).
The analysis of the effect of harmonic environment revealed that the
difference between the front and the back environments was significant for both F1
78
and F2 values in the normal speaking rate for two out of three speakers. VälimaaBlum also mentioned that a typical acoustic measure of vowel backness, F2 – F1
(Ladefoged 1975: 179), showed that the vowels in the back context were more
retracted than in the front context. Unfortunately, she did not include a statistical
analysis of the F2 – F1 measure. In hyper-articulated speech, the effect of
environment was not significant, although the direction of the effect agreed with the
normal rate results for all three subjects.
Välimaa-Blum interpreted these data as resulting from a style-dependent late
phonetic assimilation rule that applies in less-formal speech contexts. Unfortunately,
there was no control of the possible effects of consonants in her experiment, although
the measurements were taken at a temporal mid-point of the vowel (Välimaa-Blum,
p.c.). Also, the number of tokens was relatively small (fewer than 25 in each cell),
which might have contributed to the non-significance in the hyper-articulation mode.
With respect to the articulatory investigation of transparency, Archangeli et al.
(2004) presented limited ultrasound data from the production of the transparent
vowels /i/ and /u/ in the ATR harmony of Wolof, a language of the Atlantic branch of
the Niger-Congo family. They compared tongue positions during the transparent
vowels in pairs such as mandikat, where the [+ATR] transparent /i/ is surrounded by
[–ATR] vowels /a/, with /i/ in dindiku where it is surrounded by [+ATR] vowels /i/
79
and /u/. The preliminary results showed that Wolof transparent vowels alternate
articulatorily depending on the ATR feature of the surrounding vowels.
To sum up, all available phonetic studies of transparency in vowel harmony
suggest that transparent vowels are affected by harmony phonetically. Unfortunately,
each of the mentioned studies is limited in its scope, number of tokens, and in the
case of the only articulatory study also statistical analysis. More importantly, all three
acoustic studies of palatal harmony assumed that the effect of environment is purely
phonetic without rigorously testing this assumption.
3.3.
Articulatory experiment: Methodology
3.3.1. Magnetometry and ultrasound techniques
The core of this experiment is to study the spatial characteristics of Hungarian
transparent vowels. In many respects, the most important articulator for the
production of vowels and a major determinant of their acoustic output is the tongue
body. In general, there are two available experimental techniques for the observation
of tongue body movements (Stone 1997 and references therein). The first is imaging
of particular points on the tongue using either an electro-magnetic field or X-ray
imaging. The second is imaging of the global tongue surface using either Ultrasound
or an MRI. In this experiment, both point imaging as well as global tongue imaging
were used.
80
The EMMA technique (Electro-magnetic Midsagittal Articulometer, Perkell
et al. 1992, Stone 1997) is based on tracking the movements of small receivers that
are attached to articulators, in an electro-magnetic field. The set up of an EMMA
experiment is shown in Fig. 3. Three equidistant transmitter coils (T) are fixed on a
plastic apparatus that is attached on the subject’s head with a headband. The coils
produce alternating magnetic fields at three frequencies in the range 50-75 kHz.
Small receivers (1-1.5mm in diameter) are attached on various articulators using
special adhesive. In this experiment, eight such receivers were placed in a midsagittal plane on the nose, maxilla, upper lip, lower lip, jaw, tongue body (2), and
tongue dorsum.35
Headband
Plastic apparatus
TB1
TB2
TD
Transmitter coils
Fig. 3 – An illustration of the plastic apparatus with transmitter coils (T) on
the left and the approximate placing of three receiver coils on the tongue on
the right. TB1 and TB2 are glued on the tongue body, TD on the tongue
dorsum.
35
Standard calibration and cleaning procedures for each of these receivers were
completed before each experiment (Perkell et al. 1992, Kaburagi & Honda 1997).
81
The electromagnetic field from the transmitter coils passes through the
receiver coils and generates an electric signal. The voltage of this signal is inversely
related to the distance of the receiver from the transmitter coil. This relationship is
then used to calculate the position of the receivers in the two-dimensional coordinate
plane as a function of time. The voltages in the receivers are captured with the use of
‘Maggie’ software (Tiede et al.1999) at a sampling rate of 500Hz. Audio data were
also collected with a Sennheiser shotgun microphone at a sampling rate of 20 kHz.
The position of the receivers relative to the transmitter coils is relatively fixed
with the use of a head-stabilizing structure shown in Fig. 3. Nevertheless, minor head
movement within the plastic restraining apparatus is unavoidable. To correct for this
movement, data from the transmitter coils placed on the nose and maxilla (the gum
above the upper front teeth) are used. The movement of these two receivers does not
result from articulation but from the minor head movements within the helmet.
Hence, the time-varying data on the position of the receivers placed on the active
articulators (e.g. the tongue) were corrected for head movement using the movement
data from the nose and the maxilla receivers. This head correction was performed by
an automatic procedure that is part of the voltage-to-distance data extraction.36
36
The head-correction involves rotating and translating the movement data with
respect to reference receivers as oriented on the occlusal plane. The occlusal plane is
obtained when data from four receivers are collected: two receivers are positioned on
a biteplane (a spatula-like plastic object) that the subjects hold between their teeth so
82
The Ultrasound technique allows the imaging of the surface of the tongue. A
subject places the ultrasound probe below his/her chin, in the soft area surrounded by
the jawbone. Fig. 4 shows the two most common placements of the probe: sagittal on
the left and coronal on the right. The sagittal placement provides the image of
approximately the mid-line of the tongue from the tongue blade to the tongue root.
The coronal placement provides a cross-sectional image of the tongue (from side to
side). In this experiment the sagittal placement was used.
Fig. 4 – Placements of the ultrasound probe: sagittal on the left and coronal
on the right (reproduced from Stone (to appear).
An ALOKA SSD-1000 ultrasound system at Haskins Laboratories with a 3-5
MHz convex-curved probe was used. A piezoelectric crystal in the probe emits ultrahigh frequency waves and receives the reflected echo. The emitted waves travel
through the soft tissue of the tongue and reflect back when they reach an interface
with a matter of different density such as bone or air (Stone 1997). This reflected
echo is used to construct a bright white line that shows the boundary between the
that one receiver is outside and one inside the subject’s mouth; the other two
receivers are attached to the nose and maxilla.
83
tongue surface and the air above it. Ultrasound images of the tongue were collected at
a 30Hz rate, recorded on an S-VHS video-recorder, and then digitized into sequences
of movies in the ‘jpeg’ format.
During data collection, the movement of the probe with respect to the
subject’s head must be minimized to allow for the comparison of the data across
tokens. The subjects were instructed not to move their heads during data collection.
This prevented large-scale movements but did not prevent involuntary minor shifting
of the head. To limit the subjects’ head movement in relation to the ultrasound probe,
two elastic self-adhesive bands were attached to the probe and then wrapped around
the subject’s head. In addition, the sides of the probe were taped on the subject’s skin
with a kind of tape routinely used for medical purposes. In this way, the probe was
fixed to the subject’s jaw sufficiently so that it moved together with the subject’s
head. Since both the head and the probe moved together, the position of one with
respect to the other was fixed. Due to the elasticity of the adhesive bands, the
movement of the jaw was not substantially restricted. See Stone (to appear) and Gick
(2002) for other methods of head stabilization while collecting ultrasound data.
3.3.2. Stimuli and subjects
This experiment investigated the effect of the [±back] phonological environment,
determined by the form of the suffix following the stem, on the production of
84
transparent vowels. Hence, the aim was to compare the tongue body position during
transparent vowels in the front environment, e.g. bili-vel ‘pot-Instr.’, and the back
environment, e.g. buli-val ‘bunny-Instr.’. For this purpose, the stimuli consisted of
lexical pairs where the transparent vowels were placed in either the front or back
harmonic environment.
Significant effort was made to control for the consonantal environment
surrounding the TVs within the pairs so that the effect of harmonic environment on
the production of transparent vowels was not confounded by differences in the
surrounding consonants. The most important consideration was to keep the
consonants surrounding the transparent vowel in the two lexical pairs as similar as
possible. Ideally, both preceding and following consonants were identical, as in
bulival vs. bilivel. Alternatively, they agreed in the general place of articulation, as in
tömítı vs. tompító.
Wherever possible, the transparent vowel occurred in an open syllable. The
dorsal and lateral consonants were dispreferred in the position following the
transparent vowels whereas labials, labiovelars, the glottal /h/, and the coronals were
preferred (in that order). The assumption underlying this approach to stimuli
construction was that coda consonants interfere with the preceding vowel more than
the following onset consonant of another syllable. Additionally, labials, labio-dentals
and glottals have a much smaller effect of the tongue body and tongue dorsum
85
position than velars or the lateral /l/. Attention was also paid to the prosodic structure
so that the distribution of long and short consonants between the lexical pairs was
identical. Finally, morphological structure had to be considered as well. Hungarian
has a significant number of compounds and verbal prefixes that were avoided because
a boundary between two compounds or between a prefix and a verb blocks harmony
in Hungarian (e.g. Vago 1980: 27). Although vowel harmony is pervasive in verbs,
nouns and adjectives alike, effort was made to keep the part of the speech identical
for both members of a lexical pair.
It was impossible to create a list of stimuli that simultaneously a) satisfied all
the above considerations, and b) provided a sufficiently large number of lexical pairs
to represent the general pattern of vowel harmony. However, attention to both of
these conditions was paid as much as possible with the aim to create a balanced
stimuli list. An example of an ideal pair is Tomi-hoz ‘Tom.Dim.-Allative’ vs. Imi-hez
‘Imre.Dim.-Allative’. The complete list of stimuli is in Appendix A.
Two sets of stimuli were constructed. In the first set, transparent vowels
occurred in disyllabic stems such as buli vs. bili, and were followed by a
monosyllabic suffix. This approach yielded tri-syllabic words, a sample of which is
given in (17).
86
(17)
Example of the first set of stimuli – disyllabic stems with suffixes
Back
kábít-om [k˘A˘bi˘tom] ‘daze’
buli-val [bulivAl] ‘party’
Tomi-hoz [tomihoz] 'Tom. Dim.’
bódé-tól [bo˘de˘to˘l] ‘hut’
Front
Suffix
repít-em [rEpi˘tEm] ‘send’ 1st sg. def.
bili-vel [bilivEl] ‘pot’
Instr.
Imi-hez[imihEz] ’Imre. Dim.’Allative
bidé-tıl [bide˘tO˘l] ‘bidet’ Ablative
Notice that the transparent vowels in this set are surrounded by either front
vowels (in the front context) or back vowels (in the back context) from both sides.
Therefore, it is plausible that the surrounding vowels influence the production of
transparent vowels via coarticulation. As a result, one might expect that the
transparent vowels surrounded by back vowels will be produced more retracted
compared to the same transparent vowels surrounded by front vowels. In other words,
the different phonetic realizations of transparent vowels in the front and back
phonological domains might result from low-level coarticulation.
In order to determine if this is the case, a second set of stimuli was
constructed. Recall from the phonological description of Hungarian harmony patterns
in Chapter 2 that some monosyllabic stems with a transparent vowel (T stems) select
back suffixes, and some select front suffixes. This means that T stems can be found in
the back and front harmony environments respectively. Yet, when these stems do not
require overt morphology, there are no adjacent vowels to influence the production of
transparent vowels. In Hungarian, there are certain morphological categories marked
87
with phonologically zero suffixes, e.g. the nominative singular for nouns, or the third
person singular for verbs. This property of Hungarian was used in constructing the
second set of stimuli, a sample of which is shown in (18).
(18)
Example of the second set of stimuli – monosyllabic stems without
suffixes
Stems selecting back suffixes
Stems selecting front suffixes
vív [vi˘v]‘fence’
ív [i˘v]
‘bow’
ír [i˘r]
‘write’
cím [tÉsi˘m]
‘rumor’
cél [tÉse˘l]
'aim’
szél [se˘l]
‘wind’
héj [he˘j]
‘crust’
éj [e˘j]
‘night’
Both sets of stimuli were used in all experimental sessions reported in this
dissertation. However, there is a difference between the number of tokens in the two
stimuli sets. There were 22 pairs and 8 repetitions for disyllables, but only 8 pairs and
4 repetitions for monosyllables. The decreased number of pairs in the second set was
caused by the limited set of monosyllabic stems that select back suffixes, and the
effort to maintain minimal consonantal differences between the members of each
pair.
All stimuli words were embedded in the frame sentence: Azt mondom, hogy
“____ “ és elismétlem azt, hogy “____“ mégegyszer ‘I say _____ and I repeat _____
once again’. This generated two renditions of the token in each sentence.37 The
37
A statistical analysis of position in the sentence revealed no significant effect on
the position of the tongue. Hence, in the discussion of results, the values from the two
positions were pooled.
88
sentences were randomized and presented to the subjects visually on a computer
screen.
The results from three subjects are presented. All of them are young adults in
their twenties and speak the Budapest dialect of Hungarian. ZZ (male) and BU
(female) were presented with the most complete set of stimuli that was described
above (Appendix A). CK (female) was the pilot subject with whom a slightly
different set of stimuli and one additional frame sentence were used (Appendix B).
3.3.3. Data collection
Using EMMA with subjects ZZ and BU, 8 repetitions of 44 lexical items (22 pairs) of
disyllabic stems in 2 sentence positions were collected, which yielded a total of 704
tokens. The distribution of transparent vowels in the 22 lexical pairs was balanced: 7
pairs with /í/, 8 pairs with /i/, and 7 pairs with /é/. Due to corruption in one lexical
token with the transparent vowel /i/ in ZZ’s data, data from that lexical pair (N = 32)
and 8 other corrupted tokens had to be discarded. This exclusion gave a total of 664
tokens for ZZ. For BU, 22 (randomly distributed) corrupted tokens were excluded,
which resulted in a total of 658 analyzed tokens. For monosyllabic stems, 4
repetitions of 16 lexical items (8 pairs) in 2 sentence positions were collected. This
gave a total of 128 tokens. For ZZ and BU, three tongue receivers (TD, TB2, and
TB1) were used.
89
For the pilot subject CK, 4 repetitions of 64 lexical items in 2 environments
generated a total of 512 collected tokens of disyllabic stems. There were 18 lexical
items with missing data, which resulted in 494 tokens as the input for the analysis.
Additionally, 4 repetitions of 3 lexical pairs, giving a total of 24 tokens of
monosyllabic stems were also collected. For CK, two tongue receivers (TD and TB)
were used.
Using Ultrasound, data from one subject (ZZ) were collected using the same
sets of stimuli as for the EMMA data for this subject. The stimuli were divided into
two blocks. Each block consisted of 4 repetitions of 22 lexical pairs of disyllabic
stems in 2 positions and in 2 environments, for a total of 352 tokens. Two such blocks
were collected; however, the data from these blocks are not collapsed.38 This is
because the elastic bands that ascertain the fixed placement of the probe with respect
to the subject’s head were taken off after each block. Although an effort was made to
re-attach the probe on the same position of the subject’s under-chin as before, there is
no objective means to evaluate our success. Therefore, the data are divided into two
blocks because the position of the probe could have moved between the blocks,
which prevents the conflation of the data.
38
Due to missing data, there were only 350 tokens in the second block.
90
3.3.4. Data labeling and extraction
This experiment investigates the effect of harmonic environment on the position of
the tongue during the production of transparent vowels in a palatal vowel harmony
system that involves the horizontal positioning of the tongue body. It is then
reasonable to assume that the participation of transparent vowels in palatal vowel
harmony is at least in part correlated with the degree of horizontal tongue body
retraction. In order to quantify this participation, the following labeling and extraction
procedures were used. The rationale behind these procedures was that the spatial
target of the horizontal tongue body movement for the front vowels is achieved at the
extreme front position of the tongue body.
For EMMA data, a Matlab software package called MAVIS (Tiede et al.
1999) was used for initial observation and labeling. Fig. 5 shows the time functions of
the receivers, as well as the audio signal during the production of zafírban ‘saphireInessive’ from the pilot study.
First, the transparent vowel in each token was identified manually using both
auditory and articulatory information. Then, the procedure continued with the
identification of the time point when a particular tongue body receiver achieved its
most front position during the production of a transparent vowel. This was done using
an automatic procedure in the MAVIS package that determines peaks of the time
91
functions representing the motions of the receivers.39 The output of this ‘peakpicking’ procedure is shown as the ‘max’ labels for the TB and TD receivers in Fig.
5. Finally, the horizontal value of the receiver at the labeled point was extracted.
Fig. 5 – Horizontal and vertical trajectories of articulators during the
production of zafír-ban. The top panel represents the audio signal for the
complete sentence whereas the second panel from the top presents only the
portion with the target word. The remaining panels show the horizontal
(dashed) and vertical (solid) movement of the receivers; (from the top down)
Tongue Tip (TT), Tongue Body (TB), Tongue Dorsum (TD), Upper Lip (UL),
and Lower Lip (LL).
39
In some cases, the horizontal movement of the tongue was smooth during the
acoustic portion of the TV without any peak. In these cases, the ‘max’ labels were
placed at the point of maximal front position within the acoustic portion of the vowel.
Usually, this point was around the release of the preceding consonant.
92
To determine the effect of harmonic environment on the position of the
receivers, the formula in (19) was used.
(19)
MAX(R, TV)back – MAX (R, TV)front
e.g.
DIFF(TD, /i/) = MAX (TD, /i/)bulival – MAX (TD, /i/)bilivel =
= – 23.0 – (– 22.5) = – 0.5mm
This formula calculates the difference DIFF between the extreme frontward position
MAX of the receiver R during the transparent vowel TV in the front and back
environments. All the MAX values are negative due to a convention in calculating
EMMA output. In this convention, the zero of the horizontal axis is on the gum above
the front incisors (the maxilla receiver). The receivers toward the outside of the vocal
tract, such as those on the lips, have positive values whereas those inside the vocal
tract have negative values. The further inside the mouth the receiver is, the more
negative the value that is extracted.
A statistically significant difference between the MAX values from the front
and back environments indicates a significant effect of the harmonic environment on
the position of the tongue receivers, and by assumption on the position of the tongue
body. The DIFF value measures the size of the effect but also its direction: if the
DIFF value is negative, the relevant receiver is more retracted in the back
environment than in the front environment. If the DIFF value is positive, the receiver
is more retracted in the front environment than in the back environment.
93
For Ultrasound data, the vowel gesture spans across several individual frames.
During data labeling, first the frame with the most advanced position of the tongue
was determined as the target frame. This was done manually, using both visual and
acoustic information. Then, the tongue edge in this target frame was traced using the
semi-automatic procedure described in Iskarous (2004). Fig. 6 illustrates the edge
tracing from the target frame for /i/ in buli-val.
Fig. 6 – Tracings of the tongue surface at the extreme front position during
the TV /i/ in buli-val. Tongue edge before the tracing (left) and after (right).
All the curves representing the tongue edge were normalized to 100 points.
Given effective measures for preventing head movement with respect to the probe,
the two-dimensional coordinates of these points represent the position of the midsagittal portion of the tongue in an arbitrary, but fixed, coordinate system.
94
In order to quantify and determine the significance of the effect of harmonic
environment, two types of data were extracted. First, a pair-wise comparison of the
curves was performed by calculating the difference as the area between the curves.40
Second, a fixed system of reference points was established and the distance between
the points on the curve and these reference points was calculated.41 The procedures to
obtain these two types of data are described below.
To quantify the difference between two curves, the area between the two
curves was calculated. The smaller the area, the more similar the two curves are.
However, the length of the curves potentially affects the area between them. This is
because two very similar long curves can have a greater area than two short but less
similar curves. In order to control for the effect of the length, the endpoints of the
shortest curve in the set of curves under investigation were determined and then used
to define a wedge sketched in Fig. 7. The origin O is a fixed point defined by the
ultrasound machine, and the points A and B are the endpoints that produce the
smallest value of the angle γ. The area between the two curves within the wedge was
then calculated.42
40
I am especially grateful to David Goldberg and Lisa Davidson for help and
discussions on this issue.
41
I am grateful to Khalil Iskarous for help and discussion on this issue.
42
See Davidson (2003) and Davidson & Stone (2004) for the use of L2-norms instead
of area.
95
A
B
γ
γ
O
O
A = x mm2
A = y mm2, y > x
Fig. 7 – Comparison of two curves as the difference in the area between
them.
The rationale for the application of the area measure is as follows. The
significance of the harmonic environment on tongue shape is achieved if the
differences among the curves from the same environment (front-front, back-back) are
significantly smaller than the differences among the curves from the opposing
environments (front-back). In other words, harmonic environment has a significant
effect on the tongue shape if the shapes compared within the environment are more
similar than the shapes compared across the two environments.
In order to illustrate this idea, consider the data from the disyllabic stems.
Edge detection of the target frames provided 8 shapes of each lexical item. Because
the environment condition is binary (front vs. back), the available data consists of 8
curves from the front environment and 8 from the back environment. A pair-wise
comparison where one curve is from the front environment and the other from the
back environment yields 64 combinations. A pair-wise comparison where the two
96
curves come from the same environment (front-front or back-back) yields 56
combinations. This is illustrated in Fig. 8.
Front
Back
Fig. 8 – Illustration of the pair-wise comparison of the ultrasound curves.
The squares represent the curves from the front environment and the circles
the curves from the back environment. The solid lines represent the (subset
of) comparisons between the environments and the dashed lines the (subset
of) comparison within the environments.
Each line in Fig. 8 corresponds to one comparison between two curves, i.e. to
a value of the area between them. If the area values of the 64 comparisons across the
environments are significantly greater than the area values of the 56 values
comparisons within the environments, the effect of environment is significant.
Note that the area measure determines if there is a significant effect of the
harmonic environment on the global shape of the tongue. It does not, however,
determine the direction of this effect. In order to find out if the tongue is more
97
retracted in the back environment than in the front environment, the position of
several points on the tongue surface was compared to a number of fixed reference
points. Five reference points were selected as the endpoints of five line segments. To
maximize the information obtained by the EMMA and Ultrasound techniques, the
five arbitrary lines were placed in the posterior area of the tongue that is not
accessible with EMMA. Keeping the lines constant across all tokens, the distance D
in millimeters between the fixed point on the line and the point where the line
intersects the tongue surface was computed. This computation, illustrated in Fig. 9,
was performed for all five lines in each target frame of the data.
98
fixed reference
points
intersection points
10mm
5
4
3
2
D
Tongue Tip
1
5
Fig. 9 – Quantification of the effect of environment from the ultrasound
images. The white bi-directional arrow shows the distance D between the
fixed point of the line, marked with white asterisks and the intersection point
marked with black asterisks.
To determine the effect of environment on the position of the receivers, the
formula in (20) was used. The formula calculates the distance DIFF between the fixed
point on the line L and the intersection point of the line L and the tongue surface from
the target frame of the transparent vowel TV.
(20)
DIFF(L, TV) = D(L, TV)front – D(L, TV)back
e.g.
DIFF(L1, /i/) = D(L1, /i/)bilivel – D(L1, /i/)bulival
In addition to the size of the effect, the DIFF value indicates also the direction
of the effect. Because the fixed points for measuring the D values are placed behind
99
the tongue, greater D value corresponds to the advancement, and smaller value
corresponds to the retraction of the tongue body. Therefore, based on the formula in
(20), the tongue is more retracted in the back environment than in the front
environment if the DIFF values are greater than zero.
3.3.5. Comparison of the magnetometry and ultrasound techniques
The information from the EMMA and the Ultrasound techniques complement each
other, providing thus comprehensive knowledge about the articulatory realization of
transparent vowels. EMMA’s advantage is that it offers highly precise temporal and
spatial information about the movements of particular points on the tongue. Its
disadvantage is the limited number of these points. The tongue is a complex organ
and the two-dimensional information about the movement of 3-4 flesh points
provides only a crude picture of constriction formation during vowel production.
Furthermore, the EMMA technique is unable to provide information about the
action of the tongue behind the dorsal area. This is due to the gagging reflex that
prevents subjects from tolerating objects placed in the back of the tongue.
Consequently, it is difficult to place receivers behind the tongue dorsum.
Ultrasound compensates for EMMA’s weakness by providing global images
of (almost) the complete surface of the tongue. However, compared to EMMA’s high
precision, the temporal information in Ultrasound is limited: its sampling rate is 30
100
Hz compared with the EMMA’s sampling rate of 500Hz. This means that the most
extreme horizontal position of the tongue is more reliably measured by EMMA than
Ultrasound. This is because this extreme position might be achieved at the temporal
point at which Ultrasound does not provide any image. In other words, actions of the
tongue are captured approximately every 33 milliseconds and it is not possible to
observe the motion of the tongue in between these temporal points. In contrast,
EMMA provides information approximately every 2 milliseconds.
Similarly, the spatial measurements by EMMA are more accurate than the
ultrasound measurements. EMMA’s measurement error is within 0.5 mm (Perkell et
al. 1992) but the actual error in the measurements such as the one reported in this
experiment is even less than 0.5 mm (Hoole & Nguyen 1997). Ultrasound accuracy,
due in part to the artifacts in the image, approaches 1 mm (Stone, to appear).
In addition, the quantification of the effect of environment on the tongue body
position differs for the two techniques. EMMA measures the horizontal position of
the tongue receivers. As a result, the effect of environment is captured as a difference
in retraction of the receivers in the horizontal dimension. Ultrasound data is used to
calculate retraction on planes that are not strictly horizontal. As can be seen in Fig. 9,
the lines used for measuring the distances are not at 180o, but approximately between
130o and 150o. This means that the distance measured on the line has both a
horizontal and a vertical component.
101
It may be argued that the ultrasound measurement better reflects the action of
the tongue muscles in creating specific tongue shapes. In our case, tongue body
retraction results from the action of the styloglossus that raises the tongue dorsum
toward the velum, and the posterior verticalis muscles that constrict to flatten the
tongue body in the back (e.g. MacKay 1987).
Finally, the three described measurements – horizontal position of three flesh
points with EMMA, position of five points on the tongue surface with Ultrasound,
and the area measure with Ultrasound – provide a continuum between local and
global information about the tongue position. EMMA is the most local because the
three flesh points on the tongue are always fixed, hence, only the information about
the movements of these three points is obtained. The Ultrasound measure of the
intersection points is less local because, instead of an actual point on the tongue, what
is fixed is the position of the line. Hence, the actual intersection of the tongue edge
with this line corresponds to potentially different points on the tongue for every target
frame. As a result, this second measure provides information about the position of a
number of points within a fixed range between the five lines. Finally, the Ultrasound
area measure is the most global measure since it considers almost all reconstructed
tongue shapes with respect to other tongue shapes and there is no fixed point or line.
Overall, then, the combination of the two techniques offers highly informative
phonetic data on the retraction of transparent vowels.
102
3.4.
Results
3.4.1. Disyllabic stems – EMMA results
The data from the subjects ZZ and BU will be presented first followed by the data
from the pilot subject CK. To preview, EMMA data from the disyllabic stems show
that the transparent vowels are retracted significantly more in the back environment
than in the front environment.
For the purposes of the statistical analysis, the data was structured in the
following way. For ZZ and BU there were three dependent variables: TD, TB2, and
TB1. These represent the MAX values (the most front position of the receiver)
measured with the receivers placed at the tongue dorsum, posterior tongue body, and
anterior tongue body respectively. For subject CK, there were only two dependent
variables: TD and TB. Additionally, there were three factors (independent variables)
for each subject: ENV representing front or back harmonic environment, VOWEL
representing the three transparent vowels {/i/, /í/, /é/}, and L.PAIR representing the
lexical pairs out of which the stimuli set was constructed.43 It should be noted that
VOWEL is not independent from L.PAIR because a particular lexical pair contained
only one transparent vowel. Consequently, the interaction between VOWEL and
L.PAIR could not be analyzed and the combined effect of each of these factors with
environment was tested with two separate two-way ANOVAs. The subjects are
43
VOWEL includes the low /e/ for the pilot subject CK.
103
analyzed separately due to the fact that the placement of the EMMA receivers
(dependent variables) is individual for each subject. The software package SPSS was
used for statistical analysis.
3.4.1.1. Subject ZZ
The results from the analysis of the harmonic environment and the vowel type for
subject ZZ are summarized in Table 2. It can be seen that both factors significantly
affect the position of the receivers. Moreover, the table also shows that there is a
significant interaction between the harmonic environment and the type of the vowel.
The main hypothesis tested in this experiment was that transparent vowels are
produced with a more retracted tongue position in the back harmonic environment
than in the front harmonic environment. After determining that the effect of the
environment is significant, the direction of the effect was obtained from the analysis
of the MAX values shown in Table 3. The values in the columns labeled MAX(F)
and MAX(B) are the means of the horizontal maxima of the receivers in the front and
back environments respectively, shown in millimeters. The values in the MD column
represent the difference (DIFF) between the mean values for the position of the
receivers in the two environments.
104
Subject Source
Receiver Type III SS
ENV
VOWEL
ZZ
ENV *
VOWEL
Error
df
Mean Square F
Sig.
TD
148.931
1
148.931 66.357 .000
TB2
314.096
1
314.096 62.684 .000
TB1
282.889
1
282.889 49.038 .000
TD
106.821
2
53.411 23.797 .000
TB2
367.899
2
183.950 36.711 .000
TB1
293.996
2
146.998 25.482 .000
TD
14.832
2
7.416
3.304 .037
TB2
58.382
2
29.191
5.826 .003
TB1
101.428
2
50.714
8.791 .000
TD
1476.806 658
2.244
TB2
3297.090 658
5.011
TB1
3795.844 658
5.769
Table 2 – Results from a 2-way ANOVA for environment and vowel type for
subject ZZ.
Receiver
MAX(F)
MAX(B)
MD (DIFF)
TD
–48.02
–48.97
–0.95*
TB2
–38.65
–40.05
–1.40*
TB1
–23.41
–24.73
–1.32*
Table 3 – Direction of the effect of environment for subject ZZ.
All MD values in the fourth column are negative. Following the discussion in Section
3.3.4, this means that the transparent vowels were more retracted in the back
105
environment than in the front environment, which applies to all three receivers. Given
the significance of the environment in Table 2 for each receiver, it can be concluded
that ZZ produced the transparent vowels in the back environment with significantly
greater retraction than in the front environment. This is marked in the table with an
asterisk.
In addition to this main effect, the interaction of environment and the type of
transparent vowel was also significant. To examine the nature of this interaction, the
effect of vowel type was tested in the two environments separately; and the effect of
environment was tested for each vowel separately. One-way ANOVA was used for
both tests. The results are summarized in Table 4. It can be observed that the
production of the three vowels is significantly different in each environment
separately, and that the harmonic environment significantly affects the production of
each transparent vowel. This applies to all receivers with the exception of the effect
of environment on the /í/ production measured with the TB1 receiver.
106
Source
Grouping
FRONT
ENV
VOWEL
BACK ENV
/i/
ENV
/í/
/é/
Receiver df df(within)
Mean
Square
F
Sig.
TD
2
329
47.069 23.360
.000
TB2
2
329
137.299 28.867
.000
TB2
2
329
100.638 19.172
.000
TD
2
329
13.912
5.609
.004
TB2
2
329
77.233 14.699
.000
TB1
2
329
97.416 15.478
.000
TD
1
218
53.651 20.016
.000
TB2
1
218
128.910 20.885
.000
TB1
1
218
105.787 12.648
.000
TD
1
220
17.598 11.805
.001
TB2
1
220
19.004
4.987
.027
TB1
1
220
5.382
1.725
.190
TD
1
220
92.478 36.040
.000
TB2
1
220
224.350 44.340
.000
TB1
1
220
273.043 46.714
.000
Table 4 – Effect of the type of the transparent vowel on the position of the
receivers in the front and back environment for subject ZZ.
In terms of the direction and size of the effect, Table 5 presents the results
from the calculation of the MD values separately for each vowel. The values in the
columns labeled F and B are the means of the horizontal maxims (MAX values) of
the receivers in the front and back environments respectively, shown in millimeters.
107
/i/
Rec.
F
B
/í/
MD
F
B
/é/
MD
F
B
MD
TD – 48.27 – 49.26 – 0.99 – 48.50
– 49.07 – 0.56 – 47.28 – 48.57 – 1.29
TB2 – 39.45 – 40.98 – 1.53 – 39.16
– 39.73 – 0.59 – 37.39 – 39.41 – 2.01
TB1 – 24.31 – 25.70 – 1.39 – 23.53
– 23.84 – 0.31 – 22.42 – 24.63 – 2.22
Table 5 – Retraction degree of individual transparent vowels, the difference
(MD) between the position of the receivers in the front (F) and back (B)
environments.
Similarly to the combined MD values shown in Table 3, the MD values for
the individual transparent vowels are all negative. It means that each transparent
vowel was retracted more in the back environment than in the front environment.
Furthermore, individual vowels were affected by harmonic environment to a different
degree: back harmonic environment results in greatest retraction of the vowel /é/,
followed by /i/, and then the smallest retraction is observed for the vowel /í/. The
significance of this difference can be inferred from the combination of the overall
significance of the VOWEL*ENV interaction in Table 2 and the significance of
VOWEL in BACK and FRONT environments reported in Table 4.
However, the analyses reported in Table 2 and Table 4 do not reveal if the
differences among the three individual vowels in terms of the degree of retraction
were significant for all three possible options (/í/ vs. /i/, /í/ vs. /é/, and /i/ vs /é/) or
some subset of the three. Following the significance of VOWEL in both front and
108
back environments reported in Table 4, two post–hoc Tukey HSD tests (α = 0.05)
were conducted to examine how individual vowels contributed to the overall
significance of VOWEL. These test revealed that in the front environment /i/ and /í/
were not significantly different for TD and TB2 receivers and that /é/ was not
significantly different from /í/ for all receivers while all other comparisons reveal
significant differences. Combination of these observations and the MAX values in
Table 5 leads to the conclusion that the back harmonic environment causes
significantly greater degree of retraction for /é/ than for /í/. This is inferred from the
fact that /é/ is significantly more advanced than /í/ in the front environment but not so
in the back environment. In addition, /i/ is retracted to a significantly greater degree
than /í/. This is inferred from the fact that /i/ and /í/ are not significantly different in
the front environment, but /i/ is significantly more retracted than /í/ in the back
environment. Hence, the overall significance of the interaction between environment
and vowel type is due to significantly greater retraction for /é/ than /í/, and
significantly greater retraction of /i/ than /í/. The results of the two post-hoc tests are
summarized in Appendix C.
Finally, the data related to the effects of environment and the type of
transparent vowel reveal the following correlation: the more front the transparent
vowel in the front environment, the more retracted it is in the back environment. For
109
subject ZZ, /é/ is the vowel with the greatest degree of retraction and simultaneously
the most advanced position in the front environment, as seen in Table 5.
After discussing the interaction between vowel type and environment, I turn
to the effect of the lexical pair variable. Due to the mentioned dependence of the
VOWEL and L.PAIR variables, a two-way ANOVA testing the effects of ENV and
L.PAIR was conducted; the results are reported in Table 6.
Subject Source Receiver Type III SS
ENV
ZZ
F
Sig.
TD
146.379
1
146.379 208.434
.000
TB2
316.663
1
316.663 186.002
.000
TB1
287.906
1
287.906 294.360
.000
TD
908.750 20
45.438
64.700
.000
2068.784 20
103.439
60.758
.000
2853.530 20
142.677 145.875
.000
L.PAIR TB2
TB1
252.182 20
12.609
17.955
.000
594.982 20
29.749
17.474
.000
725.891 20
36.295
37.108
.000
TD
436.817 622
.702
TB2
1058.937 622
1.702
TB1
608.362 622
.978
TD
ENV * TB2
L.PAIR
TB1
Error
df Mean Square
Table 6 – Results from a 2-way ANOVA for environment and lexical pair
for subject ZZ.
The results show that lexical pair had a significant effect on the horizontal
position of the receivers, and that this effect significantly interacted with harmonic
110
environment. To confirm that this result is not confounded by the mentioned
dependence of VOWEL and L.PAIR factors, separate two-way ANOVAs were
conducted for each of the three vowels. In all three tests and for all three receivers,
the interaction of ENV and L.PAIR was significant (p < 0.001). Hence, L.PAIR had a
significant effect on the degree of retraction, which means that individual lexical
pairs did not behave uniformly with respect to the effect of environment.
In order to further examine this interaction between lexical pair and
environment, the average values of each receiver for each lexical item were
calculated. This produced the mean MAX value from all 8 repetitions of each lexical
item. The stimuli contained 21 lexical pairs for ZZ. Because the effect of position in
the frame sentence was not significant, the mean MAX values from the two positions
were pooled, giving thus a total of 42 paired data points in each environment.
The effect of environment was analyzed with the SPSS version of the
Wilcoxon signed ranks test and is summarized in Table 7.44 The first column lists the
dependent variables (TD, TB2, and TB1) split into the paired values (N = 42) from
the front (F) and back (B) environments. The second column reports how many of the
42 data points had the MD value negative; these are the lexical pairs where the
transparent vowel was retracted more in the back environment than in the front
44
The Wilcoxon sign ranked test is a non-parametric test that compares two related
samples based on their medians.
111
environment. The third column reports the number of cases where the direction of the
effect was opposite.
Receiver
Neg. ranks Pos. ranks
Z
Sig.
TD(B) – TD(F)
33
9
-3.795
.000
TB2(B) – TB2(F)
31
11
-3.782
.000
TB1(B) – TB1(F)
32
10
-3.907
.000
Table 7 – Effect of environment in individual lexical items for subject ZZ.
The test confirms that, despite variation in individual lexical pairs, the effect
of environment is significant. This means that there are significantly more pairs with
a negative MD value than those with a positive one.
3.4.1.2. Subject BU
The structure of BU’s data was identical to ZZ’s data. Thus, the same statistical
methods described in the previous section were used in the analysis of BU data. The
results from the first two-way ANOVA are reported in Table 8.
Similarly to ZZ, the harmonic environment and the vowel type were
significant factors also for BU. Following the procedure described for ZZ, the
direction of the effect was confirmed with the analysis of the MAX values shown in
Table 9.
112
Subject
Source
ENV
VOWEL
BU
ENV *
VOWEL
Error
Type III
SS
Receiver
Mean
Square
df
F
Sig.
TD
26.144
1
26.144 16.403
.000
TB2
59.069
1
59.069 31.181
.000
TB1
26.208
1
26.208 15.030
.000
TD
66.128
2
33.064 20.745
.000
TB2
90.447
2
45.223 23.872
.000
TB1
134.429
2
67.215 38.546
.000
TD
.384
2
.192
.121
.886
TB2
7.430
2
3.715
1.961
.142
TB1
2.324
2
1.162
.667
.514
TD
1077.452
676
1.594
TB2
1280.615
676
1.894
TB1
1178.778
676
1.744
Table 8 – Results from a 2-way ANOVA for environment and vowel type for
subject BU.
Receiver
MAX(F)
MAX(B)
MD (DIFF)
TD
–43.12
–43.51
–0.39*
TB2
–30.89
–31.48
–0.59*
TB1
–21.68
–22.07
–0.39*
Table 9 – Direction of the effect of environment for subject BU.
All MD values in the fourth column are negative, which means that the transparent
vowels are more retracted in the back environment than in the front environment.
113
This effect was found in all three receivers. Therefore, given the significance of the
environment in for each receiver reported in Table 8, it can be concluded that BU,
similarly to ZZ, produced the transparent vowels in the back environment with
significantly greater retraction than in the front environment.
In contrast to ZZ, the interaction of the effect of environment with the type of
the transparent vowels was not significant for BU although the effect of VOWEL was
significant. Therefore, the differences among transparent vowels in terms of their
retraction degree, reported in Table 10, were not significant. It is interesting,
however, that the degree of retraction follows ZZ’s pattern: the effect of environment
is greatest for /é/ while the smallest for /í/.
/i/
Subject Rec.
F
B
/í/
MD
F
B
/é/
MD
F
B
MD
TD –43.53 –43.89 –0.37 –42.80 –43.15 –0.36 –42.99 –43.45 –0.46
BU
TB2 –31.40 –31.91 –0.51 –30.64 –31.03 –0.38 –30.55 –31.43 –0.88
TB1 –22.20 –22.53 –0.34 –21.15 –21.43 –0.28 –21.62 –22.19 –0.56
Table 10 – Retraction degree of individual transparent vowels: the difference
(MD) between the mean position of the receivers in the front (F) and back (B)
environments.
With respect to the effect of the lexical pair variable, a two-way ANOVA
testing the effects of ENV and L.PAIR was conducted. The results are reported in
Table 11.
114
Subject
Source
ENV
L.PAIR
BU
ENV *
L.PAIR
Error
Receiver
Type III
SS
Mean
Square
df
F
Sig.
TD
24.442
1
24.442
48.873
.000
TB2
60.393
1
60.393
88.217
.000
TB1
28.290
1
28.290
67.428
.000
TD
631.123
21
30.053
60.093
.000
TB2
679.711
21
32.367
47.279
.000
TB1
840.015
21
40.001
95.341
.000
TD
188.724
21
8.987
17.969
.000
TB2
260.603
21
12.410
18.127
.000
TB1
211.054
21
10.050
23.955
.000
TD
319.076
638
.500
TB2
436.772
638
.685
TB1
267.674
638
.420
Table 11 – Results from a two-way ANOVA for environment and lexical
pair for subject BU.
The results are similar to those reported for ZZ. The factor L.PAIR had a
significant effect on the horizontal position of the receivers, and the interaction
ENV*L.PAIR was also significant. To exclude the possibility that the significance of
L.PAIR stems from the VOWEL factor, separate two-way ANOVAs were conducted
for each of the three vowels. L.PAIR had a significant effect on the degree of
retraction because the interaction of ENV and L.PAIR was significant at least at p <
0.05 for each receiver in each vowel separately.
115
Subsequent examination of the interaction between lexical pair and
environment was carried out using of the Wilcoxon signed ranks test. In BU’s data,
the stimuli contained 22 lexical pairs, which, after pooling the mean MAX values
from the two positions, resulted in a total of 44 paired data points in each
environment. The results are reported in Table 12. The distribution of positive and
negative ranks is similar to ZZ’s result; the number of pairs where the retraction of
the transparent vowel in the back environment is greater than in the front
environment is significantly higher than the number of the pairs with opposite
direction of the effect.
Receiver
Neg. ranks
Pos. ranks
Z
Sig.
TD(B) – TD(F)
28
16
-2.381
.017
TB2(B) – TB2(F)
29
15
-3.151
.002
TB1(B) – TB1(F)
29
15
-2.262
.024
Table 12 – Effect of environment in individual lexical items for subject BU.
3.4.1.3. Subject CK (pilot)
Although the structure of the stimuli for CK differed from the one for ZZ and BU
(Section 3.3.3), CK’s data are reported using the same statistical methods as in the
previous two sections.45 The results of the two-way ANOVA testing the effects of
45
For the subject CK, there were 4 repetitions of each token and each repetition was
unique because there were 2 frame sentences and 2 positions. Therefore, each token
116
environment and vowel type are summarized in Table 13. Recall that data from only
two receivers (TD and TB) were collected for this subject, and that there were four
vowels tested (/í/, /i/, /é/, and /e/).
Subject
Source
ENV
VOWEL
CK
ENV *
VOWEL
Error
Receiver
Type III
SS
Mean
Square
df
F
Sig.
TD
55.980
1
55.980
9.308
.002
TB
13.583
1
13.583
2.776
.096
TD
3581.598
3
1193.866 198.513
.000
TB
2009.716
3
669.905 136.904
.000
TD
22.425
3
7.475
1.243
.294
TB
36.281
3
12.094
2.472
.061
TD
2796.535
465
6.014
TB
2275.354
465
4.893
Table 13 – Results from a two-way ANOVA for environment and vowel type
for subject CK.
The results concerning the main effect of the harmonic environment are consistent
with the other two subjects: the production of the transparent vowels was
significantly affected by harmonic environment. However, this effect was observed
only for the receiver placed on the tongue dorsum and not for the one placed on the
from the front environment could be paired with the corresponding token from the
back environment, which allows for the analysis using paired samples t-test. Overall,
the results from this analysis are consistent with the ones using ANOVA and
presented in the text.
117
tongue body where only a tendency is reported (p < 0.1). The size and direction of the
effect is summarized in Table 14. Similarly to ZZ and BU, the transparent vowels in
the back environment were more retracted than in the front environment as confirmed
by the negative MD values.
Receiver
MAX(F)
MAX(B)
MD (DIFF)
TD
–24.59
–25.58
–0.99*
TB
–21.83
–22.08
–0.23
Table 14 – Direction of the effect of environment for subject CK.
The differences in retraction among the four transparent vowels were not
significant, as shown in Table 13, although they approach significance for the TB
receiver (p = 0.061). However, there is a main effect of VOWEL.
Table 15 shows the retraction degrees of individual vowels. The mean
difference (MD) values are all negative with the exception of the vowel /í/ and the TB
receiver. Among the transparent vowels, the vowel /é/ was retracted the most for the
TD receiver, which corroborates the results from subjects ZZ and BU. In addition to
the three transparent vowels /í/, /i/, and /é/, the stimuli for this subject also included a
limited number of tokens where the transparent vowel was the short low /e/. The
stimuli set contained two lexical pairs: hárem ‘harem’ vs. érem ‘medal’, and totem
‘totem’ vs. tetem ‘corpse’. The vacillating stems hárem and totem were presented
118
with back suffixes whereas the harmonic stems érem and tetem were presented with
the matching front suffixes.46 The result in terms of the effect of environment on the
retraction of /e/ when it behaves transparently (a back suffix follows) is consistent
with the rest of the data: /e/ is more retracted in the back environment (e.g. hárem–
nak, totem–nak) than when in the front one (e.g. érem-nek, tetem-nek).47
/í/
Rec.
F
B
/i/
MD
F
B
/é/
MD
F
B
/e/
MD
F
B
MD
TD -22.41 -22.93 -0.52 -25.63 -25.91 -0.28 -23.24 -24.56 -1.32 -31.87 -32.76 -0.89
TB -20.12 -19.74 0.38 -22.74 -22.93 -0.19 -20.50 -20.94 -0.44 -26.53 -27.35 -0.82
Table 15 – Retraction degree of individual transparent vowels for subject
CK.
Finally, the analysis of the combined effect of the harmonic environment and
the lexical pair factors provides results consistent with the results from the other two
subjects: harmonic environment affects transparent vowels in different lexical pairs
differently. CK’s data are summarized in Table 16 below. The Wilcoxon test was not
46
Recall that stems where a back vowel is followed by /e/ can be followed by front or
back suffixes.
47
This result, however, should be taken with caution due to the disproportion of the
/e/ tokens compared with other transparent vowels and also relatively high number of
missing data points. The /e/ data made only 12% of all data compared to more than
27% for each of the other three vowels. Moreover, 15% of /e/ data points were
missing (mostly from the TD receiver) compared to fewer than 4% missing for the
other three vowels.
119
performed due to the difference in the structure of the stimuli for CK (4 repetitions of
64 pairs) compared to ZZ and BU (8 repetitions of 22 pairs).
Subject Source
ENV
L.PAIR
CK
ENV *
L.PAIR
Error
Receiver
Type III
SS
Mean
Square
df
TD
42.142
1
TB
3.904
1
TD
5340.537
TB
F
42.142 25.787
3.904
Sig.
.000
2.306
.130
63
84.770 51.872
.000
3264.410
63
51.816 30.615
.000
TD
376.160
61
6.167
3.773
.000
TB
407.584
61
6.682
3.948
.000
TD
567.076
347
1.634
TB
587.294
347
1.692
Table 16 – Results from a 2-way ANOVA for environment and lexical pair
for subject CK.
3.4.1.4. Summary of disyllabic EMMA data: subjects ZZ, BU, and CK
The main reported finding was that the transparent vowels were significantly more
retracted in the back environment than in the front environment. This effect was
robustly present in all three subjects. While this effect was found in all the transparent
vowels for all subjects and receivers (with the exception of TB /í/ data for subject
CK), the harmonic environment affected the production of individual vowels to a
different degree. This difference, however, was significant only for ZZ where the
frontness of the transparent vowels in the front environment correlated with their
120
degree of retraction in the back environment so that the most advanced vowel in the
front environment was also the most retracted in the back environment. A common
tendency for all three subjects was observed that the vowel /é/ was affected by the
environment the most. Finally, the effect of harmonic environment on a transparent
vowel was significantly affected by the lexical word in which that vowel occurred.
3.4.2. Disyllabic stems – Ultrasound results of subject ZZ
Extraction of the tongue edge with the ultrasound techniques allows the
comparison of these edges by plotting them on top of each other. Fig. 10 shows two
examples of such comparison where the tongue edges were extracted from the target
frames of the transparent vowels /i/ and /é/ in the front and back environments. Visual
inspection of comparisons such as those in Fig. 10 supports the main result reported
from the EMMA study. Transparent vowels in the back environment are more
retracted than in the front environment. The greatest difference between the shapes
from the two environments is observed in the posterior area of the tongue.
121
Fig. 10 – Retraction of /i/ in back vs. front harmony. Left: Tomi-hoz with
bolder, dotted lines vs. Imi-hez with lighter, solid lines. Right: kadet-tól vs.
bidet-tıl. There are eight tokens for each word; the tongue tip is on the right.
As described in Section 3.3.4, the analysis of the ultrasound data started with
testing the null hypothesis that the shapes from both environments are not
significantly different. For each lexical pair, the 64 area values from the pairs across
environments and 56 area values from the pairs within environments were calculated.
The data from all 22 lexical pairs were pooled and served as an input to the ANOVA
tests with the area measurements as the dependent variable and environment (same
vs. different) as the factor. The results are summarized in Table 17. The effects of
environment and vowel type were analyzed separately for each block because the
factor of block significantly interacted with both environment and vowel type. It can
122
be seen that the null hypothesis is rejected because the curves from the same
environments are significantly different from the curves in different environments.
Dep.
var.
Source
ENV
VOWEL
Area
ENV*
VOWEL
ERROR
Block
Type III SS
df
Mean Square
1
43015113.655
2
111508402.089
1
4332055.976
2
2166027.988
14.064 .000
2
5384985.481
2
2692492.741
14.077 .000
1
1578705.726
2
789352.863
5.125 .006
2
5760042.769
2
2880021.384
15.057 .000
1
405660036.963 2634
154009.126
2
480848412.053 2514
191268.263
1
F
Sig.
43015113.655 279.302 .000
1 111508402.089 582.995 .000
Table 17 Effect of environment based on the area measure of difference
between curves.
Table 18 shows that the area measure was always greater for the curves
extracted from different environments (‘Diff.’ column) than for the curves extracted
from the same environments (‘Same’ column). Table 17 also shows significant
interaction of environment and vowel type for both blocks. Individual Anova tests
revealed significant main effect of environment for each vowel in each block
(p < .001)
123
Vowel
Mean Area
Same
Diff.
/í/
734.350 1022.780
/i/
700.031 994.286
/é/
685.982 872.406
All
706.480 964.572
Table 18 – Mean area between the curves from the same environments and
those from different environments.
Similarly to the EMMA data, the area measure from the Ultrasound data
showed a significant effect of the lexical pair, and its interaction with environment.
The results of the Wilcoxon signed ranks test revealed that out of 22 lexical pairs, at
least 20 showed greater mean area in the ‘Diff.’ condition than in the ‘Same’
condition. Hence, the Wilcoxon signed rank test confirmed that there were
significantly more lexical pairs where the curves within an environment were more
similar than the curves across the two environments.
Block
Variable
Neg. ranks Pos. ranks
Z
Sig.
1
Area(Diff.) – Area(Same)
2
20
-3.997
.000
2
Area(Diff.) – Area(Same)
1
21
-3.980
.000
Table 19 – Effect of environment in individual lexical items for subject BU.
124
After rejecting the null hypothesis and confirming that environment
significantly affected the global shapes of the tongue, the second step was to
determine the direction of this effect. In other words: is it the case that the tongue is
more retracted in the back environment than in the front environment? A statistical
analysis was carried with the ultrasound data quantified according to the description
in Section 3.3.4. This quantification involved calculating the constriction degree of
the tongue along five fixed lines in a two-dimensional reconstruction of the tongue
edge.48 The difference D between the fixed point and the intersection point was a
dependent variable. In addition to the factors used with EMMA (harmonic
environment, vowel type, and lexical pair), the factor LINE (1-5) was included. Table
20 summarizes the main effect obtained by a three-way ANOVA (excluding the
factor of lexical pair) conducted for the two blocks of the data separately.
48
Because the palate trace was not obtained in this experiment, the measure does not
calculate the actual constriction degree of the tongue from the palate but from the
abstract but fixed points placed approximately corresponding to points on the velum
and upper pharyngeal wall.
125
Dep.
var.
Source
Block
Type III
SS
Mean
Square
df
F
Sig.
1
8.863
1
8.863 223.818
.000
2
17.907
1
17.907 496.303
.000
1
1.075
2
.537
13.573
.000
2
.999
2
.499
13.839
.000
1
9.511
4
2.378
60.047
.000
2
5.753
4
1.438
39.860
.000
1
1.814
2
.907
22.902
.000
2
1.092
2
.546
15.126
.000
1
.808
4
.202
5.104
.000
2
.638
4
.159
4.419
.001
1
2.292
8
.287
7.236
.000
2
2.828
8
.354
9.798
.000
ENV*VOWEL 1
*LINE
2
.194
8
.024
.614
.767
.175
8
.022
.605
.774
1
68.506
1730
.040
2
62.060
1720
.036
ENV
VOWEL
LINE
ENV*VOWEL
D
ENV*LINE
VOWEL*LINE
Error
Table 20 – Main effects of the environment, vowel type, and line on the D
value
To determine the size and direction of these effects, the difference values
(DIFF) were computed from the mean D values for each vowel and line using the
formula in (20) in Section 3.3.4. Table 21 shows the results with the DIFF values in
millimeters.
126
DIFF (Front vs. Back environment)
Block Vowel
1
2
Line-3 Line-4 Line-5
Lines
combined
Line-1
Line-2
/í/
1.4
2.34
2.79
2.35
1.24
2.03
/i/
1.34
1.97
2.19
2.12
1.03
1.73
/é/
0.69
0.87
0.74
0.36
-0.09
0.51
All
1.15
1.74
1.92
1.63
0.74
1.44
/í/
1.92
2.68
3.05
2.94
2.07
2.53
/i/
1.48
2.27
2.92
2.7
1.69
2.21
/é/
1.31
1.54
1.5
1.34
0.97
1.33
All
1.57
2.17
2.5
2.34
1.58
2.03
Table 21 – Average difference in mm between the tongue shapes from the
front and back environment computed on the five lines shown in Fig. 9 in
Section 3.3.4.
It can be seen that all the DIFF values except one (/é/, Line-5, Block1) are
positive, which shows that the transparent vowels in the back environment were more
retracted than in the front environment. This main effect of harmonic environment on
the production of transparent vowels was significant, as confirmed in Table 20.
With respect to individual vowels, the effect of environment was significant
for each transparent vowel in both blocks (p< .05 for /é/ in Block1, p < .001 for all
others). Retraction resulting from the back environment was greatest for /í/ where it
reached up to 2.5mm on average in Block 2. The smallest effect can be observed with
127
/é/.49 The data summary in the table supports a scale of retraction degree from the
most to the least retracted vowel: /í/ > /i/ > /é/. This difference in retraction among the
vowels was also significant due to the significant interaction of ENV and VOWEL
reported in Table 20. Additionally, this scale was respected for all five lines and in
both blocks.
With respect to the effect of LINE, environment had the greatest measured
effect on the middle Line-3 and the effect was gradually scaled down towards more
peripheral lines number 1 and 5. As confirmed by the significance of the ENV*LINE
interaction, these differences among the lines were also significant.
Finally, there was a significant difference between the two blocks in the
overall size of the effect.50 The DIFF values were higher in the second block. This
applied to all values reported in Table 20. At the same time, the relative differences
among individual vowels held approximately identically for both blocks.
49
In fact, the effect of environment for /é/ in Block 1, despite being significant for
Lines 1-3 in separate Anova tests, is smaller than the measurement error, and should
thus be treated as a tendency only.
50
When Block was added to the three factors, BLOCK, and BLOCK*ENV,
BLOCK*LINE were significant and BLOCK*VOWEL was not. Hence, the overall
difference between the blocks (1.44 vs. 2.03) was significant but differences in terms
of individual vowels were not. It seems that the difference between the two in terms
of the ultrasound probe placement significantly affects the magnitude of the effect of
environment.
128
Thus, measurements of tongue position with ultrasound support the result
obtained from EMMA measurements that transparent vowels in the back environment
were more retracted than the ones in the front environment.
Recall the second correlation observed in the EMMA data that the most front
vowel in the front context is also the most retracted in the back context. Table 22
summarizes the D values of the transparent vowels in the front context. Recall from
Fig. 9 of Section 3.3.4 that D is the distance between the endpoint of a fixed line and
the point where that line intersects with the tongue surface.
Front environment: D
Block Vowel
Line-1 Line-2
1
2
Line-3
Line-4 Line-5 Average
/í/
18.8
19.6
19.2
20
19.4
19.4
/i/
17.6
18.2
18
19.1
19.2
18.4
/é/
16.3
17.1
17.5
18.8
19.1
17.8
/í/
21.3
21.8
21.8
22.6
21.3
21.7
/i/
20.2
20.9
21
22
21.2
21.1
/é/
19.2
19.9
20.5
21.9
21.6
20.6
Table 22 – Advancement of the transparent vowels in the front environment.
Each column shows average distances in millimeters between the fixed points
and the intersections of the lines with the tongue shapes.
A scale from the most to the least front can be inferred from the table: /í/ > /i/
> /é/. This is because more front positions are signaled by higher values of the
distance from the fixed points (that are positioned in the back area of the vocal tract).
129
A one-way ANOVA revealed a significant difference among the three vowels in the
front environment for both blocks, F(2, 877) = 5.194, p = 0.006 for Block1, and
F(2, 877) = 21.094, p < 0.001 for Block 2. Post-hoc comparisons using the TukeyHSD test revealed that /í/ was articulated significantly more advanced than /i/, and
that /i/ was significantly more advanced than /é/.
A comparison of the values in Table 21 and Table 22 reveals that more
advanced position of a transparent vowel in the front environment correlates with
more retracted position in the back environment. For example, the vowel /í/ was
produced as the most front of the three in the front environment but it was also the
most retracted in the back environment. Hence, the correlation observed in the
EMMA data also shows up in the Ultrasound data. The only difference is that the
most front vowel based on Ultrasound was /í/ whereas it was /é/ based on EMMA. I
will return to this issue in Section 3.5.2.
3.4.3. Monosyllabic stems – Results
This part of the experiment analyzed the production of transparent vowels in bare
monosyllabic stems that trigger either front suffixes or back suffixes. The set of
stimuli used for data collection was described in detail in Section 3.3.2. In this
condition, the influence of adjacent vowels on the production of the transparent
vowels was controlled: the transparent vowel was the only vowel in the token word,
130
and the vowels of the frame sentence were always constant. Therefore, the null
hypothesis for this subset of the data is that the position of the tongue during the
production of stems with exclusively transparent vowels is not affected by the choice
of the suffix that normally follows these stems.
In this stimuli set, the type of the transparent vowel was not balanced: there
was only 1 pair with the vowel /i/, two pairs with the vowel /é/, and the remaining 5
pairs contained the vowel /í/. Therefore, the effect of environment was first analyzed
with a one-way ANOVA. The results for subjects ZZ and BU are reported in Table
23. The values in the MD column represent the mean of the differences between the
values from the receivers in the front and back environments.
Subject Source Rec.
BU
BU
df
df
Mean
(between) (within) Square
F
Sig.
MD
TD
124
1
4.061
1.524
.219 –0.36
ENV TB2
124
1
19.339
4.005
.048 –0.79
TB1
124
1
4.754
.782
.378 –0.37
TD
116
1
8.118
6.940
.010 –0.51
ENV TB2
116
1
20.662 11.403
.001 –0.80
TB1
116
1
7.680
7.453
.007 –0.54
Table 23 – ANOVA results for the effect of environment in monosyllabic
stems. MD is the mean difference between the position of the receivers in the
front and back environments.
131
It can be seen that the effect of environment was significant for four out of six
conditions: all three receivers of subject BU and one (TB2) of subject ZZ. All MD
values in the last column are negative, which means that the degree of retraction in
the back environment was greater than in the front environment. This was the case for
all three receivers and both subjects.
As with disyllabic stems, the variable of lexical pair as well as its interaction
with the factor of environment was significant for both subjects. In order to further
examine this interaction, the average values of each receiver and each lexical item
were calculated. This produced the mean MAX value for each lexical item from 4
repetitions. Because the effect of position in the sentence was not significant, there
were 16 data points for each environment (8 lexical items and 2 positions). Following
the statistical analysis for the disyllabic stems, a Wilcoxon signed ranks test was
performed. The results are summarized in Table 24.
Subject
ZZ
BU
Receiver
Neg. ranks Pos. ranks
Z
Sig.
TD(B) – TD(F)
12
4
-1.034
.301
TB2(B) – TB2(F)
12
4
-1.655
.098
TB1(B) – TB1(F)
11
5
-1.138
.255
TD(B) – TD(F)
12
4
-1.551
.121
TB2(B) – TB2(F)
12
4
-1.758
.079
TB1(B) – TB1(F)
10
6
-.879
.379
Table 24 – Effect of environment in individual lexical items for subjects ZZ
and BU.
132
The results in the table show that in the majority of the lexical pairs, the
transparent vowels were on average more retracted in the back environment than in
the front environment. In the rest of the pairs, the transparent vowels in the front
environment were more retracted than in the back environment. The effect of
environment calculated based on the average values from each lexical pairs showed a
tendency in one receiver of each subject (TB2); the other measures were not
significant.
Finally, the monosyllabic stimuli for the pilot subject CK contained only 12
tokens (4 repetitions of three lexical pairs) in each environment (front vs. back). In 9
out of 12 pairs in the TD receiver, the transparent vowel in the stems selecting back
suffixes was more retracted than in the stems selecting front suffixes. Table 25 shows
that Wilcoxon sign ranked test revealed that this limited data is not significant.
Subject
Receiver
CK
TD(B) – TD(F)
Neg. ranks
9
Pos. ranks
3
Z
-1.569
Sig.
.117
Table 25 – Effect of environment in individual lexical items for subject CK.
EMMA data from the monosyllabic stems show that for ZZ and BU at least
one measure supported the rejection of the null hypothesis. Therefore, the effect of
environment was significant and the transparent vowels in those stems that select
133
back suffixes were more retracted than the transparent vowels in the stems that select
front suffixes.
The area measure with the ultrasound data from ZZ’s monosyllabic stems also
supported the rejection of the null hypothesis: environment had a significant effect on
the global shape of the tongue. Table 26 summarizes the results of a one-way Anova.
The area value is consistently greater for the across the environment condition (Diff.)
than for the within the environments condition (Same). This effect was significant for
Block 2 as well as for the entire data set.
Block
df
df
(between) (within)
Mean
Square
F
Sig.
Mean Area
Same
Diff.
1
222
1
275144.893
1.617
.205 919.8459 990.6672
2
222
1 1240603.846
9.316
.003 777.5336 927.9170
1+2
446
1
8.729
.003
1342122
848.690
959.292
Table 26 – Effect of environment in monosyllabic stems based on the area
measure of difference between curves.
Despite this, the calculation of the effect of environment using five lines (Fig. 9) did
not yield a significant effect. However, a tendency was reported in Block 2. Although
significance was not achieved, the direction of the effect corroborated previous
findings: on average, the transparent vowels in words selecting a back suffix were
134
more retracted than in the words selecting front suffixes. However, the differences
were within the measurement error.
Dep.
var.
D
Source
ENV
Block
df
df
(between) (within)
Mean
Square
F
Sig.
1
318
1
.001
.018
.894
2
318
1
.190
2.915
.089
1+2
638
1
.109
1.427
.233
Table 27 – Effect of environment measured on the five lines shown in Fig. 9.
3.5.
Summary and discussion
3.5.1. Harmonic environment
The production of Hungarian transparent vowels {/i/, /í/, and /é/} was investigated
with the use of magnetometry (EMMA) and Ultrasound techniques. In the first part,
the stimuli consisted of disyllabic stems where a back or a front initial vowel was
followed by a transparent vowel that was in turn followed by a suffix with a back or a
front vowel (e.g. bili-vel vs. buli-val). The major finding was that transparent vowels
in a front harmony context were less retracted than in a back harmony context. This
effect was robust and highly significant for all three subjects and both methodologies.
In the second part, the stimuli consisted of monosyllabic stems where the
same transparent vowel triggers either front suffixes (e.g. hír, éj), or back suffixes
(e.g. ír, héj). The harmonic environment affected the position of the tongue in that the
135
transparent vowels that trigger back suffixes were more retracted than the transparent
vowels that trigger front suffixes. This effect achieved or approached significance in
at least some measurements for each subject. Therefore, the transparent vowels in
monosyllabic stems behave comparably with the transparent vowels in disyllabic
stems. The weaker effect of environment in monosyllabic stems will be discussed in
Section 3.5.4.
3.5.2. Vowel type
The data from the two techniques and all three subjects showed that all
vowels are subject to retraction caused by the back harmonic environment. On
average, all transparent vowels were more retracted in the back environment than in
the front environment. However, this effect of harmonic environment was realized to
a different degree on individual transparent vowels. In the palatal area of the tongue
body, measured with EMMA, the difference in retraction between the two
environments was greater for /é/ than for the two /i/ vowels. In the dorsal-pharyngeal
area, measured with Ultrasound, the scale of retraction was reversed: /í/ was the most
retracted and /é/ was the least retracted.
This apparent inconsistency can be resolved by the fact that the two
techniques focus on two different portions of the tongue. The EMMA receivers were
placed mostly in the front part (below the hard palate) whereas the ultrasound
136
measurement was based on the lines placed in the back part of the tongue (below the
velum and opposite to the pharynx). Since the tongue is a complex muscle structure
that does not behave uniformly (e.g. Stone, to appear), it is not surprising to find that
the actions of one part of the tongue differ from the actions in the other part. More
specifically, the expansion of the tongue in one part effectively results in contraction
of another part of the tongue because the volume of the tongue “… can be
redistributed but not increased or decreased” (Stone & Lundberg 1999: 2858).
To apply this fact to the observed differences among the transparent vowels,
the retraction in the back part of the tongue may result in slight raising in the front
part of the tongue. To illustrate this idea, it is assumed that the effect of the back
environment on transparent vowels is realized mainly as a retraction in the dorsopharyngeal area, shown with an arrow in Fig. 11. For a high vowel such as /í/, shown
in the left panel of Fig. 11, this retraction causes a slight flattening of the tongue body
in the palatal area, which prevents a large-scale retraction in the palatal portion of the
tongue. For a lower vowel such as /é/, the palatal portion of the tongue is already
flatter with the tongue mass redistributed in the dorso-pharyngeal and palatal areas.
Retraction stemming from the back harmonic environment thus affects the dorsopharyngeal area to a smaller extent than it affects the palatal area.51 Hence, the
51
Experimental investigation of the differences in height between /í/ and /é/ showed
that /é/ is significantly lower than /í/ for all three receivers (Beller 2004).
137
seemingly contradictory results obtained from the investigation of transparent vowels
using EMMA and Ultrasound might receive an explanation motivated by such
independent facts as stability of tongue volume.
dorsphar.
dorsphar.
pal.
pal.
Tip
/í/
Tip
/é/
Fig. 11 – Illustration of possible effects of retraction on the tongue body.
Tongue tip is on the right; the tongue shape in the back environment is in
bold.
With respect to the differences among the set of the transparent vowels {/i/,
/í/, /é/}, the data reveal an implication between the position of the transparent vowel
in the front environment and its retraction in the back environment: the most
advanced vowel in the front environment is the most retracted in the back
environment. Both the EMMA and the Ultrasound data reveal that this correlation is
significant for subject ZZ, and it is also observable in the EMMA data of the other
two subjects.
138
3.5.3. Lexical pair
The effect of lexical pair was significant for all subjects and correlated significantly
with the effect of environment. Therefore, the harmonic environment affected
transparent vowels in various lexical items differently. Although minor differences
among the lexical items can be found in morphological and prosodic structure or part
or speech, presumably the most relevant differences are in the segments surrounding
the transparent vowel. Both consonants and vowels were different (e.g. tomít-ó vs.
zafír-ban). Unfortunately, the experiment reported in this chapter was not sufficiently
controlled for these influences to provide more information about the nature of the
strong correlation between harmonic environment and lexical pair.
Hence, the effect of environment was not uniform across lexical pairs.
Moreover, there were cases that were at odds with the general pattern of greater
retraction in the back environment than in the front environment. Even more puzzling
is the fact that the differences among the lexical pairs cannot be attributed to the
idiosyncrasy of particular lexical stem(s) because the two subjects were presented
with identical stimuli yet differed as to which pairs followed the majority pattern and
which behaved exceptionally.
It is well established that systems involving real-life phenomena do not
behave uniformly. Hence, both the direction and the size of the effect varies with
different lexical pairs. Nevertheless, the statistical analysis confirmed the validity of
139
the main effects discussed above. Thus a model of vowel harmony in Hungarian
should account primarily for the statistically significant observations. At the same
time, though, the system should be flexible enough to be able to deal with the
exceptions. For example, if a model underlying the behavior of transparent vowels is
construed in dynamic terms, the observed irregularities in the system could be
attributed to random fluctuations arising from the small effect of the harmonic
environment on the tongue position.
3.5.4. Disyllabic vs. monosyllabic stems
The data reported in the previous sections showed that harmonic environment affects
transparent vowels both when adjacent to other vowels in a word as well as when a
transparent vowel is the only vowel of a word. This result does not support the oftenmade assumption that the variation of the transparent vowels depending on harmonic
context is caused by phonetic coarticulation from adjacent vowels. In the
monosyllabic condition, there were no vowels adjacent to the transparent vowels.
Furthermore, if coarticulation were solely responsible for the retraction
pattern in disyllabic condition, the effect of environment would be expected to be
stronger on the short vowels and weaker on the long vowels. This is because long
vowels have more time to achieve their target and are thus less prone to coarticulatory
influence from adjacent material than short vowels. Hungarian has phonemic length
140
distinction and long vowels are about twice as long as short vowels (Magdics 1969:
16).52 In the set of transparent vowels used in the experiment, there were two long
vowels, /í/ and /é/, and one short vowel /i/. In the EMMA data, the long /é/ was
affected by environment in the greatest degree. In the ultrasound data, it was the other
long vowel, /í/, that was affected the most. Therefore, long vowels were more
retracted than short vowels, which supports the hypothesis that retraction of
transparent vowels in the back environment is, at least in part, independent of
phonetic coarticulation.
Finally, let me discuss the comparison of the results from the disyllabic and
monosyllabic stimuli. It seems that the strength of the influence that environment
exerts on the production of transparent vowels varies in the two conditions. Table 28
presents the mean values of retraction for disyllabic and monosyllabic stems (adapted
from Table 3 and Table 9).
52
It is assumed that this was the case in our data. Although duration of vowels was
not measured in the reported experiment, several randomly selected vowels did
follow this pattern.
141
Subject Receiver
ZZ
BU
Disyllables Monosyllables
MD
MD
TD
–0.95
–0.36
TB2
–1.39
–0.79
TB1
–1.32
–0.37
TD
–0.39
–0.51
TB2
–0.59
–0.80
TB1
–0.39
–0.54
Table 28 – Comparison of retraction degree between disyllabic and
monosyllabic stems.
It can be seen that the degree of retraction for ZZ is smaller for the
monosyllabic stems (yet still significant at least for the TB2 receiver). This points to
some effect of coarticulation for this subject: the degree of retraction in the back
environment is increased if the transparent vowel is surrounded by back vowels
within the word. However, for BU, the opposite effect can be seen: the degree of
retraction for the monosyllabic stems is slightly greater than for the disyllabic stems.
Therefore, it seems that coarticulation from adjacent vowels has a different effect on
the degree of retraction in the back environment in the two subjects.
However, at least two considerations question the validity of the comparison
shown in Table 28. First, the disyllabic stimuli were balanced in terms of the type of
the transparent vowel, but the monosyllabic stimuli were not. Out of 22 disyllabic
142
stems, 7 contained /í/, 8 contained /i/, and 8 contained /é/. In contrast, out of 8
monosyllabic stems, 5 contained /í/, 1 contained /i/, and 2 contained /é/. The decrease
in the MD values in the monosyllabic stems for ZZ could be due the decreased ratio
of the stems with /é/ and increased ratio of the stems with /í/ in the stimuli. This is
because in the disyllabic stems, /é/ was affected by environment significantly more
than /í/. For BU, for whom the type of the vowel was not significant, the imbalance in
the stimuli for the monosyllabic stems did not affect the overall MD values.
Second, due to the small number of tokens, the overall MD values in the
monosyllabic condition are more likely to be affected by individual lexical pairs. In
both ZZ’s and BU’s monosyllabic data, there is a single lexical pair that is considered
an outlier. For ZZ, it was the pair hisz vs. nyit where the mean difference was over
2.0 mm for the TD receiver and over 3.0 for the two TB receivers. For BU, it was the
pair cím vs. síp where the mean difference was less than –2.0 mm for the TD receiver
and less than –3.0 for the two TB receivers. Therefore, the outliers have the effect of
increasing the MD value for ZZ but decreasing it for BU. If these outliers were
excluded from the comparisons, ZZ’s MD values from monosyllables would be still
smaller but comparable with his disyllabic results. On the other hand, BU’s MD
would now be smaller than for the di-syllabic stems.
These considerations, however, do not undermine the importance of the
significant effect of environment in the monosyllabic stems. They only question the
143
validity of comparing the overall degree of retraction between the mono and disyllabic data.
3.6.
Conclusion
The experimental findings reported in this chapter raise several important points
relevant to the discussion of phonetic and phonological aspects of transparency. First,
the stable and statistically significant articulatory patterns of tongue body position
support the hypothesis that vowel harmony is, at least in part, an articulatory process.
As will be argued in Chapter 5, the observed variation of transparent vowels can be
captured if vowel harmony is construed as a pattern of interaction between
articulatory gestures corresponding to vowels.
Another important conclusion from the experiment is that the assumption in
most of the theoretical approaches that transparent vowels do not participate in vowel
harmony is not supported by the data. This assumption is based on IPA style
transcriptions since the transparent vowels are always transcribed identically
irrespective of the context in which they appear. However, the investigation of
dynamic articulatory actions of the tongue reported in this chapter showed that
Hungarian speakers systematically produce the transparent vowels differently
depending on the harmonic context (front vs. back). Hence, the transparent vowels do
participate in vowel harmony, which is observable in their articulatory patterns.
144
The systematic pattern of tongue body retraction, however, does not create a
phonemic but a sub-phonemic contrast: retracted realizations of the transparent
vowels are not considered separate phonemes. Hence, it may be argued that this
effect is phonetic. In other words, a plausible explanation of this result is that a
phonological module determines the form of the suffix. Then, a low-level phoneticimplementation module generates sub-phonemic retraction via coarticulation. In this
derivational view then, the observed pattern of tongue body retraction during the
production of transparent vowels must be seen as the result of low-level phonetic
coarticulation from adjacent vowels. This approach was in fact suggested by the
limited studies that reported the effect of environment using acoustic analyses
(Fónagy 1966, Gordon 1999, Välimaa-Blum 1999).
This proposal, however, is weakened by the significant and systematic pattern
of retraction observed on the transparent vowels in monosyllabic stems. It was found
that the transparent vowels in the T stems selecting back suffixes were more retracted
than the transparent vowels in the T stems selecting front suffixes. Crucially, the T
stems in this condition were presented in un-suffixed forms; hence, adjacent vowels
could not cause the coarticulatory retraction. Therefore, the similarity and
systematicity of retraction in both disyllabic and monosyllabic condition do not
support the explanation that relies solely on coarticulation in the phonetic component.
Hence, the predictions of the approaches based on the absence or parameterization of
145
locality requirement discussed in Chapter 2 are not supported by the experimental
results presented in this chapter.
The difficulty with attributing the retraction pattern under the domain of
phonetics suggests that the strict division between phonetics and phonology, assumed
for such processes as vowel harmony, is not tenable. In other words, systematic
phonetic patterns are not necessarily uni-directionally dependent on phonological
alternations. It is proposed that the phonetic and phonological patterns are
interdependent and some phonetic parameters play an active role in determining the
outcome of phonological computation. The relevance of the data presented in this
chapter for the proposals dealing with the interface between phonetics and phonology
is discussed in detail in the next chapter.
146
CHAPTER 4
Phonetics meets phonology
4.1.
Introduction
This chapter links the phonological behavior of the transparent vowels with their
phonetic characteristics. It is proposed that phonetic details of the transparent vowels
are relevant for the phonological alternation of vowel harmony in Hungarian. More
specifically, the chapter develops an argument that transparent vowels are not
excluded from participating in vowel harmony. Rather, when in stem-final position,
their backness is relevant for determining the value of the suffix. This proposal is
supported by three correlations between the phonetics and phonology of the
transparent vowels. These three correlations are discussed in Sections 4.2 through 4.4
respectively.
Section 4.2 discusses the correlation between sub-phonemic retraction in
transparent vowels and the phonological alternation in the suffix form. The
experimental results presented in Chapter 3 showed that back suffixes follow those
transparent vowels that are phonetically retracted. This correlation was observed
across different types of data in Hungarian, namely in monosyllabic stems with a
transparent vowel (T stems) and disyllabic stems where a back vowel precedes a
transparent vowel (BT stems).
147
Section 4.3 develops an argument for linking vowel height and transparent
behavior in palatal vowel harmony. It is noted that in disharmonic stems, all stemfinal transparent vowels can be followed by back suffixes. Hence, all front unrounded
vowels behave transparently to a certain degree. However, it is also observed that as
the transparent vowel lowers in height, the likelihood that it is followed by back
suffixes decreases. This section thus argues that transparent vowels display a
correlation between scalar differences in height and gradiency with which they
combine with back suffixes.
In Section 4.4, a link between perceptual stability of transparent vowels and
their phonological transparency is introduced and the findings related to perceptual
characteristics of vowel-to-vowel coarticulation are reviewed. Crucially for the
discussion in this chapter, the perception of the vowels /i/ and /e/ is minimally
affected by adjacent vowels. In contrast, the perception of other vowels (e.g. /u/, /o/,
/a/) is greatly affected by adjacent vowels. This difference between {/i/, /e/} on the
one hand and {/u/, /o/, /a/} on the other hand correlates with the phonological
behavior of these vowels in palatal vowel harmony systems. The former vowels (/i/,
/e/) are transparent whereas the latter vowels (/u/, /o/, /a/) do not behave
transparently.
Section 4.5 evaluates existing models of the phonetics-phonology
relationship. It argues that, although these models are successful in explaining some
148
of the links between the phonetics and phonology of transparent vowels, they do not
offer a unified and principled treatment of all the mentioned correlations. Therefore, a
new model that explains these correlations is needed.
Section 4.6 builds a foundation for such a new model by investigating the
relationship between the articulatory and acoustic properties of transparent vowels.
Experimental studies by Stevens (1989) and Wood (1979, 1986) are reviewed and the
notion of non-linearity between articulation and acoustics is introduced. The section
then proposes that the differences in non-linear qualities among front vowels provide
independent motivation for the correlations discussed in Sections 4.2 through 4.4.
More specifically, the likelihood that a front vowel is followed by a back suffix
correlates with the potential of the tongue body to be retracted during the production
of the front vowel without significantly affecting the acoustic ‘frontness’ of this
vowel. Section 4.7 concludes the chapter.
4.2.
Articulatory retraction is relevant for suffix selection
The main experimental result reported in Chapter 3 was that the retraction degree of
the stem-final transparent vowels correlates with the selection of the suffix. More
retracted transparent vowels are followed by the suffixes with back vowels whereas
less retracted transparent vowels are followed by the suffixes with front vowels. The
effect of harmonic environment on the production of transparent vowels was most
149
robustly observed in the BT stems where the transparent vowels were flanked by
either back vowels (buli-val) or front vowels (bili-vel).
As discussed is Section 3.6, the explanation that coarticulation is responsible
for the observed retraction effects in disyllabic BT stems is not consistent with the
results obtained from the investigation of monosyllabic T stems. The similarity and
systematicity of the retraction patterns observed on the transparent vowels in
disyllabic stems followed by a suffix as well as in bare monosyllabic stems do not
support the derivational explanation that relies solely on coarticulation in the phonetic
component.53
These results, however, do support a hypothesis that the phonetic details of
tongue body retraction play a role in the phonology of suffix selection. This
hypothesis is an extension of the traditional view where the phonemic [±back] feature
of the last non-transparent stem vowel determines the value of the suffix vowel (e.g.
Ringen 1975, Vago 1980). This thesis proposes that transparent vowels are not
excluded from participating in vowel harmony. When in stem-final position, the
horizontal tongue body position of these vowels is relevant for determining the value
of the suffix. The crucial innovation of this proposal is that this horizontal tongue
53
As will be discussed in Section 4.5, it is possible to trace the origin of this
retraction pattern in unsuffixed forms to speakers’ retention of the coarticulation
present in the suffixed forms (e.g. híd-nak vs. víz-nek). However, Section 4.5 also
argues that a proposal based on this idea is not sufficient and does not generalize to
productive suffix selection in other data (e.g. trisyllabic stems).
150
body position of the transparent vowels that is relevant for suffix selection is subphonemic.
Additional support for the hypothesized relevance of phonetic retraction in
suffix selection comes from the data on BT and BTT stems. As noted in Chapter 2,
the BT stems select back suffixes almost exclusively and vacillate when the
transparent vowel is /e/. The BTT stems, on the contrary, select mostly front suffixes
and sometimes vacillate. The difference between BT and BTT stems is problematic
for the traditional analyses of transparency. In these analyses, the [±back] value of the
rightmost harmonic vowel determines the [±back] value of the suffix. Hence, in both
cases suffix selection should be identical irrespective of whether there is one or two
transparent vowels separating the trigger and the target of the harmony.
Chapter 2 discussed two proposals for deriving the difference in suffix
selection patterns between BT and BTT stems. One was based on differences in
perceptual compatibility of the harmonizing [+back] feature with the [–back] feature
of the transparent vowels (Kaun 1995). In this approach, Kaun argued that the
harmonizing feature extended over two syllables is less compatible with the feature of
the transparent vowels than the harmonizing feature extended over only one syllable.
The other approach was based on prosodic differences between the BT and BTT
stems in terms of secondary stress and foot structure (Ringen & Kontra 1989, Ringen
& Heinämäki 1999). In this approach, BT and BTT stems differ in that the stem-final
151
transparent vowel in BT stems is in the unstressed position whereas the stem-final
transparent vowel in BTT stems is the head of the foot and receives secondary stress.
Both mentioned approaches provide a better alternative than arbitrary
statements that stipulate the number of transparent vowels between the trigger and the
target of a phonological process. However, as discussed in Chapter 2, neither of the
two approaches can explain a broader range of BTT-data including both the
vacillating pattern and differences in suffix selection between BTT and BBT stems
that have presumably identical metrical structures. It was concluded that a more
promising hypothesis is that the relevance of the number of transparent vowels may
be attributed to the differences between BT and BTT stems in the phonetic
manifestation of their stem-final transparent vowel.
To explore this hypothesis, consider the difference between BT and BTT
stems. It is plausible to assume that the retraction of the stem-final transparent vowel
in the BT stems is greater than the retraction of the stem-final transparent vowel in
the BTT stems.54 The transparent vowel in the BT stems is more retracted because it
is directly preceded by a back vowel. In contrast, the stem-final transparent vowel in
BTT stems is less retracted because it is adjacent locally to a front vowel and only
non-locally to a back vowel.
54
Experimental data from the EMMA experiment designed to test this assumption
could not be analyzed due to problems with one of the tongue receivers and
subsequent data extraction. See also the section on future work in Chapter 7.
152
Note that the experimental data already established the correlation between
the differences in stem-final vowel retraction and suffix selection for disyllabic and
monosyllabic stems. It was observed that more retraction correlates with back
suffixes and less retraction with the front suffixes. The difference between BT and
BTT stems in terms of their suffix selection patterns can be explained in the same
fashion. The stem-final vowel in the BTT stems is less retracted than the one in BT
stems due to the backness of the immediately adjacent vowel, and it is the stem-final
vowel in the BTT stems that is more likely to be followed by a front suffix than the
stem-final vowel in the BT stems. Therefore, phonetic retraction of stem-final
transparent vowels is context-dependent, and serves as a good predictor for the
quality of the suffixes in disyllabic and trisyllabic stems.
Considering the patterns of suffix selection in monosyllabic (T), disyllabic
(BT), and trisyllabic (BTT) stems, the proposal that tongue body retraction of the
stem-final vowel is relevant for suffix selection provides a uniform treatment for all
three types of stems.55 This is because the correlation that more retraction implies
55
In a traditional derivational account, this uniformity is not achieved. For example,
Vago (1980) treats the transparent vowels that select back suffixes in monosyllabic
stems as underlyingly back (abstract) and undergoing a rule of abstract vowel
neutralization after his rule of vowel harmony applies. In contrast, the transparent
vowels in BT stems are underlyingly front and are skipped over for the purposes of
the harmony rule. Finally, as discussed before, the pattern of differences between
suffix selection of the BTT and BT words is problematic for traditional derivational
accounts where phonetics and phonology are strictly divided.
153
higher chances for a back suffix applies to all three stem types. This correlation
between sub-phonemic differences in the articulatory properties of stem-final vowels
and the quality of the following suffixes were either observed experimentally (T, BT
stems) or derived from a plausible assumption that the articulation of a transparent
vowel is affected by the quality of an adjacent preceding vowel (BTT stems).
In sum, this section argued that sub-phonemic differences in tongue body
retraction of stem vowels serve as a good predictor of the phonological form of the
suffix that follows the stem-final vowel. In addition to the support for this position
from the result of the experiment reported in Chapter 3, the importance of phonetic
retraction for suffix selection was supported with the difference in the pattern of
suffix selection in BT and BTT stems.
4.3
Phonetic height and suffix selection
This section develops the second argument for the relevance of phonetic detail in the
phonology of suffix selection that is based on the correlation between phonological
transparency and phonetic height. Based on the review of phonological patterns
related to the suffix selection in a broad range of data in Chapter 2, transparency was
established as a scalar phonological property of front vowels in palatal harmony
systems. The basis for the transparency scale is the likelihood of selecting back vowel
suffixes when the stem-final vowel comes from the set {/i/, /í/, /é/, /e/}. It was
154
concluded that /i/ and /í/ are the most transparent, hence most likely to select back
suffixes, followed by /é/. The low /e/ is the least transparent and most likely to select
front suffixes. Compared with /i/, /í/, we observed a slight decrease in transparency
for /é/ and a significant difference between /e/ and the rest of the transparent vowels.
Hence, the lower the transparent vowel, the more likely it is to select suffixes with
front vowels.
Consistent with the Hungarian facts, in a cross-linguistic study of
transparency, L. Anderson (1980) observed an implicational generalization related to
vowel height: if /e/ is transparent, /i/ must be also but not vice versa. As already
mentioned in Chapter 2, additional evidence for the correlation between the height of
the transparent vowel and its phonological transparency comes from several dialects
of the Transdanubian region in western Hungary. In these dialects the short /e/ has
two allophones: high mid [ë] and low mid [Ε]. The higher [ë] behaves transparently,
similar to its long counterpart /é/, whereas the lower [Ε] behaves harmonically and
selects front suffixes only (Sz. Szentgyörgyi, p.c.).
Hence, phonological evidence from Hungarian as well as other languages
suggests that /i/ is the most transparent, /é/ is medially transparent, and /e/ is the least
transparent. These differences among the three vowels can be illustrated with a scale
of phonological transparency as shown in (21) below.
155
(21)
Scale of phonological transparency for front unrounded vowels in
Hungarian
More Transparent
Less Transparent
/i/, /í/
/é/
/e/
Another way to describe the relationship between these four vowels is to look
at their height. Fig. 12 shows the tongue shapes of the four Hungarian transparent
vowels produced in fixed context and captured with Ultrasound. These tongue shapes
were reconstructed from the (target) frames with the most advanced position of the
tongue body. Visual inspection reveals that /e/ is significantly lower (and more
retracted) than the other three transparent vowels. In contrast, /é/ is only slightly
lower than the two /i/ vowels.56
56
The investigation of the vertical tongue position captured by the tongue receivers in
the subset of ZZ’s disyllabic stem EMMA revealed that the height difference between
/í/ and /é/ was significant (Beller 2004).
156
Fig. 12 – Tongue shapes of four transparent vowels in the #b_b# context from
the ultrasound data of subject ZZ. Four repetitions were extracted from the
first block of ultrasound data from subject ZZ.57 Legend: – /í/ [i:], -- /i/ [i], –
/é/ [e:], and ... /e/ [E].
The observations about phonetic height of transparent vowels from the above
figure may be illustrated as in (22).
(22)
Scale of phonetic height of front unrounded vowels in Hungarian
Higher
Lower
/i/, /í/
57
/é/
/e/
The tongue shapes from the second block display the same pattern.
157
A comparison of the scales in (21) and (22) reveals that the pair-wise
distances between the individual vowels on the scale of phonological transparency
correlate well with the distances on the scale of phonetic height. In this sense, the
gradient height of the front unrounded vowels has a direct relationship to their
gradient phonological transparency.
In contrast to the scalar phonetic expression of height, the traditional division
of the transparent vowels using binary features does not capture two key properties of
these vowels. First, such a division does not provide any explanation for the robust
cross-linguistic generalization that higher vowels are more transparent than lower
vowels. In traditional models with segregated phonetics and phonology, vowel height
is expressed by the division of the front unrounded vowels into [+high, –low] /i/, /í/,
[–high, –low] /é/, and [–high, +low] /e/. These models can stipulate that [–high,
+low] vowels are less transparent than [–high, –low] ones that are in turn less
transparent than [+high, –low] vowels. However, these models also allow for
statements that are not attested in palatal harmony systems. An example of such as a
statement is that [+low] vowels are more transparent than [–low] ones.
Although a scalar re-description of height differences does not in itself
provide a principled explanation for the correlation between height and transparency,
there is a principled link between the continuous properties of tongue body height and
its horizontal retraction. Section 4.6 of this chapter reviews evidence showing that
158
vowel height affects the degree of retraction in such a way that lower vowels can be
retracted less than higher vowels. A model where the form of the suffix depends in
part on the phonetic retraction of the stem-final front vowel thus provides a
connection between height and transparency: the height of the transparent vowel
affects its retraction, and the retraction affects the form of the suffix that follows the
transparent vowel. Chapter 5 will argue for such a model.
In addition to the correlation between height and transparency, a model where
transparency is grounded in phonetic characteristics of vowels also captures the
gradient nature of phonological transparency. As we discussed above and illustrated
on the scale in (21), the steps on the transparency scale are not equal. For example, /é/
in Hungarian is only a little less transparent than /í/, but /e/ is much less transparent
than /é/. This asymmetry is problematic for the traditional accounts because the
division of vowel height with binary features creates categories that are equally
distanced from each other. Therefore, the traditional division of phonetic height using
binary features does not align well with the observed phonological behavior of
transparent vowels.
A model where the continuous properties of vowels can be linked to discrete
properties of the suffix form allows for a principled explanation of gradiency in
vowel harmony as well. In such a model, smaller differences in height correspond to
smaller differences in transparency whereas greater differences in height correspond
159
to greater differences in transparency. Therefore, the asymmetry in the phonological
behavior among the transparent vowels follows from the asymmetry in their phonetic
properties such as height and horizontal retraction.58
In short, the traditional featural representation of phonological contrast is too
weak in that it cannot deal with asymmetries in transparent behavior and it is too
strong in that it permits unattested generalizations. A more satisfactory account can
be achieved if the phonological pattern of suffix selection is grounded in the scalar
phonetic properties of height and retraction.
4.4
Perceptual results of coarticulation
The third argument supporting the relevance of phonetic detail in the phonology of
suffix selection comes from findings related to the perceptual stability of vowels
subject to coarticulation. In an environment where two vowel gestures are
contiguous, for example as nuclei of adjacent syllables, these gestures influence each
other. Crucial for the discussion in this chapter is the finding that the degree of
perceptual change caused by this coarticulation varies with the type of the vowel.
58
A careful examination of the tongue shapes in Fig. 12 shows that lower vowels are
more retracted even without any influence of surrounding vowels. Hence, this
observation further supports a principled relationship between height and retraction
that will be discussed in Section 4.6.2 and formalized in Chapter 5.
160
More specifically, front high unrounded vowels have been found to change minimally
compared to other vowels.
There is a large body of experimental work on the acoustic effects of
coarticulation in VCV sequences. The resistance to coarticulation for the front high
vowels (especially /i/) both from adjacent consonants as well as vowels has been
established for a variety of languages (see Recasens 1999 for a review). In addition,
Beddor et al. (2001) investigated both acoustic as well as perceptual effects of carryover coarticulation in CV1CV2 sequences with all combinations of the five vowels {a,
e, i, o, u} in all positions. Despite extensive carry-over effects from V1 on V2 in other
pairs, coarticulation in the pairs ‘a-i’, o-i, ‘u-i’, and ‘a-e’ was not observed. In all
these pairs, a back vowel is followed by a front non-low unrounded vowel. The
remaining two pairs of the relevant group (‘o-e’ and ‘u-e’) showed some
coarticulation effect. However, listeners were able to perceive these effects minimally
when compared to the perception of coarticulation in other vowel sequences.59
59
Stimuli for the perceptual experiment consisted of pairs of ‘bV1bV2’ sequences
where the bV2 syllables were cross-spliced in the following way. Substituting the
second syllables from original ‘baba’, ‘biba’ sequences produced ‘babia’, ‘bibaa’
sequences where the subscript denotes the original coarticulatory context. Listeners
were then presented with two pairs of these sequences, for example bibia-babaa vs.
bibaa babaa, where the second vowels of the first pair were acoustically different but
in appropriate contexts whereas the second vowels in the second pair were
acoustically identical but one of them (in bold) was in inappropriate context.
Listeners were then asked to choose the pair in which the second vowels sounded
more different. The prediction (confirmed by the results) was that listeners would
161
Therefore, these studies showed that listeners are not sensitive to the
coarticulatory effects of back vowels on the following front unrounded vowels /i/ and
/e/. Moreover, /i/ and /e/ were found to be different in that the former is perceptually
more resistant to coarticulation than the latter. Hence, phonological transparency is
associated with those vowels that resist perceptual coarticulation from adjacent
vowels. Furthermore, the differences in the perceptibility of carry-over coarticulation
between /i/ and /e/ parallel the phonological differences between these two vowels in
terms of their transparency.
This is an important result for two reasons. First, it serves as another piece of
evidence for the interdependence between the phonological and phonetic properties
of transparent vowels. In addition to articulatory features of the tongue body
horizontal and vertical position, we can include perceptual sensitivity to
coarticulation among the phonetic dimensions that are closely linked to the
phonological notion of transparency. Second, this result points to the fact that
phonological transparency in terms of suffix selection is connected to the phonetics of
the transparent vowels in a way that includes both articulatory as well as perceptual
properties of these vowels.
compensate for the coarticulatory differences in the acoustic signal and judge the
similarity of the second vowels based on appropriateness of the coarticulatory context
rather than their acoustic differences.
162
4.5
Transparency and existing models of phonetics-phonology interface
The three preceding sections argued that a) the observed degree of retraction of the
transparent vowels is relevant for the phonological pattern of harmony, b) the
phonetic scale of vowel height correlates with the scale of phonological transparency,
and c) perceptual stability of vowels in coarticulation is linked to their phonological
transparency in palatal vowel harmony. These arguments support the hypothesis that
the phonetic and phonological behaviors of the transparent vowels are closely related.
However, the claim that phonetic detail and phonological patterning are interrelated leads to a puzzle: phonetic detail is continuous, whereas phonological
patterning is discrete. On the one hand, all three mentioned correlates of transparency
– tongue body retraction, tongue body height, and resistance to perceptual
coarticulation – are quantitative phonetic features. On the other hand, the [±back]
quality of Hungarian suffixes is a discrete property. This section will review the
proposals in the literature for relating discrete and continuous dimensions of speech.
It will be concluded that although some proposals are partially successful in
explaining some aspects of transparency in Hungarian, none of them captures all the
discussed phonetic and phonological observations.
163
4.5.1
Derivational model
In the traditional derivational model, a cognitive phonological module is assumed to
be separated from an implementational phonetic module. Therefore, observed
systematic phonetic differences can be derived from two sources. Either they are
transferred from the phonemic differences stated in the phonological grammar, or
they stem from low-level processes of the phonetic-implementation module.
However, experimentally observed systematic differences in tongue body
retraction of Hungarian transparent vowels cannot be attributed to either source.
Phonemically, all transparent vowels are specified as [–back] in the output of
traditional phonological accounts. As discussed in Chapter 2, the transparent vowels
in derivational accounts are either excluded from the vowel harmony process by
segment-skipping rules, underspecification, or other devices. Alternatively, the
[–back] underlying specification of the transparent vowels may be changed to
[+back] by the application of vowel harmony, which creates an environment for a late
neutralization rule that changes their feature back to [–back]. In this model then, the
observed differences in tongue body retraction of Hungarian transparent vowels
cannot be attributed to the phonological component.
The remaining option is to argue that the systematic retraction reported in
Chapter 3 stems from mechanical coarticulation from adjacent vowels. This,
however, is also problematic given the observed systematic retraction in the
164
monosyllabic T stems. Recall that these stems were presented in bare form; hence,
there were no adjacent vowels that could be claimed to be responsible for the
difference in retraction.
Therefore, traditional derivational models are not suitable to account for the
systematic correlations between the phonetics and phonology of the transparent
vowels that were observed in Hungarian data. The crucial problem is the strict
separation of the phonetic and phonological components in these models. On the
contrary, as argued in the preceding three subsections of this chapter, the phonology
of suffix selection is tightly connected to the details of tongue body retraction and
other phonetic factors.
4.5.2
Ohala’s perceptually-based model
In a perceptually-based model of transparency (Ohala 1994a,b, Beddor et al. 2001),
the phonological features of transparent vowels are proposed to have a phonetic basis
in coarticulatory properties of these vowels. This model proposes to explain two
phonological properties of transparent vowels. The first is the fact that they do not
undergo harmony, and the second that they do not trigger harmony. In Hungarian for
example, /í/ in radír does not undergo harmony because it does not change to a back
[µ] in the back harmonic context. Moreover, /í/ does not trigger harmony because at
least in BT stems it does not agree with the following suffix (radír-nak, *radír-nek).
165
In this model, the fact that the transparent vowels do not undergo harmony
phonologically is hypothesized to derive from the observation that the preceding back
vowel has minimal perceptual coarticulatory influence on the transparent vowels. In
other words, /í/ in radír does not change to [µ] because the coarticulation with the
initial /a/ does not result in perceptible acoustic retraction (F2 lowering) of the
transparent vowel. This is based on the experimental findings concerning the
resistance to coarticulation for transparent vowels that were discussed in Section 4.4.
If true, this hypothesis would provide an interesting example of a phonological
generalization derived from an established phonetic fact.
Compared to the perceptual basis for the fact that the transparent vowels do
not undergo harmony, the support for the phonetically based explanation of the
second fact, that transparent vowels do not trigger harmony, is less convincing.
Experiments show that front unrounded vowels greatly influence the following
vowels perceptually. For example, the results reported in Beddor et al. (2001)
confirm that both /i/ and /e/, when in the V1 position of a V1CV2 sequence, have a
strong coarticulatory effect on the vowels in the V2 position. Hence, due to their
extreme F2 values, these vowels should be ideal triggers for the left-to-right harmony
in Hungarian, but this is not borne out (*radír-nek).
To explain this paradox, Ohala (1994b) suggests that listeners are aware of the
strong coarticulatory effect of /i/ and /e/ on the following vowels and thus are able to
166
parse this effect out of the signal. Hence, the suffix vowel in radír-nak is significantly
affected by coarticulation ‘spilled over’ from the preceding front vowel. However,
the listener is aware that /í/ has this effect on /a/ and can thus disregard it. Therefore,
the perceptual features essential for the explanation of the fact that the transparent
vowels do not undergo harmony must be ignored to explain the fact that the
transparent vowels do not trigger harmony.
In this model, two phonological generalizations related to transparent vowels’
behavior receive radically different phonetically based explanations. On the one hand,
the fact that transparent vowels do not undergo harmony derives from their perceptual
resistance to coarticulation. Hence, the phonological pattern where a front [i:] does
not change to a back [µ:] in radír is phonetically natural because [i:] is not affected
phonetically by backness from the initial [a]. On the other hand, the fact that
transparent vowels do not trigger harmony derives from the ability of listeners to
disregard a readily perceptible coarticulatory effect that transparent vowels have on
adjacent vowels. Hence, the phonologization of suffix selection is based on the active
avoidance of a natural phonetic process of coarticulation.
In this model, then, phonologization is based on the adherence to and
avoidance of natural phonetic patterns. Crucially, some of the facts related to
transparency are still not fully captured by this model. There is no explanation in this
model for the third pattern whereby the flanking vowels agree in backness across the
167
transparent vowels. In other words, it is not clear how the backness of the suffix
vowel in radír-nak can be ascribed to (a phonologization of) V-to-V coarticulation
from the first stem vowel, given that the two vowels are not adjacent. Although longdistance coarticulation across medial schwa was found in English (Magen 1997),
studies also showed that the front vowel /i/ is resistant to coarticulation from the
preceding vowel(s) in terms of perception (Recasens (1987) for Spanish and Catalan,
Farnetani et al. (1985) for Italian, Magen (1984) for Japanese). Assuming that the
resistance to perceptual coarticulation of high front vowels applies to Hungarian as
well, the propagation of the acoustic backness on the suffix vowel from the first
vowel across a transparent vowel such as /i/ is even less rooted in the phonetic
pattern.
Hence, the model of perceptually-based transparency offers a paradox. On the
one hand, the model is rooted in phonetics as it proposes that the phonological
behavior of transparent vowels follows from their acoustic-perceptual characteristics.
This is, moreover, consistent with much related work on the phonetic grounding of
phonological phenomena (e.g. Ohala 1990, Steriade 1997, Pierrehumbert 2000). For
example, Ohala (1990) has argued convincingly that the phonological phenomenon of
nasal place assimilation is rooted in the weak acoustic place cues of the nasals when
followed by stops.
168
On the other hand, the model must resort to fundamentally different
explanations for the fact that the robust and readily perceptible phonetic pattern of
coarticulation is not ‘phonologized’ as harmony whereas a weak or non-existent
phonetic coarticulation between non-adjacent vowels across a transparent vowel does
result in harmony.
One could try to improve the model with an assumption that the suffixes in
Hungarian are underlyingly [+back] (as Kiparsky (1973) argues for Finnish, but cf.
Vago’s (1980) argument for an underlyingly [–back] feature for some suffixes in
Hungarian). Under this assumption, the [+back] form of the suffix following stems
like papír does not originate from the [+back] form of the initial vowel. Despite the
fact that coarticulation from the stem-final /í/ affects the production of the suffix
vowel to a great degree, listeners filter out these coarticulation cues and perceive the
suffix as [+back]. However, under this approach, one would then also expect that the
stems with only transparent vowels should invariably select back suffixes. Although
there is an important group of such stems, the default pattern in Hungarian is to select
front suffixes for these stems.60 Therefore, even this alternation would not save the
analysis from a non-principled treatment because the productive and phonetically
60
The fact that this is a productive pattern is best seen in diminutives of proper names
such as Tib-i-nek, *Tib-i-nak (Vago 1980, van der Hulst 1985).
169
motivated pattern of front suffixes following stems with transparent vowels would
have to be analyzed as exceptional.
A variation of Ohala’s perceptually-motivated approach outlined above was
proposed in Kaun (1995) and Hayes (2004). In Kaun’s model, transparent vowels do
not participate in vowel harmony because their [–back] feature is assumed to be
perceptually compatible with the [+back] harmonizing feature. Perceptual
compatibility occurs when the [–back] feature of the transparent vowel does not
interrupt the acoustic signal of the [+back] harmonic feature. Additionally, the
transparent vowels do not trigger harmony due to their high perceptual strength. In
Kaun’s model, harmony is construed as temporal extension of certain marked features
in order to facilitate their perception. Kaun hypothesized that transparent vowels do
not have to extend their features because they are highly salient themselves.
One of the advantages of this approach is that it can be extended to explain
the correlation between vowel height and degree of transparency in palatal vowel
harmony. For example, Hayes (2004) assumes that lower vowels, due to their
perceptual inferiority, allow extension of their [–back] feature beyond the syllable in
which they reside (i.e. behave harmonically), whereas higher vowels, due to their
perceptual strength, do not resort to this strategy.
These analyses are more satisfactory than a traditional analysis of
transparency where “… quantity and quality of intervening vowels should be
170
irrelevant” (Kaun 1995: 142). This is in part because perceptually-based models
provide independent (phonetic) motivation for the phonological behavior of
transparent vowels. However, they still assume that transparent vowels do not
participate in vowel harmony. As discussed above, this assumption is challenged by
the experimental data reported in this dissertation because that data support the
hypothesis that transparent vowels form an integral part of the harmony domain.
Moreover, the perceptual data reviewed in Section 4.4 weaken the assumption that
the non-participation of transparent vowels in vowel harmony derives from the
perceptual compatibility of their [–back] feature with the [+back] harmonizing
feature.61
4.5.3
Exemplar-based model
Another family of models that deals with the relationship between phonetics and
phonology is the exemplar-based models (Johnson 1996, Kirchner 1999,
Pierrehumbert 2001, 2003). In these models, continuous phonetic details are
potentially relevant for phonological alternations as they are stored in the lexicon as
exemplars. For example, the difference in retraction between the two /i/s in buli-val
61
Additionally, Kaun’s model does not account for vacillation in Hungarian.
Moreover, the assumed universal transparency continuum suggests that a sequence of
two vowels is always less transparent than a single vowel. However, a single rounded
vowel is opaque in Hungarian (e.g. parfüm-nek ‘perfume-Dat.’ whereas a sequence of
two unrounded vowel may be transparent (e.g. aszpirin-nek ‘aspirin-Dat.’).
171
and bili-vel that we experimentally observed would be stored in the lexicon as
different exemplars of the category /i/. These differences arise from the input set of
Hungarian words where the transparent vowels sometimes occur adjacent to front
vowels, as in bili-vel, or víz-nek, and sometimes to back vowels, as in buli-val, or hídnak. Using the extension of the model proposed in Kirchner (1999), this lexically
stored perceptual difference between the two groups of exemplars for the vowel /i/
could then be transformed into an actual articulatory difference in production, as
observed in our experiments.
The workings of the model can be illustrated with the data from monosyllabic
stems such as híd ‘bridge’ and víz ‘water’. As pointed out in Gósy (2000), children
often hear stems together with various suffixes. Hence, a stem such as híd would be
heard conjugated as híd-nak, híd-hoz, híd-nál, …, and híd-∅. In contrast, a stem such
as víz would be heard as víz-nek, víz-hez, víz-nél, …, and víz-∅. As a result, the input
data to the model would consist of multiple exemplars of /í/, some extracted from the
environment where /í/ is followed by a back vowel suffix, some from the
environment where /í/ is followed by a front vowel suffix, and some from /í/ not
followed by any suffix.
Let us assume that children are able to perceive the difference between two
exemplars of /í/ originating from the inputs such as híd-nak and víz-nek. The
exemplars of /í/ associated with híd would then be on average more retracted (with
172
lowered F2) because they are affected by coarticulation from the back suffix vowels.
The exemplars for /í/ in víz will be less retracted since all its suffixes contain a front
vowel. This is illustrated in Fig. 13, based on Pierrehumbert (2001: 141).
Activation
strength
– híd
-- víz
F2
Fig. 13 – A sketch of the lexicon of exemplars of /í/ based on F2 values
derived from the input set {híd-nak, híd-hoz, híd-nál, híd-ba, híd-ban, híd-∅,
víz-nek, víz-hez, víz-nél,víz-be, víz-ben, víz-∅}.
The bias for more retracted exemplars associated with híd in the lexicon
mirrors the input data and can be transformed into production given Kirchner’s
extension of the model. Crucially, even the realizations of bare híd and víz would
show the difference in retraction. This is because the value of F2 for the production of
/í/ in bare híd would be supplied by the exemplar that best represents the cloud of
exemplars associated with híd. According to this model, then, Hungarian speakers
detect and store differences among transparent vowels caused by coarticulation from
the adjacent vowels. In addition, they can reproduce these differences in a noncoarticulatory environment of bare stems. Moreover, the exemplar-based model is
173
also able to account for the statistical tendencies rather than categorical differences
observed in the Hungarian data (Pierrehumbert 2003). Hence, the exemplar-based
model is more successful than the derivational model and can be extended to account
for the observed output data on tongue body retraction given some plausible
assumptions about the input data.
Despite these advantages, a closer examination of the exemplar-based model
reveals certain deficiencies in providing a principled understanding of transparency in
vowel harmony. In order to explore this issue, consider two major reasons for
building a model of any natural phenomenon (Kosslyn 1978). The first one is that a
model formalized mathematically provides explicit and testable predictions about the
behavior of the variables and the system as a whole. This aim of modeling
corresponds to the notion of descriptive adequacy (Chomsky 1965) where a
grammatical model of a language generates only the forms present in a given corpus
and predicts the forms that do not belong to the original corpus but constitute possible
forms in that language. The second motivation for a model is that it helps us better
understand the phenomenon for which the model was constructed and thus enables us
to extend the underlying principles of the model to cover a broader range of data.
This aim corresponds to Chomsky’s notion of explanatory adequacy. A model
characterized as explanatory adequate should inform us about possible grammars and
about principles underlying human speech.
174
In the following, I argue that an exemplar model of Hungarian data, while
descriptively adequate, is deficient when explanatory adequacy is considered. In other
words, an exemplar model does not inform us about the underlying principles of
transparency. Rather, it relies on the input data to extract the observed patterns
without explaining why we find these patterns in the input data in the first place.
Previous sections of this chapter reported several sub-generalizations in the
Hungarian phonetic and phonological data: the phonetic retraction of the transparent
vowels, their height, and their number, all correlate with the form of the suffix that
follows them. Therefore, a good model for these data should account for the behavior
of variables that have both a continuous phonetic nature, such as tongue body height,
as well as a categorical phonological nature, such as the form of the suffix vowel.
Furthermore, an explanatory adequate model should also provide us with an
understanding of the correlations between the phonetic and phonological variables.
An exemplar-based model is able to describe the behavior of the continuous
phonetic variables. However, this is crucially dependent on prior information about
the categorical variables. In other words, the model can provide the degree of
retraction of the transparent vowels given that some other component (or input)
provides the form of the suffixes. Therefore, the value of the suffix vowels must be
determined by some other component that does not use any phonetic information.
Hence, an exemplar-based model only partially satisfies the goals in modeling
175
described above because it does not predict the behavior of the categorical variables
of the system.
Additionally, individual generalizations within the Hungarian data do not
receive a unified explanation. Previous sections argued that similar correlations
between the form of the suffix and final vowel retraction could be observed in
monosyllabic, disyllabic, and trisyllabic stems. However, an exemplar-based model
treats these sub-generalizations as independent from one another. The model can
extract these generalizations from the input data; however, the fact that all these subgeneralizations are reducible to a common underlying principle remains a
coincidence under this model.
Another weakness of the exemplar model is in its limited applicability to a
broader range of data. The model, in its present form, cannot deal with the fact that
the correlations between phonetic details and phonological patterns are not languagespecific but systematic. For example, lower and more retracted /e/ is less transparent
than higher and more front /i/. The exemplar model can derive generalizations such as
this from the input Hungarian data. However, a similar generalization is true for
Finnish and other palatal vowel harmony systems (L. Anderson 1980). Therefore,
there is a principled correlation between height and transparency, and the opposite
generalization that lower vowels are more transparent than higher vowels is not
attested. The problem is that an exemplar-based model could encode such unattested
176
generalizations if provided with appropriate input data. Hence, the absence of the
unattested pattern itself in the input data is not explained.
To summarize, an exemplar-based model presents an important development
from traditional accounts in that it recognizes the role of sub-phonemic differences in
phonology. In this model, phonology can access phonetic details due to the capacity
of the cognitive phonological system to store and reproduce fine-grained phonetic
details. In addition, this model is inherently stochastic and allows for the explanation
of frequency effects with the use of the probability theory (Pierrehumbert 2003).
However, the role of phonetic details in the current form of the exemplar model
prevents a principled account of the correlations between phonetics and phonology
that were described in previous sections of this chapter.
The following section develops an argument that an underlying principle for
the phonological behavior of front vowels in palatal vowel harmony lies in the nonlinear relationship between articulatory and acoustic characteristics of these vowels.
A model mathematically formalizing the system based on this principle is developed
in Chapter 5. In addition, the model develops another underlying common principle
of explanation, according to which the sub-phonemic retraction of the stem-final
vowel correlates with the form of the suffix. Hence, the phonology of suffix selection
not only stores but also actively utilizes the phonetic details of tongue body
retraction. In this sense, the information on sub-phonemic retraction of the
177
transparent vowels is included in the cognitive system that calculates suffix forms.
The model thus predicts the behavior of both continuous as well as categorical
variables relevant in palatal vowel harmony, and provides for explanatory adequacy
by developing the principles of non-linearity between acoustics and articulation, and
by claiming the relevance of phonetic information for a phonological alternation.
4.6
Vowel harmony and non-linearity between articulation and perception
Perceptual and articulatory properties of transparent vowels were discussed in the
preceding sections of this chapter in an effort to establish a link between the phonetic
and phonological characteristics of these vowels. On the one hand, the experimental
results showed that the participation of transparent vowels in vowel harmony could
be observed in the articulatory patterns of these vowels. It was argued that seemingly
different phonological patterns of suffix selection in monosyllabic, disyllabic, and
trisyllabic stems have all in common the correlation between the [±back] form of the
suffix and sub-phonemic degree of tongue body retraction in the stem-final
transparent vowel. On the other hand, perceptual qualities of the transparent vowels
were used in efforts to explain their phonological behavior (Kaun 1995, Ohala 1994).
However, these perceptual properties were taken to motivate the non-participation of
these vowels in the vowel harmony process. This section argues that a better
understanding of the link between the phonetics and phonology of transparent vowels
178
is achieved if the articulatory and acoustic properties of transparent vowels are not
separated. Rather, it is argued that transparency in palatal vowel harmony is grounded
in the quantal nature of the relation between articulation and perception of the
transparent vowels.
4.6.1
Transparency: articulatory retraction without significant perceptual
effect
The main proposal of the quantal theory of speech (Stevens 1972, 1989) is that
articulatory-acoustic coupling during speech is not accidental. In other words, while
both acoustic and articulatory properties separately represent a continuum, their
relationship displays discontinuous characteristics. This non-linear relationship can
be illustrated as in Fig. 14 that is adapted from Stevens (1989: 4).62 The S-curve
divides the abstract phonetic space into three regions. For regions I and III,
articulatory changes along the x-axis do not result in significant changes in perception
along the y axis. Region II, however, is fundamentally different in that even small
changes in the x parameter cause significant differences in the y attribute.
62
The adjectives quantal, non-linear, and discontinuous are used interchangeably in
this section.
179
Perceptual Parameter
III
II
I
Articulatory Parameter x
Fig. 14 – Non-linearity between articulation and perception.
Stevens argued that Universal Grammar utilizes the stable regions I and III to
encode contrast in phonological systems. This, according to Stevens, explains why
the abundance of coarticulation in natural speech does not hinder perception.
Articulatory targets might be achieved imprecisely or with a great degree of variation
as a result of coarticulation. Yet, the desired perceptual effect is still achieved due to
the presence of regions of perceptual-articulatory stability.
Stevens (1989) and Wood (1979) provide evidence for the type of non-linear
relationship described in Fig. 14 for non-low front unrounded vowels. Stevens (1989)
illustrates the basic relationship between articulation and acoustics with the tube
model of the vocal tract (e.g. Fant 1970). In this model, the vocal tract is represented
as a single tube that is closed at one end. This end represents the larynx as the source
of acoustic energy. Local narrowings of the tube then represent the deformations
180
caused by the tongue and the lips. An example of this model for front non-low vowels
is shown in Fig. 15.
Fig. 15 – Approximate mid-sagittal vocal tract configurations for non-low
unrounded vowels (on the top) and the three-tube model of these
configurations on the bottom (adapted from Stevens 1989: 11). The two wider
tubes correspond to the back and front cavities, and the narrow tube in the
middle models the constriction between the tongue and the palate. A1 and A2
are cross-sectional area values of the back and front cavities respectively, and
Ac is the area of the constriction. The length of the back cavity is marked as l1.
Assuming that Ac = 0 and that l2 is the length of the front cavity, the
frequencies of the back and front cavities can be computed from l1 and l2 values
respectively (e.g. Fant 1970, Stevens 1989). If Ac ≠ 0, these frequencies are slightly
shifted due to the coupling between the resonators. The natural frequencies of this
coupled system with two cavities then have local maxima or minima at the l1 values
corresponding to the intersections of the uncoupled frequencies. The relationship
181
between the constriction location and the acoustic output can be seen in a nomogram
that plots the predicted natural frequencies of the coupled system as the constricted
portion of the tube moves along the vocal tract. Fig. 16 shows two such nomograms
for the lengths of the constriction lc = 5cm on the left, and lc = 6cm on the right.
Fig. 16 – Nomograms of the natural (formant) frequencies of the three-tube
model in Fig. 15 as a function of the length of the back cavity.
It can be seen that F2 reaches its maximum and F3 reaches its minimum in the
region of l1 values between 6.5 and 9cm. Stevens points out that the natural
frequencies in the vicinity of their maximum or minimum values are relatively
insensitive to small changes in l1 values. In contrast, when the constriction becomes
more retracted (the l1 decreases) to the left of the F2 maximum, “there is a substantial
decrease in F2, and F2 becomes quite sensitive to changes in l1” (Stevens 1989:11).
Stevens concludes that “… when the tongue body is in a fronted position such that a
constriction is formed in the anterior part of the vocal tract, there is a range of
182
position for which F2, together with F3, and possibly F4, are relatively insensitive to
anterior-posterior perturbations in tongue-body position” (Stevens 1989: 12).
Applied to the Hungarian case, the non-low front unrounded vowels may be
retracted to some degree without causing significant changes in their acoustic outputs.
Moreover, cross-linguistic studies of the horizontal tongue position in palatal vowels
show that the tongue body during unrounded vowels is more advanced in languages
that contrast rounded and unrounded vowels than in languages where this contrast is
absent (Wood 1986). Because the Hungarian inventory exhibits the contrast in
rounding for the front vowels, it is plausible to assume that the canonical front
unrounded vowels in Hungarian have a relatively anterior position and can be
potentially retracted with minimal effect on their acoustics.
Wood’s (1979) data from a study of the quantal properties of vowels using
natural human vocal tract profiles corroborate the results from the Stevens model.
Wood estimated vocal tract area functions from the X-ray data of English and Arabic
speech as well as the sources from 13 other languages available in the literature. The
study includes the reports of the kinetic and potential energy distribution calculated
from the vocal tract area functions. These energy distributions (together with volume
velocity and sound pressure parameters) show that certain formants are insensitive to
slight shifts in particular locations of the vocal tract constriction. This is because “the
sound pressure and volume velocity maxima are not narrowly localized but range
183
over extended zones, consequently, resonance mode sensitivity to local narrowing or
expansion does not alter appreciably through these zones” (Wood 1979: 31). The
figure below shows that such a zone is present around the hard palate area (HP)
where the constriction results in a maximum in the energy distribution for F1 and a
minimum for F2.63
Wood’s study also includes a discussion of the lingual muscle activity in the
production of vowels. Wood observed that front non-low vowels such as /i/ and /e/ in
different languages vary in terms of the horizontal position of the palatal constriction.
This is possible, according to Wood, because these vowels are mostly formed by the
action of the genioglossus muscle alone. In contrast, the production of low and back
vowels primarily by the styloglossus also includes the activity of the palatoglossus
and the pharyngeal constrictors, whose “sphincteral function […] leaves little
opportunity to vary the location“ of the constriction for these vowels (Wood 1979:
41).
63
Local maximum in this energy distribution corresponds to high kinetic energy and
low potential energy, which translates into a fall in the formant resonance. Local
minimum has the effect of increasing the formant resonance.
184
Fig. 17 – Energy distribution for the palatal vowels for an English speaker (on
the left) and an Arabic speaker (on the right), adapted from Wood 1979: 2829.
It can be concluded from Stevens’s and Wood’s studies that the calculations
using both simple tubes as well as natural human vocal tract profiles show that the
acoustic outputs for non-low front vowels to be insensitive to a limited amount of
variation in the horizontal position of the tongue body. Therefore, /í/ may be retracted
185
to some degree Rí without losing its perceptual identity. Following the concept of
non-linearity described in Fig. 14, this situation is illustrated in Fig. 18 below.
Perceptual quality
back
III
II
front
I
front
back
Tongue body position
Fig. 18 – Non-linear relationship for front non-low unrounded vowels. Tongue
body retraction is shown as the difference between the two balls on the x-axis
while the minimal perceptual effect of this retraction is shown on the y-axis.
The black ball represents the placement of the palatal constriction formed by
the tongue body. A slightly retracted tongue body position, illustrated with the gray
ball, still falls in the region of perceptual stability and a vowel with this constriction
location is still considered a front vowel.
Therefore, it is argued in this dissertation that transparency in vowel harmony
has a phonetic basis in that it arises from the non-linear relationship between
articulatory and acoustic domains. More specifically, transparent vowels in palatal
186
vowel harmony are those vowels that can be articulatorily retracted to a certain
degree and still maintain their acoustic quality of frontness.
4.6.2
Effects of lip rounding and tongue body lowering
Two cross-linguistic generalizations were observed about transparency in palatal
vowel harmony. The first one concerns lip rounding of front vowels and states that
unrounded vowels are transparent whereas the rounded ones are opaque. The second
generalization concerns the tongue body height and states that high vowels are more
likely to be transparent than low vowels. In the following, I argue that both of these
generalizations can also be linked to the non-linearity between articulation and
acoustics (and by extension also audition and perception).
With respect to the first generalization, both Stevens (1989) and Wood (1986)
showed that the quantal properties of front rounded vowels differ from those of the
front unrounded vowels. The source of this difference between /i/ and /ü/ comes from
a difference in the position of the constriction relative to the vocal tract.
In terms of the actual position of the tongue, the difference between the front
rounded and unrounded vowels in languages that have both vowels is minimal. Both
/i/ and /ü/ are slightly advanced in languages with a rounding contrast compared to /i/
in languages without this contrast (Wood 1986: 392). A selection of /i/ and /ü/
profiles collected in the literature shows that the tongue is not more retracted for /ü/
187
but it is slightly lower than for /i/ (Wood 1986: 393). This is supported in Hungarian
by the ultrasound images of front vowels taken from the environment /b__b/. Fig. 19
shows the vowels /í/, /i/ with solid black and /ő/, /ü/ with dashed gray lines. We see
that the position of the tongue body is almost identical in the two conditions; the only
difference being that the rounded vowels are on average slightly retracted and
lowered compared to the unrounded vowels.
Fig. 19 – Ultrasound shapes of high front unrounded vowels (solid black) and
high front rounded vowels (dashed gray) in Hungarian. There are 7 tokens in
each condition; all are in a /b_b/ context.
188
In terms of the relative position of the constriction, however, /i/ and /ü/ are
different. The major difference between /i/ and /ü/ comes from rounding that extends
the length of the vocal tract. This extension of the vocal tract results in the
advancement of the region of decreased acoustic sensitivity to articulatory
perturbation in the horizontal dimension. This is evident from the nomograms
reported in Stevens (1989: 17), reproduced here in Fig. 19. While the region of
decreased acoustic sensitivity is around 7.5cm for the unrounded vowels, it is around
9cm for the rounded vowels. The x-axis corresponds to the length of the back cavity.
Hence, a higher x-value corresponds to a more advanced position in the vocal tract.
Fig. 20 – Formant resonances for the unrounded front vowels (solid lines) and
the rounded ones (dashed lines) when rounding is modeled as the extension of
the vocal tract by 1cm in the front (Stevens 1989: 17).
189
The vocal tract is longer for /ü/ than for /i/, which effectively shifts the point
of vocal tract constriction toward a more retracted position (measured from the lips)
for /ü/ (Wood 1986). In addition to the effect of rounding, Wood investigated other
articulatory differences between rounded and unrounded vowels. Relevant for this
discussion is the height of the larynx. Wood observed that the larynx is slightly
depressed in the production of front rounded vowels compared to the unrounded
ones. He concluded that this depression is essential to compensate for the lip
rounding: the area of F2 insensitivity to articulatory perturbation for rounded vowels
thus remains in the pre-palatal region. Crucially, however, this area is still more
anterior than for unrounded vowels (Wood 1986: 400).
Fig. 21 shows the nomograms reported by Wood (1986). In addition to
Fig. 19, these plots show continuous advancement of the region of acoustic
insensitivity (approximity of F2 and F3) from mid-palatal area when lips are spread
to pre-palatal area when lips are rounded. The comparison of the plots on the right
and on the left reveals that lowering the larynx helps in keeping the insensitivity
region for the rounded vowels in the pre-palatal area.
190
Fig. 21 – Formant resonances for the spread (solid), neutral (dashed),
moderately rounded (dashed) and closely rounded (dotted). The plot on the
right shows the resonances when the larynx is depressed. (Wood 1986: 396).
The relative difference in the location of the region with decreased acoustic
sensitivity for /i/ and /ü/ can be illustrated in Fig. 22 below. The white tube represents
the vocal tract, the gray block represents the area of decreased acoustic sensitivity to
articulatory variation, and the black block represents the canonical location of the
palatal constriction within the (gray) region of decreased acoustic sensitivity. In the
top panel, the tongue body constriction has certain flexibility and may be slightly
retracted while still remaining within the insensitivity region. In contrast, the
191
extension of the vocal tract due to lip rounding in the bottom panel results in the
advancement of the insensitivity region despite the compensation at the larynx. As a
result of this advancement, the potential retraction of the tongue body within that
region is minimal. Due to these factors, the tongue body position for /ü/ can be less
retracted than the position for /i/.
Lips
TB constriction
Larynx
/i/
/ü/
Fig. 22 – Illustration of the quantal differences between /i/ and /ü/.
More support for differences in perceptual consequences of articulatory
retraction comes from several observations in the literature on Finnish palatal
harmony.64 As discussed in Section 3.2, there are a limited number of studies dealing
with the phonetics of transparent vowels. At least two studies report acoustic
differences between the transparent vowels in the front and back context (with
varying significance levels) (Gordon 1999, Välimaa-Blum 1999). The mean F2 or F2F1 value was lower for the transparent vowels in the back context. Interestingly, I am
not aware of any reports where these retracted vowels would be perceived differently
64
I assume that Finnish palatal harmony is similar to Hungarian in the crucial aspects
of transparency.
192
from non-retracted variants found in the front context. But there are reports that front
rounded vowels in a back harmonic context are perceived either as back vowels
(Campbell 1980), or as vowels with an intermediate value of backness [u] (Wiik
1995, Välimaa-Blum 1999). For example, front rounded [y] in “… olympialaset
‘Olympics’ is often perceived as /u/ … olumpialaset, …, Söul is seldom Söul but
Soul, or Söyl” (Välimaa-Blum 1999: 248). These data can be explained by assuming
that the front rounded vowels in a back harmonic context are retracted to some degree
but, due to the advancement of the region of acoustic insensitivity for these vowels
described in Fig. 22, this retraction falls outside this region and results in vowels that
are not perceived as front anymore.
Fig. 23 expresses the quantal difference between front unrounded and rounded
vowels, following the notion of non-linearity described in Fig. 14 and Fig. 18 above.
The solid curve corresponds to the quantal properties of the front unrounded vowels
whereas the dashed one corresponds to the properties of front rounded vowels.
193
Acoustic quality
back
III
------ rounded
–––– unrounded
II
I
front
front
back
Tongue body CL
Fig. 23 – Quantal differences between unrounded and rounded front vowels.
Suppose that the horizontal tongue body location for both rounded and unrounded
front vowels is illustrated with the filled black ball. Following the quantal curve for
the unrounded vowels, we see that they may be retracted to some degree (gray filled
ball) without significantly affecting their perception as front. However, the behavior
of front rounded vowels is different. We see that if the ball follows the curve for the
rounded vowels, these vowels can be retracted less than the unrounded vowels
(empty black ball). If the rounded vowel is retracted to the same degree as the
unrounded vowel (empty gray ball), it loses its perceptual identity.
The second generalization observed in Chapter 2 was that higher front
unrounded vowels are more transparent than lower vowels. There is converging
evidence that the size of the region of acoustic insensitivity to the horizontal
perturbation of the tongue body decreases as the tongue lowers. Firstly, the studies by
194
Stevens (1989) and Wood (1979) showed that increased acoustic stability related to
insensitivity to articulatory retraction of the tongue in the palatal region only applies
to front non-low unrounded vowels. Moreover, the vowel space becomes increasingly
crowded in the horizontal dimension as the vowel height decreases, which is
traditionally depicted by the trapezoid shape of the vowel quadrilateral.
Secondly, the study by Välimaa-Blum (1999) on transparency in Finnish
vowel harmony claims that there are three possible low unrounded vowels in suffixes
following loanword disharmonic stems. Apart from the usual front /Q/ and back /A/,
there is a third, ‘intermediate’ vowel of a quality close to /a/. For example, dynamiitti
can take one of the three suffixes [la], [lQ], or [lA]. This difference is readily
perceptible and supported by an acoustic analysis. The acoustic data show that the
difference between [a] and the ‘regular’ front and back vowels [Q] and [A] is
significant at least for some formant measurements. The author claims that neither
rounding nor vowel height was responsible for the differences. Hence, the acoustic
differences must result from the variation in tongue body backness. The ‘medial’ [a]
is perceived as neither a clear front nor a clear back vowel and never occurs in
suffixes following harmonic stems.65
65
Välimaa-Blum speculated that Finnish speakers fail to reach the intended target for
the suffix vowel because disharmonic stems violate basic native phonotactics.
195
The relevant observation for this discussion is that minor deviations from the
intended horizontal articulatory target for the low vowels are transformed to readily
perceivable differences in vowel quality. Hence, the acoustic stability of low front
vowels in terms of the horizontal tongue position is decreased compared to the nonlow vowels.
Thirdly, some Hungarian speakers intuitively perceive that low /e/ in
vacillating stems sounds differently depending on the quality of the suffix vowel (E.
Solyom, p.c.). For example, /e/ in hotel-ban ‘sounds different’ from /e/ in hotel-ben.
This difference in perception is not reported for other transparent vowels, for example
/i/ in bili-vel is not reported as sounding different from /i/ in buli-val.66
To sum up the converging evidence, the difference between high and low
front vowels in terms of their quantal properties can be characterized as in Fig. 24.
The gray area of decreased acoustic sensitivity to articulatory perturbations in the
horizontal dimension is shrunk for the low front vowels compared to high vowels. As
a result, the potential tongue body retraction is more limited in the low vowels than in
the front vowels.
66
Speakers who report this intuitive difference have experience with the dialects
where short /e/ has two allophones: high-mid [ë] and low mid [E]. Hence, this
difference in perception between two /e/s in hotel-ban and hotel-ben might arise only
from variation in height and not in horizontal retraction. However, height and
retraction are closely linked, as I discussed in this chapter.
196
Lips
TB constriction
Larynx
[i]
[E]
Fig. 24 – Illustration of the quantal differences between [i] and [E].
To conclude this section, I argued that the differences in the phonological
behavior of front vowels in palatal harmony can be linked to their quantal properties.
High front unrounded vowels are thus assumed to be transparent because the
observed tongue body retraction in the back harmonic environment does not result in
significant changes in the acoustic output. In contrast, the availability of this
retraction for both rounded and low front vowels was shown to be limited. This is due
to the advancement or shrinking of the region of decreased acoustic sensitivity to
horizontal tongue body perturbations for these vowels.
4.7
Conclusion
This chapter argued that the phonetic and phonological properties of transparent
vowels are intertwined and none of the existing models of the phonetics-phonology
relationship fully capture the generalizations observed in the Hungarian data. The last
section developed an argument that these generalizations might be reducible to the
independent quantal relationship between articulation and perception.
197
The next chapter proposes a model where phonetic retraction of the tongue
body and suffix selection are inter-dependent. It will be argued that it is possible to
model the relation between the discreteness of phonological form and the continuity
of phonetic substance in which that form is embedded when the language of
nonlinear dynamics is used.
198
CHAPTER 5
Dynamic Model of Transparency
5.1.
Introduction
This chapter presents a model that links small differences in details of articulation
observed in Chapter 3 to the categorical alternation in suffixes described in Chapter 2.
To formally express this correlation, the model uses the mathematics of non-linear
dynamics (e.g. Percival & Richards 1982, Kelso et al. 1993, Gafos, to appear)
operating over the parameters of dynamic gestural representations (Browman &
Goldstein 1989, 1995, Gafos 2002).
The model consists of two components. The first one models stem-internal
harmony and produces the differences in vowel retraction discussed in Chapter 4.
Building on work by Öhman (1966), Fowler (1983), Ní Chiosáin & Padgett (1997),
and Gafos (1999), it is assumed that vowel gestures across syllables are contiguous
and the consonantal gestures are superimposed on the vowel gestures. Harmony
between stem vowels is construed as blending of adjacent vowel gestures. The
current model of gestural blending (Saltzman & Munhall 1989) is extended by
proposing that the vowel gestures blend only to the extent that this blending allows
perceptual recoverability of the original gestures. Certain front vowel gestures resist
blending with back vowel gestures more than other front vowel gestures do. This is
due to differences in the quantal configuration of individual vowels. More
199
specifically, the outcome of the blending depends on the degree of acoustic
insensitivity to articulatory retraction of the tongue in the front-back dimension
(Stevens 1989, Wood 1979), as described in Section 4.6 of the previous chapter.
An account of the stem-suffix harmony, i.e. the selection of the [±back]
quality of the suffix vowel, is developed in the second component of the model. This
component crucially relies on the degree of articulatory retraction of the stem-final
vowel gesture. It is proposed that the suffix selection follows from the non-linear
dynamic relationship between the stable order parameter ([±back] of the suffix
vowel) and the continuous control parameter (degree of retraction). Small changes in
the control parameter result in significant changes in the order parameter. Effectively,
non-contrastive differences in the degree of tongue body retraction of the transparent
vowels result in categorical alternations in the suffixes that follow them. This is due
to the fundamental property of nonlinearity that allows us to link changes along
continuous dimensions to categorical alternations.
The chapter is organized as follows. Section 5.2 provides a brief motivation
for developing a model and using dynamics in this endeavor. The rest of the chapter
is devoted to the description of basic dynamic tools used in the model. The
presentation of the proposed model itself proceeds in three steps. First, Section 5.3
describes gestural representations that constitute the primitives of the proposed
analysis. Then, Section 5.4 develops the model of stem-internal blending. Finally,
200
Section 5.5 formalizes the relation between the retraction of stem-final vowel and the
form of the suffix. Section 5.6 summarizes the model and reviews the correspondence
between the predictions of the model and the observable facts related to transparency
in Hungarian vowel harmony.
5.2.
Modeling and dynamics
5.2.1. Why model?
In their manifesto of Laboratory Phonology, Pierrehumbert et al. (2001) acknowledge
the need for explicit mathematical modeling: “…successful scientific communities
recognize the value of mathematical formulation and use mathematics to make
precise theoretical predictions” (Pierrehumbert et al. 2001: 4). A mathematical model
is a set of equations or expressions that describe the behavior of the variables
observed in the data.
In cognitive science in general, and in phonology in particular, the leading
modeling approach is based on discrete mathematics. This approach relies on
sequential logical operations over discrete symbols. Under this view, intelligent
behavior can be sufficiently described by computer-like machines that operate with
symbolic representations. Physical phenomena related to cognitive processes such as
neuronal activity or articulator movements, typically described by the continuous
mathematics of calculus, are largely disregarded. Hence, most of the cognitive
201
science research in the second half of the twentieth century was guided by the
hypothesis that intelligent behavior can be studied separately from its ‘realization’,
‘implementation’, or ‘instantiation’ in the real world (e.g. Chomsky’s competenceperformance dichotomy).
However, there is an increasing body of evidence that argues against omitting
continuous phenomena from cognitive models (e.g. van Gelder & Port 1995 for an
argument based on time in cognition, Pierrehumbert et al. 2001 and Gafos (in press)
for arguments from phonology, Guy (to appear) for a review of arguments from
sociolinguistics). This evidence then supports the view that cognition operates both
with discrete and continuous mathematics, and that “the identification of formalism
with [SB: only] discrete formalism is erroneous and is deeply misleading in its
influence on research strategy” (Pierrehumbert et al. 2001: 3).
The approach taken in this dissertation takes the criticism of Pierrehumbert et
al. seriously by attempting to develop a model that accounts for both discrete and
continuous variables of a particular cognitive system. The variables include the
phonemic [±back] quality of the suffix vowel, and the phonetic degree of tongue
body retraction that is dependent on other vowel qualities such as height and lip
rounding as described in Chapter 4.
The aim of the modeling process is thus to find a mathematical formalism that
exhibits similar broad correlations between variables as those observed between the
202
suffix form and the tongue body retraction. If this enterprise is successful, then some
insight into the process of Hungarian vowel harmony is gained. More importantly, the
model provides testable predictions relating to the cognitive nature of the relationship
between phonemic and phonetic variables.
In this sense, a model is primarily a research tool. It allows us to test
hypotheses about the relevance of various parameters to the observed behavior.
Moreover, explicit models help uncover underlying principles of the studied
phenomena by forcing us to formulate mathematically coherent hypotheses about
these principles. Once this is done, further areas and new issues become apparent to
the researcher, which guide him or her to more empirical work that leads to
refinements that in turn open further areas for testing (see Kosslyn 1978 for
discussion of modeling, its motivation and scientific value).
5.2.2. Why dynamics?
As argued in Chapter 4, the pattern of suffix selection in Hungarian vowel harmony
involves interaction between high-dimensional phonetic parameters such as tongue
body retraction, and low-dimensional phonological parameters such as [±back]. Note
the crucial concepts of pattern and interaction in the previous sentence. Pattern
formation is the object of scientific inquiry not only in linguistics but also in biology,
ecology, and other disciplines studying complex behavior. Interaction between
203
various forces is the basic underlying principle of nature, and is studied for example
in chemistry or physics. The formal language capable of describing pattern formation
in real time through interaction of various parameters is dynamics. As pointed out by
van Gelder & Port:
“… dynamics […] happens to be the single most widely used, most powerful,
most successful, most thoroughly developed and understood descriptive
framework in all of natural science. It is used to explain and predict
phenomena as diverse as subatomic motions and solar systems, neurons and
747s, fluid flow and ecosystems. Why not use it to describe cognitive
processes as well?” (van Gelder & Port 1995: 4).
There are several advantages to approaching cognitive processes dynamically.
First, the abstract nature of the dynamic formalism has a potential for broad
application (as noted in the quote from van Gelder & Port), which may provide a
vehicle for the unification of thinking and methodologies across a broad range of
disciplines. Second, dynamics provides us with explicit mathematical tools applicable
to issues that are typically confronted in linguistics and other cognitive sciences. For
example, the notion of the stability of a system is well understood and many
successful models in economics (demand-supply-price model), ecology (predatorprey model), or physics (laser model) are based on the mathematical expression of
dynamic stability. As an example from cognitive science, consider visual and
auditory perception. Color or sound categories are stable over variations in physical
204
properties such as wavelength or formant frequency. Dynamics provides us with
mathematical tools to account for this stability.
A dynamic model can be generally characterized as quantitative or qualitative.
A quantitative model captures the observed phenomenon by attempting to provide an
accurate match between the numerical data produced by the model and those
observed experimentally. A qualitative model improves our understanding of a
particular phenomenon by providing a mathematically well-defined framework that
exhibits the behavior that is qualitatively similar to that observed in the experimental
data. An exact fitting of the observed data with the numerical output of the model is
not crucial in a qualitative model.
The dynamic model developed in this chapter is a qualitative model whose
aim is to guide in effort to understand the underlying principles of the relationship
between ‘cognitive’ phonology and ‘physical’ phonetics. More specifically in this
dissertation, ‘cognitive’ corresponds to the pattern of selecting either front or back
vowel for suffixes in Hungarian, and ‘physical’ corresponds to the position of the
tongue body when Hungarian speakers produce front vowels. The relationship
between these two aspects of human speech is thus a particular instantiation of the
interdependence of the mind-body problem.67 Dynamics provides a specific paradigm
67
For a comprehensive review of philosophical foundations of the mind-body
problem, see Kim (2000).
205
for approaching this dichotomy that was aptly described in van Gelder & Port (1995):
“The cognitive system does not interact with other aspects of the world by passing
messages or commands; rather, it continuously coevolves with them” (van Gelder &
Port 1995: 3).
5.2.3. Static vs. dynamic approach: example
Dynamics studies the behavior of variables such as temperature, population size,
radioactive decay, or employment in real time. The difference between a static and a
dynamic approach to a problem together with some basic mathematical concepts used
in dynamics can be illustrated with the following example from economics (Medio &
Lines 2001). Take a basic model of supplied (S) and demanded (D) quantities of a
single product defined as a function of its price (p). It is assumed that the demand
function D(p) decreases with price (p) because people wish to buy less if the price is
high. In contrast, the supply function S(p) increases because producers wish to supply
more if the price is high. The simplest way to formalize D(p) and S(p) as linear
functions is in (23) where a, b, m, and s are positive constants, and only non-negative
values of D, S, and p are considered.
(23)
D(p) = a – bp
S(p) = –m + sp
206
The economic equilibrium condition requires that demand equals supply, shown in
(24).
(24)
D(p) = S(p)
a – bp = –m + sp
The static solution to this problem involves solving the equation in (24) for pe,
equilibrium price, shown in (25).
(25)
pe = (a + m)/(b + s)
Fig. 25 illustrates the static system of demand and supply as a function of price.
Fig. 25 – Static model of balance between demand and supply (from Medio &
Lines 2001: 2).
Although the static model gives us the equilibrium price (pe), it does not tell us
anything about the behavior of the system when the price is different from the
equilibrium price. More importantly, such a static model does not allow us to make
207
any prediction as to the future behavior of the model that was just sketched.
Consequently, it does not fulfill one of the basic requirements for a good model, to
generate testable predictions.
In order to see how the system behaves for any arbitrary initial conditions, we
must construct a dynamic model that allows us to express the assumption that, over
time, price increases or decreases in proportion to the excess of demand over supply.
This assumption can be expressed mathematically as in (26) where p(t) is a price at
time t and h is an interval of time.68
(26)
p(t + h) = p(t) + h(D[p(t)] – S[p(t)])
Separating the excess demand term and dividing by h we get (27).
(27)
(p(t + h) – p(t))/h = D[p(t)] – S[p(t)]
To express the continuous nature of the process over time, the interval of time h is
taken to zero (h → 0). The definition of the derivative thus allows us to write (27) as
the differential equation in (28).
(28)
dp(t)/dt = D[p(t)] – S[p(t)]
68
The situation when supply exceeds demand (S – D) would be modeled similarly
and ultimately result in the system qualitatively identical to that described in the text.
In other words, excess demand in (26) can be a positive or negative quantity.
208
The equation in (28) is an ordinary differential equation (ODE) because p is a
function of a single independent variable, time. The equation is also of the first order
because there are no second derivatives. Simplifying this and replacing D and S based
on (23), we get equation (29).
(29)
dp/dt = p = D(p) – S(p) = (a + m) – p(b + s)
The quantitative solution to this dynamic equation involves finding a function
of time p(t) such that (29) is satisfied for an arbitrary initial condition p(0). Following
standard steps (e.g. Medio & Lines 2001, Tu 1992), the quantitative solution can be
abbreviated as in (30). In order to solve the homogeneous equation, the method of
integration is used. After taking the antilogarithm of both sides and setting eA = C, we
get: p(t) = C e– (b + s)t. In a particular solution of the non-homogeneous case, the righthand side, (a + m), is a constant, hence for p = k where k is a constant, p = 0. The
solution to the static problem (p = const.) is the same equilibrium price as obtained in
(25) above. The complete solution to (30) is a sum of the particular solution and the
solution to the homogeneous equation.
209
(30)
p = (a + m) – p(b + s)
p + p(b + s) = a + m
homogeneous eq : a + m = 0
non-homogeneous eq:
a + m is const., if p = k (const),
then
p = – p(b + s)
p=0
Ιdp/p = – (b + s) Ιdt
k(b + s) = a + m
k = pe = (a + m)/(b + s)
ln p(t) – ln po = – (b + s)t
ln p(t) – A = – (b + s)t
eln p(t) = e– (b + s)t eA
p(t)
= C e– (b + s)t
p(t) = pe + C e– (b + s)t
p(0) = p0 = pe + C
C = p0 – p e
p(t) = pe+ (p0 – pe) e– (b + s)t
This solution can be interpreted as the sum of the equilibrium price pe and the
difference between the initial price p0 and pe increased or decreased by the term
e–(b+s)t.
Depending on the value of (b + s), the final equation in (30) represents two
types of behavior that are sketched in Fig. 26. If (b + s) < 0 as in the left hand plot,
e–(b+s)t becomes increasingly greater with increasing time t, and the deviations from
the equilibrium price become infinitely large. Consequently, the actual price would
explode and become increasingly distant from the equilibrium price. If, on the other
hand, (b + s) > 0 as in the right hand plot, e–(b+s)t becomes increasingly smaller with
increasing time t, which results in deviations from the equilibrium price tending
210
asymptotically to zero. As a result, the actual price becomes increasingly similar to
the equilibrium price.
Fig. 26 – Dynamic model of balance between demand and supply.
5.2.4. Geometric description of a dynamic system
In addition to the quantitative solution sketched in (30), dynamics offers tools for
qualitative solutions when the primary interest is in the structure and stability of a
dynamic system (Tu 1992, Percival & Richards 1982). The qualitative solution is
obtained by use of phase flow diagrams, also referred to as phase portraits.
Informally, solving a differential equation for a particular initial condition provides a
single trajectory such as the ones in Fig. 26. However, the same differential equation
also provides a more global view of the dynamic system because it describes the
shape of all possible trajectories irrespective of the initial conditions. A phase flow is
a graphical representation of this global view.
211
The phase flow of the system dx/dt = f(x) is constructed by taking a set of
values xi and drawing an arrow of length proportional to |f(xi)| near the x-axis; the
center of the arrow is at xi and the arrow points in the direction of increasing or
decreasing x depending on the sign of f(xi). If the f(xi) is positive the arrow points to
the right; if it is negative it points to the left (Percival & Richards 1982: 3). These
arrows then represent the velocity of the flow that is represented by the differential
equation. This flow could be imagined as a flow of water and a single trajectory as a
path of a massless particle placed in that flow.
The phase flow of an equation d(x)/dt = αx is shown in Fig. 27 below. This is
the same type of equation as described in the previous section and can be obtained by
setting (a + m) = 0, and –(b + s) = α. The phase flow for α < 0 is on the left, and for α
> 0 is on the right.69
p
0
0
Fig. 27 – Phase flow showing the velocity field of f(x) = αx. If α < 0, the flow
on the left is obtained, and if α > 0, the flow on the right is obtained.
The value of p for which f(x) = 0 is considered a fixed point because its
velocity vector is zero and therefore the system with that initial state is in the state of
69
This can be verified by substituting (b+s) for either 1 or –1 and then calculating p
for p = {–2, –1, 0, 1, 2}.
212
equilibrium and does not change (i.e. the particle placed in that point stays there at all
times). Fixed points are of two types: stable and unstable. Stable fixed points, also
called attractors, are the points xi around which f(x) is a decreasing function of x.
Hence, the states in the vicinity of the attractor move toward it, which is marked by
the arrows of the vector field. In contrast, f(x) around unstable fixed points, also
called repellers, is an increasing function of x. Consequently, the states in the vicinity
of a repeller move away from it.
A geometric description of the system shown in Fig. 28 that will be used in
this chapter is based on the property of autonomous first-order systems where the
velocity field f(x) = dx/dt can be expressed as a negative gradient of a potential V(x)
(Percival & Richards 1988: 5, Kelso et al. 1993: 20). This is shown in (31), and the
value of the potential is obtained via integration, which is shown in (32).70
(31)
f(x) = – dV(x)/d(x)
(32)
V(x) = V0 – Ιdx’ f(x’)
Therefore, a linear function f(x) = ax + b that is similar to the one used in the
previous economic example has the potential V(x) = – (½*ax2 + bx). The function
f(x) can be imagined as a force that moves a particle in the flow, and V(x) is the
70
The constant V0 can be disregarded as it does not affect the qualitative behavior of
the system.
213
potential producing that force. Fig. 28 shows both the flow and the potential function
for a particular linear function f(x). The balls in the potentials of the rightmost panel
represent a particle in the flow produced by the differential equation and the
movement of these balls in the potential thus represents the qualitative description of
the dynamic system.
Fig. 28 – Velocity fields, flows, and potentials for a linear function f(x) = ax +
b. For the values of the parameters a=0.5 and b = 1, the fixed point is x = –2,
and for a=–0.5 and b = 1, it is x = 2.
Returning to the economic example in Section 5.2.3, the parameter ‘a’ represents the
–(b + s) parameter. If the initial price is different from the equilibrium price the price
is adjusted over time so that its value is progressively closer to the equilibrium price.
214
This is valid for a > 0. If the value is negative, the price will be increasingly different
from the equilibrium price.
The systems discussed so far are linear dynamic systems. However, many
natural phenomena have been described by non-linear dynamics where the function
f(x) of the differential equation dx/dt = f(x) is a polynomial of degree two or more.
For example, non-linear dynamic systems have been used to describe processes in
physics such as the light emitting from a laser or the motion of fluid in a rectangular
vessel under increasing temperature (Haken 1990). In chemistry, the autocatalytic
reaction called Brusselator (Hanon & Ruth 1997: 77) or the well-known BelousovZhabotinsky reaction (e.g. Goodwin 1994) have been modeled as non-linear dynamic
systems. Many biological and ecological phenomena such as the exponential growth
of cells, or the effect of worms on spruce and fir trees (Hanon & Ruth 1997: 303)
have been shown to display behavior that is qualitatively similar to that of non-linear
dynamic systems. Following Fig. 28, a geometric qualitative solution to a non-linear
dynamic system can be illustrated in Fig. 28.
215
Fig. 29 – Velocity fields, flows, and potentials for a non-linear function
f(x) = –x3+x. From Percival & Richards 1982: 2.
It can be seen that the dynamic system shown in Fig. 28 has three fixed points
that correspond to the roots of the cubic polynomial x = {–1, 0, 1}. The fixed point x
= 0 is an attractor because the function f(x) for a set of points in its vicinity is
decreasing. This is shown as arrows pointing toward this fixed point. In contrast, the
other two fixed points are repellers as the arrows point away from them.
One of the central notions in any dynamic system is the notion of stability.
Kelso et al. (1993) describe dynamic modeling as follows: “implementing a dynamic
theory means mapping the reproducibly observed states of a system (i.e. those that
occur independent of initial condition) onto attractors of a corresponding dynamic
model” (Kelso et al. 1993: 20). The forthcoming discussion of this section is devoted
to the tools of dynamic modeling that allow for formalizing stable recurring patterns
as well as differences in initial conditions.
216
The division of fixed points into stable and unstable was already discussed
above. It was mentioned that the stability of a fixed point is determined by the
increase or decrease of f(x) for the values xi in the vicinity of a fixed point xf.71
However, in addition to this binary notion of stability, dynamics allows us also to
express stability continuously. The most intuitive measure of stability involves the
effect of noise. Dynamic systems model real life phenomena that contain, and are
affected by many sub-systems. One way of representing these effects is via stochastic
forces that cause local perturbations that in turn produce deviations from the attractor
state(s) (Kelso et al. 1993: 21). The variance or standard deviation of x around the
fixed point xf thus serves as a measure of stability of that fixed point. This idea is
illustrated in Fig. 29 below. It can be inferred that the system on the left of Fig. 29 is
more stable than the one on the right because the interval of system variation caused
by random perturbations is smaller in the left than in the right graph.
71
Mathematically, this can be determined by performing the Taylor expansion of the
function around the fixed point xf.
217
Fig. 30 – Measure of stability as the width of the probability distribution.
Potential function (––––) and the probability function (-----) (from Kelso et al.
1993: 23).
Hence, the degree of stability of a dynamic system depends on the curvature
of the potential function around the fixed points. For attractors, greater curvature
corresponds to more stability. The curvature, and the shape of the potential V(x) in
general depends on the value of dynamic parameter(s). For example, the difference
between the two plots in Fig. 29 results from smooth variation of the parameter ‘a’ in
the quadratic term of V(x) = ax2. In some cases, continuous manipulation of such
parameters in one direction results in reaching some critical value, beyond which the
stability of an attractor can be lost. This is shown schematically in Fig. 30.
218
Fig. 31 – Loss of stable fixed point due to continuous change in a control
parameter. Potential function (–––) and the probability function (-----) (from
Kelso et al. 1993: 23).
Finally, consider the four plots in Fig. 32. It can be seen that a dynamic
system described by the differential equation dx/dt = f(x) = –4ax3 – 2bx, which after
integration produces the potential function V(x) = ax4 + bx2, can display four
qualitatively different types of behavior: a) the system may have a single attractor and
no repeller, b) a single attractor and two repellers, c) two attractors and one repeller,
and d) a single repeller and no attractor.
219
Fig. 32 – Effect of changing parameters a,b on the shape of the potential
function V(x) = ax4 + bx2.
The changes of the parameters a,b within the four categories, shown with the
dashed potentials, affect the degree of (in)stability of the attractors and repellers, but
crucially, they do not affect the qualitative pattern of the fixed points. Therefore,
Fig. 32 illustrates a fundamental property of structurally stable non-linear dynamic
systems: stability for certain intervals of input parameter values, and a qualitative
change when the input parameter reaches some critical value. Fig. 32 exemplifies this
220
property by showing the effect of variation in the parameter b on the shape of the
potential V(x) = ax4 + bx2. It can be seen that the critical value of the parameter b is
zero. This is because the potentials in the first three plots are qualitatively identical,
that is they all have one attractor at x = 0 and no repellers, despite continuously
decreasing values of b = 2, b =1, and b = 0.1 respectively. Crucially, another decrease
in the value of the parameter crosses the critical value b = 0, and the dynamic system
is now characterized with three fixed points: a repeller at x = 0, and two attractors at x
= 1, and x = –1.
The relationship between the parameters x and b shown in Fig. 32 is a crucial
concept utilized in the model developed in this chapter. Informally, parameter x
corresponds to some higher-level, low-dimensional property, and the parameter b to
some lower-level, high-dimensional aspect related to this property. The parameter x
is low-dimensional because it is stable over large intervals of b values (x = 0 for
b χ <4, 0), and x = {–1, 1} for b χ (0, 4>). Yet, in the region around the critical value
b = 0, the variation in the parameter b has a significant effect on the parameter x. In
dynamic terms, the parameter x is called an order parameter, and the parameter b is
called a control parameter.
221
Fig. 33 – Non-linearity of the dynamic system characterized with the potential
V(x) = ax4 + bx2. Parameter a is kept constant (a = 1) and the critical value of
the parameter b is zero.
To sum up, non-linear dynamics provides methodology, concepts and explicit
formalism for building models that account for a) the stability of higher-level order
parameters despite variation in some lower-level control parameters, and at the same
time b) ability of the control parameter, when it reaches some critical values, to affect
the order parameter. In the remainder of this chapter, this last basic concept will be
222
applied to the relationship between [±back] forms of the suffix vowel and articulatory
retraction of the stem-final vowel in Hungarian vowel harmony.
5.3.
Gestural representations
The basic units of phonological representation in the proposed dynamic model are the
gestures of Articulatory Phonology (Browman & Goldstein 1986 et seq.). A gesture is
a dynamically defined unit of action that has both spatial and temporal dimensions. In
this section, I will explore each of these characteristics and briefly discuss some of
the crucial aspects of Articulatory Phonology. I will start with a brief description of
the dynamic system used for characterizing gestures. Then I will apply this system in
formalizing vowel gestures and their [±back] feature.
Articulatory gestures are task-oriented. This means that the movement of
articulators is driven by an underlying task to form a constriction somewhere in the
vocal tract. For example, a consonantal gesture for /p/ must achieve the closure
between the upper and the lower lips while simultaneously producing an opening of
the vocal cords. A task of producing a vowel gesture for /i/ involves, in part, the
achievement of a narrow constriction between the tongue body and the hard palate.
The formation of constrictions such as these involves a change in the position of one
or more active articulator(s) over time. Therefore, the task of producing constrictions
223
in the vocal tract can be modeled using the mathematical theory of dynamics
described in the previous section.
I will first describe the spatial parameters with which this dynamics operates,
and then the equations themselves. I will assume the gestural score model of
Articulatory Phonology (Browman & Goldstein 1995). In this module, the spatial
target of a gesture is characterized by two variables: constriction location (CL) and
constriction degree (CD). CL can take one of eight possible values (labial, dental,
alveolar, postalveolar, palatal, velar, uvular, pharyngeal). CD takes one of five
possible values (closed, critical, mid, narrow, wide). Therefore, the movement of
every active articulator (lips, tongue tip, tongue body, etc.) is specified for a CL and a
CD variables. The schematic description of the tract variables in Articulatory
Phonology is shown in Fig. 34 (Browman & Goldstein 1995: 183).
For example, a gesture defining the alveolar stop [t] is specified for the tongue
tip constriction location (TTCL) ‘alveolar’ and TTCD ‘closed’. The model
additionally assumes CD values for velum and glottis. CL in these articulators is
‘wired in’. In other words, there is only one target location possible to achieve when
the velum is lowered. The vocal tract variables are coupled to prosodic and speechrate effects, which yields a gestural score. This score serves as input to the task
dynamics module that calculates the time-varying response of the vocal tract
articulators to a set of gestural control structures (Saltzman & Kelso 1987).
224
Fig. 34 – Tract variables of Articulatory Phonology (Browman & Goldstein
1995: 183).
The dynamic system most commonly used to describe task-oriented
movements of articulators is a damped mass-spring system (Saltzman & Kelso 1987).
This dynamic system then describes the behavior of the parameters CD and CL.
Informally, the gestural targets correspond to the stable values of these parameters.
Each gesture is a unit of action characterized by a discrete target. This is a secondorder ‘point-attractor’ system described by the equation in (33).
(33)
m*d2x/dt2 + b*dx/dt + k(x – xo) = 0
In the first term of the equation, m is the mass coefficient and d2x/dt2 is the
second derivative of x with respect to time. In the second term, b is the damping
225
(friction) coefficient and dx/dt is the first derivative of x with respect to time. In the
third term, k is the stiffness coefficient, xo is the equilibrium (target) position, and x
represents the displacement of a particular articulatory parameter.72
The dynamic systems discussed so far were examples of a first order system
in which the differential equation contained only first derivative terms. In contrast,
the equation in (33) describes oscillatory motion and contains a second derivative.
However, the equation in (33) can be simplified for the purposes of the proposed
model. This is because to model spatial blending of gestures, the information about
the location of the attractor is sufficient. Consequently, the second derivative term
from the equation in (33) is omitted, which makes it an ordinary first-order
differential equation in (34) that is familiar from the discussion in Section 5.2.3. This
equation describes a gesture as a movement toward the target xo of a spatial
parameter x = {CL, CD} over time.
(34)
dx/dt = –k/b(x – xo)
Following the discussion in Section 5.2.3, the movement of an articulator
toward a target can be imagined as a ball moving in a potential landscape V(x). The
general equation of motion in (35) provides the relationship between f(x) and V(x),
72
Damping is assumed to be critical so that the articulator does not overshoot the
target position. Also, the task-dynamics model of Articulatory Phonology assumes
that the constrictions are massless (Kelso et al. 1986: 35).
226
which for the function in (34) produces the potential in (36). The noise term can be
neglected for now.
(35)
dx/dt = f(x) + noise = –dV(x)/dx + noise
(36)
V(x) = k/2b(x – xo)2
This potential then represents the dynamic mechanism that underlies a taskdefined gesture. The movement of a ball in that potential represents the actual
movement of an articulator toward a target. Fig. 35 below illustrates such a dynamic
system in which an articulator moves toward the target value of the constriction
location xo as a function of time.
Fig. 35 – Attractor dynamics for constriction location.
In any model of a natural phenomenon, noise is introduced by various lowlevel microscopic systems implementing the essential variables under modeling. In
227
the case of articulator movement, noise originates from the neuronal and myodynamic systems implementing the movement and can be imagined as a force F(t)
that pushes the ball back and forth randomly. As discussed in Section 5.2.4, the
perturbations caused by F(t) add a stochastic component to the dynamic formalism.
The strength of an attractor, i.e. its stability, can be formally expressed as the standard
deviation of x in the vicinity of the attractor. Given this randomness, the two panels
in Fig. 36 show the potentials with different curvature around the attractor, i.e.
different stability or strength. This difference is brought about by the variation of the
parameter α = k/2b from (36). Informally, α expresses the strength with which a given
gesture imposes its control over the given tract variable. Higher stiffness and less
damping result in a stronger attractor. I will return to the activation intervals in more
detail later in this section.
It can be seen that the two potentials differ in terms of their curvature at their
attractors. Importantly, the value of the attractor does not change (x = 2) and the
target spatial value for both gestures is thus the same. In contrast, the α parameter is
different: in the left panel α = 3 but in the right panel α = 5. Hence, changing the
weight parameter α results in flattening or widening of the potential V(x). I will refer
228
to these differences as changes in the strength of an attractor dependent on the value
of the weight α. Given this, a gesture is now defined as V(x) = α(x – xo)2 + F(t).73
Fig. 36 – Potentials and probability distributions of the dynamic system V(x)
= α(x – 2)2 + F(t) as a variation of the weight α.
Here I present a brief summary of the dynamic notions that will be used in
defining articulatory gestures. Each gesture is task-oriented and specified for an
articulatory parameter x and the target value of that parameter xo. The movement of
an articulator toward this target is described by the motion of a ball in a potential
V(x) = α(x – xo)2 + F(t) where α is the weight of a gesture and F(t) is the noise. This is
a general description of articulatory dynamics applicable to all gestures discussed in
this chapter.
73
The weight parameter must be positive (α > 0), otherwise the attractor of the
function would change to a repeller, an unstable point.
229
The gestural theory of phonological representations described above is
exemplified in Fig. 37. The left panel of Fig. 37 shows the kinematic trajectories of
the receivers placed on the upper lip (UL), lower lip (LL), tongue body (TB), and
tongue dorsum (TD) during a production of ‘bib’ in Hungarian.74 In this plot, the
dashed lines show vertical and the solid lines horizontal movement. The x-axis is
time and the y-axes represent a particular spatial dimension: the higher the value, the
more advanced (solid) and elevated (dashed) is the position of the receiver.
The solid ovals represent the bilabial gestures for /b/. Vertical raising of the
lower lip and to a lesser extent lowering of the upper lip can be observed. This
coordinative movement corresponds to the tract variable value for lip aperture (LA)
‘closed’. Therefore, this consonantal gesture can be specified as
V(x) = α(x – xo)2 + F(t) where x is lip aperture and xo equals zero (no distance
between lips). The potential on the right illustrates V(x).
The dotted oval shows the movement of the tongue body to achieve the
palatal narrow constriction required for /i/. Both raising and advancement of the
tongue body can be observed. Hence, this vowel gesture can be specified with two
potentials where x corresponds to TBCL and TBCD respectively. These potentials are
shown on the bottom right with arbitrary values xo = 2 for ‘TBCL = palatal’, and xo =
1 for ‘TBCD = narrow’ respectively.
74
This data is collected by the EMMA technique described in detail in Chapter 3.
230
Fig. 37 – Left: kinematic trajectories of the vertical (dashed) and horizontal
(solid) movement of lips and tongue body during ‘bib’. The solid ovals point
to the bilabial closure for /b/ and the dotted oval to the tongue advancement
and raising for /i/. Right: dynamic specifications that underlie the kinematic
movements.
In the case of palatal vowel harmony, the relevant parameter whose variation
we wish to express dynamically is the [±back] specification for vowels. This feature
is expressed in Articulatory Phonology with the parameter of the Tongue Body
Constriction Location (TBCL). For example, /i/ has TBCL = ‘palatal’ whereas /a/ has
TBCL = ‘uvular’. Although the model of Browman & Goldstein uses labels such as
‘palatal’ and ‘uvular’, the CL parameter is essentially a continuous numeric
parameter representing the vocal tract from the lips to the pharynx. The label such as
231
‘palatal’ then corresponds to an interval of CL values. In the following, the TBCL
value for [+back] vowels is set arbitrarily at TBCL = –2, and for [–back] vowels at
TBCL = 2.75 The potentials G(x) and F(x) corresponding to these gestures are shown
in Fig. 38. The arbitrariness in setting the values of the parameter is justified because,
as discussed in Section 5.2.1, the model developed in this chapter is a qualitative
model.
Fig. 38 – Dynamic formalism of [±back] vocalic feature.
A specific assumption of the model formalized in this chapter is that a single
arbitrary specification of the TBCL for the [+back] covers the three target locations
75
In Browman & Goldstein (1995), the target CL values for the tongue movement are
specified in terms of the number of degrees. For example, 56 degrees corresponds to
‘alveolar’, and 90 degrees to ‘palatal’.
232
available for the tongue body (e.g. Wood 1979): TBCL = {velar, uvular, pharyngeal}.
This is a simplification due to the fact that the experimental stimuli did not allow us
to test for the differences among /u/, /o/ and /a/ in their effect on the retraction of the
front vowels in Hungarian. Testing of these differences is the area of future research.
At the outset of this section, gestures were defined as spatio-temporal
dynamically defined units of action. So far, the dynamic nature and the spatial aspect
of gestures were discussed. The third characteristic in the definition of a gesture is its
inherently temporal character. Consider the left panel of Fig. 37. Each gesture,
illustrated with an oval, unfolds in time. Given some threshold values, typically
expressed as an interval around two velocity zero crossings, approximate temporal
points when a gesture begins, reaches its target, and then is released can be
identified.76
There are two, functionally different, ways in which the temporal character of
gestures is formalized in Articulatory Phonology. The first one is defined at the
interarticulator level (Saltzman & Munhall 1989: 336). At this level, the dynamic
parameters of inertia, stiffness, and damping determine a ‘settling time’ for a gesture.
This is the time required by the specifications of a dynamic system to move from zero
76
See Nam (2004) for a proposal that the release portion of a gesture is an active
gesture in itself, not just passive return to a neutral position. See Gafos (2002) for a
theory of gestural coordination based on temporal landmarks and its application in the
phonology of Moroccan Arabic, and Hall (2004) for the application of temporal
landmarks in the analysis of vowel intrusion.
233
velocity to the spatial target. This temporal domain thus results from the dynamics for
the tract variables such as TBCL, and roughly corresponds to a segmental level.
The second level of time specification is at the inter-gestural coordination
level. At this level, an activation interval is assigned to each gesture. The activation
intervals are step functions of time and determine when the gesture is ‘on’ and ‘off’
(Saltzman 1995).77 During this interval, the gesture actively shapes the movements of
the articulators in the vocal tract. The patterns of relative timing and cohesion among
the activation intervals of various gestures are specified at this level. An example of
the gestural score for the word ‘pan’ is shown in Fig. 39 below.
Fig. 39 – Gestural descriptors with activation intervals and generated
movement of the articulators (Browman & Goldstein 1995: 187).
77
Nam et al. (2004) present an improved version of the task-dynamic model for
speech where the activation intervals are not step functions but ramp functions.
234
To summarize, the advantage of the gestural theory of representation sketched
in this section is twofold. On the one hand, gestures are dynamic structures that are
sufficiently abstract to capture qualitative phonological generalizations. This is
because gestures are task oriented. Their goal is to achieve some well-defined
macroscopic state, e.g. a lip closure, which could be formalized with attractor
dynamics. At the same time, the actual spatio-temporal realizations of these gestural
goals are necessarily continuous and directly linked to the observable articulatory
characteristics of speech. Therefore, the model of Articulatory Phonology offers
representations with a lawful relationship between macroscopic phonological goals
and microscopic phonetic properties of speech.
In addition, temporal overlap of gestures, formalized with activation intervals
of the gestural score, is a fundamental property of speech. For example, the gesture
for /i/ in the left panel of Fig. 37 starts well before the preceding lip gesture is
released. Similarly, the second lip gesture starts while the tongue body gesture is still
active. Without temporal overlap, the primary function of speech, the communication
of information, would be much less efficient. In the following section, I propose that
gestural overlap plays an important role in the phonological process of vowel
harmony. See Browman & Goldstein (1990) for the analysis of several phonological
processes that is based on the notion of gestural overlap.
235
5.4.
Stem-internal blending
5.4.1. Assumptions
Phonological assimilation has been claimed to have a phonetic basis (e.g. Ohala
1990, Gafos 1999, Bakovic & Wilson 1999, Padgett (to appear)). In line of this
research, it is assumed in this dissertation that vowel harmony arises as a
phonologization of the articulatory patterns of blending between adjacent vowel
gestures. The aim of this sub-section is to motivate this assumption.
Overlap of articulatory gestures is a fundamental and universal characteristic
of speech production. Gestural blending is the formal mechanism that accounts for
this overlap in the gestural approach to speech production. Saltzman and Munhall
(1989) define gestural blending as follows:
“…coproduction occurs whenever the activation of two or more gestures
overlaps partially (or wholly) in time within and/or across tract variables.
Spatial overlap occurs whenever two or more coproduced gestures share some
or all of their articulatory components. In these cases, the influences of the
spatially and temporally overlapping gestures are said to be blended”
(Saltzman & Munhall 1989: 345).
Therefore, there are two necessary requirements for blending. The first one is
the temporal overlap of the activation intervals of at least two gestures. The second
requirement states that two blending gestures share at least some articulatory
components. These two requirements are discussed below.
236
In the framework of Articulatory Phonology, gestural coproduction is
characterized with a coordination relationship between gestures. It has been shown
that these coordinations exist both within the structures traditionally called segments
as well as between them. The intra-segmental coordinations include oral-velum
coordinations in nasals (Krakow 1989), oral-glottal coordinations relevant for
distinction in voicing (e.g. Munhall and Löfqvist 1992), or tongue tip and tongue
dorsum for laterals (Sproat and Fujimura 1993, Gick 1999). The inter-segmental
coordinations include specific consonant-vowel coordinations within syllables (e.g. ccenter effect, Honorof & Browman 1995, Byrd 1995), or coordinations between
consonants (e.g. Gafos 2002, Davidson 2004).
A special case of inter-segmental coordination is the requirement of V-to-V
contiguity. Since Öhman’s (1966) seminal work we know that (at least some) vocalic
gestures in a VCV sequence are contiguous. Öhman found ample evidence for this
claim in acoustic as well as articulatory data involving VCV sequences. In other
words, the consonantal constriction does not preclude the overlap of the vowel
constrictions. In contrast, consonantal gestures in a CVC sequence are not contiguous
in the sense that the consonantal constriction of the first consonant is completely
‘undone’ by the (open) constriction of the following vowel. This result have been
supported by other later studies (e.g. Fowler 1983), and used as a motivation for the
asymmetry between vowel and consonant harmony systems (Gafos 1999).
237
The studies mentioned above thus provide converging evidence for the first
requirement of blending that vowels in adjacent syllables are temporally overlapped.
This contiguity may be informally illustrated as in Fig. 40. This figure shows the
spatio-temporal evolution of two adjacent vowel gestures and the temporal window
of overlap.
Space
Time
TBCL:
Fig. 40 – Spatio-temporal evolution of two adjacent vowel gestures: time is on
the x and a spatial parameter such as Constriction Location on the y-axis. The
shaded window represents the temporal interval of gestural overlap. The
boxes represent the activation intervals for the TBCL tract variable of the
gestural score module.
The second requirement for blending is that the gestures involved in blending
must control the same articulator or a tract variable (e.g. TTCL, TBCD, etc.). In the
case of palatal vowel harmony as in Hungarian, the relevant articulatory dimension of
the vowel gestures is the constriction location of the tongue body (TBCL). This is
because the actions of the tongue body are the major determinant of the [±back]
quality of vowels (Wood 1979). Based on this, specific settings of the TBCL
parameter result in the constriction in the palatal, velar, uvular, and pharyngeal areas.
238
The first one corresponds to the traditional [–back] feature while the other three
correspond to the [+back] feature. Hence, in the relevant dimension of [±back]
specification, adjacent vowel gestures share the common tract variable Tongue Body
Constriction Location (TBCL).
It is then assumed that the vowel gestures in a VC(C)V sequence satisfy the
required conditions and undergo blending. However, in Saltzman & Munhall’s
model, blending occurs only within the interval of actual temporal overlap of the
activation intervals. In other words, when a gesture is ‘off’, it does not influence the
result of blending with another gesture that is ‘on’. Under this assumption, in the case
illustrated in Fig. 40, the first gesture would have an effect on the second gesture only
within the interval marked with the shaded window. Yet, the results of the
experiments discussed in Chapter 3 showed that it was the target location for a front
vowel gesture that was affected by the coproduction with a preceding back vowels
gesture.78 This is because the effect of environment was measured at the target
(maximal displacement) of the relevant gestures. Therefore, a model of vowel
harmony should capture the fact that the stem-final front vowel gesture is affected
78
As pointed out by L. Goldstein, the reported experiments showed differences in
retraction degree rather than in actual constriction location. The assumed
correspondence between retraction degree and constriction location must be verified
with the use of palate trace data that, unfortunately, were not collected during the
experiments.
239
globally by the environment, and not just in the limited time interval near the onset of
the gesture.
Support for the hypothesis that languages with vowel harmony display global
effects of coarticulation while languages without vowel harmony show only local
effects comes from experimental work by Boyce (1988). Boyce studied the activity of
lips during the consonant portion of VC(C)V sequences in Turkish and English.
Turkish is a language with rounding vowel harmony whereas English does not have
this phonological pattern. Boyce found that the rounding gesture in V1CV2 sequences
in which both vowels are round is present during the consonant as well. For example,
lip rounding showed a plateau pattern in sequences like ‘utu’. Furthermore,
continuous rounding was present even when three consonants intervened, as in
ViCCCVi exemplified with the sequence ‘uktlu’. In contrast, the same sequences in
English showed a trough pattern. The lip rounding gradually decreased during the
consonant portion and then increased again for the second vowel.
This result suggests that, in English, the rounding for the two vowels is a
separate event because it is ‘on’ for the first vowel, then turned ‘off’ for the
consonants, and finally turned back ‘on’ for the vowel. In Turkish, however, rounding
is a global gesture that is ‘on’ during the whole portion of the VCV sequence. It is
assumed that in Hungarian, the tongue body target value originating from the initial
vowel is a global gesture active for the duration of the whole word.
240
Therefore, while overlap between adjacent articulatory gestures is a
fundamental property of speech common in all human languages, languages differ as
to the relevance of this property in their phonological systems. It is hypothesized that
in the languages without harmony, coarticulation patterns do not play a role in
determining the outcome of phonological processes. However, in vowel harmony
systems, articulatory constraints of gestural overlap interact with perceptual factors
and both participate in the phonological system that guides the harmonic alternations
in stems and suffixes. A formalism of the difference between languages with and
without vowel harmony is presented in Chapter 6.
To summarize, this sub-section reviewed evidence that vowel gestures in
adjacent syllables overlap temporarily and spatially and that the vowel gesture in the
initial syllable globally influences the target of the following gesture(s). Based on
this, it is assumed that the phonological pattern of vowel harmony is phonetically
motivated in the sense that it arises through phonologization of the overlap between
the gestures of the adjacent vowels. The upshot is that both phonological as well as
phonetic generalizations can be related to a single process of articulatory overlap. In
the model developed in this chapter, this overlap is formalized as blending of adjacent
gestures controlling the movement of the tongue body. At the same time, as will be
modeled in Section 5.5, this blending interacts with other mechanisms of speech
production and perception in determining the suffix forms. In this view, the
241
conclusion reached in Chapter 4, that the phonetic and phonological patterns linked to
vowel harmony are inter-dependent, follows from a model based on articulatory
overlap.
5.4.2. Dynamic model of stem-internal blending
Consider two vowel gestures A, B in adjacent syllables. Following the discussion in
Section 5.3, these gestures are defined by the potentials VA(x), VB(x), and weight
parameters α,β. The simplest working hypothesis for formalizing blending of these
two gestures is to take the linear combination of the input potentials VA(x)+VB(x) where
both potentials contribute equally to the blended output (α = β).
Let us illustrate this first hypothesis about blending with a specific example
where gesture A corresponds to a front vowel and gesture B to a back vowel.
Following the discussion in the previous section, the [±back] quality of the vowel
gesture is formalized with a point attractor dynamic system V(x) = (x – xo)2 where x
corresponds to tongue body constriction location (TBCL). Front vowels are specified
with xo = 2, and back vowels with are specified with xo = –2. The potentials with
these specifications characterizing a front (F(x) = (x – 2)2 ) and a back vowel (G(x) =
(x – (–2))2 ) were shown in Fig. 38 above, and are combined into a single plot in
Fig. 41 below. The potential shown with the dashed line represents the blended
gesture after addition of the two potentials αF(x) + βG(x).
242
Fig. 41 – Blending of two gestures represented with potentials F(x) and G(x)
respectively. The left panel shows the output when the two gestures have
equal weights. In the right panel blending is biased since the gesture with the
front attractor is weighted more than the gesture with the back attractor.
It can be seen that the minimum of the resulting potential is at CL = 0 if α = β.
Hence, the resulting attractor for CL of the blended gesture would be at the midpoint
between FRONT and BACK attractors. However, the experimental results showed that
transparent vowels are retracted only slightly when they are preceded by a back
vowel. In other words, a front vowel adjacent to a back vowel is still a front vowel.
Therefore, the influence of the two blending gestures on the outcome of blending is
not equal. We therefore propose a minimal extension of the blending function where
the two gestures involved in blending have unequal weights. This is shown in the
right panel of Fig. 41 where the gesture for the front vowel is weighted more than that
243
for the back vowel (α = 3, β = 1). Consequently, the result of blending, the function
αF(x) + βG(x) shown with the black dashed line, has its minimum tilted more toward
the attractor for the front vowel.
In the case of disyllabic stems with a back vowel followed by a front vowel
such as papír, R corresponds to the retraction degree of a stem-final vowel. The
difference between the left and right panels of Fig. 41 results in different values of the
parameter R. This parameter is calculated as the absolute value of the difference
between the attractor of the ‘stronger’ input gesture and the attractor of the gesture
after blending. In the left panel of Fig. 41, the difference between the minimum of the
input αF(x) and the minimum of blended (αF(x) + βG(x)) is R = |2 – 0| = 2. In the
right panel, R = 1. The parameter R serves as the output of the stem-internal
blending.
Next, I explore the model of blending sketched above in more detail. I will
concentrate on the linguistic significance of the parameter q that expresses the ratio
of the weights in adjacent gestures. I will discuss two properties of blending that are
possible to express with q: directional bias and articulatory stability.
First, the simulations of the dynamic blending model V(x) = αF(x) + βG(x)
show that the parameter q = α/β provides a good approximation of the blending
asymmetry. If the ratio of weights is higher than one (q < 1), the gesture represented
by G(x) with the BACK attractor is stronger. The result of such blending is an
244
‘advanced back gesture’ with a target between the values of CL (0, –2). If, however,
q > 1, the gesture represented by F(x) with the FRONT attractor is stronger and the
result of blending is a ‘retracted front gesture’ with a target between the values of CL
(0, 2).
Therefore, the division of the q values into two intervals, q ∈ (0, 1) and
q ∈ (1, 4), corresponds to what is traditionally called regressive and progressive
coarticulation. If α > β (i.e. q >1), the first gesture is ‘stronger’. Hence, the effect of
the weaker second gesture on the first one is limited. The gesture resulting from such
blending would be similar to the first gesture with minor influence from the second
gesture. This is the case of regressive coarticulatory influence from the second
gesture on the first gesture. In the opposite case, α < β (q < 1), the result of blending
would be closer to the specification of the second gesture. The effect of the weaker
first gesture would be limited. The result of blending will be a gesture similar to the
second gesture with minor influence from the first gesture. This corresponds to the
descriptive term of progressive coarticulation.
It has been shown, for example by Öhman (1966), that the effect of both
progressive and regressive coarticulation can be observed in any given V1CV2
sequence. Therefore, the V1 gesture affects the V2 gesture, and at the same time, V2
affects V1. The blending computation provides the output V1 if the input V1 is more
weighted than V2 (α > β => q >1). Conversely, for the computation of the output V2
245
gesture, the input V2 is more weighted than the input V1 (α < β => q < 1). Hence, both
progressive and regressive coarticulation can be modeled.
Apart from being able to express directional bias in blending, the parameter q
also captures differences in articulatory stability of vowels. Articulatory stability
expresses resistance of gestures to coarticulation from adjacent gestures controlling
the same tract variable. For example, take a back vowel gesture V1 with weight α = 1,
and two front vowel gestures V2 and V3 with weights β = 2 and γ = 3 respectively.
Comparing the two front gestures when adjacent to a back vowel gesture (V1CV2 vs.
V1CV3), V2 is less articulatorily stable and consequently more influenced by the back
gesture than V3. This is because the blending model determines that V2 is more
retracted, i.e. influenced more by the adjacent back gesture, than V3.
Therefore, the parameter q governs the degree of retraction R of front vowels
when adjacent to back vowels.79 This lawful relationship between q and R is
important in our modeling because I argued in Chapter 4 that retraction degree is the
key phonetic factor that correlates with the quality of the suffix. In the following, I
79
In addition to q, the other factor that influences the retraction of underlyingly front
V2 gesture in a V1-V2 sequence is the [±back] specification of the V1. If the V1 is a
front vowel, the attractor for the CL value will be close to the value of (+2). Given
that both V1 and V2 are specified as front, the difference between the respective
attractors will be minimal. In that case, irrespective of the value of q, the retraction of
V2 would be minimal since there is no back vowel gesture that would induce this
retraction via blending. In the same way, the model predicts that differences in
backness among the back vowels also affect the retraction of the front vowel. This
prediction will be tested in subsequent experiments.
246
examine the differences among the front vowels in the way they undergo blending
when preceded by back vowels. I will formalize these differences with the
articulatory stability parameter q of the gestural blending model. In order to illustrate
this relationship, the [±back] specification of V1 is kept constant, which corresponds
to CL = –2.
5.4.2.1. Effect of front vowel rounding
The Hungarian data discussed in Chapter 2 showed that there is a fundamental
difference between unrounded and rounded front vowels with respect to their
phonological behavior in disharmonic stems. It was observed that the unrounded
vowels behave transparently whereas the rounded ones behave opaquely. For
example, the stem papír selects suffixes with back vowels, and parfüm selects
suffixes with front ones.
With respect to tongue body position, Section 4.6 of the previous chapter
presented converging evidence that coarticulation with back vowels affects rounded
and unrounded vowels differently. Stevens (1989), using simple tubes, and Wood
(1979) using natural human vocal tract profiles, showed that the acoustic outputs for
non-low front vowels are insensitive to a limited amount of variation in the horizontal
position of the tongue body. Therefore, /í/ can potentially be retracted to some degree
Rí without losing its perceptual identity.
247
As argued in Section 4.6.2, however, the acoustic output of front rounded
vowels is very sensitive to even small amount of articulatory variation if the tongue
body is retracted from the pre-palatal area. Consequently, if /ü/ is subject to blending
with a preceding back vowel and is retracted to a degree comparable to Rí, it would
lose its perceptual identity. Therefore, /ü/ may be retracted only slightly, to a degree
Rü that is smaller than Rí.
What is at stake, then, is the relationship between flexibility and
recoverability of vowel gestures. The vowel gesture for /í/ has high flexibility with
respect to the constriction location because its region of recoverability is quite large.
Hence, even if the tongue body is slightly retracted, it is still recovered as ‘palatal’. In
contrast, the vowel gesture for /ü/ has lower flexibility because the recoverability
region in the palatal area is decreased. If the vowel /ü/ was allowed the same
articulatory flexibility as the vowel /í/, the normally stable system of acousticarticulatory correspondence would become increasingly unstable. As a result, chances
for recovering the constriction location as ‘palatal’ for /ü/ from the acoustics would
be significantly decreased.
Hence, there are competing requirements on coarticulation (gestural overlap).
On the one hand, overlap should be maximized because it facilitates perception (e.g.
Fowler 1981). On the other hand, it should be limited in those cases when it decreases
248
the stability of the acoustic-articulatory correspondence, because it diminishes the
chances of successful recovery of the gestures.
In traditional accounts, the opaque behavior of front rounded vowels in vowel
harmony was ascribed to their markedness. They were considered less perceptually
stable, less salient or requiring more articulatory effort; in short, /ü/ was considered
more marked than /í/. In the proposed model, markedness is not an inherent property
of vowels. Rather, adjacency of /ü/ to back vowels coupled with the blending
requirement of harmony makes /ü/ dispreferred because it limits the chances of
successful recovery of the /ü/ gesture.
In the approach taken in this dissertation, differences in the degree of
retraction of stem-final vowels in vowel harmony are linked to the way they blend
articulatorily with preceding vowels. In this blending then, differences in the degree
of articulatory flexibility derive from differences in the parameter q. Fig. 42 shows
how the parameter q affects the result of blending between a back and a front vowel
in our model. The left panel simulates blending of a back vowel with a transparent
vowel, and the right panel shows blending with an opaque vowel. The transparent
vowels are articulatorily flexible (allow lower degree of articulatory precision). This
is expressed with a low value of q, which results in high degree of retraction R. The
opaque vowels are less flexible (require higher articulatory precision); hence, the
value of q is high because the weight of the front attractor is increased. As a result,
249
the degree of retraction R is smaller for opaque vowels compared to R of transparent
vowels. Hence, Rü < Rí corresponds to qü > qí and the parameter q expresses the
quantal specification of individual vowels.
Fig. 42 – Model of blending of a transparent (left) and an opaque (right) vowel
with a preceding back vowel.
In addition to the difference in flexibility, another difference between the
rounded and unrounded front vowels shown in Fig. 42 is the difference in the strength
of the attractor of the blended gesture. It can be seen that the output gesture has a
stronger attractor for the opaque vowels than for the transparent vowels. Therefore,
the model predicts less articulatory variability for front rounded vowels than for
unrounded ones when preceded by a back vowel (qü > qí) as well as independently of
context (αí < αü).
250
5.4.2.2. Effect of front vowel height
As discussed in the Chapters 2 and 4, in addition to the major phonological
distinction between transparent and opaque vowels, there are minor distinctions
among the members of the set of transparent vowels. Most significantly, short low /e/
behaves less transparently than the other three vowels. This is shown most clearly in
the suffix selection of disyllabic BT stems whereby /e/ triggers vacillation (hotelnak/nek) but {/i/, /í/, /é/} trigger back suffixes almost exclusively. It was observed
both cross-linguistically as well as in Hungarian data that the lower (and more
retracted) transparent vowels are more likely to select suffixes with front vowels.
Based on this it was argued that the vowel /e/ is less transparent than {i, í, é} and that
height of the front unrounded vowels has a direct relationship to their phonological
transparency.
The assumption of the dynamic model developed in this chapter is that the
differences in suffix selection correlate with differences in the degree of retraction.
Logically, if Rü < Rí applies to the difference between transparency and opacity, then
Rü<Re<Rí should hold for medially transparent /e/. Let me review the evidence that
supports this intermediate degree of retraction for /e/ in Hungarian.
251
Unfortunately, the experiments discussed in Chapter 3 did not contain reliable
data to investigate the effect of environment on the production of /e/ in Hungarian.80
However, there is indirect evidence supporting the crucial difference between low
and non-low front vowels in terms of their quantal properties. In the proposed model,
quantal properties of individual vowels (flexibility and recoverability of their
gestures) are directly related to their pattern of blending with back vowels. Less
flexibility and decreased recoverability translates into lower degree of retraction.
Therefore, evidence for decreased flexibility and recoverability of /e/ compared to the
other three transparent vowels supports difference in the degree of retraction
Re < Ri,Rí,Ré.
Section 4.6 in Chapter 4 reviewed three types of evidence supporting
decreased flexibility and recoverability of /e/ compared to the other three transparent
vowels. Firstly, the studies by Stevens (1989) and Wood (1979) showed that
increased acoustic stability related to insensitivity to articulatory retraction of the
tongue only applies to front non-low unrounded vowels. As discussed in Chapter 4,
the ultrasound imaging showed that /e/ is notably lower and more retracted compared
80
For the pilot study, a limited set of stimuli with these vowels was constructed.
However, it was not possible to determine reliably the target position in many tokens.
In the follow-up study, allowed time for data collection did not permit to include
more stimuli. Finally, in the experiment designed to test the behavior of /e/, data were
corrupted due to problems with the receivers placed on the tongue and had to be
discarded.
252
to the vowels {i, í, é}, among which only minimal differences in height and backness
can be observed. Secondly, observations from Finnish vowel harmony (VälimaaBlum 1999) support the hypothesis that minor deviations from the intended
horizontal articulatory target for the low vowels are transformed to readily
perceivable differences in vowel quality. Thirdly, some Hungarian speakers perceive
/e/ in hotel-ban as different from /e/ in hotel-ben while they do not report this
difference for other transparent vowels {i, í, é}.
To summarize, there is evidence that the quantal features of low front vowels
are different from those of the non-low ones. More specifically, the acoustic output of
the low front vowels is more sensitive to articulatory perturbations in the horizontal
position of the tongue body than the acoustic output of the non-low vowels.
Therefore, low vowels tolerate less articulatory variation. As a result, it is proposed
that /e/ is less flexible than {/i/, /í/, /é/}, and that blending of a back vowel and /e/
causes a smaller degree of front vowel retraction than the retraction resulting from
blending of a back vowel with any of the three other transparent vowels.
Quantal differences between /i/ and /ü/ were formalized above with the
parameter q = α/β, where α and β represent the weights for the two adjacent gestures
undergoing blending (αF(x) + βG(x)). It was observed that for the input variables
qi < qü, the model of blending a back vowel with a front vowel generates a more
retracted output gesture for unrounded /i/ than for rounded /ü/ (Rü < Rí). This
253
difference between /i/ and /ü/ is shown in Fig. 43 as the difference between the
leftmost and the rightmost panels. On the left, q = 3, which results in R = 1.0. On the
right, q = 10, which results in R = 0.36. Therefore, the parameter q expresses both the
flexibility and recoverability of front vowels. The unrounded vowel /i/ has a greater
region of recoverability, which makes it more flexible than the rounded vowel /ü/.
Following the formalization of differences between /i/ and /ü/, it is proposed
that low front /e/, when subject to blending with a back vowel, can be retracted by an
intermediate amount due to its different quantal properties from the other two vowels.
The middle panel in Fig. 43 shows that intermediate values of q result in intermediate
values of R (e.g. qe = 5, Re = 0.66).
Fig. 43 – Blending of a back and a front vowel gestures: differences in
articulatory elasticity expressed with the parameter q result in differences in
the degree of retraction R of the front vowel.
254
Fig. 43 thus shows how the variation in q results in different degrees of
retraction R. These graphs also show the output of the first component of our
dynamic model. It can be seen that in terms of the tongue body constriction location
(CL), the proposed model of blending produces various degrees of retraction R of a
front vowel gesture when it blends with a preceding back vowel gesture. These
differences in R stem from differences in quantal properties of front vowels.
5.4.2.3. Effect of front vowel advancement
As observed in Chapter 4, the frontness of the transparent vowels in the front
environment correlates with the observed degree of retraction in the back
environment. The vowel with the most advanced position in the front environment is
retracted the most in the back environment. Fig. 44 shows that the proposed dynamic
model predicts this fact.
The left panel models how the value of q = 3 results in the degree of retraction
R = 1 for the front vowel gesture. Crucially, the horizontal target for the input front
vowel gesture is at CL = 2. In contrast, the attractor of the front gesture in the right
panel is shifted slightly to the right (CL = 2.2 > 2). Therefore, this gesture is slightly
more advanced than the front vowel gesture in the left panel. Consequently, this
change in the value of the CL results in a small increase in the value of R
(R = 1.05 > 1.0).
255
Fig. 44 – Modeling different retraction degree as a result of increased
frontness of the front vowel.
The graphs in Fig. 44 show that changes in the retraction degree R of the front
vowel gesture can be brought about by variation in the position of the attractor for
that front vowel. More specifically, the more advanced the front vowel is, the more it
is retracted when adjacent to a back vowel.81 Thus, the variation in the retraction
degree of stem-final transparent vowels when preceded by a back vowel derives from
two sources: the index of their articulatory flexibility expressed as the parameter q,
and their horizontal target position expressed as the CL parameter.
81
This applies under the assumption that all else is equal, namely q and the dynamic
specification for the back gesture.
256
To summarize the effects of the parameters q and CL on the retraction degree
R, consider Table 29 below. In the table, the increase in the value of q (shown in
columns) results in a decreased value of R for any value of CL. On the other hand,
for any fixed value of q, a decrease in the value of the CL attractor of the front vowel
gesture (shown in rows) also results in a decreased value of R.82
q
2.5
3
5
10
2.4
1.26
1.1
0.73
0.40
Constriction Location (CL)
2.2
2
1.8
1.2
1.14
1.09
1.05
1
0.95
0.70
0.66
0.63
0.38
0.36
0.35
1.6
1.03
0.9
0.60
0.33
Table 29 – Degree of retraction as a function of variation in quantal features
(q) and input value of Constriction Location (CL) of the front gesture
In short, this table shows the prediction of our model with respect to the
degree of retraction of the transparent vowels when preceded by a back vowel. The
model predicts that both the quantal properties of the transparent vowels (formalized
with q values) as well as the differences in their articulatory frontness (formalized as
target CL values) affect the degree of retraction.83
82
The CL value of the first back gesture, which is another possible source for the
variation in retraction, is kept constant.
83
It is not clear what the precise relationship between quantal properties and
frontness is. It is possible that the two are tightly linked so that more front vowels
always exhibit less articulatory stability for the purposes of retraction. The table only
shows that our model is capable of deriving the generalization observed in our data. I
257
I summarize the first component of our dynamic model – blending of adjacent
vowel gestures – in (37) below. In the left column, I list the experimentally
observable properties of the transparent vowels. In the right column, I summarize the
dynamic formalism of these properties.
(37)
Summary of the blending model
Speech observables
Formalism in Dynamics
Phonological category [± back]
Vi(CL) = (CL – CLo)2
[-back]
CLo
=m
[+back]
CLo
= –m
V-to-V coarticulation
Blending of adjacent vowel gestures
αVi(x) + βVj(x)
Flexibility & Recoverability of gesture
q = α/β, q>0
Retraction of the stem-final vowel
Ri=|min(αVi(x)+βVj(x))-min(αVi(x))|
In the first three rows, I summarize the proposal that articulatory gestures are
formalized with attractor dynamics. Phonological backness corresponds to the
Tongue Body Constriction Location parameter (CL). The stable values of this
leave the question of exact relationship between frontness and quantal properties of
vowels for future research.
258
parameter in the dynamic system formalized in the first row correspond to [± back]
featural specification.84
We know since Öhman’s (1966) that adjacent vowels influence each other.
This fundamental property of speech is implemented in our model as linear addition,
blending, of the two adjacent vowel gestures. This is shown in the fourth row.
Differences in quantal features observed in the studies by Stevens and Wood
enter into our dynamic model as the ratio of the weights of the adjacent gestures. This
is expressed by the parameter q of the blending process.
The final row shows the output of the first component of the model expressed
as the degree of retraction of the stem final vowel. This is computed in the model as
the absolute value of the difference between the x-values of the minima from the
potential Vi(x) of the stronger gesture, and the potential αVi(x)+βVj(x) of the gesture
after blending.
5.5.
Model of suffix selection
In this section, I present the second component of the dynamic model and show how
the variation of the continuous control parameter R results in the front or back quality
of the suffix vowel.
84
As mentioned in Section 5.4.1, a single value of CL corresponding to all back
vowels (/u/, /o/, and /a/) is a simplification. The effect of the quality of the back
vowel on the retraction is an area for future research. See also Section 7.1.
259
In Chapter 4, I argued that traditional implementational models of the
relationship between phonetics and phonology are not suitable for characterizing the
link between retraction of Hungarian transparent vowels and the choice of the suffix
that follows them. The core of the argument was the fact that the phonetic and
phonological properties of the transparent vowels are interdependent whereas the
derivational models only allow dependency of continuous phonetics on categorical
phonology.
In this section, I propose an alternative model based on the mathematics of
non-linear dynamics. This formal language allows one to express, within a single
system, the complex relationship between qualitative and quantitative aspects of that
system. In particular, the qualitative aspects represent phonological generalizations
related to the form of the suffix, and correspond to variation in an order parameter.
The quantitative aspects represent continuous phonetic details of the stem-final front
vowels, and correspond to control parameter(s). I take advantage of non-linearity as
one of the fundamental properties of dynamic systems where smooth variation of the
control parameter brings about discontinuous changes in the order parameter (e.g.
Fig. 32 in Section 5.2). In the case of transparency in Hungarian vowel harmony,
non-linearity is the key notion that links the [± back] feature of the suffix (the order
parameter) with phonetic retraction of stem-final vowels (the control parameter).
260
In the proposed dynamic model, the two discrete forms of an alternating
suffix (e.g. dative -nak vs. -nek) are mapped to attractors of a single dynamic system.
In order to model both continuous and discrete parameters in one system, the choice
of the attractor must be modulated by variation in R. This parameter represents the
retraction degree of the stem-final vowel and is computed in the first component of
the model. Mathematically, these ideas can be stated in the equation
dx/dt = N(x, R) + F(t). This equation expresses the temporal evolution of the suffix
vowel (tongue body) constriction location variable, denoted by x, as a nonlinear
function N of the current state x and the control parameter of retraction degree R.
A dynamic system modeling suffix alternation in vowel harmony is required
to have a two-attractor potential representing the two stable forms of a front and a
back suffix. Given this requirement, I follow Gafos (in press) and propose that a good
candidate for N is the ‘tilted’ anharmonic oscillator, whose dynamics are described
by N(x, R) = (3R – 2) + x – x3. The important fact is that the non-linear function N
has to be at least cubic, i.e. with the largest exponent of x at least 3. This is because
we need two distinct attractors, one for -nak and one for -nek. Arnold (2000) showed
that a polynomial of degree less than three does not allow for more than one attractor.
Finally, the factor F(t) represents the noise and can be ignored for now.
I employ the formal apparatus described in Section 5.2.4 where the value of
the constriction location for a suffix vowel is interpreted by the position of a ball
261
running downhill in the potential landscape V(x) = (2 – 3R) – ½x2 + ¼x4, which can
be obtained by integrating N(x, R). The asymptotic behavior of x in this equation can
be visualized by looking at the simulations shown in Fig. 45. These graphs provide a
qualitative view of solutions to the differential equation
dx/dt = N(x, R) = (3R–2) + x – x3 in the sense described in Section 5.2.4.
Fig. 45 – Suffix form as a function of retraction degree, graphs of V(x)=(2–
3R)x–(1/2)x2+(1/4)x4 for three values of R obtained from simulations in the
stem-internal component.85
The graph on the left simulates the suffix selection in stems like papír. The
output parameter of stem-internal blending was calculated as the retraction degree Rí
= 1. The function N(x, R) for this value of R provides a potential V(x) with one
85
In the simulation shown in the figure, the coefficient for x4 was 1/10 for exposition
purposes.
262
attractor close to the value of CL = –2. This value corresponds to the back variant of
the suffix and is denoted as the BACK attractor. There is high probability that a ball
left in this potential ends up in the vicinity of the attractor. Since the position of the
ball represents the output of the dynamic modeling for the CL parameter of the suffix
vowels, the suffix surfaces as back, e.g. papír-nak.
The graph in the middle panel shows how the potential V(x) changes when R
decreases. For the value of R = 0.3 a qualitative change is evident in the shape of
V(x). The BACK attractor has been replaced by a FRONT attractor that corresponds to
the front variant of the suffix. The FRONT attractor is located at the other end of the
constriction location axis around the value CL = 2. We saw in the previous section
that this value of R = 0.3 was derived by blending of a back vowel gesture with a
front rounded vowel gesture, as in parfüm. Hence, the proposed dynamic model
provides a front suffix for opaque vowels, which is consistent with the Hungarian
data.86
86
Although disharmonic stems with front rounded vowels such as parfüm always
select front suffixes in Hungarian, this is not the case in Finnish. Apart from the
mentioned perceptual instability of front rounded vowels when adjacent to back
vowels, there are cases when front rounded vowels, mostly [y] (= /ü/), may behave
transparently in disharmonic stems. For example, Campbell (1980) reports that either
front or back suffixes are possible in disharmonic loan stems, for some even back
harmony is more frequent: analyysi-sta, analyysi-stä ‘from analysis’, or marttyyre-ja,
marttyyre-jä ‘martyrs, partitive pl.’.
I argue that this Finnish data is not problematic for our analysis of [y] in Hungarian
for two reasons. Firstly, Campbell (1980: 250-1) also notices that “back harmony is
263
The change from the left panel to the middle panel illustrates the fundamental
property of non-linearity: a small change in the control parameter (degree of
retraction) results in a large change in the order parameter (quality of the suffix). In
other words, there are intervals where changes of the parameter R of certain
magnitude do not cause macroscopic changes of the order parameter. For example, if
the control parameter is tweaked by one unit from R = 1 to R = 2, the model of suffix
selection still predicts a front suffix. However, if the control parameter is tweaked
within an ‘unstable’ interval of the R values, e.g. from R = 1 to R = 0, the change of
the same magnitude of one unit results in a system where the stable value of the order
parameter is different (a back suffix in our case).
Finally, the graph on the right in Fig. 45 shows the behavior of this dynamic
system for intermediate values of the parameter R. In nonlinear dynamics, a change
from one macroscopic state of the system to another implies an intermediate stage of
fluctuation. The potential V(x) is shown for the intermediate value of R = 0.6. We see
considered more prestigious, more learned while front harmony is more colloquial”. I
follow the claim advocated in van der Hulst (1985: 288) that “…in colloquial styles
we find the things that are phonologically natural. The prestigious style may just as
well contain a non-phonological rule inserting the autosegment [+B] in the case of
loans that typically belong to learned vocabulary”. This autosegment might enter our
dynamic model as additional retraction in the stem-final vowel since in all cases
where Campbell reports transparent behavior of [y], this front rounded vowel is
followed by another front unrounded transparent vowel (mostly [i]). Secondly, it is
possible that in many cases, these front rounded vowels are actually realized as either
[u] or [u] (Wiik 1995, Välimaa-Blum 1999), in which the selection of back harmony
suffixes is predicted.
264
that there are now two minima representing the presence of two stable states, FRONT
and BACK. An intermediate value of R thus gives two possible states of the system:
the ball may end up in the front or back attractor.
However, for each particular run of the dynamic model there must be a unique
solution (there is only one ball and one suffix). In order to determine the quality of
the suffix in this situation, the effects of noise (F(t)) and initial position of the ball
must be considered. None of these were relevant in the other two runs where R had a
value of 1.0 and 0.36, as seen in the middle and left graphs. The ball in any initial
position within the potential well would still be drawn to the same attractor. Only
small variation in the final position of the ball would be caused by noise.
In contrast, both the initial position, and noise could have a significant effect
on the final state of the dynamic system illustrated in the rightmost potential of
Fig. 45. To see this, consider, for example, a ball at a position around (0,0) in this bistable potential. Due to the random kicks introduced by fluctuations, the ball will end
up either in the left attractor or the right one.
The bi-stable stage with two attractors is a welcome result of the dynamic
modeling since the corresponding situation is found in Hungarian data where
vacillating stems may select either of the two suffixes.87 There are two sources or
vacillation in Hungarian. As I discussed earlier, vacillation is particularly salient in
87
Vacillation is also possible in disharmonic loan words in Finnish.
265
the stems in which a back vowel is followed by the low /e/. I argued that the
intermediate degree of retraction for this vowel is due to the fact that its quantal
properties differ from other transparent vowels. Specifically, its low and somewhat
retracted constriction location of the tongue body gesture allows only limited
retraction when such a gesture blends with a back vowel gesture.
The other source of vacillation in Hungarian comes from the stems where a
back vowel is followed by multiple transparent vowels (BTT stems). Suffix selection
in these stems is variable and thus they may select either front or back suffixes.
Recall from Chapter 2 that the most common pattern is to select [–back] suffixes, but
many BTT stems also allow vacillation and are thus followed by either [–back] or
[+back] suffixes. An example of a vacillating stem where two transparent vowels
follow a back vowel is aszpirin where both aszpirin-ban and aszpirin-ben ‘aspirinInessive’ are possible.
I show below that not only can the dynamic model account for this behavior,
but it also offers a principled explanation of the robust generalization in Hungarian
that BTT stems are more likely to select front suffixes than BT stems.
Fig. 46 shows the application of the first component of the model for trisyllabic BTT stems such as aszpirin. The two panels correspond to the two cases of
blending since there are two pairs of immediately adjacent vowels: /a-i/, and /i-i/. The
panel on the left shows blending between the initial /a/ and the following /i/. The
266
weight of the /i/ gesture is α = 3, and the weight of the /a/ gesture is β = 1. Hence, the
index of articulatory flexibility is q = α/β = 3. The potential for the front vowel
gesture after blending with a back gesture is shown with the dashed line. We see that
this blended gesture is retracted compared to the input /i/ gesture. It has the attractor
at CL = 1, which gives a retraction degree R for the first /i/ vowel Ri = 1.
Fig. 46 – Stem-internal blending in the BTT stems, between the initial back
vowel and adjacent transparent one on the left, and between the two
transparent vowels on the right.
Recall from the previous section that the model of blending generates the
same retraction degree (R = 1) for the front vowel in BT stems such as papír. In this
case, /í/ is in the stem-final position. Hence, its retraction degree R enters into the
computation of the following suffix vowel. We know from the left panel of Fig. 45
267
that the second component would yield a suffix with a back vowel for this value of R.
However, the /i/ in the second syllable of aszpirin is not in the stem-final position
because it is followed by another vowel.
The right panel of Fig. 46 shows the second blending between a retracted
/i/, the output of the first blending, and another /i/ vowel. The weighting of the
gestures is kept the same as for the first blending. The α value for the stem-final front
vowel was kept at 3 while the β for the blended gesture of the second stem vowel was
set to 1, which yielded the value of q = 3. This blending yields the potential shown
with the dotted line whose attractor is at CL = 1.4. The degree of retraction R of the
stem-final vowel in aszpirin is then R = 0.6. Effectively, the second transparent
vowel is less retracted than the first one (RT2 = 0.6, RT1 = 1.0).
The left panel in Fig. 47 shows the potential landscape of the non-linear
model for the value of the control parameter R = 0.6. We see that the potential V(x)
for the suffix vowel following a stem-final vowel with this retraction is bistable with
a bias toward the front value of the suffix. The right panel shows the potential for the
suffix vowel preceded by a stem-final vowel retracted by R = 1. The difference in the
potential wells for the retraction degrees of R = 0.6 and R = 1 translates into a
difference in suffix selection between BTT (R = 0.6) and BT (R = 1) stems. The
former are more likely to select front suffixes than the latter. This prediction of the
model is matched in the Hungarian data.
268
Fig. 47 – Potential for the CL value of the suffix vowel when retraction degree
is 0.6.
As I discussed in Chapter 2, the difference between BT and BTT stems in
terms of their suffix selection is problematic for the traditional analyses of
transparency. Because the transparent vowels in the BT stems do not affect suffix
selection, the transparent vowels in the BTT stems should not either. Therefore, the
phonology of suffix selection would have to include separate explanations for the two
patterns.
By linking suffix selection with non-phonemic features of the transparent
vowels, the proposed model explains the difference between BT and BTT stems in a
unified way. All stem vowels participate in harmony since all vowel gestures undergo
blending. Because there are more front vowels in the BTT stems than in the BT
stems, the stem-final vowel in the BTT stems is less retracted than the one in the BT
269
stems. In the proposed non-linear dynamic system, the difference in retraction
corresponds to qualitatively different outputs for the suffix selection. Therefore, the
generalization from Hungarian that the BTT stems are more likely to select front
suffixes than the BT stems is also predicted.
Finally, the proposed model also accounts for the observation made by Ringen
& Kontra (1989) that “… suffix vowel choice can be influenced by the harmonic
quality of a vowel in a preceding morphologically identical suffix” (Ringen & Kontra
1989: 184). Crucially, vowel harmony is assumed to operate only within words,
hence the [±back] quality of a suffix in a preceding word should not affect the suffix
choice in the target word.
Following the modeling of the effect of intentions on grammar (Gafos, to
appear), the effect of the [±back] quality of the preceding suffix vowel may be
formalized by adding a potential E(x) (environment) with a single fixed point
(E(x) = a(x – xo)2) to the output potential for the suffix vowel
V(x) (V(x) = (2 – 3R) – ½x2 + ¼x4). The value of xo corresponds to the [±back]
quality of the preceding suffix. Assimilatory influence of the preceding suffix on the
target suffix occurs if a > 0 whereas a dissimilatory effect is obtained if a < 0. Fig. 48
shows the effect of a preceding front vowel suffix on the target suffix as addition of
V(x) and E(x). The potentials V(x) are marked with the dashed lines and those for
270
E(x) with the dotted lines. The potentials resulting from the addition are shown with
solid lines.
Fig. 48 – Effect of preceding front suffix (x0 = 2) on the target suffix vowel.
Plots a-c show assimilatory influence (a > 0), and Plots d-e show dissimilatory
influence (a < 0).
It can be seen that for the setting of the parameters in Fig. 48, assimilatory
influence (a > 0) of the preceding front suffix (xo = 2) decreases the stability of the
271
back attractor for both papír-nak, shown in Fig. 48a, and hotel-nak, shown in Fig.
48c. In contrast, plots in Fig. 48d and Fig. 48f show that a dissimilatory influence
increases this stability. With respect to the front suffixes in parfüm-nek and hotel-nek,
the assimilatory influence of the front suffix slightly increases the stability of the
front attractor, while the dissimilatory influence increases the probability of the back
suffix. The strength of the environmental effect is modeled as the absolute value of
the parameter a. As with other control parameters, if |a| reaches some critical value, it
can potentially affect suffix selection not only quantitatively but also qualitatively.
This can be seen in the plot in Fig. 48e where the output potential for the suffix vowel
changes from mono-stable to bi-stable.
5.6.
Summary and Conclusion
Let us summarize the predictions of the dynamic model and review how the model
captures the correlations between phonetics and phonology of transparent vowels
presented in Chapters 2-4. The model of suffix selection is based on the retraction in
the stem-final vowel. Significant retraction generates back suffixes and minimal
retraction generates front suffixes. In this aspect, the model is descriptively adequate
and matches the production data discussed in Chapter 3.
I modeled the retraction of stem-final front vowels as originating from
articulatory blending with preceding vowel(s). This blending is in turn constrained by
272
independent facts about the quantal (non-linear) relationship between articulation and
perception of front vowels. The model predicts that the most transparent vowels in
palatal vowel harmony are those vowels that can be maximally retracted without
losing their perceptual recoverability. As discussed in Chapter 4, this quantal property
of front non-high unrounded vowels has been established in independent phonetic
experimental work, and conforms to cross-linguistic generalizations about the nature
of transparent vowels.
Additionally, the model of blending predicts that T2 in a BT1T2 stem is less
retracted than the T1 in a BT1 stem. Since the choice of the suffix is linked to the
stem-final retraction, the model predicts more back suffixes in the BT stems than in
the BTT stems. This prediction is also supported by the data from Hungarian and
further strengthens the claim that sub-phonemic retraction of the stem-final vowel is
relevant for the phonological alternation in suffixes.
Finally, the proposed dynamic model predicts a state of bistability where more
than one value of the order parameter is available. This is borne out in the Hungarian
data in the pattern of vacillation between the [+back] and [–back] value of the suffix
vowel. Moreover, the model is flexible and allows incorporation of other external
effects on the pattern of suffix selection. Overall then, the model provides a uniform
treatment for the three major phonological patterns in Hungarian vowel harmony:
transparency, opacity, and vacillation.
273
CHAPTER 6
OT formalism of vowel harmony: Integrating OT and dynamics
6.1.
Introduction
It was argued in previous chapters that both phonetic and phonological qualities play
a role in the phonology of Hungarian vowel harmony. The experimental results
presented in Chapter 3 showed that sub-phonemic retraction correlated with the
phonemic form of the suffix. Chapter 4 presented converging evidence for, and
Chapter 5 an explicit model of, the relationship between articulatory retraction of the
stem-final vowel and the form of the suffix following that vowel.
The model in Chapter 5 developed two major ideas. First, it was proposed that
continuous aspects of stem-final retraction are related to the discrete choice of the
suffix with the use of non-linear dynamics. Second, the relationship of stem-final
retraction with other vocalic dimensions such as height, rounding, and adjacent vowel
context was modeled as stem-internal blending of vowel gestures. In this sense, the
model accounted for Hungarian phonetic and phonological data and proposed
explanations for cross-linguistic generalizations.
However, the model did not account for cross-linguistic typological
differences related to vowel harmony. For example, adjacent stem vowels in English
presumably undergo a form of articulatory blending similar to the one modeled for
Hungarian. Hence, /i/ in happy might be retracted to a greater degree than /i/ in hippy
274
due to the difference in the initial vowel. Yet, the phonemic quality of English
suffixes is not dependent on this coarticulatory blending: the suffix vowel in happiness is phonemically identical to the suffix vowel in hippi-ness. In contrast, Turkish
presents a system that differs from English as well as from Hungarian. In Turkish, an
initial back vowel imposes its backness on all following vowels, e.g. kapi ‘door’ but
not *kapi. This backness also triggers phonemic variation in the [±back] quality of
suffixes, e.g. kapi-lar ‘door-Nom.Pl.’. Informally, the stem-internal blending of
adjacent vowels in Turkish behaves differently from Hungarian because a front
unrounded vowel may be retracted to such a degree that it is perceived as a back
vowel.88
Hence, there are systematic differences among languages such as English,
Hungarian, and Turkish. The model presented in Chapter 5 identified several
phonological demands on the vowel systems of these languages. For example, the
blending of vowel gestures arises from pressure to harmonize articulatory targets for
adjacent vowels. Or, vowels differ as to the strength with which they preserve their
articulatory specifications. While the dynamic model does not handle differences
between English, Turkish, and Hungarian, the formalism of Optimality Theory
88
Exceptionally, stems where a back vowel is followed by a front unrounded vowel
do occur in Turkish: e.g. hamsi ‘anchovies’, anne ‘mother’. In these cases, however,
Turkish differs from Hungarian in terms of suffix selection: the Turkish front
unrounded vowels behave opaquely and are followed by front suffixes, whereas front
unrounded vowels in similar Hungarian stems behave transparently.
275
(Prince & Smolensky 1993) is ideally suited to deal with typological patterns. In
addition many instances of variation within a language have been successfully
modeled with OT (e.g. Anttila 2002). Hence, OT framework has proven successful in
explaining both cross-linguistic differences such as those between Hungarian,
English, and Turkish, as well as variation within a language such as the tendencies in
suffix selection in Hungarian described in Chapter 2.
This chapter provides an analysis of transparency couched in the framework
of Optimality Theory. This analysis is based on the conflict between the pressures for
articulatory agreement of adjacent vowels on the one side, and perceptual faithfulness
of these vowels on the other side. As in any OT grammar, this conflict is resolved
with constraint ranking. The constraints and their evaluation derive from a set of
assumptions described in the previous section and packaged as a model of steminternal blending and stem-suffix agreement.
The proposed analysis involves a novel approach to the definition and
evaluation of OT constraints. This approach provides a framework for analyzing the
relationship between phonetic and phonological variables and follows thus from the
insights gained from dynamic modeling described in the previous chapter. Relevant
phonological constraints are defined formally as dynamic systems that control
phonetic dimensions. These constraints establish preferred and/or disprefered values
of these phonetic dimensions for the OT evaluation.
276
In this way, the cognitive system that models transparency in vowel harmony
is phonetically grounded because constraints are defined over phonetic dimensions.
Crucially, such a system is also phonologically stable because it does not ascribe
relevance to all phonetic details. Rather, the proposed OT model transforms highdimensional phonetic variation into a low-dimensional system of (dis)preferred
phonetic values.
The chapter is laid out as follows. Section 6.2 provides a brief summary of the
main concepts in traditional OT. Section 6.3 introduces the proposed extensions of
OT: dynamic definition of constraints and gradient vs. categorical constraint
evaluation. The tools introduced in this section are applied in the description of
constraints used in the analysis of Hungarian vowel harmony in Section 6.4. The
phonological patterns of transparency, opacity, vacillation, and exceptional suffix
selection in monosyllabic stems are analyzed in Section 6.5. Finally, Section 6.6
discusses typological implications of the proposed analysis, and Section 6.7
concludes the OT analysis of Hungarian palatal vowel harmony.
6.2.
Optimality Theory
The analysis provided in this section assumes the constraint-based framework of
Optimality Theory (Prince & Smolensky 1993). In this framework, grammar is
conceived as a hierarchy of ranked and violable constraints that evaluate the well-
277
formedness of output forms. An OT grammar consists of three main components:
GEN, CON, and EVAL. The function GEN associates each input with a (possibly
infinite) set of output candidates. CON is a set of universal constraints that are ranked
on a language particular basis. EVAL is a function that assesses the output candidates
and orders them based on how well they satisfy a particular ranking of the
constraints. EVAL selects one candidate that best satisfies the constraints as the
optimal output.
To illustrate ranking and violability of the constraints, consider an abstract
example where CON consists of three constraints A, B, and C, and GEN provides
three output candidates cand1, cand2, and cand3. Given a language-specific ranking
‘A dominates B’ (A>>B), and ‘B dominates C’ (B>>C), the EVAL(uation) takes the
form presented in the following tableau.
(38)
Example of an OT tableau
input
A
a. cand1
b. cand2
B
C
*
**!
*
*
*!
c. cand3
The input and the most competitive output candidates are in the leftmost column and
the constraints are arranged in the following columns with ranking descending from
278
left to right. An asterisk in a box means that the candidate in the corresponding row
violates the constraint in the corresponding column.
The evaluation proceeds from higher-ranked constraints to lower-ranked ones;
hence, it starts with the constraint A. The candidate in (38b) incurs a violation of this
constraint. This is a fatal violation (marked by ‘!’) because it disqualifies this
candidate from further evaluation despite the fact that (38b) does not violate any
other constraints. Shading shows the irrelevance of other constraints in the evaluation
of this candidate. Candidates (38a) and (38c) do not violate A, hence, they ‘qualify’
for the evaluation by the next constraint in the hierarchy: the constraint B. Both
remaining candidates equally violate this constraint, so the decision for the optimal
output rests on the constraint C. The violations of this constraint are computed
gradiently on some basis, and thus C presents a case of gradient constraint violation.89
EVAL deems candidate (38c) as the actual output (marked by ‘’) because it incurs
fewer violations of the lower-ranked constraint C.
The example tableau in (38) showed that the ranking of the constraints is
strict, and each constraint has absolute priority over any constraint it dominates. In
this way, satisfaction of a higher-ranked constraint can drive a violation of a lowerranked one. As a result, an optimal output candidate typically violates some lower-
89
The difference between gradient and categorical evaluation will be discussed in
more detail in the next section.
279
ranked constraints. Constraints represent demands that are satisfied whenever
possible and violated in a particular language only to avoid violation of some higherranked, more crucial, demands in that language.
Finally, cross-linguistic variation in OT arises by re-ranking of the constraints.
Hence, the ranking A >> B >> C in one language may be permuted to A >> C >> B
in another language.
6.3.
Dynamic definition of OT constraints and their evaluation
As it was mentioned in the introduction, the OT constraints used in this analysis are
dynamically-defined attractor systems that control relevant phonetic dimensions. Let
me illustrate this idea with the constraint *VOICE. A typical OT definition of this
constraint is a statement similar to the one in (39).
(39)
*VOICE – Consonants are voiceless.
In addition to stating the constraints in this verbal fashion, the relevant phonetic
dimension will be explicitly specified. In the case of *VOICE, the relevant dimension
is the position of vocal cords.
In dynamic terms, the constraint in (39) represents a system where open
glottis, i.e. the abducted position of vocal cords, constitutes a stable, preferred state.
This concept is illustrated in Fig. 48 below. The x-axis represents the distance
280
between the vocal cords. This exposition is simplified by the assumption that when x
is zero, the vocal cords are abducted, which results in the absence of voicing.90
V(x)
0
x
vocal cords’ distance
Fig. 49 – Illustration of a dynamic system defining the constraint *VOICE.
The bold line represents the dynamic potential V(x) mathematically defined
by some non-linear function f(x).91 The value x = 0 represents a stable fixed point, an
attractor of this dynamic system. Intuitively, this is because a ball placed on the line
representing the potential would be drawn toward this point, as shown with the
arrows. Therefore, the demand of the phonological system for the abduction of the
vocal cords is expressed with the dynamically-defined potential V(x).92
90
In Articulatory Phonology terms, the x dimension may represent the CD parameter
of the glottal gesture, and zero may correspond to the label ‘wide’.
91
Section 5.2 provides a discussion of the relationship between f(x) and V(x) in
mathematical terms.
92
Another phonological constraint might require x to be different from zero.
Dynamically, such a system corresponds to a V(x) with a negative quadratic term and
the value x = 0 then represents an unstable fixed point, a repeller. This is because a
ball placed on this potential would be drawn away from this point.
281
Finally, the evaluation of the constraints such as *VOICE must be formalized.
In traditional OT notation, a constraint that disallows a certain property ‘p’ is denoted
*P (e.g. *VOICE in (39)) and is evaluated either categorically or gradiently. In
categorical evaluation, the OT grammar considers only two types of candidates: those
that have ‘p’ and those that do not. Candidates where ‘p’ is present violate the
constraint *P whereas candidates where ‘p’ is absent satisfy *P. In contrast, gradient
evaluation is a process of determining the quantity or degree of the property ‘p’, as
opposed to its presence or absence. Therefore, more or less of some property ‘p’ in a
candidate translates into a greater or smaller number of violation marks for that
candidate from a gradiently violable constraint.
Even gradient evaluation in traditional OT requires a well-defined discrete
scale. This is because OT is “… a purely symbolic theory of grammar … [SB: based
on] … non-numerical domination hierarchies” (Prince & Smolensky 1993: 226).
Sometimes, the relevant scale for gradient evaluation has a discrete nature. For
example, the distance between the boundaries of a foot and a prosodic word is
calculated based on the number of intervening syllables (McCarthy & Prince 1993).
Such a scale straightforwardly transforms into a basis for gradient evaluation in OT:
if a constraint requires perfect alignment between the boundaries of a foot and a
prosodic word, every syllable that intervenes between the two boundaries
corresponds to one violation mark.
282
In other cases, the relevant scale has a continuous nature. For example,
sonority is a continuous property of speech sounds that roughly corresponds to the
amount of acoustic energy (e.g. Ladefoged 1993). Interestingly, sonority plays a role
in determining well-formedness of segments in the syllable nucleus position (e.g.
Dell & Elmedlaoui 1985). The division of a continuous scale into discrete units was
typically formalized with binary feature matrixes (e.g. sonority scale, Clements
1990), or arbitrary chunking of physical dimensions (e.g. formants, Flemming 2001).
In OT, such a scale of discrete units was then translated into a hierarchy of ordered
constraints that were (partly) defined using these units (e.g. Prince & Smolensky
1993).
The problem with binary feature matrixes is that they do not capture fine
phonetic details that might be relevant for phonological patterning. For example,
Section 4.3 discussed the deficiency of binary features representing vowel height in
capturing the levels of transparency in Hungarian palatal vowel harmony.
In contrast, proposals for defining OT constraints by arbitrary chunking of
phonetic dimensions lead to instability of such phonological systems over changes in
environmental conditions. For example, Flemming (2001) defines perceptual distance
between vowels using the first three formants. Each formant represents one axis of
the three-dimensional vowel space. Each axis is then divided into 7 arbitrary units, 1
corresponds to a low value and 7 to a high value. In Flemming’s theory then, vowel
283
sounds are defined by matrixes of formant values, e.g. [F1 1, F2 6, F3 3] for [i] that
are used by OT constraints in defining distinctiveness of vowel sounds. However, the
problem is that formant values depend for example on the size of speakers’ vocal
tracts. Therefore, defining phonological constraints with the use of actual phonetic
values does not guarantee the stability of the phonological system.93
In this chapter I propose that a theory that defines constraints in a non-linear
fashion and allows both categorical and gradient evaluation of such constraints
captures the importance of phonetic detail as well as maintains the necessary stability
of phonological patterns. Consider the dynamically defined constraint *VOICE defined
in Fig. 48 above. The two balls represent two candidates with two different values of
the phonetic parameter x. If *VOICE is a categorical constraint, neither of the
candidates would satisfy the constraint because x ≠ 0 for both of them. Therefore,
each candidate would receive a violation mark and *VOICE would treat them as
identical with respect to the property ‘x’. If, on the other hand, *VOICE is a gradient
constraint, the two candidates would receive a different number of the violation
marks. The candidate whose x-value is closer to the preferred value (x = 0) would
violate the constraint less than the other candidate. Therefore, such a gradient
93
McCarthy (2004) argues for the elimination of gradient constraints from CON.
However, he assumes that all relevant scales in phonology are divided into some
finite number of steps. As I discussed, this assumption might be problematic in view
of the role of phonetic detail in Hungarian vowel harmony.
284
constraint would preserve the phonetic difference between the two candidates
because they would receive different number of violations with respect to this
constraint. If there is no higher-ranked constraint that decides between these two
candidates, the phonetic difference between the two candidates is phonologically
relevant.94
Informally then, gradient evaluation follows from the difference between a
candidate’s value of x and the value of x corresponding to the stable fixed point. The
difference between two candidates with values x1 and x2 respectively can be imagined
as the difference in the force that prevents the balls in Fig. 48 from falling into the
attractor. The farther the candidate is from the stable fixed point, the stronger the
force is needed to prevent the ball’s movement, and the constraint assigns more
violation marks.
In sum, the OT constraints used in this analysis are defined dynamically. Each
constraint identifies a phonetic dimension x and is defined with a potential function
V(x). This dynamic potential in turn defines one or more fixed points that represent
the demands of the phonological system on the phonetic variables. Furthermore, the
constraints might be evaluated categorically or gradiently. Minor differences in the
phonetic parameter do not affect the outcome of the categorical evaluation whereas
94
The described gradient evaluation is orthogonal to the ‘size’ of the difference between two
candidates. In theory, any difference between two candidates on the phonetic dimension evaluated by a
gradient constraint results in different number of violation for the two candidates.
285
the same differences do matter in gradient evaluation. Hence, in addition to a
traditional verbal statement, each constraint will be formally defined by a phonetic
dimension x, a dynamic potential V(x), and the type of evaluation.
6.4.
OT constraints for vowel harmony and their evaluation
After illustrating general aspects of the proposed theory in the previous section, this
section defines the OT constraints required for the analysis of Hungarian palatal
vowel harmony. In this analysis, I assume that the basic units of phonological
representation are dynamically defined articulatory gestures (Browman & Goldstein
1995, Gafos 2002), as described in Section 5.3. The gestural parameter Tongue Body
Constriction Location, referred to as CL from now on, is the crucial parameter for this
analysis of transparency.
6.4.1. Markedness constraints: AGREE
6.4.1.1. Stem-internal harmony
As in any OT grammar, stem-internal blending is modeled as a conflict between
markedness and faithfulness constraints. In this case, the markedness constraint
expresses the phonological demand that vowels within a word must agree in terms of
certain phonetic properties. This requirement drives vowel harmony, a systematic
phonological pattern found in many languages. Thus, in Hungarian, all vowels in a
286
word are drawn either from the ‘front’ set [i í e é ö ı ü ő], articulated with a frontward
movement of the tongue body, or from the ‘back’ set [u ú o ó a á], articulated with a
backward movement of the tongue body.95
The phonological demand for stem-internal vowel harmony is construed as an
articulatory requirement of the AGREE constraint stated in (40). AGREE(CL) mandates
that consecutive stem vowels have identical values of tongue body constriction
location.
(40)
AGREE(CL)St – Verbal statement
Consecutive stem vowels minimize their difference (distance) in terms
of tongue body constriction location.
To illustrate the effect of this constraint, consider a sequence of stem vowels
V1-V2 where V1 is a back vowel and V2 is a front vowel. A candidate with more
retracted V2 better satisfies AGREE(CL)St than a candidate with less retracted V2. More
formally, the degree of backness (retraction) of V2 is inversely proportional to the
number of the violations of the AGREE constraint.
As proposed in Section 6.3, the first step in a formal definition of an OT
constraint is the identification of the phonetic dimension on which the constraint
95
A functional explanation for this markedness constraint might be based on easier
segmentation of speech into words (Trubetskoy 1939, Suomi et al.1997).
287
operates. For AGREE(CL)St, this dimension is Articulatory Distance (AD). The AD
value is directly related to the CL parameter of the gestural representation and is
computed as the absolute value of the difference between the CL values of two
adjacent vowel gestures.
The second step in formalizing a constraint is the definition of a dynamic
system that controls the phonetic parameter. For AGREE(CL)St, the preferred value of
the AD parameter is zero. This is because zero difference means that two adjacent
gestures are identical with respect to CL, i.e. they are in perfect agreement. A
dynamic system with a single attractor at the value AD = 0 can be defined with the
potential V(x) = x2. Such a potential is shown in (41).
(41)
AGREE(CL)St – Dynamic formalism and evaluation
V(x)
b
a
x
a.
b.
AD
1
2
AGREE(CL)St
m*
n*, n>m
To conclude the definition of the AGREE(CL)St constraint, its evaluation must
be established. Independent of the type of evaluation, any candidate with AD greater
288
than zero incurs a violation. The issue is whether the phonetic difference between
candidates (41a) and (41b), represented as balls in the above potential, is deemed
relevant by the constraint. It was argued in Chapter 4 that the result of stem-internal
blending is relevant for the phonology of Hungarian palatal harmony. Therefore, it is
proposed that the evaluation of AGREE(CL)St is gradient.
In gradient evaluation, any difference between the AD values of the
candidates translates into a difference in their harmony as determined by
AGREE(CL)St. Take two candidates for the input papír. For both candidates, the initial
back vowel gesture is specified with CLa = –1. The candidates differ on the
specification for the second front vowel gesture. For candidate (41a), CLí = 1, and for
candidate (41b), CLí = 0. We compute articulatory distance for the two candidates by
finding the absolute value of the difference between the CL values of the two stem
vowels. Hence, AD = |–1–1| = 2 for (41a), and AD = |–1–0| = 1 for (41b).
The two candidates are represented as empty balls in the potential V(x) in (41)
based on their AD values. Note that only candidates with positive AD values are
possible because the AD parameter computes the absolute value of the difference.
Both of the candidates violate the constraint because their AD values are greater than
zero. However, candidate (41a) fares better on the constraint because its value of AD
(AD = 1) is closer to the preferred value (AD = 0) than the AD value of candidate
(41b). As a result, candidate (41a) receives m number of violations whereas candidate
289
(41b) receives n violations, with n > m. In OT terms then, candidate (41a) is more
harmonic than candidate (41b).
The proposed OT mechanism compares the AD values of candidates and
decides which candidate is more harmonic relative to the other candidate(s).
Importantly, actual distance in millimeters for the two candidates is not relevant.
What is important, and determined by AGREE(CL)St, is the relation among candidates
based on the phonetic values of articulatory distance. Therefore, the notion of
distance in this evaluation is relational, not absolute. This is important because the
relation between the two candidates is assumed to hold across different conditions
such as speech rate or the size of speaker’s vocal tract. On the contrary, the actual AD
values are arguably unstable across these conditions. Hence, a system that evaluates
phonetic parameters using relational implications is more stable than a system that is
based on some (arbitrary) absolute values of these parameters.
The definition of the constraint AGREE(CL)St is summarized in (42) with a
verbal statement, a dynamic definition with specified value of the fixed stable point
(attractor) as xA, and the type of evaluation.
(42)
AGREE(CL)St – Consecutive stem vowels minimize their difference
(distance) in terms of tongue body constriction location. V(x) = x2;
xA = 0, x = articulatory distance between adjacent vowel gestures,
gradient evaluation.
290
6.4.1.2. Stem-suffix harmony
In addition to stem-internal agreement, harmony is also manifested as agreement
between the stem and the suffix. In traditional analyses, it was the last nontransparent vowel in a stem that selected the quality of a suffix vowel. This idea is
here extended by arguing that the quality of the suffix vowel is always determined by
the quality of the stem-final vowel, including the case where the stem-final vowel is a
transparent vowel.
Consider first the observation that the process of selecting alternating suffixes
varies qualitatively depending on the [±back] feature of a stem-final vowel. Two
generalizations arise from Hungarian data. First, if the stem-final vowel is [+back],
the following suffix is always [+back]. Second, stem-final [–back] vowels are
sometimes followed by [+back] suffixes (papír-nak, hotel-nak) and sometimes by [–
back] suffixes (öröm-nek, parfüm-nek, hotel-nek). These two generalizations are
respected without exception in Hungarian data. It is therefore assumed that the
cognitive system for suffix selection following a [+back] vowel is slightly different
from that for suffix selection following a [–back] vowel. This difference is captured
as the difference between two constraints mandating the agreement between stemfinal and suffix vowels.
The constraint in (43) is proposed to account for the first generalization that
stem-final back vowels are always followed by a back suffix. Because there are two
291
stable forms of an alternating suffix in Hungarian, the constraint determines which of
the two candidates, one with a back suffix or one with a front suffix, is more
harmonic given a certain value of the Constriction Location of the stem-final vowel.
This is an example of categorical evaluation.
(43)
AGREEA-Suff (CL) – Statement
Stem-final [+back] vowels minimize their articulatory distance with
the suffix vowel(s) in terms of constriction location.
*
AGREEA-Suff (CL) *
Fig. 50 – AGREEA-Suff (CL) – dynamic formalism and evaluation.
The relevant phonetic dimension for the dynamic definition of the constraint
is articulatory distance (AD). The potential that formalizes AGREEA-Suff(CL) has one
attractor and two repellers. The attractor corresponds to the value AD = 0, the
maximal agreement between two gestures. This is a stable point because if we place a
292
ball in the potential in the vicinity of this stable point, the ball would be drawn
towards that point. This attractor corresponds to the value of the physical parameter
that ‘best represents’ the demand of the phonological system. The relevant repeller
(the one with the positive AD value) corresponds to AD = 1. This is an unstable point
because a ball placed in the potential around this point would be drawn away from it.
A potential that has these properties can be defined as V(x) = 1/4*x4 – 1/2*x2, and it
is illustrated in Fig. 49.
The method for calculating articulatory distance is the same as for steminternal AGREE(CL)St. For example, let us take the values from Chapter 5, where all
front vowels correspond to CL = 2 and all back vowels to CL = –2. The two crucial
candidates for the input város-n_k ‘town.Dat’ are város-nek and város-nak. Because
articulatory distance (AD) is the absolute value of the difference between the
respective values of the CL for the adjacent gestures, AD between stem-final and
suffix vowel for város-nek is ADváros-nek = |–2 – 2| = 4. Following the same method,
ADváros-nak = |–2 – (–2)| = 0. Because AD is defined as an absolute value, only the
potential for the positive x-values is relevant.
All candidates that have their AD values within the interval defined by the
values of the attractor and the repeller satisfy the constraint. This is because
AGREE(CL)St evaluates candidates categorically. The candidates város-nek and városnak correspond to the empty dots in Fig. 49; the former is shown with a light line and
293
the latter with a bold line. The candidate város-nek violates AGREEA-Suff (CL) because
its AD value 4 is beyond the critical value represented by the repeller in the potential
V(x). In contrast, város-nak satisfies the constraint.
The candidate with the filled dot in Fig. 49 has an AD value slightly higher
than zero. It is assumed that the stem-final [+back] vowel in disharmonic stems such
as béka ‘frog’ is slightly fronted due to the influence from the preceding [–back]
vowel.96 Therefore, the CL value for the stem-final vowel will be less than 2.
Consequently, for the input béka-n_k ‘frog.Dat’, the AD for the candidate béka-nak
will be slightly more than zero. Nevertheless, due to categorical evaluation, this
candidate also satisfies the constraint. Therefore, back vowels in stem-final position
of disharmonic stems still select back suffixes.
It is important to note that the model built in this chapter, similar to the one in
Chapter 5, is a qualitative model. The importance of the relations between parameters
rather than their actual value is shown with the comparison of AD values for városnak and béka-nak. Due to differences between /a/ and /o/ (e.g. Wood 1979), it is
unrealistic to consider AD in város-nak to be zero. However, the crucial information
that the model captures is that despite the fact that AD in béka-nak is somehow
greater than AD in város-nak, this difference does not affect the choice of the suffix.
96
This is due to the stem-internal agreement formalized with AGREE (CL). Recall that
front vowels exert readily perceived coarticulatory influences on the following back
vowels (Beddor et al. 2001).
294
After formalizing the agreement triggered by stem-final [+back] vowels, I
now turn to the more complex issue of suffixes following stem-final [–back] vowels.
It was argued in Chapter 4 that the retraction degree of a stem-final [–back] vowel
correlates with the choice of the following suffix. In Chapter 5, this correlation was
modeled using non-linear dynamics. Based on that formalism, an AGREE constraint in
(44) maintains that the choice of the suffix depends on the value of the retraction in
the stem-final vowel.
(44)
AGREEI-Suff(R) – Statement
Stem-final [–back] vowels determine the quality of the suffix based on
their retraction degree R.
In this OT analysis, constraints are formalized as dynamic systems controlling
physical domains such as articulatory distance or constriction location. The same
applies to AGREEI-Suff(R) as well because it defines the preferred values of constriction
location (CL) for the suffix vowels. The novel idea is that the dynamic potential for
this constraint that formalizes (44) is not fixed. This is because the quality of the
suffix that follows a front vowel is variable. The exact shape of the potential is
modulated by variation in the parameter of retraction degree R. This parameter is
295
calculated as the difference between the input and output CL values for the stem-final
vowel.97
To express these ideas formally, the constraint in (44) is defined with the nonlinear function N(x, R) = – x3 + x + R where x is the order parameter representing the
constriction location of the suffix vowel, and R is the control parameter representing
retraction degree of a stem-final vowel. Following the discussion in Chapter 5, the
negative gradient of the potential V(x) is obtained by integrating N(x, R):
V(x) = 1/4*x4 – 1/2*x2 – R. The potentials in Fig. 51 illustrate the effect of the
control parameter R on the order parameter CL.
Fig. 51 – AGREEI-Suff(R) – dynamic formalism. Potential for the CL of the
suffix vowel for three values of stem-final retraction R={1,0,3}.
97
Ultimately, all constraint potentials are parametrized for additional phonetic
properties. See for example the effect of lip rounding in IDENT(front) discussed in
Section 6.5.2
296
The non-linearity of the dynamic system that underlies AGREEI-Suff(R) is best
demonstrated by comparing the potentials in the left and the middle panels of Fig. 51.
The value of retraction degree R = 1 results in the potential in the left panel. This
potential specifies that the stable CL value for the suffix vowel is around CL = –2.
Continuing the use of the example CL values, this value of the attractor corresponds
to a back suffix such as –nak. In contrast, for zero retraction (R = 0), the potential in
the middle panel specifies that the stable CL value for the suffix vowel is around CL
= 2. This value of the attractor corresponds to a front suffix such as –nek. Hence, a
small increase in the control parameter of stem-final retraction, from R = 0 to R = 1,
causes a substantial change in the preferred form of the suffix vowel, from –nek
to –nak.
The rightmost panel in Fig. 51 shows the potential for the suffix vowel if the
stem-final vowel is maximally retracted (R = 3). Comparing the potentials on the left
and the right it can be seen that the change is not qualitative because the stable region
is still around CL = –2, which corresponds to a back suffix. However, the strength of
the attractor in the right panel is increased. This translates to higher probability that
the suffix following a significantly retracted stem-final vowel will be back.
The qualitative behavior of the system showed in Fig. 51 represents the
essence of non-linearity in dynamics discussed in Chapter 5. A dynamic system is
stable for some interval of control parameter, e.g. between R = 1 and R = 3. However,
297
a small change of the control parameter around some critical value brings about a
global change in the number or position of fixed points.
After dynamically defining the potential, the last step in the definition of
AGREEI-Suff(R) is to determine its mode of evaluation. It is assumed that suffix
selection is a categorical process because there are only two available stable forms of
the suffix in Hungarian palatal harmony.98 Therefore, the evaluation of AGREEI-Suff(R)
is categorical. The evaluation compares the potential determined by the value of R
and the dynamic definition of AGREEI-Suff(R) on the one hand, with the suffix vowel of
each candidate on the other hand. If the CL value of the output suffix vowel is within
the region of the CL values that correspond to the attractor, the candidate satisfies the
constraint. If the output CL value of the suffix vowel is outside of this region, the
candidate violates the constraint.99
For example, a candidate with the value R = 1 and the suffix –nak satisfies the
constraint because the potential, shown in the left panel of Fig. 51, identifies the
stable value of the suffix vowel around CL = –2, which corresponds to a back vowel.
98
Hungarian also has a rounding harmony, hence some suffixes have three or four
stable forms: -hez, -hoz, -höz for the Allative suffix, or -at, -et, -ot, -öt for the
Accusative suffix.
99
For the purpose of AGREEI-Suff(R) evaluation, the region of the CL values that
corresponds to the attractor is delimited by zero: all positive CL values correspond to
the attractor –nek, and all negative values to –nak.
298
In contrast, a candidate with the same value of R but with the suffix –nek, CL = 2,
violates the constraint.
The proposal for the AGREEI-Suff(R) constraint is summarized in (45).
(45)
AGREEI-Suff(R) – Stem-final [–back] vowels determine the quality of the
suffix based on their retraction degree R. V(x) = –1/4*x4 + 1/2*x2 + R,
xA = {–2, 2}, xR = 0, x = CL of the suffix vowel, R = retraction of the
stem-final vowel, categorical evaluation.
6.4.2. Faithfulness IDENT constraints
Faithfulness constraints militate against any difference (change) between the
representation provided by the lexicon and the one that is realized. Therefore,
faithfulness constraints protecting the input values of the relevant phonetic
parameters come in conflict with markedness constraints such as AGREE(CL)St. It is
proposed that faithfulness between the input and output forms is evaluated separately
for perceptual and articulatory domains. This is motivated by the quantal relationship
between articulation and perception (Stevens 1989, Wood 1979). As a result, the OT
grammar computes the articulatory and perceptual identities of output forms
differently. First, consider the definition of articulatory identity in (46). It is evaluated
similarly to the AGREE(CL)St constraint, using the same phonetic scale of articulatory
distance based on the gestural parameter of Constriction Location. The only
299
difference is that, while AGREE(CL)St compares the CL values in two separate output
gestures, IDENT(CL) compares the CL value in the input with the CL value in the
output of the same gesture.
(46)
IDENT(CL) – Corresponding vowel gestures in the input and output
have identical specifications for constriction location. V(x) = x2, xA =
0, x = articulatory distance between input and output vowel gestures,
gradient evaluation.
To illustrate the relationship between IDENT(CL) and AGREE(CL)St, consider
two candidates similar to the ones illustrated in (41). Both contain a V1-V2 sequence
where V1 is a back vowel gesture and V2 is specified in the input as a front vowel
(CL = 2). This is the case of Hungarian words like papír. The results presented in
Chapter 3 showed that the vowel /í/ preceded by back vowels (e.g. /í/ in papír) is
retracted in Hungarian. The degree of retraction in the output of the front vowel
gesture is proportional to the number of IDENT(CL) violations. This is because the
back gesture for V1 pulls the output V2 gesture away from its input specification. On
the other hand, the same retraction of the V2 gesture is inversely proportional to the
number of AGREE(CL)St violations. In other words, the more the V2 gesture is
retracted, the more similar it is to the back V1 gesture with which it is required to
300
agree. And the more similar the two gestures are to each other, the fewer violation
marks the candidate incurs from AGREE(CL)St.
Because the experimental results showed that the transparent vowels are
retracted when preceded by a back vowel, the harmony constraint must dominate the
articulatory faithfulness constraint: AGREE(CL)St >> IDENT(CL). This situation is
shown in tableau (47).
(47)
Harmony dominates articulatory identity
However, this ranking generates maximal retraction of the stem-final vowel in
the winning candidate (10c). This is not the case in Hungarian, although it is the case
in other languages, e.g. Turkish. As we know, stem-final vowels in stems like papír
are retracted only to such a degree that allows them to still be perceived as front.
Therefore, the ranking in (47) produces a degree of articulatory retraction not attested
in Hungarian. It is proposed that a perceptual faithfulness constraint limits extensive
articulatory retraction and allows only those articulatory perturbations that do not
jeopardize perceptual identity of a front vowel gesture.
301
The rationale behind this faithfulness constraint is best illustrated with an
example from categorical perception of sounds. Liberman et al. (1957) demonstrated
the existence of perceptual discontinuities across a continuously varying physical
dimension. In this pioneering experiment, subjects were presented with pairs of
consonant-vowel syllables with experimentally manipulated values of voice onset
time (VOT). The phonetic continuum of voicing (VOT) was divided into several
equidistant steps. For example, the /p/ ↔ /b/ continuum was split into eleven
consonants C1, C2, … C11 with C1 closest to /p/ and C11 closest to /b/. Then, the
subjects were asked if the pairs of CnV syllables where n belongs to {1, 2, …, 11}
were the same or different. The participants reported C1V vs. C4V to be the same,
whereas C4V vs. C7V to be different. Crucially, this is the case even though the
physical differences between C1V vs. C4V on the one hand, and C4V vs. C7V on the
other hand were identical. Therefore, continuous variation of a physical parameter (of
voicing) has discontinuous consequences for the perception of sound categories.
Similar results were obtained in experiments testing categorical perception of
color. Bornstein & Korda (1984) showed that two stimuli representing two values of
wavelength that cross a category boundary (e.g. green-yellow) are easier to
discriminate than two values that belong to the same category (e.g. green-green). As
in speech perception, the physical differences in wavelengths were identical within
the two pairs.
302
Experiments such as these established the reality of perceptual categories as
well as the presence of critical values of the physical parameters that define them.
The contrast between a front and a back vowel is similar to the contrast between /pa/
and /ba/, or between green and yellow color.100 The only difference is in the physical
parameter that defines the category. It is Voice Onset Time (VOT) for consonant
voicing and wavelength for color. What is the phonetic parameter that defines the
front-back quality of vowels?
To answer this question, consider a prototypical case of the relationship
between a perceptual category and its underlying physical parameter. A typical case
in human perception is that continuous changes in the physical parameter result in
discontinuities in category judgments. It was argued in Chapter 5 that the relationship
between tongue body constriction location (CL) and the perceptual category such as
FRONT also displays the non-linear quality. Therefore, it is proposed that CL is the
relevant continuous parameter that determines the perception of the FRONT/BACK
distinction in vowels. This non-linearity of the CL vs. FRONT/BACK relationship is
illustrated in Fig. 52 where the values for the articulatory parameter CL used in
chapter 5 serve as examples.
100
The perception of vowel categories is more similar to the perception of color than
voicing. This is because the boundaries for vowel categories are smoother than for
voicing.
303
Front-Back
III
Front
II
Back
Constriction
Location
I
-2
-1
0
1
2
Fig. 52 – Non-linearity between the horizontal position of the tongue (CL) and
perceptual frontness. The circles represent the position of the candidates in the
following tableau.
Region I represents the stable region for front vowels, and CL = 2
corresponds to the ideal CL value for a prototypical front vowel. Region III
represents the stable region for back vowels, and the value CL= –2 corresponds to the
center of that region. In these two regions, small changes in the CL parameter do not
affect the perception of the respective categories. In contrast, Region II is an unstable
region where even small changes in tongue body position significantly affect the
perceptual output.
After these considerations, the perceptual faithfulness constraint that limits
changes in the articulatory CL values may be defined. The first step of this definition,
a verbal statement, is presented in (48).
304
(48)
IDENT(front) – Statement
Corresponding vowel gestures in the input and output are perceived as
front.
Informally, IDENT(front) is violated when the input CL value of a vowel
gesture belongs to Region I but the output value does not. For example, the input
stem-final front vowel corresponding to CL = 2 is shown with the filled dot in
Fig. 52. Articulatory retraction due to the harmony constraint AGREE(CL)St pulls the
stem-final vowel gesture toward more retracted values. Two possible output
candidates are shown with empty dots. They correspond to the CL values 1 and –1
respectively. The candidate in a different region from the input violates IDENT(front).
In contrast, the candidate whose output form belongs to the same region as the input
satisfies IDENT(front).
Formally, the non-linear relationship between the articulatory parameter and
the perceptual category follows from the dynamic system that underlies IDENT(front).
This constraint controls the phonetic dimension of tongue body constriction location
(CL). Following the model in Chapter 5, the relationship between CL and FRONT is
non-linear. Therefore, a non-linear function is used to define the dynamic system that
controls the relationship between CL and FRONT. The requirement for that function
is that it defines a stable interval where multiple values of the CL parameter
305
correspond to a single macroscopic phonological category FRONT. A good candidate
for this function is f(x) = –(x – xo)3 + (x – xo) where ‘x’ corresponds to CL. The
negative gradient of the potential of this function, shown in Fig. 53, can be obtained
by integrating f(x): V(x) = 1/4*(x–xo)4 – 1/2*(x–xo)2.
IDENT(front)
•*
Fig. 53 – IDENT(front) – dynamic formalism and evaluation. The position of
the balls corresponds to the positions in Fig. 52.
It can be seen that this dynamic system has three fixed points. One is the
attractor, defined with the value ‘xo’. This attractor has the value of the physical
parameter that ‘best represents’ the category. In our case, the prototypical FRONT
vowel is the one with constriction location CL = 2. The other two fixed points are
repellers. These are unstable points and represent the boundary of the category.
306
Intuitively, the dynamic system expressed with V(x) can be imagined as a
force that pulls balls with different values of the CL-parameter toward and away from
the attractor. The balls attracted to the stable fixed point represent the tokens that
belong to the category defined by this dynamic system. The balls driven away from
the stable fixed point represent the tokens that do not belong to this category.101
It is proposed that this dynamic system describes the OT constraint
IDENT(front) stated in (48), hence it represents the second step in its definition. The
parameter that remains to be determined in the definition of IDENT(front) is its
evaluation. This constraint defines a system where candidates representing multiple
values of the continuous parameter correspond to a single category. Hence, certain
differences among candidates are not relevant for the ‘membership’ in the category.
As discussed in Section 6.3, this is achieved by categorical evaluation.
This evaluation proceeds in the following way. Candidates have various
values of the control parameter CL, as supplied by the GEN function of OT. Some of
these candidates are illustrated as balls positioned in the potential V(x). If left in the
potential, these balls would either fall into the attractor or not. A candidate is
perceptually faithful to the input specification ‘front’ if the ball representing its CL
101
As we will see later, the input non-linear function depends on the type of the
vowel, e.g. the shape of the potential will be slightly different for /i/ and /ü/.
Nevertheless, the basic shape with one attractor and two repellers is shared in all
these cases.
307
value is drawn towards the attractor. These candidates satisfy the constraint. In
contrast, a candidate violates the constraint if the ball representing its CL value is
drawn away from the attractor. Informally, perceptual recovery of CL specifications
corresponds to the ball falling either into or outside of the attractor. Articulatorily
retracted /i/ is recovered as front if the ball corresponding to its CL value falls into the
attractor. It is not recovered as front if the ball does not fall into the attractor.
The table on the right of Fig. 53 gives the evaluation of the three candidates
shown on the left. It can be seen that some retraction is tolerated by the constraint
because the candidate represented with the bold empty ball does not violate it.
However, more retraction is penalized because the light empty ball is outside the
attractor basin, which represents the fact that a vowel with this value of CL is not
perceived as front.
The categorical evaluation of IDENT(front) is crucially different from gradient
evaluation described for AGREE(CL) and IDENT(CL). In gradient evaluation, two
candidates with different values of the relevant physical parameters are always
ordered: one candidate is more harmonic than the other candidate.
In categorical evaluation such as the one shown in Fig. 53, certain ranges of
phonetic values correspond to a single phonological entity (e.g. [–back]). A
candidate’s CL value either belongs to a certain region or not. Hence, depending on
the region of the input specification, a candidate either violates perceptual identity or
308
not. In other words, differences among candidates in terms of their respective values
of CL do not necessarily affect their harmony with respect to IDENT(front). For
example, two candidates with values CL = 1 and CL = 2 fare equally on IDENT(front)
because they belong to the same region. Thus, this type of evaluation ensures the
stability of phonological categories as well as the phonetic basis for defining them.
The definition of the constraint IDENT(front) is summarized in (49) with its
verbal statement, the dynamic definition, and the type of evaluation.
(49)
IDENT(front) – Corresponding vowel gestures in the input and output
are perceived as front. V(x) = 1/4*(x–xo)4–1/2*(x–xo)2, xA = 2, xR =
1,3, x = constriction location of the tongue body, categorical
evaluation.
6.4.3. Phonological categories and dynamic OT
It is known that perceptual categories ‘overlap’, which means that their boundaries
are not strict. For example, it is not the case that all values within a given interval of
wavelength values are always perceived as green and the values outside that interval
as yellow. Rather, the values around the middle of the region (the attractor) are
mostly perceived as green. As wavelength is scaled away from this value, the
judgments are less consistent. Finally, around the critical values for the wavelength
309
parameter, the judgments reach no significant difference between the two categories
and the same wavelength is sometimes perceived as green and sometimes as yellow.
The observation that perceptual categories are stochastic can be
straightforwardly modeled with attractor dynamics. Recall that noise is a necessary
component of any dynamic system. In this case, noise enters the computation of the
final position of the ball. As a result, only the probability that a ball will be in a
certain interval can be computed. The function that models this probability is similar
to a typical bell-shaped gaussian distribution in that it has high y-values around the xvalues for the attractor, and low y-values around the x-values for the repellers. Hence,
the probability that a token minimally different from xo is perceived as belonging to
the category defined by V(x) is high. In contrast, the probability that a token
maximally different from xo is perceived as belonging to the category defined by
V(x) is low. This is a welcome result of the model because it reflects the perception
data reported for categorical perception (e.g. Liberman et al. 1957). Therefore,
dynamic definition of perceptual categories together with categorical evaluation of
the corresponding OT constraints allows for modeling the continuity in judgments as
well as the proposed discreteness of the category in the cognitive system.
In addition to capturing stochasticity of perceptual categories, the dynamic
definition naturally explains the stability of the categories over the variation in
extralinguistic parameters such as speech rate or voice quality. This is because the
310
dynamic model is flexible and incorporates these influences as changes in the
position and strength of the attractor. However, these influences do not affect the
presence of the attractor itself.102
Finally, the proposed notion of perceptual category may shed light on a
puzzling phenomenon called incomplete merger (Labov et al. 1990, Pierrehumbert
2003). Labov et al. observed that, despite the fact that certain phonetic contrasts have
been claimed to be neutralized and subjects do not perceive the contrast, subjects
consistently maintain the contrast in their productions for sociolinguistic reasons.
Pierrehumbert (2003) proposed that, in order for the contrast to persist in production,
the maintenance of the contrast must have been motivated in the past while speakers
were younger but was subsequently lost.
In Hungarian, the contrast between retracted and ‘regular’ productions of
transparent vowels is also claimed to be lost in perception, yet the data established its
presence in production. This suggests that the motivation for maintaining a contrast in
production despite its assumed loss in perception might not be restricted to
sociolinguistic reasons. It may be that the phonological constraint of vowel harmony,
realized as gestural blending, and a subsequent reliance of the system that computes
the suffix form on the results of gestural blending provide motivation for maintaining
the contrast.
102
For modeling the effect of intention on word-final devoicing, see Gafos (in press).
311
In sum, defining perceptual categories with attractor dynamics operating over
gestural parameters provides a means of accounting for the stochastic nature of the
perceptual categories, their stability over extralinguistic influences, and assumed
divergence of production from perception in cases of incomplete mergers.
6.4.4. Summary of the developed OT tools
Each constraint is defined as a dynamic system that controls a phonetic parameter.
Markedness constraints control articulatory distance between adjacent vowel
gestures. Faithfulness constraints control the CL dimension of each gesture. One of
the characteristics of the dynamic definition of constraints is the presence of a fixed
point (attractor, xo) that represents a preferred state of the system. A candidate with
the value of the phonetic parameter equal to that of the attractor (xo) maximally
satisfies the constraint. However, OT constraints are also violable and their violation
is assessed in two modes. In gradient evaluation, the further the candidate’s value of
the phonetic parameter is from xo, the less harmonic it is with respect to other
candidates. In categorical evaluation, a certain range of values around xo does not
violate the constraint but any candidate with the value of the phonetic parameter
outside that range violates the constraint.
312
6.5.
OT analysis of Hungarian vowel harmony
6.5.1. Transparency
This section applies the OT tools developed in the previous section in the analysis of
suffix selection in Hungarian. The tableau in (50) shows the interaction of the
proposed constraints in Hungarian stems. The input is a disyllabic stem where a back
vowel is followed by one of the transparent vowels {i, í, é}, e.g. papír. The
candidates show the perceptual output on the top in ‘[ ]’ and the articulatory output
on the bottom in ‘{ }’. The degree of retraction is illustrated with arrows, where two
arrows mean more retraction than a single one. In this tableau, the effect of the initial
vowel on the following one is formalized assuming the left-to-right direction of
harmony in Hungarian. The directionality is captured by an undominated constraint
expressing positional faithfulness to the initial vowel (e.g. Beckman 2004) or
positional markedness of the non-initial vowel(s) (e.g. Walker 2001, de Lacy 2002).
Given the high ranking of the positional constraint, the input CL value of the initial
back vowel (CL(a) = –2) does not change in the output and each candidate contains the
output CL value of the front vowel only.103
103
This simplifying assumption is taken in order to illustrate the crucial aspect of the
analysis: variation in the front vowel gesture. It is likely that the CL of the initial back
vowel is affected by the following vowel. The model of blending can derive this
outcome, as was discussed in Section 5. Similarly, a dynamically defined and
categorically evaluated positional faithfulness or markedness constraint would allow
313
(50)
( a - i )St.
(pa-pír)St,
CL(a)= –2
CL(i)= 2
TVs maximize articulatory agreement with the initial vowel without
compromising perceptual identity
IDENT(front)
AGREE (CL)St
IDENT(CL)
x = TBCL
V(x) = 1/4*(x–xo)4–1/2*(x–xo)2
Categorical
a. a - [i:]
{i} CL = 2
b. a - [µ:]
{i}CL =–2
c. a - [i:]
{i} CL = 1
x = |CLO – CLO|
V(x) = x2
Gradient
x = |CLI – CLO|
V(x) = x2
Gradient
**!
*!
**
*
*
Candidate (50a) is faithful to the input both articulatorily and perceptually.
Hence, it receives no violation of the two IDENT constraints. In contrast, it receives
the most violations of the AGREE(CL)St among all three relevant candidates. This is
because the distance between the two output gestures is large: x = AD = |CLa – CLi| =
| –2 – 2| = 4. This value is far from the preferred value of x = 0 for AGREE(CL)St. As a
result, candidate (50a) is the least harmonic candidate on this constraint.
certain advancement of the CL value while preserving the backness of the initial
vowel.
314
To avoid the fatal violation of AGREE(CL)St, the gesture for vowel /í/ in
candidate (50b) is significantly retracted to {i}, exemplified with the value
CL = –2. As a result, the output gesture for the front vowel is distant from its input
specification, and this candidate receives the largest number of violations from the
IDENT(CL) constraint. This significant articulatory retraction, however, pulls the front
vowel gesture close to the back vowel gesture in terms of articulatory distance.
Therefore, candidate (50b) is the most harmonic with respect to the AGREE(CL)St
constraint (AD=|CLa–CLi| = |–2 – (–2)| = 0). The significant articulatory retraction
also results in a vowel that is perceptually not front. As shown in the potential V(x) of
IDENT(front), a CL value smaller than 1 results in the violation of IDENT(front).
The gesture for vowel /í/ in candidate (50c) is retracted less than in (50b),
{i}, CL = 1. Due to this limited retraction, it incurs a violation of IDENT(CL) but the
articulatory agreement with the initial stem vowel is better than in (50a) and worse
than in (50b) (AD = |CLa – CLi| = |– 2 – 1| = 3). Since candidate (50c) has the stemfinal vowel retracted to an intermediate degree with respect to the other two
candidates, the gradient evaluation described for this constraint determines that (50c)
is harmonically ordered in between the other two candidates. As a result, (50c)
receives more violation marks from AGREE(CL)St than candidate (50b) but fewer than
candidate (50a). Crucially, this retraction of /í/, represented with the value CL = 1,
315
does not change its perceptual identity. IDENT(front) is not violated for this value,
and hence, candidate (50c) does not receive a violation mark from this constraint.
Given that (50c) is the output, the ranking is IDENT(front) >> AGREE(CL)St >>
IDENT(CL). AGREE(CL)St must dominate IDENT(CL) because the opposite ranking
would favor (50a) over (50c). Similarly, IDENT(front) must dominate AGREE(CL)St
since the opposite ranking would favor (50b) over (50c). Intuitively, tableau (50)
expresses the idea that transparent vowels can maximize articulatory agreement with
the initial back vowel while preserving their perceptual identity.
To complete the analysis of transparency in Hungarian, the suffix selection
must be discussed. The final tableau is shown in (51). The input consists of a stem
with two vowels, the first one is back and the second one is front. Each vowel gesture
is specified for its constriction location (CL) following the dynamic gestural
representations. A stem back vowel is specified for CL = –2, and a following front
vowel is specified for CL = 2. Each candidate is also specified for the output CL
value of the stem-final vowel. This allows the computation of the values for all
phonetic parameters relevant in the evaluation. Retraction degree R is the difference
between the input and output CL values of the same vowel. Articulatory distance AD
is the absolute value of the difference between the output CL values of the adjacent
stem vowels. In subsequent tableaux I include the values of R and AD for each
candidate for ease of exposition but the only parameter generated by GEN is the
316
output CL value. Finally, following the notation from (50), the articulatory
specification of the output stem-final vowel is included in curly brackets and the
perceptual output in square brackets. It should be noted, however, that the symbols in
the brackets are only labels for the parameters such as CL and R. Hence, the actual
evaluation of the candidates depends on these parameters and their values only.
(51)
Transparency, final tableau
( a - i )St –(V)Suff
(pa-pír)St, Dat.
AGREE(R)I-Suff
IDENT
(front)
AGREE IDENT
(CL)St (CL)
CLa=–2, CLí=2
a. (a - [i] )-a
{i}CLi=2, Ri=0, AD=4
*!
***!
b. (a - [i] )-e
{i}CLi=2, Ri=0, AD=4
c. (a - [µ])-a
{i}CLi=-1, Ri=3, AD=1
*!
d. (a - [µ])-e
{i}CLi=-1, Ri=3, AD=1
e. (a - [i] )-a
{i} CLi=1, Ri=1, AD=3
*!
f. (a - [i] )-e
{i} CLi=1, Ri=1, AD=3
***
*
*!
*
**
*
**
**
*
**
*
Candidates (51a,b) are the faithful ones. Both stem vowels are identical to the
input, which results in the greatest violation of the markedness stem-internal harmony
constraint AGREE (CL)St. The value of Retraction degree (R = 0) produces the
317
potential for the evaluation of AGREE(R)I-Suff shown in the first row. This potential has
a single attractor around the value of CL = 2, which corresponds to a front suffix
–nek. Consequently, (51a), which has a back suffix, violates the stem-suffix harmony
constraint AGREE(R)I-Suff whereas (51b) does not.
Candidates (51c,d) best satisfy stem-internal harmony since the stem-final
front vowel is maximally retracted. This degree of retraction, however, falls into the
range where IDENT(front) is violated and the stem-final vowel is not perceived as
front anymore. Moreover, (51d) has a front suffix whereas the potential for the suffix
vowel for R = 3 gives only one stable value that corresponds to CL = –2, a back
suffix. Hence, (51d) receives an additional violation mark from AGREE(R)I-Suff .
Finally, candidates (51e,f) have a medially retracted stem-final vowel with
respect to (51a,b) and (51c,d). As a result, they incur a medial violation of the
gradient constraints AGREE(CL)St and IDENT(CL) compared to (51a,b) and (51c,d). The
retraction degree R = 1 falls into the range where IDENT(front) is not violated and the
stem-final vowel is still perceived as front. The stem-suffix harmony constraint
decides between (51e) and (51f). The constraint AGREE(R)I-Suff with the retraction
degree R = 1 yields the potential for the suffix corresponding to a back vowel.
Candidate (51f), however, has a front suffix and thus violates AGREE(R)I-Suff .
The rankings IDENT(front) >> AGREE(CL) and AGREE (CL) >> IDENT(CL) have
been determined already in tableau (50). They ascertain that (51e) is more harmonic
318
than (51c) and (51a) respectively. The ranking among the three AGREE constraints
cannot be determined since they never come into conflict. The OT grammar of
transparency is summarized in (52).
(52)
Ranking for Hungarian vowel harmony
IDENT(front)
|
\
AGREE(CL)St, AGREE(R)I-Suff , AGREE (CL)A-Suff
\
|
/
IDENT(CL)
/
… Perceptual Faithfulness
|
… Harmony
|
… Articulatory faithfulness
After illustrating the dynamically-based OT constraints in the analysis of
transparency, I now discuss some of the proposed OT constraints and their
evaluations in more detail. First, AGREEI-Suff(R) determines whether a front or a back
suffix is more harmonic given a value of the stem-final front vowel retraction degree
R. In this sense, this is a categorical evaluation similar to that of IDENT(front) and
AGREEA-Suff(CL).
However, there is also an important difference between AGREEI-Suff(R) on the
one hand, and IDENT(front) and AGREEA-Suff(CL) on the other. The difference is that
the former constraint defines potentially two stable categories (back suffix –nak, front
suffix –nek), whereas the latter two constraints define only one (FRONT and BACK
respectively). The additional task for AGREEI-Suff(R) is to select between the two stable
CL values. This is formalized with the addition of the parameter R into the dynamic
319
function that defines this constraint. The variation in this parameter then allows the
dynamic system of the OT constraint to capture the relationship between stem-final
retraction and the suffix quality.
Second, it is important to mention that the apparent pair-wise evaluation of
candidates by AGREEI-Suff(R) is different from OT accounts of transparency with
targeted constraints (Baković & Wilson 2000) or sympathy (Walker 1998). In the
former, a targeted constraint governs pair-wise evaluation of the winning candidate
with the most similar candidate based on some phonetic scale. All other candidates
are excluded from this evaluation. In the latter, Walker assumes an additional
'sympathetic' correspondence relation (McCarthy 1999). This relation holds between
the optimal candidate and the candidate that is not violated by the crucial markedness
constraint. In tableau (51), the sympathetic candidate would be (51c) ‘a-[∝]-a’. The
special sympathetic constraint and the markedness constraint would then dictate
maximal similarity between this candidate and the output. Because (51e) differs from
(51c) in one segment and (51f) in two segments, (51e) would be the winner. In both
of these accounts, the pair-wise evaluation requires additional OT machinery (an
abstract correspondence relation) that mediates between the input and output.
In the proposed analysis, a special relationship between the winning candidate
and some other candidate is not assumed. All candidates that are specified according
to the requirements of the constraint, i.e. they have a value of R and a suffix vowel,
320
are evaluated. Due to categorical evaluation of the constraint, the candidates are
divided into two groups: those that violate and those that satisfy the constraint. As a
byproduct of this evaluation, the candidates can be compared in a pair-wise fashion.
For each pair of candidates with an identical value of retraction degree but different
suffix forms, AGREEI-Suff(R) determines which candidate incurs a violation and which
does not. Therefore, no additional machinery for constraint evaluation is needed in
our analysis. The novel idea is the use of dynamic equations in formalizing the
relationship between a phonetic dimension and constraint evaluation.
Third, the gestural representation allows the generation of multiple
candidates, each with a slightly different value of the CL parameter. As we argued,
the presence of sub-phonemic variation in retraction degree is crucial in the cognitive
system of suffix selection. At the same time, the role of phonetic detail in the analysis
is limited by the nature of the proposed OT constraints.
To exemplify this point, consider the input ‘papír’ and three corresponding
output candidates with the CL values of the stem-final vowel {1.2, 1, 0.8}. Recall that
the retraction degree R is calculated as the difference between the input and output
values of the constriction location, R = CLInput – CLOutput. Given that the CLInput for
the front vowel is CL = 2, the three candidates would have the values of R = {0.8,
1.0, 1.2} respectively. Crucially, the choice between candidates papír-nak (Rí = 0.8),
papír-nak (Rí = 1), and papír-nak (Rí = 1.2) is not made by AGREEI-Suff(R), because all
321
three candidates satisfy the constraint. This is due to the fact that all three potentials
generated from the three R values are qualitatively identical. They have a single
attractor around the value CL = –2 that corresponds to the back suffix –nak. Hence, in
this case, small differences in retraction degree do not affect the form of the suffix.
The proposed OT grammar, however, always determines which candidate is
the optimal one. The choice among these three candidates is made by the interaction
of stem-internal agreement formalized with AGREESt(CL), and perceptual identity
formalized with IDENT(front). This interaction was shown in tableau (50). Because
IDENT(front) >> AGREESt(CL), the winning candidate is the one that minimally
violates AGREESt(CL) while simultaneously satisfying IDENT(front).
Finally, recall the generalization that either front or back suffixes may follow
a stem-final front vowel. The proposed dynamically defined constraint AGREEI-Suff(R)
identifies stable region(s) of the tongue body constriction location that correspond to
the two stable forms of the suffix vowels, e.g. –nak and –nek. The parameter of stemfinal retraction determines the particular location of this stable region for the suffix
vowel. Therefore, the phonetic parameter R plays a role in determining the
phonological [±back] value of the suffix. In other words, a certain value of the order
parameter (e.g. CL = 2) constitutes a dynamically stable fixed point only in
connection with a certain value of R. This is the formalism of inter-dependency
between phonetics (the retraction degree) and phonology (the form of the suffix).
322
6.5.2. Opacity
The OT machinery applied for the analysis of transparency can be straightforwardly
extended to analyze opacity. Recall that opaque vowels impose their own [±back]
specification on the suffix vowel, by which they block the agreement between the
preceding stem vowel(s) and following suffix vowel(s). In Hungarian, for example,
front rounded vowels when preceded by a back vowel are followed by front suffixes:
parfüm-nek ‘perfume-Dat.’, sofır-nek ‘driver-Dat.’. This contrasts with transparent
vowels that allow agreement between initial and suffix vowels: papír-nak ‘paperDat.’, kávé-nak ‘coffee-Dat.’.
It was argued in Chapter 5 that the crucial difference between the transparent
and opaque vowels lies in their quantal properties. While some retraction of the front
unrounded vowels is perceptually tolerated, the same degree of retraction for the front
rounded vowels significantly affects their perceptual characteristics (Wood 1979,
1986, Stevens 1989). The proposed OT formalism captures the effect of retraction on
the perception with the IDENT(front) constraint. Hence, quantal differences among
front vowels are formalized as modifications of the dynamic potential defining
IDENT(front).
The role of IDENT(front) in this analysis is to define the relationship between
the constriction location of a vowel gesture (CL) and the perceptual category
FRONT. This is achieved by formalizing the constraint with a non-linear potential
323
function that determines the attractor and the slope of the potential. The changes in
the potential function that underlies the evaluation of IDENT(front) arise from
differences in the quantal properties of front vowels. These changes, however, do not
affect the global characteristics of the function such as the existence of one stable
region. It is the position and strength of the attractor that changes.
The difference between front unrounded and rounded vowels in terms of their
quantal features is illustrated as the difference between the two potentials in
Fig. 54.104
Fig. 54 – Formalization of non-linearity between the horizontal position of the
tongue (CL) and perceptual frontness.
104
Another way to express the generalization that front rounded vowels can be
retracted to a smaller degree than the unrounded vowels involves changing the input
CL values for the rounded vowels. If these values are smaller than the CL value of
the unrounded vowel, i.e. more to the left in the figure, the retraction degree for the
rounded vowels is smaller than for the unrounded vowels. However, our articulatory
data as well as the data in Wood (1986) show that the CL value for the front rounded
vowels is similar to that for the front unrounded vowels.
324
Two potentials in Fig. 54 represent a formalization of the observation that
[–round] front vowels are more acoustically stable and thus allow more articulatory
variation than [+round] front vowels (Wood 1986). The range of CL values that
satisfy the IDENT constraint for the [–round] vowels is bigger than the range of values
that satisfy the constraint for the [+round] vowels. As a result, a front unrounded
vowel can be retracted by R = 1 to CL = 1 and still be perceived as front. On the other
hand, a front rounded vowel can be only retracted by R = 0.3 to CL = 1.7, hence,
Rí > Rü.
It is assumed that the quantal relationship between articulation and perception
is a universal property of speech (Stevens 1989). The comparison of the two
potentials in Fig. 54 shows that the definition of what counts as a violation of
IDENT(front) depends on the [±round] quality of the front vowel.105 The relevant
potential for the evaluation of IDENT(front) is provided by independently known facts
about the connection between rounding and quantal properties of vowels. This is a
possible way of grounding phonological constraints phonetically.106
105
In standard OT, the definition of constraints does not depend on the type of input
they evaluate. A standard OT solution of this situation is to posit an IDENT(front)
constraint for each input vowel. In this way, the non-linear function specifying
IDENT-(i) would provide the potential landscape shown in the left panel of Fig. 54.
IDENT-(ü) would then be defined with a slightly different function that gives the
potential in the right panel.
106
Following Wood’s (1986) study of quantal differences between rounded and
unrounded front vowels in multiple languages, it is also assumed that the difference
325
The quantal differences between unrounded and rounded front vowels present
evidence for extending the original formulation of the dynamic potential defining
IDENT(front). Mathematically, the difference in the potentials shown in Fig. 54 results
from varying the parameters α,β of the potential function. Therefore, the new
formalism includes these parameters: V(x) = α*(1/4(x – xo)4) – β(1/2(x – xo)2). In
general, increasing β widens the potential and strengthens the attractor, increasing α
narrows the potential and weakens the attractor.107
Tableau (53) shows the OT formalism of opacity. In the input, a back vowel is
followed by a front rounded vowel. For simplicity, the values of CL for /i/ and /ü/ are
assumed to be identical (CLi = CLü = 2).108 The evaluation of the candidates is
similar to the evaluation in Tableau (51). Candidates (53a-b) are the faithful ones and
fare the worst on AGREE (CL)St because the stem-final front vowel in these candidates
in the potentials for rounded and unrounded vowels for IDENT(front) evaluation are
universal. This assumption is subject to empirical research. If it turns out that
languages differ in this respect, language-specific quantal relationships might be
encoded via grammar-controlled parameterization of the non-linear equations that
express these relationships.
107
It seems that the parameters α and β capture the same phenomenon as the
parameter q of the dynamic model. Both are used in formalizing the quantal
differences between /i/ and /ü/. However, the similarity is only apparent. In the
proposed OT account, quantal properties of vowels are defined with IDENT, and steminternal harmony with AGREE. The ranking of these constraints then gives rise to the
differences in stem-final retraction. In the proposed dynamic account in Chapter 5,
the parameter q expresses the instantiation of this ranking in the particular process of
gestural blending.
108
Support for this simplification is in the ultrasound data presented in chapter 4. The
tongue body horizontal position for /ü/ was only minimally different from that for /i/.
326
is not retracted. In candidates (53c-f), the front rounded vowel is retracted, which
causes a fatal violation of IDENT(front) for each of them. The front vowel in
candidates (53g-h) is retracted minimally. Consequently, these candidates satisfy
IDENT(front) while minimally violating AGREE (CL)St.
(53)
Opacity; perceptual constancy prevents significant articulatory
retraction
( a - ü )St –(V)Suff
(par-füm)St, Dat.,
AGREE
(R)I-Suff
IDENT
(front)
AGREE IDENT
(CL)St (CL)
CLa=–2, CLü =2
a. (a - [y])-a
{y} CLü=2, Rü=0, AD=4
*!
b. (a - [y])-e
{y} CLü=2,Rü=0, AD=4
****!
c. (a - [u])-a
{y}CLü=–1, Rü=3, AD=1
*!
d. (a - [u])-e
{y}CLü=–1, Rü=3, AD=1
*!
e. (a - [u] )-a
{y} CLü=1, Rü=1, AD=3
*!
f. (a - [u] )-e
{y} CLü=1, Rü=1, AD=3
*!
g. (a - [y] )-a
{y} CLü=1.7, Rü=0.3,
****
*
***
*
***
**
**
*
**
**
*!
***
*
***
*
*!
AD=3.7
h. (a - [y] )-e
{y} CLü=1.7, Rü=0.3,
AD=3.7
327
In more detail, candidates (53a-b) are not retracted and thus cause maximal
violation of AGREE (CL)St. Additional violation of AGREE(R)I-Suff is incurred by (53a)
due to its back suffix. Candidates (53c-d) best satisfy AGREE (CL)St. However, the
value of the retraction degree falls in the range where IDENT(front) is violated and the
stem-final vowel is not perceived as front.
Candidates (53e-f) have stem-final vowel retracted to an intermediate degree.
As a result, they incur medial violation of the gradient constraints AGREE(CL)St and
IDENT(CL) with respect to (53a-b) and (53c-d). However, the retraction value R = 1
does translate into violation of the IDENT(front) constraint for both (53c) and (53d), as
shown in the right panel of Fig. 54. As a result, the winning candidate is (53h) where
the front vowel is minimally retracted and selects a front suffix, which satisfies
AGREE(R)I-Suff. Candidate (53g) is sub-optimal since the agreement of a minimally
retracted vowel (R = 0.3) with a back suffix violates AGREE(R)I-Suff. This can be seen
by inspecting the attractor potential for this constraint for the value R = 0.3 shown in
the tableau.
Due to the difference in the [±round] quality of the vowel, modeled as a
parametric variation in the dynamic specification for IDENT(front), the candidates
with CL = 1 in tableaux (51) and (53) are evaluated differently by this constraint.
This can be observed in Fig. 55 with the evaluation of (51e-f) on the left, and (53e-f)
on the right.
328
Fig. 55 – Evaluation of candidates with CL = 1 by IDENT(front)[–round] on
the left and IDENT(front)[+round] on the right.
It can be observed that for front unrounded vowels, the value of CL = 1
satisfies IDENT(front)[–round] because the ball left in the potential would fall into the
attractor. However, the same CL value for a rounded vowel violates
IDENT(front)[+round] because a ball with this value would not fall into the attractor.
This difference can be observed by comparing (51e-f) and (53e-f). In all of these
candidates, CL = 1 and R = 1. This retraction is tolerated by IDENT(front) for (51e-f)
but not for (53e-f).
329
6.5.3. Vacillation
Sections 6.5.1 and 6.5.2 described the use of the proposed OT constraints in
explaining two phonological patterns related to vowel harmony in Hungarian:
transparency and opacity. In addition to these, the third phonological pattern in
disharmonic stems is vacillation between front and back suffixes. For example, in
most stems with a back vowel followed by /e/, both front and back suffixes are
possible, e.g. hárem-nak, hárem-nek ‘harem.Dat.’. Recall that this pattern receives a
natural explanation in the dynamic analysis with the notion of bi-stability, an
essential characteristic of any dynamical system described in Chapter 5. In this
section I propose an analysis of bi-stability within the developed OT framework
crucially using the dynamic definition of OT constraints.
The underlying source of the medial retraction of /e/ lies in its quantal
properties. It was argued in Chapter 5 that the low and somewhat retracted
articulatory target for this vowel gesture allows only limited retraction when such a
gesture blends with a back vowel gesture. In the previous section, the differences in
quantal features between /i/ and /ü/ were formalized by varying the potential function
of the IDENT(front) constraint. Following this approach, the potential in the rightmost
panel of Fig. 56 is proposed to control the perceptual faithfulness constraint for the
vowel /e/.
330
Fig. 56 – Quantal properties of /i/, /ü/, and /e/ as differences in respective
potentials defining the IDENT(front) constraint.
It is important to note that the differences in the three potentials derive from
scalar differences in lip rounding and tongue body retraction and height. Compared to
/i/, /e/ has less lip spreading, more retracted horizontal position, and lower vertical
position. Each of these three phonetic dimensions has the effect of decreasing the
acoustic stability of front vowels. More research is needed to determine the precise
influence of rounding, retraction, and height on the shape of the IDENT(front)
potentials. With more information, the mathematical formalization of these influences
could be sharpened and then modeled by varying the parameters in the dynamic
specification of IDENT(front).
The tableau in (54) shows how the proposed set of constraints and their
evaluation generates medial retraction of stem-final vowel that results in vacillation
331
in suffix selection. The stem-final /e/ in candidates (54a,b) is not retracted. The cost
of satisfying IDENT(CL) is in multiple AGREE (CL)St violations. A back suffix in (54a)
additionally violates AGREE(R)I-Suff. Candidates (54c,d) are maximally retracted,
which results in maximal IDENT(CL) violation but minimal AGREE (CL)St violation.
Based on the rightmost potential in Fig. 56 above, both (54c) and (54d) violate
perceptual identity. Candidates (54e,f) avoid the fatal violation of IDENT(front), and
improve on AGREE (CL)St in comparison to (54a,b). Therefore, in this tableau, as in
the two tableaux above, the candidate with maximal retraction within the limits of
perceptual constancy is the optimal candidate.
(54) Vacillation; perceptual constancy allows medial articulatory retraction
AGREE
IDENT
AGREE IDENT
( a - e )St –(V)Suff
(há-rem)St, Dat.,
(front)
(R)I-Suff
(CL)
(CL)St
CLa=–2, CLí=2
a. (a - [e] )-a
{e} CLe=2, Re=0, AD=4
*!
b. (a - [e] )-e
{e} CLe=2, Re=0, AD=4
***
***!
c. (a - [√])-a
{e}CLe=–1, Re=3, AD=1
*!
*
**
d. (a - [√])-e
{e}CLe=–1, Re=3, AD=1
e. (a - [e] )-a
{e} CLe=1.3,Re=.7, AD=3.3
*!
*
**
**
*
**
*
f. (a - [e] )-e
{e} CLe=1.3, Re=.7,
AD=3.3
332
However, the crucial difference between this tableau and tableaux (51) and
(53) is in the evaluation of candidates with medially retracted stem-final vowels by
the constraint AGREE(R)I-Suff. The potential for the evaluation of this constraint for
candidates (54e,f) is shown in the bottom row of the tableau. All discussed
constraints so far were defined by a monostable dynamic potential, i.e. a potential
with a single attractor. Consequently, in the evaluation of two candidates with
identical R values and different backness of the suffix, it was possible to determine
which candidate satisfies the constraint and which violates it. The CL value of the
suffix was either within the range of values corresponding to the attractor or not.
The last potential in (54) shows that for certain values of R, there are two
stable values of CL for the suffix vowel. As a result, a candidate with the front suffix
and certain value of R is equally harmonic with respect to AGREE(R)I-Suff as another
candidate with the back suffix and the same value of R. Candidates (54e) and (54f)
fare equally on all constraints, and therefore, both are optimal.
6.5.4. Monosyllabic stems
As described in Chapters 2 and 3, transparent vowels in some monosyllabic stems
select back suffixes, e.g. híd-nak ‘bridge-Dat.’. The experimental results reported in
chapter 3 showed that the transparent vowels in these stems are more retracted than in
333
monosyllabic stems that select front suffixes. This was the case even when the stems
were produced in bare, unsuffixed forms, e.g. híd ‘bridge-Nom’.
It is likely that this difference in retraction originates in contextual differences
between the two groups. More specifically, híd-type stems are often followed by back
suffixes. It is not clear if retraction triggers the form of the suffix or the suffix induces
retraction. The crucial point, however, is that this coarticulatory retraction is then
stored and produced even if there is no following suffix that could induce such
retraction.
Due to this capacity of storing phonetic details, it is assumed that transparent
vowels in híd stems are lexically specified for some degree of retraction (e.g. CL ≅ 1).
In this way, they differ from canonical front vowels that are specified as CL = 2. This
underlyingly specified retraction is then interpreted as a retraction degree R ≅ 1.
Because this value of retraction is compatible with retraction arising through steminternal harmony, AGREE(R)I-Suff mandates back suffixes in both cases.
Crucially, it is the phonetic information about retraction that unites the
patterns of suffix selection in disyllabic and monosyllabic stems. The vowel /í/ in víz
‘water’ is similar to /í/ in szépít ‘beautify’ in that neither of them is retracted enough
and thus both select front suffixes. In contrast, /í/ in híd ‘bridge’ is similar to /í/ in
papír ‘paper’ in that both are retracted and thus induce back suffixes. The only
334
difference between híd and papír is that the difference from a canonical /í/ in híd is
stored whereas in papír it results from stem-internal blending.
The pattern of suffix selection in monosyllabic híd-type words is formalized
in (55). AGREE (CL)St and AGREE (CL)A-Suff are not included in the tableau: AGREE
(CL)St is vacuously satisfied by all candidates because there are no adjacent vowels in
monosyllabic stems, and AGREE (CL)A-Suff is not relevant because the stem vowel in
the input is front.
(55)
Lexically specified retraction in abstract stems
(i)St –(V)Suff
(híd)St, Dat.,
IDENT(front)
AGREE(R)
IDENT(CL)
I-Suff
CLí=1, R=1
a. ([i])-a
{i} CLi=2,Ri=–1
b. ([i])-e
{i} CLi=2,Ri=–1
c. ([µ])-a
{i}CLi=–1,Ri=2
d. ([µ])-e
{i}CLi=–1,Ri=2
e. ([i])-a
{i} CLi=1,Ri=1
f. ([i])-e
{i} CLi=1,Ri=1
*!
*
*!
*!
*!
**
*
**
*!
Candidate (55e) is the winner because it reproduces the lexically specified
articulatory retraction, it is still perceived as front, and the retraction value prefers a
335
back suffix. We can also observe the effect of IDENT(CL) that resolves the conflict
between (55b) and (55e) in favor of the latter. Candidate (55b) violates IDENT(CL)
because its stem vowel is fronted compared to the input. Hence, the articulatory
distance between the input and output CL values of candidate (55b) is greater than for
the faithful candidate (55e).
6.6.
Typological considerations
OT constraints capture general demands of the phonological system that are
employed in particular languages with different priorities. As mentioned in Section
6.2, the possibility of constraint re-ranking in the OT framework constitutes an
elegant and parsimonious way for dealing with cross-linguistic variation. This section
explores the application of the developed OT tools in explaining typological
differences among languages. It is important, however, to keep in mind that these
considerations are based on impressionistic transcription data. A careful experimental
study of the production and perception of transparent and opaque vowels in various
languages might reveal patterns that were missed in traditional phonological
descriptions of these vowels (as was shown for Hungarian for example).
The proposed OT grammar for suffix selection as well as stem-final vowel
retraction is repeated in (56) below. The analysis can be reconstructed in three steps.
First, the most general conflict between AGREE and IDENT was established. Second,
336
both of these constraints were sub-divided to explain the most general Hungarian
patterns. AGREE was argued to differ for stem-internal and stem-suffix agreement
whereas IDENT was split to account for the discussed non-linearity between
articulation and perception into IDENTPerc and IDENTArt. Third, stem-suffix agreement
in Hungarian was proposed to be analyzed differently for stem-final front vowel and
back vowels.
(56)
OT grammar for Hungarian
IDENTPerc
/
|
\
AGREE(CL)St, AGREE(R)I-Suff , AGREE (CL)A-Suff
\
|
/
IDENTArt
This section presents typological considerations and explores the applicability
of the proposed OT constraints to account for vowel harmony patterns of other
languages. Providing complete analysis of these patterns would require either
collecting information about these languages (either experimentally or from the
literature) similar to what was obtained for Hungarian. Therefore, rather than
presenting the full OT typology, the discussion is organized along the three steps
mentioned above that guided the construction of the OT grammar for Hungarian.
The first (and the most coarse) difference related to vowel harmony is
between languages that display this pattern and those that do not. For example,
337
vowels in English or French words co-occur freely. In addition, the form of the affix
vowels does not depend on the form of the stem vowels. In contrast, vowels in
Hungarian and Turkish display co-occurrence restrictions and do not combine freely.
This difference is analyzed in OT by language-specific resolution to the conflict
between the harmony constraint AGREEArt and the faithfulness constraint IDENT:
AGREEArt dominates IDENT in vowel harmony languages whereas the opposite
ranking applies to languages without vowel harmony.
With respect to the second step, the division of AGREEArt into two constraints
governing stem-internal and stem-suffix agreement is also supported in other
languages. For example, Hungarian front vowel /i/ behaves transparently because it is
followed by back suffixes in certain environments. In Turkish, on the other hand, /i/
does not behave transparently even when it is exceptionally preceded by a back
vowel. Hence, /i/ is followed by front suffixes whereas /i/ is required for back
suffixes in Turkish. Therefore, we can speculate that Turkish stem-suffix harmony is
controlled by AGREEStem-Suff. This constraint is identical to AgreeA-Suff but applies to
all stem-final vowels in Turkish and not just the back ones. Hence, the Agree(R)St-Suff
is lower ranked in Turkish and stem-suffix harmony arises from the ranking
AGREEStem-Suff >> IDENTArt.
In Hungarian, stem-internal agreement is not fully respected because a back
vowel might be followed by a front unrounded vowel. In Turkish, the stem-internal
338
agreement is higher-ranked because a back initial vowel typically implies the
presence of a back unrounded vowel not a front one. This can be analyzed with
ranking AGREEArt >> IDENTPerc in Turkish, and the opposite ranking for Hungarian.
The division of the IDENT constraints into perceptual and articulatory seems to
be supported by the patterns found in Uyghur (Lindblad 1990, Vaux 2001) and
certain dialects of Estonian (Kiparsky & Pajusalu 2003). For example, Uyghur vowel
harmony pattern is slightly different from that of Hungarian and Turkish. Similar to
Hungarian, stems with only /i/ behave transparently and may be followed by either
front or back suffixes; there are even several minimal pairs that are difficult to find in
Hungarian. In Uyghur T stems, however, the generalization is that back suffixes are
an unmarked, default choice whereas front suffixes are marked and exceptional. In
underived stems, /i/ may behave transparently and opaquely, and, in the cases where
/i/ derives from low vowels through raising, the behavior is typically transparent.
Interestingly, Lindblad (1990) reports that /i/ (and the assumed underlying
back /i/) have between 8 and 14 allophones depending on adjacent sounds, mostly
consonants. It is possible to assume that a seemingly transparent /i/ has retracted
allophones in back harmony domains, and non-retracted ones in front harmony
domains. Lindblad’s analysis assumes an underlying contrast between front /i/ and
back /i/, followed by a vowel harmony process, then a contrast neutralization rule by
fronting (/i/ /i/), and finally allophonic rules that determine the surface form based
339
on the environment. See Vaux (2001) for an analysis that avoids abstract absolute
neutralization and accounts for the derived transparency of /i/. The fact relevant for
the OT analysis developed in this chapter, and not discussed in Vaux (2001), is that
the retraction of transparent /i/ in a back harmony domain is readily perceptible and
transcribed as allophonic variation.
A possible analysis, though speculative at this moment in the absence of
careful phonetic investigation, is that Uyghur is similar to Hungarian in that phonetic
details of tongue body retraction correlate with the suffix selection. The difference
from Hungarian is in the ranking of IDENT(front): while it is undominated in
Hungarian resulting in vowels recovered as /i/, it is ranked below AGREE constraint(s)
in Uyghur resulting in more extreme retraction, which violates IDENT(front).
The considerations presented in this section are only exploratory but provide
directions for future research as will be discussed in Chapter 7.
6.7.
Summary of the OT model
The model of transparency developed in this section is an integrated model of
phonetics and phonology that is couched in the OT framework and enriched with the
notion of dynamic definition of constraints. The units of representation are
dynamically defined articulatory gestures with parameters such as Constriction
Location and Constriction Degree. Palatal vowel harmony is construed as a pattern of
340
articulatory agreement between adjacent vowel gestures in terms of Tongue Body
Constriction Location.
There are markedness and faithfulness constraints acting upon this parameter
and determining the phonetic and phonological outputs of the alternations. The
faithfulness IDENT constraints are of two types: articulatory and perceptual. Both
militate against changes in the quality of stem vowels caused by markedness
constraints. The division into articulatory and perceptual IDENT constraints is
supported by independent facts about non-linear relationship between the two
dimensions. The markedness AGREE constraints induce harmony between vowel
gestures. They require that the TBCL parameter of adjacent vowels is minimally
different.
Some constraints are gradiently evaluated, and ,because they operate on
continuous articulatory parameters such as CL, they ascertain the relevance of these
parameters in phonology. This is the case of AGREE(CL) for instance. In contrast,
other constraints are categorically evaluated, which ascertains the macroscopic
stability of phonological categories. The ranking of these two types of constraints
with respect to each other determines the degree to which individual languages use
phonetic details in their phonological systems. The crucial novelty of the model is in
the dynamic definition of the constraints. This extension, together with the
categorical and gradient modes of evaluation, enables an account of phonological
341
alternations that depends on phonetic information, yet it does not compromise
qualitative and cognitively essential features of discreteness and stability.
342
CHAPTER 7
Conclusion
A major problem in the study of speech is how to relate the symbolic aspects of our
speaking competence to their continuous physical manifestation in terms of vocal
tract action. The study of these two aspects of speech has traditionally been pursued
under separate domains, with the mental or symbolic aspects being the domain of
phonology and the physical or continuous aspects being the domain of phonetics.
This dissertation presented the results from a study of a particular instance of the
phonetics-phonology relation: transparency in Hungarian palatal vowel harmony.
Chapters 2 and 3 showed that looking at the same system from more than one
angle provides useful information. Depending on a particular level of observation,
different types of generalizations can be observed. For example, the phonetic
investigation revealed a systematic pattern of stem-final vowel retraction correlated
with the phonological quality of the suffix vowel. Phonological investigation revealed
that transparency of front vowels is a scalar property. That is, the likelihood that a
stem-final front vowel is followed by back suffixes depends on its phonetic features
of height, lip rounding and tongue body retraction.
Chapters 5 and 6 presented a model in which phonetic and phonological
knowledge are integrated into a single system. In the proposed analysis, it is assumed
that both phonetic and phonological properties participate in determining which
343
suffix is chosen. In other words, the traditional domains of phonetics and phonology
are construed as two inter-dependent layers of a single system that operates at
different levels of granularity (abstraction). It was proposed that the lawful
relationship between these two levels can be formalized using the properties of nonlinear dynamics.
The advantage of this approach is that it allows one to express both qualitative
and quantitative aspects without losing sight of the essential distinction between the
two. Applied to the particular case of transparency, this approach covers a broad
range of phonetic and phonological data in a unified and explanatory way. The
rationale of the model can be summarized in Fig. 57.
344
Gestural blending
Lip rounding
Vowel height
Stem-final Retraction
{í, i, é} -nek
{í, i, é} -nak
e
-nek
B{í, i, é} -nak
Bü
-nek
B{í, i, é} -nak
Be
-nek/nak
BT -nak
BTT -nek/nak
Fig. 57 – Transparency as an integrated system of phonetics (dashed oval) and
phonology (dotted oval).
The phonetic level of our integrated system is shown with the dashed oval. It
illustrates the relationship between stem-final retraction on the one hand, and gestural
blending, vowel height, and lip rounding on the other hand. First, coarticulation
formalized as gestural blending is the source of retraction in disharmonic stems. A
back vowel induces retraction on the adjacent front vowel. Second, the degree of this
retraction depends on lip rounding. Following Wood (1986) it was argued that more
lip rounding implies less tongue body retraction. The basis for this claim is in the
non-linear relationship between articulation and perception: the region of acoustic
345
stability that is insensitive to the horizontal movement of tongue body is very limited
for the rounded vowels. Third, it was argued that degree of retraction also depends on
vowel height. The lower the front vowel, the less it is retracted when adjacent to a
back vowel.
Hence, the continuous phonetic dimension that combines the effect of gestural
blending, lip rounding and vowel height is the degree of retraction for the stem-final
vowels. This is depicted by the lines connecting the box labeled ‘Stem-final
Retraction’ with the other three boxes in the dashed oval in Fig. 57. It is important to
note that all three relationships are independent of vowel harmony. Gestural blending
is a natural phenomenon of speech in general, and correlations between retraction and
lip rounding and height stem from independent quantal properties of vowels.
The central parameter that provides the link between the phonetics and
phonology is stem-final retraction. The phonological level that controls the selection
of suffixes is shown with the dotted oval in Fig. 57. The oval illustrates that, if
phonetic retraction of stem-final vowels participates in determining the form of the
suffix, all observed phonological generalizations receive a unified explanation.
First, monosyllabic stems with a transparent vowel (T stems) usually select
front suffixes, but some T stems exceptionally select back suffixes. Our experimental
results showed that the transparent vowels in the stems that select back suffixes are
more retracted than the stems that select front suffixes, even in bare, unsuffixed
346
forms. If this retraction is stored lexically and has a lawful relationship to the form of
the suffix, as in the proposed dynamic model, an explanation for both phonetic
differences in retraction and the phonological differences in suffix selection for T
stems is achieved.109
Second, front unrounded vowels behave transparently in that they select back
suffixes when preceded by back vowels. In contrast, front rounded vowels behave
opaquely and always select front suffixes in this environment. The differences in
retraction between rounded and unrounded front vowels stem from their quantal
phonetic properties. In the proposed model, this difference enters a non-linear
dynamic system for suffix selection as a control parameter, and predicts that the
rounded vowels are followed by front suffixes, whereas the unrounded ones are
followed by back suffixes.
Third, the major difference within the set of transparent vowels in Hungarian
is the behavior of /e/ when compared to the other three vowels {/i/, /í/, /é/}. When
these vowels are preceded by a back vowel, /e/ can be followed by either a front
suffix or a back suffix. In contrast, {/i/, /í/, /é/} in the same environment are typically
109
Suffix selection and tongue body retraction in T stems are an instance of the
chicken-and-egg problem. As discussed in Chapters 5 and 6, it is not clear if
retraction triggers the form of the suffix or the suffix induces retraction. Crucially for
this model, however, whatever the origin of stem-final retraction, its phonological
encoding is necessary to account for the correlation between the retraction and the
form of the suffix.
347
followed by a back suffix. Chapter 4 provided converging evidence that blending of
back vowels with low /e/ results in a medial retraction degree for /e/. In the proposed
dynamic system, this degree of retraction, or the value of the control parameter,
produces bistability where both front and back suffixes are available.
Finally, increasing the number of transparent vowels that follow a back vowel
affects the quality of the suffix. Thus, stems where a back vowel is followed by two
transparent vowels (BTT stems) are more likely to vacillate or take front suffixes than
BT stems. This difference is predicted in the dynamic model because all vowels,
including the transparent ones, participate in gestural blending. Hence, the retraction
degree for the stem-final vowel of a BTT stem is lower than in a BT stem. This is
because the additional front vowel in the BTT stem eliminates partially the influence
of the initial back vowel. In the developed model of suffix selection, the difference in
stem-final retraction translates into the difference between exclusively back suffixes
in BT stems and availability of both front and back suffixes in BTT stems.
To summarize, this dissertation presented novel experimental data from the
production of transparent vowels in Hungarian. The data support the claim that
continuous phonetic details of stem vowels are relevant for the phonological
alternation in suffixes. The dissertation proposed an integrated model that relates
phonetics and phonology using the formal language of non-linear dynamic. This
348
model achieves a unified explanation of both the phonetic and phonological
generalizations that were observed.
7.1.
Future work
In line with Kosslyn’s (1978) ideas about modeling and its scientific value, a model
built on the information gathered from original experiments opens new issues and
predictions that await further experimental work. Here I present a brief outline of
such issues and predictions related to the dynamic model of Hungarian palatal
harmony.
The first issue involves building empirical support for the explicit predictions
of the dynamic model. There are three such predictions. First, the model predicts that
front rounded vowels in stems like parfüm or sofır are retracted less than front
unrounded vowels in stems like papír or kávé. Additionally, front rounded vowels are
predicted to be less articulatorily flexible, displaying a lower degree of variance
compared to front unrounded vowels. Second, the model also predicts that the second
transparent vowel in BTT stems such as aszpirin is less retracted than the first one.
Third, the low vowel /e/ is predicted to be affected by preceding back vowel to an
intermediate degree compared to /í/ and /ü/.
The second issue involves examining differences among the back vowels and
how these differences affect blending with front vowels. Individual transparent
349
vowels are affected differently by the back environment, as observed both with
EMMA and Ultrasound (Chapter 3). It is therefore expected that individual back
vowels would affect the same transparent vowel in a different way. For example, one
would expect that the harmonic feature [+back] is realized differently on the
transparent vowel in words of the ‘babi-ba’ type and ‘bobi-bo’ type. This assumption
is supported by Wood (1979) who showed that lingual constrictions for /u/, /o/, /a/ are
produced in the velar, uvular, and pharyngeal areas respectively. Given this
difference among the vowels /u/, /o/, and /a/, it is reasonable to question whether it is
justified to group their effect on front vowels under the single notion of tongue body
retraction. In other words, what is the best way to capture the articulatory basis of the
cognitively essential quality [+back] in palatal vowel harmony systems?
Another area for future research is to find whether predictions of OT typology
are instantiated in natural languages. As discussed in Chapter 6, the system of
proposed OT constraints which account for Hungarian vowel harmony predicts the
existence of language(s) in which articulatory retraction for the transparent vowels
like /i/ is not constrained by perceptual faithfulness. In other words, the ranking
AGREE >> IDENT(front) predicts the kind of retraction of transparent vowels that is
readily perceptible. Based on impressionistic reports in the literature, languages like
Uyghur (Lindblad 1990) or several dialects of Estonian (Kiparsky & Pajusalu 2003)
may belong to this predicted type.
350
Finally, although both coarticulation and harmony are considered within a
single model of vowel harmony in this dissertation, there might be tests to distinguish
between them in a cross-linguistic study. In Turkish palatal harmony, /i/ behaves
harmonically because it is retracted to the back /i/ in the back environment. In
Uyghur palatal harmony, /i/ is claimed to be transparent (therefore not changing to /i/
due to harmony) but a retracted version of /i/ is present as a result of an assumed
allophonic rule that changes regular /i/ to its retracted allophone in words with back
adjacent vowels. In Hungarian, the literature does not report a retracted allophone,
but systematic articulatory retraction was still found in the experiments reported in
this dissertation. The research question that bears on the issue of differences between
coarticulation and harmony can be asked: Are the retractions of /i/ in Turkish,
Uyghur, and Hungarian just three different degrees of the same process, or are they
functionally different? Additionally, is there a fundamental difference between the
three mentioned languages and a language that does not display a vowel harmony
pattern such as English?
A similar question can be asked for consonants in vowel harmony systems.
Some claim that harmony affects all elements within a domain (Gafos 1999, Ní
Chiosáin & Padgett 2001), and others claim that locality is relativized and vowel
harmony does not affect consonants (see van der Hulst & van der Weijer 1995 for
review). In some languages, velar consonants are claimed to participate in vowel
351
harmony (e.g. /k/, /g/, and /l/ in Turkish) and in others they are claimed not to
participate, as in Hungarian for instance. It is reasonable to predict that even
Hungarian would show some variation of the velars with respect to the vowel
environment (as is the case for example in English difference between the velars in
‘car’ and ‘key’). What is then the real difference between the retracted velars in
languages like Turkish, languages like Hungarian with vowel harmony but assumed
transparency of consonants, and languages like English which do not have vowel
harmony?
To summarize, there are several areas for future research, whose results would
either lead to the strengthening of the original model or, perhaps more likely, to
reconsidering some of the assumptions of the model. Importantly, however, the
results presented in this dissertation provide strong support for a general approach
that recognizes the need for application of rigorous phonetic methods in phonological
research.
352
APPENDIX A
List of stimuli for subjects ZZ and BU
Disyllabic stems
Back
/í/
zafír-ban
zafír-ból
zúdít-ott
tompít-ó
normatív-nél
passzív-hoz
szólít-od
Gloss
Front
Gloss
Suffix
sapphire
sapphire
to hail
attenuative
normative
passive
to address
zefír-ben
zefír-bıl
szédít-ett
tömít-ı
primitív-nál
esszív-hez
bıvít-ed
zephyr
zephyr
to beguile
obturating
primitive
essive
to let out
Iness.
Elat.
3rd sg. past indef.
adject. suff.
Adess.
Allat.
2nd sg. def.
/i/
bácsi-ban
buli-val
kocsi-tól
polip-om
szolid-nak
bólint-ott
Tomi-hoz
lutri-hoz
uncle
party
caerria
polyp
solid
to nod
Tom.Dim.
lottery
bécsi-ben
bili-vel
öcsi-tıl
Filip-em
rövid-nek
érint-ett
Imi-hez
csitri-hez
of Vienna
pot
buster
Filip
short
to touch
Imre.Dim.
flapper
Iness.
Instr.
Abl.
1st sg poss.
Dat.
3rd sg. past indef.
Allat.
Allat.
/é/
szatén-ban
tányér-nál
málé-hoz
sasszé-val
málés-an
ganéz-ott
kadét-tól
satin
plate
spoon
shuffle
stupid
be composted
cadet
kretén-ben
tenyér-nél
filé-hez
esszé-vel
békés-en
intéz-ett
bidé-tıl
cretin
palm
fillet
essay
peaceful
manage
bidet
Iness.
Adess.
Allat.
Instr.
adject. suff.
3rd sg. past indef.
Abl.
353
Monosyllabic stems
Back
vív
híd
ír
víg
síp
nyit
cél
héj
fence
bridge
write
cheerful
whistle
open
aim
crust
Front
ív
íz
hír
míg
cím
hisz
szél
éj
bow
flavor
rumor
while
address
believe
wind
night
354
Appendix B
List of stimuli for subject CK (pilot)
Back
í
zafír-ban
zafír-tól
zafír-hoz
aktív-ál
naív-ul
masszív-val
masszív-hoz
masszív-ba
passzív-val
passzív-hoz
zúdít-ott
jobbít-om
kábít-om
i
náci-val
náci-ban
náci-hoz
bácsi-val
bácsi-ban
bácsi-hoz
buli-val
buli-ban
cumi-ban
kocsi-tól
lutri-val
lutri-hoz
lutri-ba
mázli-val
nyuszi-tól
polip-on
polip-om
Gloss
Front
Gloss
Suffix
sapphire
sapphire
sapphire
active
naïve
massive
massive
massive
passive
passive
to hail
to ammend
to daze
zefír-ben
zefír-tıl
zefír-hez
beszív-el
beív-el
mőszív-val
mőszív-hez
mőszív-be
kıszív-val
kıszív-hez
szédít-ett
kisebbít-em
repít-em
zephyr
zephyr
zephyr
to draw
to lob
art. heart
art. heart
art. heart
heart of adamant
heart of adamant
to beguile
to lessen
to send
Iness.
Abl.
Allat.
Adj.
Adj.
Inst.
Allat.
Illat.
Inst.
Allat.
3rd sg. past indef.
1st sg. poss.
1st sg. poss.
nazi
nazi
nazi
uncle
uncle
uncle
party
party
title
coach
lottery
lottery
lottery
fluke
bunny
polyp
polyp
nıci-vel
nıci-ben
nıci-hez
bécsi-vel
bécsi-ben
bécsi-hez
telivér
belibeg
semmibe
kicsi-tıl
csitri-vel
csitri-hez
csitri-be
müzli-vel
tenyészidı
zsilip-en
Fillip-em
bimbo
bimbo
bimbo
of Vienna
of Vienna
of Vienna
full-blood(ed)
to breaze in
to ignore
small
flapper
flapper
flapper
muesli
breeding season
sluice
name
Inst.
Iness.
Allat.
Inst.
Iness.
Allat.
Inst.
Iness.
Iness.
Abl.
Inst.
Allat.
Illat.
Inst.
Nominal
Superess.
1st sg poss.
355
cuclitam
kap-ni
lop-ni
pacifier
to get
to pinch
filiszter
köp-ni
lép-ni
philistine
to gob
to step
Root
Inf.
Inf.
é
acél-nak
affér-ban
bode-tól
kávé-val
soltész-ból
tányér-hoz
málé-val
málé-hoz
sasszé-val
sasszé-ból
sasszé-hoz
csálé-val
csálé-ban
csálé-hoz
vám-ért
hám-ért
púp-ért
steel
affair
hut
coffee
name
plate
spoon
spoon
shuffle
shuffle
shuffle
croocked
croocked
croocked
duty
harness
hump
beszél-nek
térbeli
bidet-tıl
végé-vel
tengerész-bıl
tenyér-hez
felé-vi
léhő-tı
esszé-vel
esszé-bıl
esszé-hez
meggylé-vel
meggylé-ben
meggylé-hez
fém-ért
hím-ért
pép-ért
to address
spatial
bidet
end
mariner
palm
terminal
loafer
essay
essay
essay
sour cherry juice
sour cherry juice
sour cherry juice
metal
dog
Cream of wheat
Dat.
Iness.
Abl.
Inst.
Elat.
Allat.
Inst.
Allat.
Inst.
Elat.
Alat.
Inst.
Iness.
Allat.
Caus.
Caus.
Caus.
e
totem-mal
totem-tól
hárem-ban
hárem-mal
hárem-on
hárem-ba
hárem-ból
hárem-hoz
totem
totel
harem
harem
harem
harem
harem
harem
tetem-mel
tetem-tıl
érem-ben
érem-mel
érem-en
érem-be
érem-bıl
érem-hez
dead body
dead body
medal
medal
medal
medal
medal
medal
Inst.
Abl.
Iness.
Inst.
Superess.
Illat.
Elat.
Allat.
356
Monosyllabic stems
Back
Front
víg
cheerful
síp
whistle
cél
aim
míg
cím
szél
while
address
wind
Additional frame sentence:
Ekkor azt láttam, hogy “
“ akkor pedig azt láttam, hogy “
‘Now I see ____ and then I read _____ once again.’
357
“ mégegyszer.
Appendix C
Post-hoc Tukey test: effect of vowel type on the tongue positionj in the front and
back environments
Multiple Comparisons, subject ZZ, FRONT environment
Tukey HSD
Mean Difference
(I-J)
Receiver (I) TV
/i/
TD
/i:/
/i/
TB2
/i:/
/i/
TB1
/i:/
Upper
Bound
.233821
.1914347 .441
-.216887
.684529
/e:/
-.989083(*)
.1914347 .000
-1.439791
-.538374
-.233821
.1914347 .441
-.684529
.216887
-1.222904(*)
.1896864 .000
-1.669496
-.776312
/i/
.989083(*)
.1914347 .000
.538374
1.439791
/i:/
1.222904(*)
.1896864 .000
.776312
1.669496
/i:/
-.297614
.2941172 .570
-.990075
.394847
/e:/
-2.057903(*)
.2941172 .000
.297614
.2941172 .570
-1.760289(*)
.2914311 .000
/i/
2.057903(*)
.2941172 .000
1.365442
2.750364
/i:/
1.760289(*)
.2914311 .000
1.074153
2.446426
/i:/
-.786079(*)
.3089814 .031
-1.513536
-.058622
/e:/
-1.901592(*)
.3089814 .000
-2.629049 -1.174136
.786079(*)
.3089814 .031
.058622
1.513536
-1.115513(*)
.3061596 .001
-1.836327
-.394700
/i/
1.901592(*)
.3089814 .000
1.174136
2.629049
/i:/
1.115513(*)
.3061596 .001
.394700
1.836327
/i/
/i/
/i/
/e:/
/e:/
Lower
Bound
/i:/
/e:/
/e:/
Sig.
(J) TV
/e:/
/e:/
Std.
Error
95% Confidence
Interval
* The mean difference is significant at the .05 level.
358
-2.750364 -1.365442
-.394847
.990075
-2.446426 -1.074153
Multiple Comparisons, subject ZZ, BACK environment
Tukey HSD
Dependent
(I) TV
Variable
/i/
TD
/i:/
Mean
Difference (I-J)
/i/
TB2
/i:/
/i/
TB1
/i:/
Upper
Bound
-.792725(*) .2379112 .003
-1.352738
-.232712
/e:/
-1.291158(*) .2379112 .000
-1.851171
-.731145
.792725(*) .2379112 .003
.232712
1.352738
-.498433 .2467439 .109
-1.079237
.082371
1.291158(*) .2379112 .000
.731145
1.851171
/i:/
.498433 .2467439 .109
-.082371
1.079237
/i:/
-2.134391(*) .3458925 .000
-2.948578 -1.320203
/e:/
-2.461490(*) .3458925 .000
-3.275677 -1.647302
/i/
/i/
/i/
2.134391(*) .3458925 .000
1.320203
2.948578
-.327099 .3587340 .633
-1.171514
.517316
2.461490(*) .3458925 .000
1.647302
3.275677
/i:/
.327099 .3587340 .633
-.517316
1.171514
/i:/
-2.790666(*) .3756004 .000
-3.674782 -1.906550
/e:/
-1.995064(*) .3756004 .000
-2.879180 -1.110948
/i/
/i/
/e:/
/e:/
Lower
Bound
/i:/
/e:/
/e:/
Sig.
(J) TV
/e:/
/e:/
Std.
Error
95% Confidence
Interval
/i/
/i:/
2.790666(*) .3756004 .000
1.906550
3.674782
.795602 .3895449 .104
-.121338
1.712541
1.995064(*) .3756004 .000
1.110948
2.879180
-.795602 .3895449 .104
-1.712541
.121338
* The mean difference is significant at the .05 level.
359
BIBLIOGRAPHY
Anderson, L. 1980. Using asymmetrical and gradient data in the study of vowel
harmony. In R. Vago (ed.), Issues in Vowel Harmony, 271-340. Amsterdam:
John Benjamins.
Anderson, S. 1980. Problems and Perspectives in the Description of Vowel Harmony.
In R.Vago (ed.), Issues in Vowel Harmony, 1- 48. Amsterdam: John
Benjamins.
Anttila, A. 2002. Morphologically conditioned phonological alternations. Natural
Language and Linguistic Theory 20(1): 1-42.
Archangeli, D. and D. Pulleyblank. 1994. Grounded phonology. Cambridge: MIT
Press.
Archangeli, D., B. Kennedy, A. Baker, and S. Racy. 2004. Ultrasonic techniques for
phonological research. Ms., University of Arizona.
Arnold, V. I. 2000. Nombres d’Euler, de Bernoulli et de Springer pour les groupes de
Coxeter et les espaces de morsification: le calcul de serpents. In É.
Charpentier and N. Nikolski (eds.), Leçons de Mathématiques d’Aujourd’Hui,
61-98. Paris: Cassini.
Baković, E. 2000. Harmony, dominance, and control. Doctoral dissertation, Rutgers
University, New Brunswick, NJ. [ROA-360].
Baković, E. and C. Wilson. 2000. Transparency, strict locality, and targeted
constraints. In R. Billerey and D. B. Lillehaugen (eds.), WCCFL 19
Proceedings, 43-56. Somerville, MA: Cascadilla Press.
Beckman, J. 2004. Positional Faithfulness. In J. McCarthy (ed.), Optimality Theory in
Phonology: A Reader, 310-342. Oxford: Blackwell.
Beckman, M and J. Kingston. 1990. Introduction. In M. Beckman and J. Kingston
(eds.) Papers in Laboratory Phonology I, 1-16. Cambridge: Cambridge
University Press.
360
Beddor, P. S., Krakow, R. A., and S. Lindemann. 2001. Patterns of perceptual
compensation and their phonological consequences. In E. Hume and K.
Johnson (eds.), The Role of Perceptual Phenomena in Phonology, 55-78. San
Diego: Academic Press.
Beller, Y. 2004. A study of tongue height during production of neutral vowels in
Hungarian. Ms., New York University.
Boersma, P. 1998. Functional phonology. The Hague: Holland Academic Graphics.
Bornstein, M. H. and N.O. Korda. 1984. Discrimination and matching within and
between hues measured by reaction times: Some implications for categorical
perception. Psychological Research 46: 207-222.
Boyce, S. 1988. The influence of phonological structure on articulatory organization
in Turkish and in English: vowel harmony and coarticulation. Doctoral
dissertation, Yale University, New Haven, CT.
Browman, C. P. and L. Goldstein. 1986. Towards an articulatory phonology.
Phonology Yearbook 3: 219-252.
Browman, C. P. and L. Goldstein. 1989. Articulatory gestures as phonological units.
Phonology 6: 201-251.
Browman, C. P. and L. Goldstein. 1992. Articulatory phonology: An overview.
Phonetica, 49(3-4): 155-180.
Browman, C. P. and L. Goldstein. 1995. Dynamics and articulatory phonology. In T.
van Gelder and R. Port (eds.), Mind as Motion, 175-193. Cambridge, MA:
MIT Press.
Byrd, D. 1995. C-centers revisited. Phonetica 52: 285-306.
Campbel, L. 1980. The psychological and sociological reality of Finnish vowel
harmony. In R. Vago (ed.), Issues in Vowel Harmony, 245-271. Amsterdam:
John Benjamins.
Chinchor, N. 1979. On the treatment of Mongolian vowel harmony. Cunyform papers
in L inguistics 8: 171-187.
361
Chomsky, N. 1965. Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Chomsky, N. and M. Halle. 1968. The sound pattern of English. New York: Harper
and Row.
Clements, G.N. 1977. Neutral vowels in Hungarian vowel harmony: An
autosegmental interpretation. In J. Kegl, D. Nash, & A. Zaenen (eds.),
Proceedings of NELS 7, 49-64.
Clements, G.N. and E. Hume. 1995. The internal organization of speech sounds. In J.
Goldsmith (ed.), The Handbook of Phonological Theory, 245-307. Oxford:
Blackwell.
Davidson, L. 2003. The Atoms of Phonological Representation: Gestures,
Coordination, and Perceptual Features in Consonant Cluster Phonotactics.
Doctoral dissertation, John Hopkins University, Baltimore, MD.
Davidson, L. 2004. Fitting phonotactics into gestural phonology: Evidence from nonnative cluster production. Poster presented at LabPhon 9.
Davidson, L. and M. Stone. 2004. Epenthesis versus gestural mistiming in consonant
cluster production. In M. Tsujimura and G. Garding (eds.), Proceedings of the
West Coast Conference on Formal Linguistics (WCCFL) 22, 165-178.
Somerville, MA: Cascadilla Press.
De Lacy, P. 2002. The formal expression of markedness. Doctoral disertation,
University of Massachusetts, Amherst, MA. Published by GLSA Publications,
Amherst, MA.). [ROA-542].
Demirdache, H. 1988. Transparent vowels. In H.van der Hulst and N. Smith (eds.)
Features, segmental structure and harmony processes, 39-76. Dordrecht:
Forris.
Dienes, P. 1997. Hungarian neutral vowels. In Z. Kiss, Á. Lukács, B. Surányi and P.
Szigetvári (eds.), The Odd Yearbook 1997, 151-180. Budapest: ELTE SEAS
Undergraduate Papers in Linguistics.
362
Fant, G. 1965. Formants and cavities. Proceedings of the 5th ICPhS, 120-141. Basle:
Kargar.
Fant, G. 1970. Acoustic theory of speech production. The Hague and Paris: Mouton.
Farkas, D. and P. Beddor. 1987. Privative and equipollent backness in Hungarian. In:
A. Bosch, B. Need and E. Schiller (eds.), 23rd annual regional meeting of the
Chicago Linguistics Society. Part Two: Parasession on autosegmental and
metrical phonology, 90-105. Chicago, IL: Chicago Linguistics Society.
Farnetani, E., Vagges, K., and E. Magno-Caldognetto. 1985. Coarticulation in Italian
VtV sequences: A palatographic study. Phonetica 42: 78-99.
Flemming, E. 1995. Auditory Representations in Phonology. Doctoral Dissertation,
UCLA.
Flemming, E. 2001. Contrast and perceptual distinctivness. Ms. Stanford University.
Flemming, E. 2004. Contrast and perceptual distinctivness. In B.Hayes, R.Kirchner,
& D.Steriade (eds.) Phonetically-Based Phonology, 232-276. Cambridge:
Cambridge University Press.
Fónagy, I. 1966. Iga es ige. Magyar Nyelv: 323-324.
Fowler, C. A. 1981. Production and perception of coarticulation among stressed and
unstressed vowels. Journal of Speech and Hearing Research 24: 127-139.
Fowler, C. A. 1983. Converging sources of evidence on spoken and perceived
rhythms of speech: cyclic production of vowels in monosyllabic stress feet.
Journal of Experimental Psychology: General 112: 386-412.
Füredi, M., Kornai, A., and G. Prószéky. 2004. The SZÓTÁR database. (In
Hungarian). ms. URL http://www.szoszablya.hu/.
Gafos, A. 1998. Eliminating Long Distance Consonantal Spreading. Natural
Language and Linguistic Theory 16(2): 223-278.
Gafos, A. 1999. The articulatory basis of locality in phonology. New York: Garland.
(1996 Doctoral dissertation, John Hopkins University, Baltimore, MD)
363
Gafos, A. 2002. A grammar of gestural coordination. Natural Language and
Linguistic Theory 20(2): 269-337.
Gafos, A. 2003. Greenberg’s Asymmetry in Arabic: a consequence of stems in
paradigms. Language 79 (2): 317-355.
Gafos, A. To appear. Dynamics in grammar: Comment on Ladd and Ernestus &
Baayen. In L. Goldstein, D. Whalen, C. Best & S. Anderson (eds.), Varieties
of Phonological Competence (Laboratory Phonology 8). Berlin, New York:
Mouton de Gruyter.
Gelder, T. van and R. Port. 1995. It’s about time: An overview of the dynamical
approach to cognition. In R. Port and T. van Gelder (eds.) Mind as motion:
Explorations in the dynamics of cognition. Cambridge, MA: MIT Press.
Gick, B. 1999. The Articulatory Basis of Syllable Structure: A Study of English
Glides and Liquids. Doctoral dissertation, Yale University, New Haven, CT.
Gick, B. 2002. The use of ultrasound for linguistic phonetic fieldwork. Journal of
International Phonetic Association, 32: 113-121.
Goldsmith, J. A. 1976. Autosegental phonology. Doctoral dissertation, MIT,
Cambridge, MA. [Published by Garland Press, New York, 1979].
Goodwin, B. 1994. How the leopard changed its spots: the evolution of complexity.
New York: Simon & Shuster.
Gordon, M. 1999. The “neutral” vowels of Finnish: How neutral are they?
Linguistica Uralica1: 17-21.
Gósy, M. 2000. Vowel harmony: Interrelation of speech production, speech
perception, and the phonological rules. In Ch. Kreidler (ed.), Phonology:
Critical Concepts in Linguistics, Vol. VI, 124-146. London and New York:
Routledge.
Guy, G. To appear. Language Variation and Linguistic Theory. Oxford: Blackwell.
364
Haken, H. 1990. Synergetics as a tool for the conceptualization and mathematization
of cognition and behavior – How far can we go?. In H. Haken and M. Stadler
(eds.), Synergetics of Cognition, 2-31. Heidelberg: Springer-Verlag.
Hall, N. 2004. Implications of vowel intrusion for a gestural grammar. Ms.,
University of Haifa.
Hanon, B. and M. Ruth. 1997. Modeling dynamical biological systems. New York:
Springer-Verlag.
Hayes, B. 1995. Metrical stress theory: Principles and case studies. Chicago:
University of Chicago Press.
Hayes, B. 2004. Stochastic Phonological Knowledge: The Case of Hungarian Vowel
Harmony. Paper presented at NYU.
Hulst, H.G. van der. 1985. Vowel harmony in Hungarian. A comparison of segmental
and autosegmental analyses. In H. van der Hulst and N. Smith (eds.),
Advances in Nonlinear Phonology, 267-303. Dordrecht: Foris.
Hulst, H.G. van der. 1988. The geometry of vocalic features. In H. van der Hulst and
N. Smith (eds.), Features, segmental structure and harmony processes, Vol 2,
77-125. Dordrecht: Foris.
Hulst, H.G. van der and N. Smith (1986). On neutral vowels. In K. Bogers, H. van
der Hulst and N. Smith (eds.), The Phonological Representation of
Suprasegmentals. Dordrecht: Foris, 233-281.
Hulst, H.G. van der and J. van der Weijer 1995. Vowel Harmony. In J. Goldsmith
(ed.), The handbook of Phonological Theory, 495-534. Oxford: Blackwell.
Honorof, D. N. and C.P. Browman. 1995. The center or edge: How are consonant
clusters organized with respect to the vowel? In K. Elenius and P. Branderud
(eds.), Proceedings of the XIIIth International Congress of Phonetic Sciences
Vol. 3, 552-555. Stockholm: KTH and Stockholm University
Iskarous, K. Submitted. Edge detection and shape measurement of the edge of the
tongue. Clinical Linguistics and Phonetics.
365
Itô, Junko, Mester, A., and J. Padgett. 1995. Licensing and underspecification in
Optimality Theory. Linguistic Inquiry 26: 571-613.
Johnson, K. 1996. Speech perception without speaker normalization. In K. Johnson
and J.W. Mullennix (eds.), Talker Variability in Speech Processing. San
Diego: Academic Press.
Ka, Omar. 1988. Wolof phonology and morphology: A non-linear approach. Doctoral
dissertation, University of Illinios, Urbana.
Kaburagi, T. and M. Honda 1997. Calibration methods of voltage-to-distance
function for an electro-magnetic articulometer (EMA) system. Journal of the
Acoustical Society of America 101: 2391-2394.
Kálmán, L. and A. Nádásdy. 1994. A Hangsúly [Stress]. In F. Kiefer (ed.)
Strukturális magyar nyelvtan II. Fonológia. [A Structural Grammar of
Hungarian II. Phonology.] Budapest: Akadémiai Kiadó.
Kaun, A. 1995. The typology of rounding harmony: An Optimality Theoretic
approach. Doctoral dissertation, UCLA. [Published as UCLA Dissertations in
Linguistics, No. 8.].
Kenstowicz, M. 1994. Phonology in generative grammar. Oxford: Blackwell.
Kaye J., Lowenstamm, J., and J-R. Vergnaud. 1985. Constituent structure and
Government in Phonology. Phonology 7: 193-231.
Kelso, S., E. Saltzman, and B. Tuller. 1986. The dynamical perspective on speech
production: data and theory. Journal of Phonetics 14: 29-59.
Kelso, S., M. Ding, and G. Schöner. 1993. Dynamic pattern formation: a primer. In L.
Smith and E. Thelen (eds.), A dynamic systems approach to development, 1350. Cambridge, MA: MIT Press.
Kim, J. 2000. Mind in a Physical World. Cambridge, MA: MIT Press.
Kiparsky, P. 1973. Phonological Representations. In O. Fujimura (ed.), Three
Dimensions of Linguistic Theory, pp. 1-135, Tokyo: TEC Co.
366
Kiparsky, P. 1982. Vowel harmony. Ms., MIT.
Kiparsky, P. and K. Pajusalu. 2003. Toward a typology of disharmony. The
Linguistics Review 20: 217-241.
Kirchner, R. 1993. Turkish Vowel Harmony and Disharmony: An Optimality
Theoretic Account. [ROA-4].
Kirchner, R. 1999. Preliminary thoughts on 'phonologization' within an exemplarbased speech processing system. In M. Gordon (ed.), UCLA working papers
in linguistics 1, 207-231.
Kornai, A. 1991. Hungarian vowel harmony. In I. Kenessei (ed.), Approaches to
Hungarian III, 183-240.
Kosslyn, S. M. 1978. Imagery and internal representation. In E. Rosch and B. Lloyd
(eds.), Cognition and categorization, 217-257. Hillsdale, NJ: Erlbaum
Associates.
Krakow, R.A. 1989. The articulatory organization of syllables: A kinematic analysis
of labial and velar gestures. Doctoral dissertation, Yale University, New
Haven, CT..
Krämer, M. 2001. Yucatec Maya Vowel Alternations – Harmony as Syntagmatic
Identity. Ms. Heinrich-Heine-Universität, Düsseldorf.
Krämer, M. 2002. Local constraint conjunction and neutral vowels in Finnish
harmony.' In K. Brunger (ed.), Belfast Working Papers in Language and
Linguistics Vol. 15, 38-64. University of Ulster.
Labov, W., M. Karen, and C. Miller. 1990. Near mergers and the suspension of
phonemic contrast. Language Variation and Change 3:33-74.
Ladefoged, P. 1975. A Course in Phonetics. (1993, 3rd ed.). New York: Harcourt,
Brace, Jovanovich.
Leben, W. 1973. Suprasegmental phonology. Doctoral dissertation, MIT, Cambridge,
MA.
367
Liberman, A. M., K. S. Harris, H. S. Hoffman, and B. C. Griffith. 1957. The
discrimination of speech sounds within and across phoneme boundaries.
Journal of Experimental Psychology 54: 358-368.
Lightner, T. 1965. On the description of vowel and consonantal harmony. Word 21:
244-250.
Lindblad, V. 1990. Neutralization in Uyghur. MA Thesis, University of Washington.
Lombardi, L. 1999. Positional faithfulness and voicing assimilation in Optimality
Theory. Natural Language and Linguistic Theory 17: 267-302.
MacKay, I. 1987. Phonetics, the science of speech production. Austin, TX: Pro-Ed.
Magdics, K. 1969. Studies in the Acoustic Characteristics of Hungarian Speech
Sounds. Bloomington: Indiana University.
Magen, H. 1984. Vowel-to-Vowel coarticulation in English and Japanese. Journal of
the Acoustical Society of America, Suplement 1, 75: S41.
Magen, H. 1997. The extent of vowel-to-vowel coarticulation in English. Journal of
Phonetics 25: 187-205.
McCarthy, J. 1989. Linear order in phonological representation. Linguistic Inquiry
20:71-99.
McCarthy, J. 1998. Morpheme Structure Constraints and Paradigm Occultation. In
M. C. Gruber, D. Higgins, K. Olson, and T. Wysocki (eds.), CLS 32, vol. II:
The Panels, 123–150. Chicago: Chicago Linguistic Society.
McCarthy, J. 2003. OT constraints are categorical. Phonology 20: 75-138.
McCarthy, J. and A. Prince. 1993. Generalized alignment. In G. Booij and J. van
Marle (eds,) Yearbook of Morphology, 79-153. Kluwer: Boston. [ROA-7].
Medio, A. and M. Lines. 2001. Non-linear dynamics: A primer. Cambridge:
Cambridge University Press.
368
Munhall, K. and A. Löfqvist. 1992. Gestural aggregation in speech: Laryngeal
gestures.
Journal of Phonetics 20: 111-126.
Nam, H. 2004. A competitive, coupled oscillator model of moraic structure. Paper
presented at Labphon 9.
Nam, H., L. Goldstein, E. Saltzman, and D. Byrd. TADA: An enhanced, portable
Task Dynamics model in MATLAB. Poster presented at 75th ASA meeting,
New York.
Ní Chiosáin, M. and J. Padgett. 2001. Markedness, segment realization, and locality
in spreading. In L. Lombardi (ed.), Constraints and representations: segmental
phonology in Optimality Theory, 118-156. Cambridge: Cambridge University
Press. [ROA-188].
Odden, D. 1994. Adjacency Parameters in Phonology. Language 70: 289-330.
Ohala, J. 1990. The phonetics and phonology of aspects of assimilation. In J.
Kingston and M. Beckman (eds.), Papers in Laboratory Phonology I: Between
the grammar and the physics of speech. Cambridge: Cambridge University
Press. 258-275.
Ohala, J. 1994a. Towards a universal, phonetically-based, theory of vowel harmony.
Proceedings of International Conference on Spoken Language Processing,
491-494.
Ohala, J. 1994b. Hierarchies of environments for sound variation; plus implications
for “neutral” vowels in vowel harmony. Acta Linguistica Hafniensia 27(2):
371-382.
Öhman, S. 1966. Coarticulation in VCV utterances: Spectrographic measurements.
Journal of the Acoustical Society of America 39: 151-168.
Papp, F. 1982. Foreign Language environment and linguistic change: Two examples.
In F. Kiefer (ed.), Hungarian Linguistics, 427-445 Amsterdam: John
Benjamins.
369
Penny, R. 1969. Vowel harmony in the speech of the Montes de Pas (Santander).
Orbis 18: 148-166.
Penny, R. 1970. Mass nouns and metaphony in the dialects of north-western Spain.
Archivum Linguisticum 1: 21-30.
Percival, I. and D. Richards. 1982. Introduction to dynamics. Cambridge: Cambridge
University Press.
Perkell, J., Cohen, M., Svirsky, M., Matthies, M., Garabieta, I., and M. Jackson.
1992. Electromagnetic midsaggital articulometer (EMMA) systems for
transducing speech articulatory movements. Journal of the Acoustical Society
of America 92: 3078-3096.
Pierrehumbert, J. 2000. The phonetic grounding of phonology. Les Cahiers de l'ICP,
Bulletin de la Communication Parlée 5: 7-23.
Pierrehumbert, J. 2001. Exemplar dynamics: word frequency, lenition, and contrast.
In J. Bybee and P.J. Hooper (eds.), Frequency and the Emergence of
Linguistic Structure, 137-158. Amsterdam: John Benjamins.
Pierrehumbert, J. 2003. Probabilistic Phonology: Discrimation and Robustness. In R.
Bod, J. Hay, and S. Jannedy (eds.), Probability Theory in Linguistics.
Cambridge, MA: MIT Press.
Pierrehumbert, J., M. E. Beckman, and D.R. Ladd. 2001. Conceptual foundation of
phonology as a laboratory science. In N. Burton-Roberts, P. Carr, G. Docherty
(eds.), Phonological Knowledge, 273-304. Oxford: Oxford University Press.
Piggott, G. L. 1996. Implications of Consonant Nasalization for a Theory of
Harmony. Canadian Journal of Linguistics 41: 141- 174.
Prince, A. and P. Smolensky. 1993. Optimality Theory: Constraint Interaction in
Generative Grammar. Ms., Rutgers University and University of Colorado.
Recasens, D. 1987. An acoustic analysis of V-to-C and V-to-V coarticulatory effects
in Catalan and Spanish VCV sequences. Journal of Phonetics 15: 299-312.
370
Recasens, D. 1999. Lingual coarticulation. In W.J. Hardcastle and N. Hewlett (eds.),
Coarticulation: Theory, Data and Techniques in Speech Production, 78-104.
Cambridge: Cambridge University Press.
Ringen, C.O. 1975. Vowel harmony: Theoretical implication. Doctoral dissertation,
Indiana University. [Published by Garland Press, New York, 1988].
Ringen, C.O., and M. Kontra. 1989. Hungarian neutral vowels. Lingua 78: 181-191.
Ringen, C. O., and R. M. Vago. 1998. Hungarian vowel harmony in Optimality
Theory. Phonology 15: 393- 416.
Ringen C. O. and O. Heinämäki. 1999. Variation in Finnish vowel harmony: An OT
account. Natural Language and Linguistic Theory 17: 303-337.
Ritter, N. 1995. The role of Universal Grammar in phonology: A Government
Phonology approach to Hungarian. Doctoral dissertation, New York
University.
Saeed, J. 1999. Somali. Amsterdam: John Benjamins.
Sagey, E. 1986. The representation of features and relations in non-linear phonology.
Doctoral dissertation, MIT, Cambridge, MA. [Published by Garland Press,
New York, 1990].
Saltzman, E. and S. Kelso. 1987. Skilled actions: A task-dynamic approach.
Psychological Review 94(1): 84-106.
Saltzman, E. and K. Munhall. 1989. A dynamic approach to gestural patterning in
speech production. Ecological Psychology 1(4): 333-382.
Siptár, P. and M. Törkenzy. The Phonology of Hungarian. Oxford: Oxford University
Press.
Smolensky, P. 1993. Harmony, markedness and phonological activity. Paper
presented at the Rutgers Optimality Workshop-1, Rutgers University, October
1993. [ROA-87].
371
Sproat, R. and O. Fujimura. 1993. Allophonic variation in English /l/ and its
implication for phonetic implementation. Journal of Phonetics 21: 291-311.
Steriade, D. 1987. Locality conditions and feature geometry. In Papers of NELS 17,
595-617. Amherst, MA: GLSA.
Steriade, D. 1995. Underspecification and Markedness. In J. Goldsmith (ed.), The
handbook of Phonological Theory, 114-174. Oxford: Blackwell.
Steriade, D. 1997. Phonetics and Phonology: The Case of Laryngeal Neutralization.
Ms. UCLA.
Stevens, K. 1989. On the quantal nature of speech. Journal of Phonetics 17, 3-45.
Stone, M. 1997. Laboratory Techniques for Investigating Speech Articulation. In J.
Hardcastle and J. Laver (eds.), The Handbook of Phonetic sciences, 11-32.
Oxford: Blackwell.
Stone, M. To appear. A summary of ultrasound instrumentation. Clinical Linguistics
and Phonetics.
Stone, M. and A. Lundberg. 1999. Three-dimensional tongue reconstruction:
Practical considerations for ultrasound data. Journal of the Acoustical Society
of America 106: 2858-2867.
Svantesson, J. 1985. Vowel harmony shift in Mongolian. Lingua 67: 283-327.
Suomi, K., J. McQueen, and A. Cutler. 1997. Vowel harmony and speech
segmentation in Finnish. Journal of Memory and Language 36: 422-444.
Tiede, M.K., E. Vatikiotis-Bateson, P. Hoole, and H. Yehia. 1999. Magnetometer
data acquisition and analysis software for speech production research. ATR
Technical Report TR-H 1999. ATR Human Information Processing Labs.
Topping, Donald M. 1968. Chamorro vowel harmony. Oceanic Linguistics 7: 67-79.
Trubetskoy, N. 1939. Grundzüge der Phonologie. [Osnovy fonologii. 1960. Moscow:
Izdatelstvo Innostrannoj Literatury.]
372
Tu, P. 1992. Dynamical systems: an introduction with applications in economics and
biology. Berlin, New York : Springer-Verlag.
Vago, R. M. 1980. The Sound Pattern of Hungarian. Washington: Georgetown
University Press.
Vaux, B. 2001. Disharmony and derived transparency in Uyghur vowel harmony. In
M. Hirotani, A. Coetzee, N. Hall, and J-Y. Kim (eds.), Proceedings of NELS
30, 671-698. Amherst: GLSA.
Välimaa-Blum, R. 1999. A feature geometric description of Finnish vowel harmony
covering both loans and native words. Lingua 108: 247-268.
Vroomen, J., J. Tuomainen, and B. de Gelder. 1998. The roles of word stress and
vowel harmony in speech segmentation. Journal of Memory and Language
38: 133-149.
Walker, R. 1998. Nasalization, neutral segments, and opacity effects. Doctoral
dissertation, University of California, Santa Cruz. [Published by Garland
Press, New York, 2000.].
Walker, R. 2001. Positional markedness in vowel harmony. In C. Fery, A. D. Green,
and R. van de Vijver (eds.), Proceedings of HILP 5. Linguistics in Potsdam,
Vol. 12, 212-232. University of Potsdam.
Walker, R. 2003. Reinterpreting transparency in nasal harmony. In J. van de Weijer,
V. van Heuven, and H. van der Hulst (eds.), The Phonological Spectrum, Part
I: Segmental Structure, 37-72. Amsterdam: John Benjamins.
Wiik, K. 1995. Finno-Ugric prosodic substrata in Germanic languages and vice versa.
Proceedings of the XIIIth International Congress of Phonetic Sciences, Vol.4,
168-171.
Wood, S. 1979. A radiographic analysis of constriction location for vowels. Journal
of Phonetics 7, 25-43.
Wood, S. 1986. The Acoustic Significance of Tongue, Lip, and Larynx Maneuvers in
Rounded Palatal Vowels. Journal of the Acoustical Society of America 80:
391-401.
373
© Copyright 2026 Paperzz