Journal of the Phonetic Society of Japan, Vol.14 No.1 April 2010, pp.00–00 音声研究 第14巻第 1 号 2010(平成22)年 4 月 ● ‒ ●頁 Optimality Theory: Experimental Extensions Jeroen van de WEIJER* 最適性理論:その実験的展開 イェルーン ヴァンデウェイヤー * 要旨:この論文では,音声学(調音の容易性)に基づく有標性制約の基盤付けと,心理言語学(語認識の効 率)に基づく忠実性制約の基盤付けについて,その方法論に関する一案を論ずる。制約そのものは全て普遍 的であり,個別言語ごとのバリエーションは制約階層(=文法)のみにあるというのが最適性理論の基本方 針であるが,制約の基盤付けに基づくアプローチこそがこの基本方針を(単に仮定するのではなく実質的に) 導き出すのである。更に,個別的な制約階層における制約の順序の付け方に関しては,音声学や心理言語学 による基盤付けのほかに,類型論的な予測による基盤付けにも左右される。そこで,こうした基盤付けに基 づくアプローチを実験的にどのように展開してゆくかに注目しつつ,文法理論全体に与える意味合いについ ても検討する。 Key words: Optimality Theory, markedness, phonetics, faithfulness, psycholinguistics, typology, experimental linguistics 1. Introduction Recent times have witnessed two parallel but contradictory trends in phonology: first, there is increasing specialization in the field: researchers are becoming increasingly specialized, and some phonologists have come to concentrate almost exclusively on particular subfields, such as tonology, vowel harmony, intonation studies, metrical structure, the interface of phonology with morphology or syntax, etc. (van Oostendorp and van de Weijer 2005). At the same time there is an increasing need of and call for a “holistic” theory of linguistics, in which researchers cross disciplinary boundary lines (see Jackendoff 2007, van de Weijer 2009a). Due to the advance of psycholinguistics, an increased interest is noted in the competence andd performance of single individuals on specific occasions, and the development to study variation on a small scale, in communities or across generations. Increased technological possibilities, e.g. in phonetics, have made empirical verification and experimental testing possible that were unheard of ten or twenty years ago, as well as other kinds of research, such as corpus linguistics and increasingly realistic modelling in computational theories such as connectionism. As a result, further integration between disciplines is called for and now seems more feasible than ever. In this paper I will address one area in which theoretical phonology, specifically Optimality Theory, might connect with more “concrete” approaches such as phonetics and psycholinguistics. 2. Grounding Optimality Theory The introduction of OT (Prince and Smolensky 1993/2004) represented a “paradigm revolution” (Kuhn 1962) in many ways. OT proposed a new way about how phonology (and other aspects of language) works, and was highly successful in its application to a number of old problems, especially where these involved inherent conflict, such as the sometimes contradictory demands of the prosodic and segmental level, or the need to fit morphology into a particular accepted phonological shape. However, this did mean that a lot of questions that were fashionable before the advent of OT all of a sudden did not receive the attention they had enjoyed beforehand (see again van Oostendorp and van de Weijer 2005 for discussion and partial reparation). These included some of the basic issues in phonology, such as the content of the set of distinctive features, the “arity” of these features, its universality, and the organization into a geometry. The same goes for higher prosodic levels, such as the structure of the syllable, questions regarding foot structure and inventory, and the role of such notions as government and licensing. * Full Professor, College of English Language and Literature, Shanghai International Studies University(上海国際研究大学 英語英文学科正教授) ̶1̶ 3weijer.indd 1 2010/06/14 10:37:02 特集「最適性理論の実験検証と実験音声学の理論整備」 One piece of —unjustified— criticism that has been levelled at Optimality Theory is that in OT “anything goes”, which seems a wrongful interpretation of the fact that all constraints in OT are directed at the level of the output, while inputs, or underlying representations, are constraint-free (“Richness of the Base”). This position underestimates the extent to which all parts of OT can and should be justified: markedness constraints should be grounded from a phonetic point of view, faithfulness constraints should be anchored in the basics of word recognition, constraint hierarchies make predictions as to typological recurrence of linguistic patterns, and inputs should be constrained by important principles such as Lexicon Optimization. In all these aspects, previous theories, such as standard generative phonology, do not even begin to compare with OT in terms of accountability and restrictiveness. It is natural to extend this accountability to candidate outputs (see below). Of course, the traditional position (Prince and Smolensky 1993/2004) is that constraints are universal. This does not entail at all that “constraints are part of Universal Grammar and therefore don’t have to be learned”, which in turn could again be (spuriously) equated with “constraints are part of Universal Grammar and therefore don’t have to be motivated”. An increasing number of linguists have taken the more interesting position and require that constraints are grounded (see e.g. Archangeli and Pulleyblank 1994, Hayes, Kirchner and Steriade 2004). In fact, the basic model of OT can be derived from a simple model of communication between a speaker and a listener (or communication in other media, such as sign language): the speaker wishes to put his message across with as little effort as possible, while listeners wish to retrieve the message with the least amount of perceptual effort (Jurafsky et al. 2001, van de Weijer 2009a; see also Passy 1891, cited in Boersma 1998). It makes sense that speakers (who are listeners too) take into account the needs of listeners, so that grammar will typically mediate between the two opposing needs and form a kind of compromise. Since both speaking and listening are common to the human species —neither ears nor mouths nor brains differ much across the planet— constraints derived from these basic activities must be universal too. In this way, a theory which demands that its constraints are grounded derives rather than stipulates universality. Let us, in the next subsections, examine the subparts of OT in more detail, with special attention for experimental verification and grounding. 2.1 Markedness Markedness constraints should, and mostly can, be grounded in phonetics, taking into account both ease of articulation and perception. All well known markedness constraints such as Identical Cluster constraints (Pulleyblank 1997), and constraints relating to syllable well-formedness (ONSET, NOCODA, *COMPLEX) can be motivated in this way. Two factors deserve special attention: first, the formulation of these constraints depends crucially on the theories of segmental structure and syllable structure adopted. In a theory such as “CVonly” (Lowenstamm 1996), for instance, in which a word like priestt is represented as in (1), there is no need for any of the three syllable structure constraints mentioned just above, because there are no complex clusters and no codas to begin with, and C-positions and V-positions always come in pairs: ((1)) C V C V C V C V C V | | | | | | | | | | x x x x x x x x x x | p | r | i | s | t However, in such a theory, constraints or principles of a similar nature, are necessary to explain why in English the various empty positions in (1) are allowed to persist, and why languages differ in which structures are allowed and which structures are not allowed. There is a large body of literature in Government Phonology and related frameworks (see e.g. van der Hulst and Ritter 1999 for discussion) which tries to do exactly this. In the end, there may be no fundamental difference between this exercise and the standard OT programme. It remains to be seen if different other theories of syllable structure (e.g. mora theory, Hayes 1989, or X-bar syllable structure, Levin 1985) offer particular advantages in this respect. Second, it should be noted that feature frameworks have a specific contribution to make where the formulation of segmental constraints is concerned. Consider, for instance, the constraint *VOICE (cf. Kager (1999, p.40): No obstruent must be voiced, or in terms of distinctive features: *[–son, +voice]. There is a large body of research on the arity of the feature [voice], which suggests that there is only one value ([voice]; see Wetzels and Mascaró 2001 for discussion and an opposing view). This entails that the constraint *VOICE can be simplified to *[–son, voice] (or *[son, voice], on the view that sonorants and obstruents form two distinct, equipollently opposed, classes). Crucially, there is no room in the theory for a corresponding constraint *NonVoice (*[–son, –voice]), because there is no object ̶2̶ 3weijer.indd 2 2010/06/14 10:37:03 Optimality Theory: Experimental Extensions [–voice] that can be referred to in phonological constraints. In other words, the results obtained in unary frameworks (e.g. Dependency Phonology, Anderson and Ewen 1987) remain highly valid in OT and to the extent that those results and research in OT point in the same direction, provide unequivocal confirmation of both approaches, both in terms of the contents of phonological representations and the content of constraints. 2.2 Faithfulness Faithfulness constraints, such as DEP, MAX, IDENT and LINEARITY penalize changes to inputs. Faithfulness constraints can be motivated from a listener perspective. There is a direct connection to psycholinguistics here, in terms of (speed of) word recognition, because changes that have occurred in outputs makes them harder to relate them to a corresponding input, that is, roughly, it makes it harder “to recognize them”, all things being equal. (In actual fact, things are hardly ever going to be “equal”, because in context markedness constraints will affect the input candidate; in such cases, the output will be matched with an input form according to the grammar). Note that an important assumption is made here, viz. that there is one input form to which surface outputs can be compared. What if there is no one abstract input form, as in theories like Exemplar Theory (see e.g. Bybee 2006 and many references cited there). In Exemplar Theory, (multiple) surface forms are stored directly in some degree of phonetic detail. No abstraction takes place but only categorization (see e.g. Dell 2000)1. In that case, two approaches are possible: in the first, faithfulness could still be evaluated by comparing an output form to which a listener is exposed with any of the exemplars present in the lexicon. This would probably diminish, but not obviate, the need for a faithfulness component in grammar: of course some “recognition mechanism”, involving a likeness evaluation metric, akin to faithfulness, is needed. A second approach would be to consider the possibility that some degree of abstraction among exemplars still takes place, so that there is still some role to play for an abstract, generalized form with special status, to which output forms may be compared — this abstract form may or may not correspond to the traditional “underlying form” of generative grammar or the “input” in Optimality Theory. In recent work, Sloos (2009; see references cited there), investigates the latter approach for a variable set of data in Dutch and finds it makes the correct predictions. In short, the operation of faithfulness constraints is intimately related to our theory of the lexicon, to which we turn next. 2.3 The lexicon One aspect of OT that might be improved upon in terms of psycholinguistic realism is its assumptions on the lexicon. A lexicon on which no constraints hold (as well as an infinite set of generated candidates which need to be checked by the Evaluator) are often criticized for not being “psycholinguistically realistic” (see Goldrick, to appear, for discussion of this notion). In generative grammar, the lexicon consisted of a list of words of the language, which specified only nonredundant information. All predictable alternations (and other properties of the output) were supplied by rule. That is, generative grammar is a so-called dualprocessing model. The question whether this is adequate is still a matter of great controversy (see e.g. Pinker 1999, Plaut 2003 for discussion). The question is whether OT is necessarily a dual-processing model like its generative predecessors or whether it is in fact more malleable and could contribute to a “compromise model” which is both theoretically and psycholinguistically attractive (see Alderete and Frisch 2009 for recent discussion). Two remarks on the nature of underlying representations should be made here: instead of solitary inputs, why shouldn’t the lexicon consist of, as in Exemplar Theory, the words (and phrases) that a speaker/listener has been actually exposed to, with a rather good reflection of what these sound like, in which contexts they are used, and in which words that have been encountered often are stored more robustly than other, less frequently encountered words? In such a conception, both phonological and semantic generalizations are made between words, leading to a multidimensional array. Grammar does still play a role: there is a selection mechanism to pick out one form, suited to its specific phonological environment and suitable to the required speech style and other contextual factors, while incoming forms must be matched to existing exemplars (see above). In this approach, both production and perception are subject to a language-specific hierarchy of constraints: both can be well expressed as an OT grammar. A second remark concerns the specification of forms which alternate. Consider the example of the English plural, which alternates between [], [] and []. Will a plural morpheme be stored in the lexicon? In Exemplar Theory, it will not be, which accords well with wordbased theories of the lexicon (e.g. Bybee 1988)2. If it is, as in morpheme-based models and dual-processing theories, there are two possibilities to such lexical entries in a surface-oriented approach. Consider the point of view of the learner: she is exposed to three ̶3̶ 3weijer.indd 3 2010/06/14 10:37:03 特集「最適性理論の実験検証と実験音声学の理論整備」 different allomorphs with exactly the same meaning. One possibility is that all three forms are stored, and be applied to new forms in conformity with emergent generalizations or the constraint grammar, the most frequent form could be stored and adapted to novel situations in analogical fashion. The second logical possibility would be that the learner stores the “lowest common denominator”, e.g. a “coronal fricative” underspecified for voice. This possibility, which can be shown to work (van de Weijer 2009b), would involve underspecification, again showing the importance of considering the role of theories of segmental structure in OT and the emergent lexicon3. Finally, we must allow for the role of frequency (cf. also Diessel 2007 for a review of the role of frequency in language acquisition, language use and diachronic change). Exemplar Theory offers a direct translation of the notions of token frequency, namely as the degree of entrenchment of tokens in exemplar clouds, and of type frequency, which is represented by the connections between exemplars which have, for instance, the same affix and are therefore related in meaning. These frequencies could be (partially and/or indirectly) related to the weight of constraints in novel approaches in Optimality Theory, such as Harmonic Serialism (see e.g. McCarthy 2008). It is also possible that these weights are not related to constraints themselves, but are, rather, a property of lexical entries themselves (see below). 2.4 The structure of the grammar The architecture of OT, and in particular the notion of “freedom of generation” in the component GEN, does not lend itself well to psycholinguistics implementation. The generation of an infinite number of candidates which must be evaluated, although perhaps unproblematic from a computational linguistics point of view (e.g. Bíro 2006), does not fit well with the idea of real language production which takes place in real time. In this respect, Harmonic Serialism (see again McCarthy 2008 and references cited there), also presents an improvement since it dramatically decreases the power of GEN: instead of an infinite number of generations, this component is only allowed to make one change in a candidate; if this change improves markedness, further changes are possible but if there is no markedness improvement, the derivation stops. As before, markedness and faithfulness constraints play a crucial role in evaluation. On the one hand, Harmonic Serialism re-introduces the concept of derivation (and rule ordering, albeit in a precisely limited and restrictive way) into OT, which many researchers have been arguing ever since OT was first introduced. On the other hand, it is also welcome because it permits a much smaller role for grammar, more compatible with a realtime approach. Thus, in this conception of grammar it is important again to delve into segmental structure, so that it can be defined what counts as a “single step” in derivation (see also McCarthy 2009). Secondly, any constraint grammar makes predictions in terms of the “factorial typology”, in other words, the introduction of new constraints must be motivated by exploring its typological consequences. Also in this sense, OT grammars are highly accountable. 3. Conclusion As a surface-oriented theory, Optimality Theory stands a better chance than some to match up with psycholinguistically realistic models of language production and perception. One area in which it could accommodate to the latter is its theory of lexicon, which might be formulated more realistically than is presently usually the case. Rather than an unstructured list without redundant information (as in earlier generative grammar), the lexicon will be a rich repertoire, with a wealth of redundant information on lexical entries and manifold relations (phonological as well as semantic) between them. It remains to be seen whether morphemes are separate entries in such a lexicon. Lexical entries may be strongly entrenched due to frequent use and therefore easily retrieved and recognized. Lexical entries may become forgotten or obscured. No two individuals have the same lexical repertoire, paving the way for interspeaker variation and language change. Stylistic variation (or other forms of intraspeaker variation), can be regarded as the operation of slightly different grammars (or slightly differently weighted constraints therein) within the same speaker. The fact that lexicons are shared for a large percentage makes it reasonable to speak of the “same language” for a group of speakers. Smaller differences can be referred to as “dialect differences”, noting that terms such as “language” or “dialect” are essentially meaningless. If we accept that there is not a single underlying form for a given lexical entry, but a small (or large) “exemplar cloud”, it is still the case that one of these exemplars must be picked out for production on any given occasion. Some exemplars within the cloud may be more prominent and therefore stand a better chance of being selected (and being selected more quickly): this is the frequency effect which has been observed in ̶4̶ 3weijer.indd 4 2010/06/14 10:37:03 Optimality Theory: Experimental Extensions many ways. Still, the planned output will appear in a particular phonological context, uttered in a particular speech style, and be responsive to its context in many other ways. That is, the selection of the output must meet, in the best possible way, a number of constraints at the same time. For this purpose, and the reverse procedure, i.e. that of matching an output form with an already stored exemplar, OT remains eminently suited. Acknowledgements This paper was presented at the August 2009 workshop on Experimental Optimality Theory, Kobe University, in the context of the project “Autonomy, Harmony and Typology”. I would like to thank the project leader, Prof. Shosuke Haraguchi (Meikai University), for making this workshop possible, and the presenters in this workshop, Andries Coetzee, Jongho Jun and René Kager, for their stimulating presentations. Thanks to Marjoleine Sloos for comments on a prefinal version. Notes 1) A de facto diminished role of inputs is also seen in Optimality approaches assigning a (smaller or larger) role to Output-Output correspondence (Benua 1985, Burzio 2000, and others). 2) In a word-based approach, morphological boundaries may still be assumed to exist in lexical entries. On this view, Alignment constraints should be regarded as faithfulness constraints. 3) Note that the build-up of such a lexicon and the acquisition of the constraints that play a role in an emergent language must go hand in hand. See Kager (this volume) for discussion. References Alderete, John and Stefan Frisch (2009) “Phonotactic Learning without A Priori Constraints: A Connectionist Analysis of Arabic Cooccurrence Restrictions.” Ms. ROA-1055. Anderson, John M. and Colin J. Ewen (1987) Principles of Dependency Phonology. Cambridge: Cambridge University Press. Archangeli, Diana and Douglas Pulleyblank (1994) Grounded Phonology. Cambridge, MA: MIT Press. Benua, Laura (1997) Transderivational Identity: Phonological Relations Between Words. Doctoral dissertation, University of Massachusetts, Amherst. Bíro, Tamás (2006) “Squeezing the Infinite into the Finite: Handling the OT Candidate Set with Finite State Technology”. In: Anssi Yli-Jyrä, Lauri Karttunen and Jahuni Karhumäki (eds.) Finite-State Methods and Natural Language Processingg (pp.21–31). Berlin: Springer. Boersma, Paul (1998) Functional Phonology. Formalizing the interactions between articulatory and perceptual drives. The Hague: Holland Academic Graphics. [LOT International Series 11]. Doctoral thesis, University of Amsterdam. Burzio, Luigi (2000) “Segmental Contrast meets Output-toOutput Faithfulness,” The Linguistic Review 17, 368–384. Bybee, Joan L. (1988) “Morphology as Lexical Organization.” In Michael Hammond and Michael Noonan (eds.) Theoretical Morphology. Approaches to Modern Linguistics (pp.119–142). San Diego: Academic Press. Bybee, Joan L. (2006) “From Usage to Grammar: The Mind’s Response to Repetition,” Language 82, 711–733. Coetzee, Andries W. (2008) “Grammaticality and Ungrammaticality in Phonology,” Language 84, 218–257. Dell, Gary S. (2000) “Commentary: Counting, Connectionism, and Lexical Representation.” In Michael B. Broe and Janet B. Pierrehumbert (eds.) Papers in Laboratory Phonology V: Acquisition and the Lexicon (pp.335–348). Cambridge: Cambridge University Press. Diessel, Holger (2007) “Frequency Effects in Language Acquisition, Language Use, and Diachronic Change,” New Ideas in Psychology 25, 108–127. Goldrick, Matthew (to appear) “Using Psychological Realism to Advance Phonological Theory. Slightly revised version to appear.” In John Goldsmith, Jason Riggle and Alan Yu (eds.) Handbook of Phonological Theory (2nd edition). Oxford: Blackwell. ROA-1039. Hayes, Bruce (1989) “Compensatory Lengthening in Moraic Phonology,” Linguistic Inquiry 20, 253–306. Hayes, Bruce, Robert Kirchner and Donca Steriade (2004) Phonetically-Based Phonology. Cambridge: Cambridge University Press. Hulst, Harry van der and Nancy A. Ritter (1999) The Syllable: Views and Facts. Berlin: Mouton de Gruyter. Jackendoff, Ray (2007) “A Whole Lot of Challenges for Linguistics,” Journal of English Linguistics 35, 253–262. Jurafsky, Daniel, Alan Bell, Michelle Gregory and William D. Raymond (2001) “Probabilistic Relations between Words: Evidence from Reduction in Lexical Production.” In Joan B. Bybee and Paul Hopper (eds.) Frequency and the Emergence of Linguistic Structure (pp.229–253). Amsterdam: John Benjamins. Kuhn, Thomas S. (1962) The Structure of Scientific Revolutions. Chicago: University of Chicago Press. Levin, Juliette (1985) A Metrical Theory of Syllabicity. Doctoral dissertation, MIT, Cambridge (Mass.). Lowenstamm, Jean (1996) “CV as the Only Syllable Type.” In Jacques Durand and Bernard Laks (eds.) Current Trends in Phonology-Models and Methods (pp.419–442). University of Salford: European Studies Research Institute. McCarthy, John J. (2008) “The Gradual Path to Cluster Simplification,” Phonology 25, 271–319. ̶5̶ 3weijer.indd 5 2010/06/14 10:37:04 特集「最適性理論の実験検証と実験音声学の理論整備」 McCarthy, John J. (2009) “Studying GEN,” Journal of the Phonetic Society of Japan 13. ROA-1049 Oostendorp, Marc van and Jeroen van de Weijer (2005) “Phonological Alphabets and the Structure of the Segment.” In Marc van Oostendorp and Jeroen van de Weijer (eds.) The Internal Organization of Phonological Segments (pp.1–23). Berlin: Mouton de Gruyter. Passy, Paul (1891) “Etude sur les changements phonétiques et leurs caractères généraux.” Paris: Librairie Firmin-Didot. Pinker, Steven (1999) Words and Rules: The Ingredients of Language. New York: Basic Books. Plaut, David C. (2003) “Connectionist Modeling of Language: Examples and Implications.” In Marie T. Banich and Molly Ann Mack (eds.) Mind, Brain, and Language: Multidisciplinary Perspectives (pp.143–167). Mahwah, NJ: Erlbaum. Prince, Alan, and Paul Smolensky (1993/2004) Optimality Theory: Constraint Interaction in Generative Grammar. Technical Report TR-2, Rutgers Center for Cognitive Science, Rutgers University, New Brunswick, NJ. [Reproduced by Blackwell, New York in 2004.] Pulleyblank, Douglas (1997) “Optimality Theory and Features.” In Diana Archangeli and Terry Langendoen (eds.) Optimality Theory: An Overview (pp.59–101). Cambridge: Blackwell. Sloos, Marjoleine (2009) Frequency effects are sensitive to phonological grammar. The interaction of resyllabification and pretonic schwa deletion as a frequency effect in Dutch. Research MA thesis, Leiden University. Weijer, Jeroen van de (2009a) “Optimality Theory and Exemplar Theory,” Phonological Studies 12, 117–124. Weijer, Jeroen van de (2009b) Cats and Dogs Revisited for the Twelfth Time: An Optimality Analysis of the Plural and Ordinal Suffix in English. Ms. Wetzels, Leo and Joan Mascaró (2001) “The Typology of Voicing and Devoicing,” Language 77, 207–244. (Received Nov. 14, 2009, Accepted May 12, 2010) ̶6̶ 3weijer.indd 6 2010/06/14 10:37:04
© Copyright 2026 Paperzz