<DOCINFO AUTHOR ""TITLE "Cognitive Linguistics Investigations: Across languages, fields and philosophical boundaries"SUBJECT "HCP, Volume 15"KEYWORDS ""SIZE HEIGHT "240"WIDTH "160"VOFFSET "4"> Cognitive Linguistics Investigations human cognitive processing is a forum for interdisciplinary research on the nature and organization of the cognitive systems and processes involved in speaking and understanding natural language (including sign language), and their relationship to other domains of human cognition, including general conceptual or knowledge systems and processes (the language and thought issue), and other perceptual or behavioral systems such as vision and nonverbal behavior (e.g. gesture). ‘Cognition’ should be taken broadly, not only including the domain of rationality, but also dimensions such as emotion and the unconscious. The series is open to any type of approach to the above questions (methodologically and theoretically) and to research from any discipline, including (but not restricted to) different branches of psychology, artificial intelligence and computer science, cognitive anthropology, linguistics, philosophy and neuroscience. It takes a special interest in research crossing the boundaries of these disciplines. Editors Marcelo Dascal, Tel Aviv University Raymond W. Gibbs, University of California at Santa Cruz Jan Nuyts, University of Antwerp Editorial address Jan Nuyts, University of Antwerp, Dept. of Linguistics (GER), Universiteitsplein 1, B 2610 Wilrijk, Belgium. E-mail: [email protected] Editorial Advisory Board Melissa Bowerman, Nijmegen; Wallace Chafe, Santa Barbara, CA; Philip R. Cohen, Portland, OR; Antonio Damasio, Iowa City, IA; Morton Ann Gernsbacher, Madison, WI; David McNeill, Chicago, IL; Eric Pederson, Eugene, OR; François Recanati, Paris; Sally Rice, Edmonton, Alberta; Benny Shanon, Jerusalem; Lokendra Shastri, Berkeley, CA; Dan Slobin, Berkeley, CA; Paul Thagard, Waterloo, Ontario Volume 15 Cognitive Linguistics Investigations: Across languages, fields and philosophical boundaries Edited by June Luchjenbroers Cognitive Linguistics Investigations Across languages, fields and philosophical boundaries Edited by June Luchjenbroers John Benjamins Publishing Company Amsterdam/Philadelphia 8 TM The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984. Library of Congress Cataloging-in-Publication Data Australian Linguistics Institute (4th : 1998 : University of Queensland) Cognitive Linguistics Investigations : Across languages, fields and philosophical boundaries / edited by June Luchjenbroers. p. cm. (Human Cognitive Processing, issn 1387–6724 ; v. 15) Chiefly revisions of papers presented at a 4th Australian Linguistics Institute workshop, held in July, 1998, at the University of Queensland. Includes bibliographical references and indexes. 1. Cognitive grammar--Congresses. I. Luchjenbroers, June. II. Title. P165.A96 1998 415--dc22 isbn 90 272 2368 8 (Hb; alk. paper) 2005058866 © 2006 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa JB[v.20020404] Prn:21/04/2006; 10:07 F: HCP15CO.tex / p.1 (47-104) Table of contents Preface ix Biographical information xi chapter 1 Introduction: Research issues in cognitive linguistics June Luchjenbroers 1 Part I. Cultural models and conceptual mappings chapter 2 When does cognitive linguistics become cultural? Case studies in Tagalog voice and Shona noun classifiers Gary Palmer 13 chapter 3 Purple persuasion: Deliberative rhetoric and conceptual blending Seana Coulson and Todd Oakley 47 chapter 4 Depicting fictive motion in drawings Teenie Matlock 67 chapter 5 Discourse, gesture, and mental spaces manoeuvers: Inside versus outside F-space June Luchjenbroers 87 JB[v.20020404] Prn:21/04/2006; 10:07 F: HCP15CO.tex / p.2 (104-151) Table of contents Part II. Computational models and conceptual mappings chapter 6 In search of meaning: The acquisition of semantic structures and morphological systems Ping Li 109 chapter 7 Grammar and language production: Where do function words come from? Joost Schilperoord and Arie Verhagen 139 chapter 8 Word recognition and sound merger Paul Warren 169 Part III. Linguistic components and conceptual mappings chapter 9 Verbal explication and the place of NSM semantics in cognitive linguistics Cliff Goddard chapter 10 “How do you know she’s a woman?”: Features, prototypes and category stress in Turkish kadin and kiz Robin Turner 189 219 chapter 11 Cross-linguistic polysemy in tactile verbs Iraide Ibarretxe-Antuñano 235 chapter 12 How experience structures the conceptualization of causality Maarten Lemmens 255 chapter 13 Internal state predicates in Japanese: A cognitive approach Satoshi Uehara 271 JB[v.20020404] Prn:21/04/2006; 10:07 F: HCP15CO.tex / p.3 (151-168) Table of contents chapter 14 Figure, ground and connexity: Evidence from Xhosa narrative David Gough 293 chapter 15 Discourse organization and coherence Ming-Ming Pu 305 Name index 325 Subject index 329 JB[v.20020404] Prn:29/11/2005; 9:40 F: HCP15PR.tex / p.1 (47-128) Preface The origin of this book was a workshop held at the University of Queensland, during the 4th Australian Linguistics Institute, in July 1998. Researchers from around the world offered papers on a range of research topics of specific interest to the cognitive linguistics paradigm, and a number of those papers have been revised and modified for this volume. Since that workshop several additional papers were also sought from exciting researchers in the field, so that this monograph would capture the diversity of research activity from various parts of the world and across a range of languages, relevant to the Cognitive Linguistics orientation toward language and cognition. My thanks to the many colleagues who volunteered their time to give Peer reviews of the papers included in this volume (listed below). Without their help this monograph would not have been possible. Also many thanks are due to the contributors themselves, many of whom have tolerated countless delays and innumerable requests; their patience and good humour have made the task of collating this monograph a satisfying experience. Thanks also to the editors of this series and their reviewers; and a final thanks to the Centre for Language & Cognition Groningen (clcg) Rijks Universiteit Groningen, where this manuscript was finally completed, as well as the Linguistics Department at the University of Wales, Bangor for supporting my visit there. List of guest reviewers Michel Achard French/Linguistics, Rice University, USA Keith Allan Linguistics, Monash University, Australia Edith Bavin Psychology, La Trobe University, Australia Frank Brisard Germanic Languages, University of Antwerp, Belgium Wallace Chafe Linguistics, University California at Santa Barbara, USA Alan Cienki Russian Studies, Emory University, USA Hubert Cuyckens English Linguistics, Katoliek University Leuven, Belgium Dirk Geeraerts Linguistics, Katoliek University Leuven, Belgium Ray Gibbs Psychology, University California at Santa Barbara Adam Glaz Linguistics, University Marie-Curie Sklodowskiej, Poland Andrej A. Kibrik Applied Linguistics, Lomonosov University, Russia Ronald Langacker Linguistics, University California at San Diego, USA JB[v.20020404] Prn:29/11/2005; 9:40 F: HCP15PR.tex / p.2 (128-128) Preface David Lee Eric Pederson Bill Raymond Giesela Redeker Wilbert Spooren Mark Turner English Linguistics, University Queensland, Australia Linguistics, University Oregon, USA Linguistics, University Columbus Ohio, USA Communication, Rijks University Groningen, Netherlands Dutch/Communication, Vrije University, Netherlands Arts & Sciences, Case Western Reserve University, USA JB[v.20020404] Prn:9/02/2006; 8:21 F: HCP15B1.tex / p.1 (47-133) Biographical information Seana Coulson – is an associate professor in the Cognitive Science Department at the University of California, San Diego where she heads the Brain & Cognition Laboratory. The author of Semantic Leaps: Frame-Shifting And Conceptual Blending In Meaning Construction, her research involves an interdisciplinary approach to the study of communication and conceptual structure. Cliff Goddard – works primarily in the natural semantic metalanguage (NSM) theory originated by Anna Wierzbicka. He has published widely on cross-linguistic semantics, ethnopragmatics, descriptive linguistics, and language typology. His books include Semantic Analysis (OUP, 1998), Meaning and Universal Grammar (co-edited with Anna Wierzbicka, Benjamins, 2002) and The Languages of East and Southeast Asia (OUP, 2005). He is a full Professor in Linguistics at the University of New England, Australia. David Gough – is currently Head of the School of English Language at Christchurch Polytechnic Institute of Technology, New Zealand where he has been for the past 5 years. Prior to this, David, a South African, was professor of Linguistics at the University of the Western Cape, Cape Town. He has research interest and has published in African linguistics, pragmatics and language and literacy education. Iraide Ibarretxe-Antuñano (PhD Edinburgh, 1999) – is currently a lecturer in Linguistics at the University of Zaragoza, Spain. She was a research fellow at UC Berkeley (1999–2001), the International Computer Science Institute (2000–2001), and the University of Deusto, Spain (2001–2003). She is especially interested in issues related to cross-linguistic polysemy, constructions, semantic change, semantic typology, sound symbolism, metaphor and metonymy, perception, space and motion. Maarten Lemmens – is senior lecturer of English linguistics at the University of Lille, France, where he teaches cognitive and English linguistics and English phonetics. His research centers around three main areas: (i) English lexical causatives and their constructional alternations, (ii) a lexical semantic analysis of posture verbs in Dutch, English and Swedish, and (iii) a typological study of the expression of static location, as a complement to existing research on movement verbs. JB[v.20020404] Prn:9/02/2006; 8:21 F: HCP15B1.tex / p.2 (133-206) Biographical information Ping Li – is Professor of Psychology and Cognitive Science at the University of Richmond, USA. His main research interests are in the areas of psycholinguistics and cognitive science. He specializes in crosslinguistic studies of language acquisition, bilingual language processing, and neural network modeling of monolingual and bilingual lexical development. June Luchjenbroers – received her PhD from La Trobe University in 1994, and joined the Linguistics Department at University of Wales, Bangor in 1999 after appointments with the Hong Kong Polytechnic University and the University of Queensland. Her research involves Discourse Analysis from a cognitive linguistics perspective, including gender and gestural analyses of video, discourse data. Teenie Matlock – is founding faculty in Social and Cognitive Sciences at University of California, Merced, and a visiting scholar in Psychology at Stanford University. An experimental psychologist and cognitive linguist, Matlock has published numerous articles on conceptual structure and imagery in language, especially non-literal spatial language. Todd Oakley – is associate professor of English and Cognitive Science at Case Western Reserve University in Cleveland, Ohio. His principle areas of scholarship are in rhetoric, linguistics, and cognitive science. His interest in Cognitive Lingusitics dates from the early 90’s when he began investigating the conceptual basis of rhetorical effect, a project that drew heavily on Langacker’s Cognitive Grammar and Fauconnier’s Mental Spaces Theory. This project has since expanded to focus on the relationship between attention and meaning construction in general, hence its title, Elements of Attention: Explorations in Mind, Language, and Culture. Gary B. Palmer – is Professor Emeritus at Nevada, Las Vegas. He is the author of Toward a Theory of Cultural Linguistics (1996), translated as Hacia una Teoría de la Linguïstica Cultural (2000) by Enrique Bernárdez. He co-edited Talking about Thinking across Languages. Cognitive Linguistics 14/2,3 (2003) with Cliff Goddard and Penny Lee, Cognitive Linguistics and Non-Indo-European Languages (2003) with Eugene Casad, and Languages of Sentiment (1999) with Debra Occhi. Ming-Ming Pu – is an Associate Professor of Linguistics at the University of Maine, Farmington. She obtained her PhD in psycholinguistics from University of Alberta, Canada. Her current research interests include cognitive linguistics, comparative discourse analysis and Chinese linguistics. Joost Schilperoord – is a psycholinguist with a special interest in cognitive and rhetorical aspects of text production and communication processes. His research focuses on regularities in language use derived from text analysis and experimentally elicited usage data. He is assistant professor at the Communication Department of Tilburg University, where he teaches statistics and text linguistics. JB[v.20020404] Prn:9/02/2006; 8:21 F: HCP15B1.tex / p.3 (206-240) Biographical information Robin Turner – teaches English at Bilkent University in Ankara, Turkey. His interests include cognitive, cultural and corpus linguistics, Turkish language and culture, constructed languages, and computer programming Satoshi Uehara – has a PhD in linguistics, from University of Michigan (1995). He is professor of Japanese language and linguistics at Center for International Exchange and Graduate School of International Cultural Studies, Tohoku University, Japan. He has also taught at University of Michigan and Wellesley College. His areas of specialization are cognitive linguistics, linguistic typology, discourse analysis, pragmatics, and Japanese and East and Southeast Asian linguistics. Arie Verhagen – received his PhD in 1986 at the Free University in Amsterdam. He has been teaching at the Free University, Utrecht University, and the University of Leiden. He has been editor-in-chief of Cognitive Linguistics, from 1996 until 2004. Since 1998, he holds the chair of Dutch Linguistics at the University of Leiden. Recent publications include Usage-Based Approaches to Dutch (co-edited with Jeroen van de Weijer, LOT, 2003) and Constructions of Intersubjectivity (Oxford University Press, 2005). Paul Warren – is Associate Professor in the School of Linguistics and Applied Language Studies at Victoria University of Wellington, New Zealand. Paul’s primary research interests are in psycholinguistics, in particular spoken word recognition and the use of intonation in sentence processing. Since moving to New Zealand in 1994, he has combined these interests with a growing fascination in the development of New Zealand English. JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.1 (48-119) chapter Introduction Research issues in cognitive linguistics June Luchjenbroers University of Wales, Bangor . The cognitive linguistics agenda Linguistics as a discipline aspires to capture the essence of communication, and how language is processed in the human brain. The exact path to achieving this aspiration however, has in past decades split into two major and substantially different approaches: the now, more traditional approach to language processing, referred to as the ‘Formal’ or ‘Orthodox’ approach (cf. Langacker 1988), and the Cognitive Linguistics approach. A significant point of contrast between these two theoretical approaches lies in whether linguistic processes are deemed essentially different from other cognitive processes, or not; and thus whether linguistic phenomena should therefore should be investigated separately (cf. Chomsky 1980; Fodor 1983), or not. Although the goal of the Formal, generativist paradigm has been to provide cognitively oriented explanations rather than structural taxonomies, linguistics researchers from within the Cognitive Linguistics research community have brought challenge to a range of fundamental elements of the Formalist’s approach to language and cognition. In particular, cognitive linguistics challenges whether the brain is modular, as well as the role of logic and deduction as cognitive strategies for information processing (e.g., Langacker 1987, 1990); whether language in the brain is hardwired, as well as the validity of ‘mentalese’ (the supposed language of the mind, thought to be propositional in structure and possess logical attributes – cf. Fodor 1975; Pylyshyn 1984). The Formalist paradigm has consistently reinforced the view that the representation of language is best seen as involving basic, symbolic building blocks and rules; and further that those building blocks are also autonomously processed – i.e., grammar is distinct from both the lexicon and semantics (cf. Newmeyer 1986; JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.2 (119-172) June Luchjenbroers Kempson 1991), and that semantics is distinct from pragmatics. However, researchers from within the cognitive linguistics community have repeatedly shown how a full appreciation of individual linguistic units requires the researcher to consider all parts of language analysis (cf. Fauconnier 1994; Lakoff 1987; Lakoff & Johnson 1990; Talmy 1996). The papers of this volume have been collected to illustrate how otherwise separate areas of linguistic concern can present a better clarification of the linguistic distributions in which units are produced in talk; as well as provide a deeper appreciation of the semantic richness of those linguistic units, not captured by Formalist approaches. The cognitive linguistics agenda is to work toward a cognitively real approach to language processing; and for researchers from within the cognitive linguists community that means making ourselves amenable to research from disciplines outside the linguistics domain, such as psychology, A.I., Anthropology and philosophy, in addition to language related studies done within the linguistics spectrum. The papers in this volume are also drawn from a number of areas from within the cognitive sciences, to provide a more comprehensive appreciation of the multiplicity of the language units under investigation, as predicted and advocated by the cognitive linguistics approach to language and cognition. However, the full breadth of the cognitive linguistics agenda involves more than identifying the nature of language processing, which in itself includes both language production and comprehension processes, it also presupposes the more primary concern of language categorization and representation in the mind. In this volume a number of papers illustrate how our understanding of grammar units are essentially semantic, and other papers are devoted to specifically clarifying the nature of conceptual structures. Janda (2000) has also described the cognitive linguistics community as a group of researchers who embrace a concatenation of core concepts and goals, and who are emerged in the empirical observations of language behaviours across languages and disciplines. This does not subsume a single philosophical perspective toward the exact relation between language and mind; instead these core concepts capture the unifying principle that language, as representations in the mind and as the product of cognitive events, reflects the interaction of cultural, psychological, communicative and functional considerations. . Outline of this volume As promised in the title of this collection, the total body of papers presents research across a variety of languages and language groups, as well show how particular elements of linguistic description draw upon otherwise separate aspects (or fields) of linguistic investigation. The languages include European languages – Basque, JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.3 (172-217) Research issues in cognitive linguistics Dutch, Spanish and Turkish, as well as different varieties of English (American, Australian, New Zealand, and Old English); Asian languages – Chinese and Japanese; Austronesian Languages – Malay and Tagalog; Bantu languages – Shona and Xhosa; as well as a number of examples drawn from Australian Aboriginal languages and cultures, such as Dyirbal and Western Australian communities. Despite possible differences in philosophical approach to the role of language in cognitive tasks, and differences in the methodology used as an avenue for linguistic investigation, these papers are similar in a fundamental way: they all share a commitment to the view that human categorization involves mental concepts that have fuzzy boundaries and are culturally and situation-based. The selection of papers within this volume all concern how language comprehension and production involve conceptual mappings between varying domains of cognitive function. The three thematic subsections captured in this collection include (a) conceptual mappings involving cultural models. These involve specific types of knowledge that impact and sculpt the language outputs produced in talk. The second subsection (b), deals with computational models that emulate and hypothesize different features of the cognitive programming dealing with morphology, grammar, and sociolinguistic variation; while the third subsection of papers (c), focuses on specific components of linguistic description: semantics, grammar and discourse. A very appropriate start to the first subsection, and to this volume, is the paper by Gary Palmer, “When does cognitive linguistics become cultural? Case studies in Tagalog voice and Shona noun classifiers” (Chapter 2). In this paper, Palmer outlines important fieldwork in which important theoretical concerns about grammatical representation and processing are dealt with. He argues for the cognitive and semantic underpinnings of grammatical phenomena in the form of ‘cultural schemas’. Evidence for his argument is provided by cross-linguistic data (from Dyirbal, Tagalog, and Shona), to illustrate how many lexical domains and grammatical constructions link either directly or indirectly to significant cultural models. Well known concepts from the cognitive sciences, such as ‘scenarios’ from Artificial Intelligence and psychology, and ‘Idealized Cognitive Models’ from linguistics, are incorporated in his treatment of grammatical voice and noun classifiers, which are presented as extraordinary polycentric categories that provide the key to understanding the discourse of these language communities. After Palmer’s consideration of the role of culture (and thus experience) in explaining linguistic structure, the first thematic subsection continues with three other papers dealing with how different linguistic choices are manifest by each speaker’s conceptual representations of the world – Coulson & Oakley; Matlock; and Luchjenbroers. These papers, each drawing on different methodologies (discourse, experiment, and gesture), deal with different aspects of con- JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.4 (217-271) June Luchjenbroers ceptual representation: Coulson & Oakley’s paper deals with conceptual blending; Matlock’s paper with how information in memory is manifest in lexical retrieval; and Luchjenbroers deals with how cognitive strategies are evident in conversational gesture. In the chapter by Seana Coulson and Todd Oakley, “Purple persuasion: Deliberative rhetoric and conceptual blending” (Chapter 3), the authors consider semantic structure in the form of ‘Conceptual Integration Theory’ (‘Blending Theory’). In their paper, the authors illustrate how blending is recruited in persuasive discourse. The data used include an email message encouraging people to vote in a US congressional election, and a church letter sent to encourage monetary donations to that church. With excerpts from these data, the authors show how simplified input models are blended to form integrated event scenarios, and how the strategic choice of input frames can provide a writer (or speaker) with the means to encourage a particular construal of events that will likely result in the target action(s). Coulson and Oakley argue that persuasion depends on ‘objects of agreement’, and the strategic choice of inputs to create a convincing blend will promote the perception of such agreement. The following chapter (4), “Depicting fictive motion in drawings”, by Teenie Matlock, puts Len Talmy’s proposed, ‘fictive motion’ (1996) to the test, and thereby also cognitive theory dealing with conceptual representation and language processing. In this paper Matlock deals with motion verbs, and asks whether fictive motion plays a role in their comprehension. With a number of drawing experiments, she uncovers reliable evidence of a link between motion verbs and the mental simulation of the action conveyed by the verb: a link that involves a mentally simulated traversal or scanning of a trajectory. For example, manner information (such as slow, fast, or neutral) is depicted with longer, thinner or straighter lines for fast verbs than for slow verbs. The results given from three experiments challenge many traditional approaches to lexical representation, and provide strong evidence that comprehension taps into knowledge acquired from embodied experience. The final paper of this subsection (Chapter 5), “Discourse, gesture, and mental spaces manoeuvers: Inside vs. outside F-space”, by June Luchjenbroers, investigates the dynamics of conversational gesture in terms of the physical space in which they occur during discourse. That space, also called the ‘comfort zone’ or ‘F-space’, is where speakers produce most of their gestures during discourse, and Luchjenbroers argues that speakers convey added meaning, relevant to mental spaces navigations (i.e., movements around conceptual structure), when they choose to locate their gestures inside the boundaries of that space, or when they physically stretch to place a gesture outside it. The examples offered in this paper also illustrate how a speaker’s choice of gesture can amplify, and sometimes supplement information provided by the lexical component; they also show how the loca- JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.5 (271-324) Research issues in cognitive linguistics tion of a gesture in relation to a speaker’s F-space conveys role relations relevant to the subject-matter being discussed. As such, a speaker’s gestural F-space can be an important source of information for all discourse participants to establish, navigate and disambiguate the many mental spaces that may be required during discourse. These chapters are then followed by a new thematic subsection, that brings together research dealing with different computational models of the human cognitive system. These papers discuss different computation models for describing cognitive processes associated with the mental lexicon, in relation to morphology (Li); grammar (Schilperoord & Verhagen); and the phonological system (Warren). The paper by Ping Li (Chapter 6), “In search of meaning: The acquisition of semantic structures and morphological systems”, presents a very different approach to cognitive processing, in that he utilizes computational models in the form of a connectionist network. In this paper Li challenges the Formalist assumption embraced by many areas in the cognitive sciences that language is best seen as involving basic, symbolic building blocks and rules. Using child language acquisition data, and in particular parental speech from the childes database, Li begins with the observation that young children learn word meanings by exploiting contextual information in the input; thus, lexical categories can be acquired by the computation of statistical regularities involving multiple constraining factors, and meaning is the emergent property of that process. The major part of this paper, however, is his consideration of a puzzle involving a ‘cryptotype’, in the form of the reversive prefix ‘un-’. The un- problem is described as essentially semantic for which there seems to be no regular rule to govern its use – e.g., we can ‘untie’ a bow but not ‘unmove’ a desk. Li’s study illustrates how the semantic features that unite different members of a cryptotype are represented in a complex distributed fashion (where feature overlaps occur across categories); a process that is accessible to native intuition but appears to defy traditional symbolic analysis. In chapter Seven by Joost Schilperoord and Arie Verhagen, “Grammar and language production: Where do function words come from?”, the authors deal with the characterization of linguistic knowledge, in particular, organizational features of the mental lexicon and mental grammar. The practical application of this bigger picture issue is to ask the question, “how are function words selected during language production?”. In this quest, the authors first offer a theoretical consideration of language production models and the predictions that result from them. This is then followed by a usage based consideration of function words (prepositions and articles) and pauses, as they appear in the production of Dutch, oral dictations of routine business letters. The authors use cognitive linguistic views on the nature of linguistic knowledge to explain the evidence they have obtained regarding function words and how they are cognitively processed. In particular, they call into question assumptions in the literature that function words are stored JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.6 (324-378) June Luchjenbroers independently of their lexical heads, and whether there is a principled difference between functional and lexical words in the mental lexicon. In the final paper of this subsection (Chapter 8), “Word recognition and sound merger”, by Paul Warren, language processing models are again considered, although in this case, the field of research deals with comprehension in the form of psycholinguistic models of spoken word recognition. Warren questions how the human recognition system copes with phonetic variability across inputs: a matter of key interest for cognitive and computational theories dealing with how linguistic units (words and phones) are represented and processed for talk. The primary focus of this paper is a phenomenon Warren refers to as (word) ‘sound merger’, as in New Zealand ear/air neutralization. In NZ English merger occurs when two originally, phonologically distinct words progressively loose phonetic contrast, to become homophones; a progression that can be partial or complete. He then considers strong sociolinguistic literature addressing this phenomenon in New Zealand English to give evidence that merger is definitely in progress. These studies provide the corpus data to consider frequency and context effects, as well as social variables such as age difference, as predictors of when sounds merge and when not. Warren suggests that aspects of the sentential and extralinguistic context will resolve homophone ambiguity in the case of merged ear and air forms just like they do for other homophones. The final subsection of papers in this volume deal specifically with different and sometimes overlapping aspects of linguistic description: semantics, grammar and discourse. The first paper in this subsection, by Goddard, has many features in common with the first paper in this volume (Palmer), in that it also deals with cultural models, computational arguments, and semantic structure. However, Goddard presents a slightly different orientation to the earlier papers, in that he focuses on not only conceptual representations of lexical entries and the semantic relations they involve, but is also concerned with key aspects of the cognitive linguistics theory itself, in terms of the intellectual contribution made to the field by Anna Wierzbicka. In his paper, “Verbal explication and the place of NSM semantics in cognitive linguistics” (Chapter 9), Cliff Goddard considers areas of cognitive linguistics endeavour compatible with or anticipated by Wierzbicka’s approach to conceptual structure. However, the main core of Goddard’s paper is to argue, with examples from Aboriginal cultures, Malay, English and Japanese, that the verbal explication of conceptual categories and lexical entries is indispensable to the field of cognitive linguistics, and to illustrate that diagrams cannot stand alone without verbal support. In fact, Goddard argues that diagrams often rely on complex culture-specific iconographic conventions (to be interpreted), and only a fine-grained approach to verbal explication can the subtle nuances of abstract, culture-rich vocabulary be dealt with. Any theorist who re- JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.7 (378-442) Research issues in cognitive linguistics searches how linguistic and language-relevant information is cognitively stored, retrieved and illustrated, as well as how analysts can illustrate their representations, must also make theoretical decisions concerning the issues raised in this paper. This argument is a very relevant and important to bear in mind with the papers collected in the final subsection of this volume that deal with specific components of linguistic description: semantic analyses (Turner; Ibarretxe-Antuñano; and to some extent Lemmens); grammatical choices (Lemmens; and Uehara); and finally discourse in the form of narrative (Gough; and Pu). Many of the component arguments raised and dealt with in these papers also have resonance with earlier papers placed in other subsections. For example, in chapter Ten, “‘How do you know she’s a woman?’: Features, prototypes and category stress in Turkish ‘kadın’ and ‘kız’” by Robin Turner, a number of concepts raised by Palmer (this volume) are considered with Turkish data, including Noun classification, story schemas and scenarios as well as prototype effects. In this paper, Turner asks the question relevant to the Turkish choice of ‘kız’ (‘girl’) or ‘kadın’ (‘woman’) as a descriptor of an adult woman, “When is a girl a woman?” Using descriptive elements from componential semantics (i.e., + or – some semantic feature) Turner nevertheless illustrates the ‘fluid’ nature of meaning, and that category membership is not absolute; descriptive components like [+virgin] are merely convenient for naive descriptions because it fits the minimum criteria for the prototype of a lexical entry, such as ‘kız’. A number of different approaches to lexical semantics are considered, including Palmer’s (1996) view that categorization is influenced by scenarios that define sequences of (expected) states and actions. An important contribution made by Turner’s paper is the concept of ‘category stress’, which occurs when there is a disparity between the results of feature-based and prototype-based categorizations. This stress has a direct impact on how users deal with category membership in production as well as comprehension. Complementing Turner’s research, the following paper “Cross-linguistic polysemy in tactile verbs” (Chapter 11) by Iraide Ibarretxe-Antuñano, looks at how the semantic content of the tactile verb ‘touch’ in three genetically unrelated languages (Basque, Spanish and English) interacts and contributes to the creation of semantic extensions, while taking into account the different lexicalization patterns needed to convey the different senses this tactile verb can convey. The resulting polysemy is explained in terms of different experiential domains, triggered by the different senses of this verb, such as the mapping onto emotions, as well as other semantic fields. Even though the following chapter (12) by Maarten Lemmens, “How experience structures the conceptualization of causality”, is in principle about syntactic choices, it also deals with lexical semantics. In this paper he focuses specifically on verbs of ‘killing’, such as ‘suffocate’, ‘choke’ and ‘kill’. Variations in the conceptual- JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.8 (442-493) June Luchjenbroers ization of the different causative events are considered, with regard to which verb of ‘killing’ is chosen and the consequences that choice has for the selection of syntactic pattern in which it is to appear. His consideration includes case categories, such as Agent, Affected, Goal and Instigator, and their significance for transitive vs. ergative syntactic choices. For example, he argues that a more volitional participant who is engaged in some causative process, is more likely to be represented as a volitional Actor in a transitive construction. Lemmens’ research uses Old English, corpus data, and goes beyond description to focus on the experiential bases for a speaker’s choice of verb within a specific semantic class. Lemmen’s paper on syntactic choices is then followed by Satoshi Uehara’s cognitive grammar paper “Subjective predicates in Japanese: A cognitive approach” (Chapter 13). Here again, like several earlier chapters, the discussion of grammatical elements involves semantic concepts, in this case feelings and emotional reactions. Uehara’s main interest in this paper is subjectification, and the construal of the speaker (i.e., the conceptualizer), to explain the use of particular grammatical elements in discourse – e.g., account for the use of the nominative particle -ga with grammatical objects. Uehara’s many examples illustrate his claim that subjective predicates in Japanese can best be characterized as ‘deictic’ as they profile the object of conception from the vantage point of the speaker. The final two papers of this collection both deal with narrative. The first by Dave Gough, “Figure, ground and connexity: Evidence from Xhosa narrative”, (Chapter 14). This is a usage-based study of folk narrative discourse, which is the stimulus to show how discourse factors, pragmatic and cognitive processing should be described in terms outside language itself. Like Palmer, in chapter Two, he argues that grammatical terms like ‘mood’ and ‘tense’ refer to quite diverse verbal categories; and similarly, like Pu, in the following chapter, he uses a functionally based account of narrative discourse, with categories such as ‘foregrounding’ and ‘backgrounding’, in addition to the more general process of ‘grounding’, and ‘connexity’ (or ‘dependence’), to reveal systematic (conceptual) organization. His ultimate claim is that the concepts of ‘grounding’ and ‘connexity’ are fundamental to the organisation of the Xhosa verbal system and further that verbal forms, referred to as the participial, consecutive and indicative moods as well as the so-called ‘continuous tense’ are structured around those concepts. In the final chapter in this volume, “Coding events in oral and written discourse” (Chapter 15), Ming-Ming Pu also investigates discourse, although her focus is on discourse organization in terms of thematic structure and information units. In particular, Pu examines episodic structure and how speakers relate events within and between them. This is followed by a consideration of the relation between spoken and written narratives, as well as universality in narrative production. Pu uses narrative data that was produced by English and Mandarin Chinese speakers, drawn from a children’s picture book. Her research is part of a larger tradition that JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.9 (493-582) Research issues in cognitive linguistics sees conversations and written texts as more than unordered strings of utterances; instead she argues for structures with levels of organization that require conceptual management. Pu’s study provides further evidence of the cognitive constraints upon speakers to accommodate their addressee’s processing needs by signaling discourse units and prompting the retrieval of information. A wide range of language issues are relevant to cognitive linguistics research and is reflected in the collection of papers included in this volume. The now traditional cognitive linguistics areas include: lexical semantics, cognitive grammar, metaphor and prototypes, pragmatics, narrative and discourse, and computational models. In this volume however, these general concerns have been considered in harmony with other important fields including: language acquisition, language and culture, video data analysis and gesture, Blending Theory, fictive motion and others. Devising an order for these papers, or my summation of them for this chapter was made all the more difficult because they all illustrate how a full appreciation of particular elements of linguistic description, and the cognitive processing involved in their use, requires a synthesis of different (and traditionally separate) areas of linguistic investigation; and that aspects of situated meaning and cultural semantics are relevant to the cognitive processing of language phenomena, and should not be divorced from them. References Chomsky, Noam (1980). Rules and Representations. Oxford: Basil Blackwell. Fauconnier, Gilles (1994/1985). Mental Spaces: Aspects of Meaning construction in Natural Language. Cambridge, UK: CUP. Fodor, Jerry A. (1975). The Language of Thought. New York: Crowell. Fodor, Jerry A. (1983). The Modularity of Mind. Cambridge, MA: MIT Press. Janda, Laura (2000). Cognitive Linguistics. Paper presented at SLING2K Workshop. Kempson, Ruth (1991). The Language Faculty and Communication. Reading materials, 1991 Linguistics Institute, Univ. of California at Santa Cruz. Lakoff, George (1987). Women, Fire, and Dangerous Things: What categories reveal about the mind. Univ. Chicago Press. Lakoff, George & Mark Johnson (1990). Philosophy in the flesh. New York: Basic Books. Langacker, Ron (1987). The Cognitive Perspective. CRL Newsletter. Vol. 1(3). UC, San Diego. Langacker, Ron (1988). An Overview of Cognitive Grammar. In B. Rudzka-Ostyn (Ed.), Topics in Cognitive Linguistics. Amsterdam: Benjamins. Langacker, Ron (1990). The Rule Controversy: a Cognitive Grammar Perspective. CRL Newsletter, 4(3). University of California, San Diego. Newmeyer, F. J. (1986). Linguistic Theory in America: The first Quarter-Century of Transformational Generative Grammar. New York: Academic Press. Palmer, Gary (1996). Towards a Theory of Cultural Linguistics. Austin: University of Texas Press. JB[v.20020404] Prn:20/03/2006; 15:50 F: HCP1501.tex / p.10 (582-592) June Luchjenbroers Pylyshyn, Zenon (1984). Computation and Cognition: Towards a Foundation for Cognitive Science. Cambridge, MA: MIT Press. Talmy, Len (1996). Fictive motion in language and “ception”. In P. Bloom, M. A. Peterson, L. Nadel, & M. F. Garrett (Eds.), Language and space (pp. 211–276). Cambridge, MA: MIT Press. JB[v.20020404] Prn:1/12/2005; 10:40 F: HCP15P1.tex / p.1 (47-73) Cultural models and conceptual mappings JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.1 (47-109) chapter When does cognitive linguistics become cultural? Case studies in Tagalog voice and Shona noun classifiers Gary Palmer University of Nevada at Las Vegas In cultural linguistics, grammar is seen as governed by cultural schemata rather than universal innate or emergent cognitive schemata. Sources of linguistically determinant schemata include mythology, social structure, repetitive domestic and subsistence activities, and salient rituals. Two noteworthy types of cultural schemata are scenarios, which model social action and discourse, and polycentric categories, which elaborate the complex and radial category types of Langacker (1987) and Lakoff (1987). These concepts will be demonstrated in two case studies: In Tagalog, an Austronesian language, grammatical voice used in emotional expression expresses elementary scenarios of control and non-control. In Shona, a Bantu language, noun classifiers are governed by polycentric categories pertaining to salient domestic and ritual scenarios. Keywords: categories, Bantu, Austronesian, scenarios, cultural linguistics . Introduction1 Ronald Langacker (1999: 13) has noted that “language is an essential instrument and component of culture, whose reflection in linguistic structure is pervasive and quite significant” (1999: 16). This observation provides an excellent starting point for cultural linguistics, an approach which foregrounds cultural schemata in explanations of grammar and semantic patterns (Palmer 1996). In this respect, it contrasts with the typical practice of cognitive linguistics, which foregrounds universal cognitive processes such as figure-ground relations, force dynamics, emergent categories, and Idealized Cognitive Models, leaving cultural dimensions of language somewhere in the background, or at least unlabeled as such. Cultural linguistics is not so much a new theory as a shift in emphasis. It draws on the theory of cogni- JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.2 (109-174) Gary Palmer tive linguistics for many essential analytical concepts, but it takes a point of view from the margin of cognitive linguistics as it is typically practiced. It is an extension of cognitive linguistics into cultural domains, as foreshadowed in the writings of Langacker (1987, 1991a, b), Lakoff (1987), and others. Specifically, I am claiming that many grammatical phenomena are best understood as governed by cultural schemata rather than universal innate or emergent cognitive schemata. The sources of such cultural schemata include mythology, such as the Australian Dyirbal myth of the sun and moon, which George Lakoff used to explain membership in Dyirbal noun classes (Lakoff 1987). They also include social structure, repetitive domestic and subsistence activities, salient rituals, and a host of other cultural phenomena. This cultural emphasis makes it essential that the linguist either produce or survey ethnography pertaining to the linguistic topic under study. As Mylne (1995) argued in a critique of Lakoff ’s interpretation of Dyirbal classifiers, linguists can not rely solely upon their own intuitions about the semantics of complex domains, but should instead attempt to discover which concepts have particular relevance for speakers. Unlike postmodernist cultural theory, which posits no fixed points of reference or stable meanings, cultural linguistics depicts grammar as an entrenched system of meaning and form. Following Langacker’s (1987, 1991a, b, 1999) theory of cognitive linguistics, the minimal units of grammar are verbal symbols, each of which represents a linkage of two kinds of units, one phonological, the other semantic. Semantic units are characterized relative to semantic domains (1987: 63). Since these may include any concept or knowledge system, linguistic semantics is encyclopedic and very much a cultural entity. When a class of linguistic expressions is seen as relative to one or more semantic domains of relatively extensive scope with complex category structures and rich details, then cognitive linguistics becomes decidedly cultural. It is this difference in emphasis and elaboration of the cultural dimension, not an underlying difference in theory, which justifies the new label of cultural linguistics. The label also differentiates the approach from that of contemporary linguistic anthropology, which is typically discourse-oriented and heavily invested in pragmatism and political economic or feminist theory, often displaying scant interest in cultural categories or cognitive processes. In my view, culture and cognition are not separate entities, just two views on the process whereby people with minds, which are embedded in physical bodies situated in social and physical environments, communicate, learn, think, and pursue social goals. Similarly, Edwin Hutchins (1996: 354) proposed an integrated view of human cognition, “in which a major component of culture is a cognitive process . . . and cognition is a cultural process.” Certain types of cultural models merit special attention from linguistic anthropologists and culturally oriented linguists. These are scenarios (including discourse scenarios) and polycentric categories. The use of these concepts will be JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.3 (174-222) When does cognitive linguistics become cultural? demonstrated in two case studies (1) voice and emotional expression in Tagalog, an Austronesian language; and (2) noun classifiers in Shona, a Bantu language. The first case will deal with elemental scenarios underlying grammatical voice in the emotion language that appears in a Tagalog video melodrama dealing with a couple living in transnational circumstances. I will show how the highly abstract scenarios underlying voice are instantiated in the emotional discourse of melodrama and provide the key to understanding that discourse. In the case the study of Shona, I demonstrate that a better understanding of noun classifiers can be achieved by analyzing each classifier as a polycentric category. The latter is a synthesis of Langacker’s (1987) concept of complex category with Lakoff ’s (1987) concept of radial category. Unlike the radial category, which has a single central prototype category, a polycentric category has multiple central categories connected by conceptual metonymies. In the next section I will elaborate on these concepts. Then, in the following sections, I will apply them to the case studies. . Operational concepts Scenarios Scenarios are schematic cultural models of action. Cultural linguistics is based on the premise that grammar is relative to cultural models and culturally defined imagery. Cultural models are cognitive entities, but they are often more richly elaborated and further removed from basic physical and cognitive experience than the spatial-mechanical schemas and figure-ground relations typically investigated within cognitive linguistics. Examples of cultural models include the conventional knowledge systems governing kinship, ways of preparing food, navigation, rituals, myths, ceremonies, games, and speech events such as conversations. Imagery arises from construing models at different levels of abstraction, from different points of view, or at different stages in a process,2 and from admitting various features of models within the scope of attention (Langacker 1987; Lakoff 1987; Palmer 1996). Cultural models include some, but perhaps not all, of what Lakoff (1987: 113– 114) termed Idealized Cognitive Models, in which he included propositional, image-schematic, metaphoric, and metonymic models. Universal image-schemas derived solely from the common experience of inhabiting a human body would not in themselves be cultural models. However, universal image-schemas may be incorporated into cultural models, and in fact most physical experience reflects not only universal constraints, but also cultural modifications or culturally specific uses of tools, dwellings, and habitats. Embodied universal categories may simultaneously belong to cultural domains. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.4 (222-259) Gary Palmer With respect to metaphoric and metonymic models, it seems more accurate to speak of metaphoric relations between models or parts of models, or to say that models comprise functional relations, which provide the material for verbal metonymy. But again, these distinctions are not theoretically crucial so long as cognitive linguistics provides a role for cultural constraints on grammar, as Langacker and Lakoff have done. It is useful to explicitly recognize the elements of convention and social construction by referring to some kinds of linguistically significant models as cultural, while conceding that all cultural models are also cognitive. Most ICMs are cultural products, and the same may be said for domains of experience (Lakoff 1987). Thus, it seems appropriate to refer to an approach which examines such cultural constraints on language as cultural linguistics. By using the term, we make it obvious that existing ethnographic studies contain a wealth of information of potential immediate use to linguistic theory. Relatively abstract or decontextualized images are called schemas or imageschemas. Those involving actions and sequences of actions are scenarios. The scenario concept is particularly important in cultural linguistics because the term directs attention to the imagery of social action and discourse, which has largely been overlooked by cognitive linguistics, particularly in the study of non-IndoEuropean languages. The reason for this neglect may lie in the fact that scenarios are strongly influenced by history and socio-cultural context and therefore relatively independent of more basic cognitive processes of attention, accessibility or saliency of information, and basic concept formation which many linguists regard as the strongest determinants of grammar. It is true that Langacker (1987: 63) included as possible semantic domains “the conception of a social relationship” and “the speech situation”, but at the very least, one can say that social scenarios have not been clearly delineated as a type of imagery having linguistic significance to the same extent as, for example, spatial imagery. And yet, humans probably direct as much verbal attention to orienting in society as they do in space, if not more. Not all of this social orientation can be reduced to metaphors of force and space. The approach pursued here resembles that of Anna Wierzbicka in that her cultural scripts are something like scenarios (Wierzbicka 1996, 1997; Palmer 2000). However, unlike Wierzbicka, I do not reduce scenarios to statements composed of a small set of semantic primes arranged according to the rules of a semantic metalanguage. I take scenarios to be gestalts or constructions built up from lower-level scenarios and event-schemas. Discourse scenarios and discursives The discourse-relevant content of forms and constructions is not always obvious. Much attention has been devoted to discourse particles, but verbs or verbal morphology may also predicate information pertaining to discourse and human JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.5 (259-333) When does cognitive linguistics become cultural? interaction, notably information on the agency of actors or interlocutors, as I will show in the Tagalog case study. Cultural linguistics approaches discourse by following two principles: (1) part of the meaning of every lexeme or construction is its habitually situated use in discourse; (2) discourse is governed by scenarios of verbal and social interaction. The first principle follows from Langacker’s premise that “any facet of the context [of an usage event] that consistently recurs across a set of usage events can be retained as a specification of the schema that emerges from them” (Palmer 1996: 40; see also Langacker 2001). The usage principle may seem obvious, but the implications for cognitive linguistics have not been clearly drawn. Of course it means that discourse follows culturally specific patterns and sequences, but it also means that most discourses consist partly of verbal particles, lexemes, and longer utterances whose predicational content is the discourse itself, meaning its participants, verbal events, and prosodic qualities. Since verbal discourse is so pervasive in human life, much of the lexicon and grammar of any language must be about discourse scenarios. Thus, we have metadiscursive terms and expressions like lie, gossip, shut up, be attentive, and be on the stump (give speeches in a political campaign). The domain of terms and expressions that predicate discourse scenarios includes that of speech act terms, but it is more comprehensive. For example, the construction be attentive, is not, strictly speaking, a speech act, but it does predicate a construal of one aspect of a discourse scenario. Terms whose main function is to predicate some aspect of ongoing discourse in which the speaker is engaged may be termed discourse indexicals, or just discursives (Palmer 1996: 207).3 These would include discourse particles such as English um, oh, and uh huh, Japanese yo, some tag questions, and English like when used as a presentative or quotative (e.g. She was like [quote, pseudoquote or experiential state]). The so-called discourse particles are seen not as mere non-propositional forms (Stubbs 1983), non-referential indexicals (Silverstein 1976), conversational reflexes, pointers, meaningless elements, or strategic moves (Clark 1996) that are qualitatively different from other terms, but as terms that predicate much as other terms do. They are verbal symbols whose semantic domain happens to be the ongoing and ambient discourse itself as performed by both speaker and listener. Thus, discursives may even be evaluative, as when English So? is used to question the significance of a preceding statement and is riposted with a Sooo?! that sarcastically questions the validity of the original question. Discursives may pertain to situation, interactional structure, pragmatic intensions, ideological content, or phonological shape of discourse. Since each culture develops its own unique discourse imagery, this is a potentially important topic in cultural linguistics.4 Many other terms and expressions may be said to have discourse indexicality or discursiveness as a peripheral part of their meaning (compare Langacker 1987: 63). JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.6 (333-370) Gary Palmer elaboration PROTOTYPE SCHEMA elaboration extension VARIANT Figure 1. Complex category as envisioned by Langacker (1987) Langacker (1991b: 318) defined the term ground as “the speech event, its participants, and its immediate circumstances . . . .” Since discursives predicate about ongoing discourse, which is necessarily part of the grounding situation, one might theorize that they will sometimes predicate speakers’ perspectives. One could investigate their distribution across the dimension of subjectivity-objectivity (Langacker 1990). A participant may take a subjective perspective on the speech event, in which case she herself lies outside the perceptual field; or she may construe the event and her own role in it objectively, in which case she herself lies within the perceptual field. Japanese yo, for example, has a sense something like I am telling you or pay attention to what I just said, but the participants are tacit, suggesting a subjective perspective and a focus on the discourse events rather than the participants, whereas an English tag question, such as “Am I right?” with an explicit pronoun for speaker, is a discursive suggesting an objective perspective on speaker in Langacker’s sense. The topic of discursives will not be discussed further in this paper, but I mention it as meriting further cross-linguistic and cross-cultural study. Categories: Complex, radial, and polycentric Cognitive linguistics presents us with at least two types of complex categories. The first is Langacker’s, which he characterizes simply as a complex category (Langacker 1987: 373; see also Palmer 1996: 96–97). It begins with a prototype and a variant. Since these necessarily have something in common, there is also a schema, which is elaborated by both the prototype and the variant (Figure 1). Langacker’s complex category appears to have no place for conceptual metonymy. Another kind of complex category is the radial category as described by Lakoff (1987). A radial category has a central subcategory and non-central extensions or variants. This is very much like Langacker’s model, except that Lakoff does not include the schemas which can be abstracted from each extension of the prototype to a variant. In his discussion of Dyirbal noun classes, Lakoff also states that “complex categories are structured by chaining; central members are linked to other members, which are linked to other members, and so on” (1987: 95). Some of the links which he describes are conceptual metonymies (the sun is linked to sunburn); others are similarities (sunburn is linked to the sting of the hairy mary grub), or prototype to variant relations (women to the sun, who is a mythical woman). JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.7 (370-418) When does cognitive linguistics become cultural? Figure 2. Radial category balan as envisioned by Lakoff (1987: 103) Rather vaguely, he asserted that Experiential Domains and Idealized Cognitive Models can “characterize links in category chains” (1987: 95). A bit of cultural theory seeps in as well: “Experiential Domains . . . are basic domains of experience, which may be culture-specific” [bold face added]. I hold that such linguistically significant experiential domains are in most instances actually cultural scenarios that have been given high salience by virtue of occurring in myth, ritual, crisis, social structure, or even the daily drudgery of domestic life. The functional links within domains are what we regard as conceptual metonymies. In a further suggestion of the importance of conceptual metonymy over schematization, Lakoff asserted that “specific knowledge (for example knowledge of mythology) overrides general knowledge” (1987: 96). We are left with a picture of a category that has a central prototype from which radiate a number of chains based on similarity and conceptual metonymy (Figure 2). Lakoff used this concept to develop a theory of Dyirbal noun classifiers. Three of the four classifiers were characterized as radial categories (bayi, balan, balam). The fourth (bala) was characterized as an ‘everything else’ category. Noun classifiers represent a common and important kind of grammatical category, which was once thought to be arbitrarily organized. Lakoff (1987) demonstrated that a class may have hundreds of members that share no common features of meaning. In my opinion, this important advance in the theory of linguistic categories depended crucially on understanding the governing role of cultural scenarios. Tom Mylne (1995) took issue with Lakoff ’s (1987) analysis of Dyirbal noun classifiers, accusing him of imposing a Western world view on the Dyirbal system because it proposed human males and females as prototypes for the classes bayi and balan. Mylne proposed instead that the linguist should seek to discover which concepts have particular relevance for the Dyirbal and use these as the basis for the analysis. He proposed that the four classes of bala, balam, bayi, and balan JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.8 (418-489) Gary Palmer could each be defined by combinations of values on the dimensions of potency and harmony, which have special relevance in Dyirbal culture and society. Thus, Mylne’s critique appears to be an argument for an explanation that is more cultural than cognitive, but based on parameters or features, rather than on scenarios or cultural models. My analysis of classifiers is like Mylne’s in two respects: First, I am arguing that the important criteria for classification are concepts that are culturally salient. Second, I am arguing that one finds no single prototype at the center of a typical noun class. But unlike Mylne, I do not try to explain the category by replacing the prototype with one or two abstracted dimensions. Similar approaches have been attempted in Bantu studies (Contini-Morava 1994; Spitulnik 1987, 1989) with unsatisfactory results, as discussed by Palmer and Arin (1999) and Palmer and Woodman (1999). A third type of complex category is the polycentric category as proposed by Palmer and Woodman (1999). A polycentric category has multiple central categories, each of which may be a scenario or a prototype derived from the scenario. I show only scenarios in the central region of Figure 3. I treat the central categories as a functional complex, rather than as parameters which must have contrasting values across categories, though I would not rule out the possibility of a level of contrast that would apply across classes to subsets of category members. The central categories are related to one another and to more peripheral categories and instances either by function (contiguity, conceptual metonymy), by similarity (prototype to variant, metaphor), or by schematization (schema to instantiation). I call these complexes polycentric categories. They consist in part of complex categories as defined by Langacker (1987: 373) and of radial categories as defined by Lakoff (1987). Since the cognitive links of polycentric categories are all embedded in cultural scenarios and other sorts of cultural models, the PC is at once both cognitive and cultural. . Case studies Case 1: Grammatical voice and emotion language in Tagalog5 The notion of agency itself represents a very abstract schema of social interaction in which the subject or focal participant initiates or performs an action. In many languages it is uncommon to explicitly mention agents of transitive constructions, so that sentence subjects are often experiencers or objects of transitive actions. Mention of a transitive agent may require explicit ergative marking on the noun. In Western Samoa, Alessandro Duranti (1994: 114–143) found that participants in village council meetings were reluctant to define agents in the beginning part of the SCHEMA f f f f f SCHEMA elaboration extension Key VARIANT SCENARIO C SCENARIO A f SCENARIO B PROTOTYPE PROTOTYPE Figure 3. Schematic of polycentric category as proposed by Palmer and Woodman (1999) VARIANT VARIANT SCHEMA PROTOTYPE f metonymy JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.9 (489-489) When does cognitive linguistics become cultural? JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.10 (489-546) Gary Palmer meetings. In transcriptions of the meetings, transitive clauses with ergative agents were not very frequent (1994: 125). They appeared only where participants were receiving credit or blame, or where “the power of certain individuals or groups to affect others through their actions or to cause or initiate events is at least acknowledged” (1994: 126). The person with the highest incidence of ergative agents in his speech was the senior orator who chaired the meeting and also acted as prosecutor or instigator. References to actions of the Almighty also place the Lord in the ergative case, as in example (1) (1994: 126). (1) e fa’alava e le Akua mea ‘uma. ta caus+enough erg art Lord thing all “The Lord makes all things sufficient.” Speakers avoid focusing the agency of participants by placing actors in prepositional or genitive phrases. While some are fixing responsibility and laying blame with ergative constructions, others are dodging responsibility and denying blame with genitive or prepositional constructions, or with vague language. Duranti pointed out that speaking with ergative agents constructs relations of power as much as it reflects them. The powerful may use ergative constructions to frame the situation, but the less powerful use them at their own risk. By demonstrating the usage of the ergative construction in political scenarios, Duranti has shown that the grammar of agency participates in the culture of power. Making the connection is not as straightforward as relating a deictic term or a spatial preposition to a physical scene, because the construal of social events is much more problematic than the construal of basic spatial conformations. Further complicating the analysis is the fact that the language of agency is not independent of the social process. The discourse and its grammar participate in the scenario, co-constituting it along with other symbolic acts, such as seating arrangements, turn-taking, and presentations of gifts or titles. There are ways to evaluate dimensions of social scenarios independently from their discourse, but for the moment I am resorting to an interpretive approach. This is not an unusual limitation, because linguists are seldom able to provide rigorous proofs of the semantic basis for their grammatical categories. My case study of voice in Tagalog emotion language is parallel to Duranti’s study of ergative constructions in Samoan council discourse. Agency schemas underlie grammatical voice at its semantic pole. Tagalog lacks an ergative case construction, but there are other grammatical similarities, no doubt based on the fact that Samoan and Tagalog are distantly related Austronesian languages. For example, both languages commonly place non-focused agents in genitive phrases.6 In Tagalog, several verbal affixes predicate the agency or the lack of agency of the focal participant in a clause. Interpreting their meaning with respect to agency of participants in a discourse is not always straightforward, because speaker may JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.11 (546-599) When does cognitive linguistics become cultural? be referring to the agency of self, of interlocutor, or of some third party. Nevertheless, the attribution or denial of agency can be shown to make sense in the discourse context. I will not be arguing that emotion language in Tagalog differs greatly from language in other domains of culture and discourse, only that making the governing scenarios explicit helps us to understand agency and voice in Tagalog. When comparable studies become available, the scenarios governing voice can be compared cross-linguistically. However, it does seem likely that emotional language is particularly sensitive to the nuances of semantic agency evocable by voice constructions. It therefore provides a good domain for the study of voice. This case study examines grammatical voice in the emotion language that appears in a Tagalog video melodrama dealing with a couple living in transnational circumstances. I will demonstrate that the protagonists in this melodrama most often present themselves and one another either as grammatical experiencers or patients. Similarly, others also represent them as patients or as needing to acquire agency. In those instances when they are assigned actor roles, they are seldom placed in grammatical focus.7 It is only in moments of crisis that they assume the language of strong personal agency by using forms in which they, as grammatical participants, take on active focus. Nominal participants in Tagalog are said to be focused if they are preceded by the referential (ref) determiner ang, which contrasts with the genitive (gn) marker ng [nang] and the directional (drc) preposition sa. There are also pronouns and personal name markers that correspond to ang, ng, and sa phrases. Each of the voice affixes places certain kinds of nominal participants in focus. The most common transitive construction occurs with a null voicing affix (though often with the ni- (-in-) realis prefix or -in irrealis suffix which is sometimes regarded as a voicing affix). The construction, regarded as a kind of passive in less technical grammars, requires that a profiled actor – if one is profiled – be genitive and a profiled undergoer be focused, as in (2). Some linguists regard this construction as evidence that Tagalog is an ergative language (Cooreman, Fox, & Givón 1984). Focused undergoers have low topicality. (2) Gagaw-in ko ang lahat upang ma-kamt-an do-irr:uf 1p:gn ref everything in.order.to irr:nc-obtain-loc ito. prox:ref “I will do everything in order to obtain this.” The non-control affix ma- also places a patient or experiencer in focus, as in (3), which uses the prefix in its realis form na- and the referential focus pronoun ako rather than an ang-phrase. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.12 (599-677) Gary Palmer (3) na-tatawa ako, hi, hi, hi, hi, sa ‘yo8 nc:rl-incm-laugh 1s:ref hee, hee,. . . drc 2s:drc “I was amused, hee, hee, hee, hee, at you.” Other affixes (mag-, -um-) place actors in focus. Two examples appear in (4). (4) Ngayon ako’y nag-sisisi kung bakit ako now 1s:ref-inv af:rl-incm-regret cond cond 1s:ref nag-‘I love you’!!!9 af:rl-‘I love you’ “Now I am regretting ever saying ‘I love you’!!!” Tagalog has many ways of verbalizing or predicating emotional experience. Example (3) illustrates the Use of Emotion Terms (na-tatawa) and the Mimesis of Psycho-ostensives (hi, hi, hi, hi). Example (4) illustrates the Use of English Emotion Terms in Tagalog or Mixed Text. Other ways of verbalizing emotions are listed in (5) to (11). (5) Obscenity the lady just kept swearing banal na aso, santong kabayo10 the lady just kept swearing holy lg dog, pious-lg horse “the lady just kept swearing ‘holy dog, horse saint”’ (6) Description of Psycho-ostensives katakot-takot na kamot si kaka’y napadaing 11 st-fear-r2 lg scratch pn prnm rl:ncf-ger-cry.out “horrific scratches, Kaka cried out” (7) Repetition ako, mahal kita, mahal na mahal12 1s:ref love 2s:1s love lg love “I love you, love of love” (8) Use of Verb with Process that Results in Emotion or Feeling Hindi mo alam kung gaano mo ako sasaktan.13 neg 2s:gn know cond how 2s:gn 1s:ref incm:injure-loc “You don’t know how much you hurt me.” (9) Description of Facial Expressions (Conceptual Metonymy) gumulong at nagkaduling-duling 14 af:roll and af:rl-st-cross.eyed-r2 “he rolled on the floor and got cross-eyed” (10) Use of Metaphor nababato ako gusto kong umuwi15 ncf:rl-incm-stone 1s:ref like 1s:gn af:irr-go.home “I am turned to stone, bored, my desire is to go home.” JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.13 (677-799) When does cognitive linguistics become cultural? (11) Denial of Emotion Matuto kang maging manhid.16 ncf:irr-learn 2s:ref-lg ncf:irr-become numb “Learn to become insensitive.” What one quickly notices in these expressions is that many emotion terms have verbal affixes, each of which conveys mood as well as voice. Examples include (3) na-ta-tawa ‘I was amused’, (4) nag-si-sisi ‘I am regretting’ and the interesting nag- ‘I love you’, which uses an English phrase as a verb stem, (6) na-pa-daing ‘he cried out’, (8) sa-sakt-an ‘(someone) hurt (someone), (9) nag-ka-duling-duling ‘got cross-eyed’, (10) na-ba-bato ‘was turned to stone’. These forms all happen to be realis mood. Irrealis forms would be matatawa, mapadaing, magkadulingduling, etc. My arguments regarding voice in Tagalog emotion language hinge mainly on the distribution of non-control (ncf), undergoer-focus (uf), and agent-focus (af) forms. The distinction between realis and irrealis is not without interest, but it is not crucial to the argument. Aspect is most often either completive, which is unmarked, or incompletive, signified by reduplication operating on the first syllable of the root as in (3) na-ta-tawa, (4) nag-si-sisi, and (10) na-ba-bato. Voice affixes ma-, - i- and -an put non-agentive participants in focus. The focal participant of ma- may be merely an experiencer, but i- and -an require focal participants to be undergoers. All three may be said to have undergoer focus (ug), but they are usually designated as stative focus (sf), undergoer-focus (uf), and locative focus (lf). Rather than stative, I use the term non-control (ncf), because it more accurately subsumes the variety of meanings. The undergoer-focus affixes i- and -an contrast with affixes mag- and -um-, which have active agent-focus (af). The forms with initial m- (ma- and mag-) are irrealis. They have realis counterparts na- and nag- and gerund forms pa- and pag-. Related to mag- is maN-, a form that has more idiosyncratic semantics. Suffix -in is irrealis, but it occurs frequently in undergoer focus constructions. The semantics of the voicing affixes is summarized in Table 1.17 The concern in this paper is how these forms are used in actual discourse to communicate agency or lack of agency on the part of the central participants. Since I am positing that the affixes of voice predicate elemental scenarios of action and agency, it seems useful to represent their semantics with a few heuristic diagrams, as in Figures 4 to 9 in which the stick figures represent the protagonists Alice and Jerry, when they are speaking or when they are being spoken to by others. As speakers, they may speak to each other, or more often to a third person. I have no examples in which Agnes or Jerry are spoken about as third persons. The stick figures may seem gratuitous, but I use them to emphasize that the voice affixes in the verbs of emotion language predicate scenarious with human agents and patients. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.14 (799-866) Gary Palmer Table 1. Focus and semantics of Agency in the voicing affixes Focus Morphology Semantics of figure Non-control (NCF) maUndergoer (UF) -i- Locative (LF) Agent (AF) -an -ummagmaN- Experiencer or patient Reason for doing; Conveyance of patient; Instrument Goal or location Performs or initiates action “ “ Example ma-rinig ‘(x) be able to hear’ i-kukwento ‘tell story-(x)’ sasakt-an ‘injuring (x)’ um-uwi ‘(x) went home’ nag-sisisi ‘(x) is regretting’ nang-galing ‘(x) came from’ * In full clauses the arguments corresponding to “(x)” in the examples would appear as focused nominals. The prefix ni-/-in- is treated as modal rather than voicing, though it commonly occurs with otherwise unmarked undergoer focus. I am regarding Tagalog grammatical focus as a means of profiling participants and processes. Profiling means that an expression specifically designates a particular substructure within a conceptual base or scope of predication (Langacker 1999: 27). “The entity designated by a predication – what I will . . . call its profile – is maximally prominent and can be thought of as a kind of focal point” (Langacker 1987: 118). Thus, I take grammatical focus in Tagalog to be a marker of salience. If an actor has grammatical focus, I take it as a marker of the salience of agency. If an experiencer or undergoer has grammatical focus, I take it to mark lack of agency. In Figures 4–8, profiled elements are drawn with bold lines. Figure 4 represents the situation in which an actor or agent is in focus, and it follows that the action must also be salient, so the arrow also appears in bold. Actually, the grammatical actor in Figure 1 is ang kalooban, ‘inner feelings’, which I have represented with the gray circle in the chest region of the stick figure. Figure 5 reverses the focus, placing it on an undergoer. Participants in this scenario lack personal agency – they are acted upon. Figure 6 represents the participant as experiencer, another situation in which personal agency is lacking. Figure 7 represents the conceptualization underlying a clause in which the agency of a central participant is denied. Predication of denial is accomplished by the construction hindi hindi . . . kayang ‘neg neg . . . be able’. Figure 8 shows a scenario in which an actor surrenders personal agency by a metaphorical act on an object in the body. The bold box represents an abstract entity, here instantiated by pride, which was metaphorically swallowed (ni-Ø-lunok), or the heart (puso), which was allowed to prevail (< ni-pa-iiral). Both constructions require undergoers, as indicated by the prefix pa- and the null prefix. The ni-/-in- prefix in both verbs is realis mode rather than voice. The dotted line indicates that the two figures represent the same person. The metaphorical actor is the actual experiencer. The figure in the target concept (box) is drawn in light lines to show that the resulting status is not explicitly verbalized. The scenar- JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.15 (866-866) When does cognitive linguistics become cultural? -um-, magNag-su-s-um-igaw ang kalooban ko* ma-, i-, -an, Ø- ‘I am shouting out my inner feelings’ Na-niwal. Bakit niya ako ni-Ø-loko? ‘I believed him. Why did he fool me?’ *‘My inner feelings are shouting out.’ Figure 4. Agent Focus (AF) Figure 5. Non-control, Undergoer & Locative Focus (NCF, UF, LF) ma- Alam mong hindi hindi ko kayang mag-mahal... Na-ba-bato ako. ‘I am turned to stone (bored)’. ‘You know I can’t love...’ Figure 6. Experiencer (Non-Control) Focus (NCF) Figure 7. Denial of Agency metaphor target source pa-, Ø- Experiencer Undergoer Focus (UF) Ni-Ø-lunok ko ang pride ko. ‘I have swallowed my pride.’ Hindi puweding ang puro puso p-in-a-iiral. ‘You cannot let the heart prevail.’ Figure 8. Metaphorical surrender of Agency JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.16 (866-937) Gary Palmer ios in these figures are sufficient to characterize most of the emotion language in Sana’y Maulit Muli. The video Sana’y Maulit Muli ‘I Hope It Will Be Repeated Again’ dramatizes several facets of the predicament of the Tagalog transnational community, at least as experienced by two young middle-class lovers, Agnes and Jerry.18 The two experience anguished separation from home, family, and friends, as well as from each other. They encounter dehumanizing ideologies and onerous social demands of the market economy. They are exploited by callous employers and immigration officials. Agnes discovers the freedom, danger, and loneliness of feminine selfreliance. Both succumb to the temptations and comforts of consumerism. The emotional conversations of Jerry and Agnes, and of each with others, appear to be largely about the loss and recapture of personal agency. Of special interest in the film is the frequent use of emotional expressions suggesting lack of control. Expressions revealing active control with protagonists in grammatical focus (af) appear only as directives received by them and as uncharacteristically assertive outbursts occurring in moments of crisis. Jerry is ambitious and spends a lot of time with his attractive boss, Cynthia, often leaving Agnes alone. Agnes’s mother wants her to come to the United States, where the mother is living. Jerry’s cousin Nick arrives from America, looking rich and important. After a difficult interview with an immigration officer, Agnes is moping about the house, dreading the thought of leaving Jerry. Her aunt tells her to take control of her life. She perceives Agnes as allowing the heart to rule. Here Agnes is the tacit actor for p-in-a-i-iral ‘let prevail’ (< ni- + pa- + i- + iral), but she shows a lack of agency by letting the heart prevail (12). Rather than Agnes being focused as actor, it is the heart, itself a metaphor for lack of control, that is focused as grammatical patient. Pa- is the gerund form of non-control focus ma-. Here it has the sense of ‘let’. (12) hindi puweding puro puso ang p-in-a-i-iral neg can.be-lg pure heart ref rl-ger-incm-prevail “You can’t allow the heart to rule” ∼ “you cannot let the heart prevail.” When departure seems imminent, Agnes says, “Don’t let me go; I don’t want to go.” Jerry says “Remember, you are loved, loved, loved (by me).” (13). Mahal kita is usually translated as ‘I love you’, but it is non-control focus, meaning that the person loved is given grammatical focus.19 Because focus is on the patient, this construction does not highlight the agency of either participant. Kita is a portmanteau form that conflates second person singular experiencer and first person singular actor. At the denouement of this story, we will hear Jerry use the active form mag-mahal, highlighting the role of human agency. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.17 (937-1014) When does cognitive linguistics become cultural? (13) mahal na mahal na mahal kita. irr:ncf-loved lg irr:ncf-loved lg irr:ncf-loved 2s:ref “You are loved, loved, loved (by me).” Agnes goes to San Francisco, where she becomes ‘stoned’ with boredom, using the non-control prefix na- (14). Her non-agency is salient. (14) na-ba-bato ako rl:sf-incm-stone 1s:ref “I am stoned [turned to stone].” Agnes’s brother and sister mistreat her. Her mother, urging a more active role on her, tells her she has to use her brain: gamitin mo ang utak mo. Agnes says, (15) Ayoko dito. dislike:1s:gn prox:loc “I don’t like it here.” (16) Wala akong ka-kampi, ma-ma-matay ako sa lack 1s:ref-lg incm-take.side ncf:irr-incm-die 1s:ref drc lungkot. melancholy “I’m not taking sides, I’m dying of home sickness.” In (15), ayoko is a contraction of ayaw ko, so this is an instance of Agnes taking the role of agent, as indicated by genitive ko, but ko is not a focus-pronoun, so her agency is non-salient. In (16), Alice is again the agent of taking sides, but she denies her agency. In the next clause, the metaphor mamamatay ‘dying’, is non-control and Agnes is the focal participant, so her non-agency is salient. Back in Manila, Jerry’s mother interferes with their phone calls. Due to lack of communication and miscommunication, their relationship is starting to get blurry (nag-ka-halabu-an < labo ‘blur’). The prefix is active, but neither Agnes nor Jerry is the agent. Jerry is torn over his relationship with his boss. As Jerry and the boss sit in his car, he speaks of his feelings, using the term pa-ki-ramdam ‘feelings’, a conventional form based on the gerund form of non-control ma- (17). The term is used for transient feelings caused by outside events that effect the body or the emotions. Jerry speaks of hearing the crying of Alice, using the non-control nari-rinig (18), and realizing that she has resentment towards him (19). His nonagency is salient. The word tampo predicates a feeling of anger and hurt, a sense of sulkiness, often felt between two people who are close or love each other. Tampo is presented as a bare root, suggesting a nominal interpretation, which is reinforced by the referential preposition ang. Alice has some agency here, as her feeling is directed towards Jerry, but the pronoun referring to Alice is the genitive niya. She is not in grammatical focus, so her agency is has low salience. If there is a focus at JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.18 (1014-1079) Gary Palmer all in this clause, it is the resentment itself, which is preceded by ang, the referential preposition used with focal participants. (17) Ang pa-ki-ramdam ko tama ang g-in-a-gawa ko pag ref ger-soc-feel 1s:gn right ref rl:ug-incm-do 1s:gn when ikaw ka-usap ko. 2s:ref st-talk 1s:gn “My feeling is that what I am doing is right when you talk to me.” (18) Pero pag na-ri-rinig ko ang iyak ni Agnes, but when rl:sf-incm-hear 1sg:gn ref cry pr:gn prnm “But when I hear the Agnes’s crying,” (19) Malaki na nga ang tampo niya sa akin. great now emph spc hurt∼anger 3s:gn drc 1s:drc “She is really feeling hurt and resentful towards me.” Jerry won’t let Agnes come home. She is then attacked in an alley. She escapes and tries to call him, but he is at Cynthia’s place. Agnes goes crazy (na-ba-baliw, another non-control form in which non-agency is salient). She characterizes herself as stupid, using the nominal root tanga ‘stupidity’ plus the genitive first person pronoun (20). She uses a non-control form of believe, suggesting that she was caused to believe and she places herself in undergoer-focus to talk about being fooled.20 (20) Ang tanga ko, ang tanga ko, ang tanga tanga ko. ref stupidity 1s:gn ref stupidity 1s:gn ref stupidity stupidity 1s:gn “My stupidity, my stupidity, my great stupidity.” Bakit, bakit ganoon? why why like.this “Why, why like this?” Na-niwal. Bakit niya ako ni-loko? rl:sf-believe why 3s:gn 1s:ref rl:ug-fool “I believed him. Why did he fool me?” The film continues in this vein, until it reaches a crisis. Jerry now realizes that he has been a passive participant. He uses a flurry of realis forms with default grammatical undergoers to speak of swallowing pride (21), sacrificing principles (22, 23), and enduring (24). These all give Jerry agency, but Jerry as agent is not in focus, so his agency has low salience. Rather, his rantings place pride, principles, and hardship in focus. (21) ni-lunok ko ang pride ko rl-swallow 1s:gn ref pride 1s:gn “I have swallowed my pride.” JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.19 (1079-1158) When does cognitive linguistics become cultural? (22) S-in-akripisyo ko ang magandang kinabukasan ko sa Pilipinas. rl-sacrifice 1s:gn ref beautiful future 1s:gn drc prnm “I sacrificed a beautiful future in the Philippines.” (23) Ni-lamon ko ang prinsipyo ko. rl-eat.big.piece 1s:gn ref principle 1s:gn “I ate a big piece of my principles.” (24) T-in-iis ko ang hirap ng buhay dito. rl-endure 1s:gn ref difficult gn life prox:drc “I endured a hard life here.” This passage reaches a climax with Jerry’s use of two active forms describing his attempts to overcome the oppression of his circumstances nag-su-s-um-igaw ‘shouting out’ and nag-babakasakali ‘hoping to repeat the past’ (25, 26). The former is doubly active, in that it uses two active affixes, nag- and -um-. But even here, Jerry is apparently not the active grammatical agent, or he would be represented with the first person pronoun ako. The sense is that his internal feelings, presented in the ang-phrase, are actively impelling (nag-) active (-um-) shouting out. A consultant said: Nagsusumigaw ang kalooban ko does not necessarily mean that the person is ‘literally’ shouting or letting out his feelings to a person(s). It just means that the person has this (intense) feeling, clamoring/bursting inside of him, wanting to be let go. Now, the person has a choice whether to let it (the feeling) out or not but he doesn’t have to “shout” it out. The shouting was inside of him. The explanation suggests that the underlying scenario involves force dynamics in which the will is striving against the inner feelings (Talmy 1988). Agency involves both motivation and choice, which may act in opposition or synergistically. In this instance, Jerry is choosing to suppress the motivation. It will be interesting to search for further instances of the affix combination mag-__-um___ to discover whether it always predicates a force-dynamic scenario. (25) Kahit nag-su-s-um-igaw ang kalooban ko, dahil mahal in.spite.of rl:af-af-incm -shout ref inner.feeling 1s:gn because love kita, 2s:ref “In spite of this I am shouting out my inner feelings, because I love you,” (26) dahil nag-ba-baka-sakali ako-ng maulit because rl:af-incm-perhaps-in.case 1s:ref-lg ncf:irr-repeat yung dati. rem:ref-lg former “because I am perhaps hoping to repeat the past.” JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.20 (1158-1209) Gary Palmer Near the end of this saga, Jerry realizes that his capacity to love is something over which he should have control, even though he feels himself losing it. He uses the agent focus form mag-mahal (27). (27) Alam mong hindi ko kayang mag-mahal nang hindi buo ang know 2s:gn-lg neg 1s:gn ex-lg irr:af-love when neg whole ref pagkatao ko. humanity 1s:gn “You know I can’t love when my humanity is not whole.” Jerry doesn’t want to lose Agnes and her respect for him, so he returns to Manila, where he takes up his old role as an assertive advertising man. One day, Agnes shows up in Makati, the upscale business district of Manila. The film ends on their encounter, leading the viewer to conclude that the couple resumes their relationship. Perhaps their language also takes a more active turn. I hope my analysis has demonstrated that grammatical voice in Tagalog emotion language is sensitive to very abstract social scenarios. In the scenarios played out in the melodrama Sana’y Maulit Muli, personal agency of the participants is a major element. The repertoire of Tagalog verbal affixes provides ample resources for predicating nuances of personal agency. Voicing affixes may focus agents, experiencers, goals, or patients. By focusing an agent with -um- or mag-, an affix may also focus the agency involved, so long as the actor is a human participant. Lack of personal agency may be expressed directly with a focused experiencer (ma-) or patient (-in, i-, -an). Lack of personal agency may be expressed indirectly by denial of the agency implied by an active form. Occasionally, some component of identity, such as inner feelings, takes the grammatical role of focal participant in an active construction. I do not feel sufficiently conversant with Tagalog theory of agency to offer a judgement as to whether or not this construction in fact highlights personal agency. Controlled surrender of personal agency is expressed metaphorically with an unfocused actor in a genitive phrase, as in ni-lunok ko ang pride ko. ‘I have swallowed my pride’ (21) (ko is the genitive form of the first person pronoun). Case 2: Shona noun classifiers as polycentric categories Many languages have gender classifiers that segregate nouns. There are, for example, the genders of German and Latin, the numeral classifiers of Chinese, Japanese, Maya, Ojibway and many languages of southeast Asia, the verbal classifiers of Navajo, and the 20 or more classes of the Bantu languages. Other languages have substantive affixes that can function as classifiers. These would include, for example, the anatomical suffixes of Tarascan and Coeur d’Alene (Friedrich 1979: 394– 395; Palmer 1996: 60, 145–146).21 For decades linguists have struggled to make semantic sense of classifiers. Most commonly they have concluded that the as- JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.21 (1209-1258) When does cognitive linguistics become cultural? signment of lexemes to classes is arbitrary or that the classes center on such basic physical qualities as shape, texture, number, and animacy. While there is some explanatory value in the physical prototype approach, it has ultimately proven to be limited, leaving unexplained such interesting phenomena as the occurrence in some Bantu languages of the human term chief in the same class as wild animals (Guthrie’s 9/10; Guthrie 1967). Another approach was needed. As early as 1959, the famous paleontologist Louis S. B. Leakey proposed in his Kikuyu lesson book that the noun classes are ranked on a hierarchy of spiritual value. For example, humans appear in Leakey’s class I (Guthrie’s 1/2), the highest in spiritual value; class II (Guthrie’s 3/4) is for “second class spirits;” and class III (Guthrie’s 9/10) is for all other living creatures. Regarding Guthrie’s class 5/6, Leakey (1955: 13) asserted that “every single word in this class is an object which is used, or has been used until recently, in connection with religion, magic or ritual or some other form of ceremonial.” To my knowledge, Leakey’s proposal was never consciously followed up by linguists. The year 1987 saw a breakthrough in the understanding of classifiers. The key to their explanation was most widely publicized by George Lakoff in the book that drew its title Women, Fire, and Dangerous Things from a noun class of the Dyirbal language of Queensland. Lakoff was actually reshaping a middle-level theory proposed by Dixon (1982). Lakoff held that each noun class had a central member and that other members were linked to the central member by category chaining. The basis of the chaining was a common domain of experience, which was culturespecific. The Dyirbal classifier balan (one of four) marks a category whose central member is human females. In Dyirbal mythology, the sun was a woman. Other members of the class were birds (mythical females) and plants and animals who either appeared in the myth or were seen as somehow similar to fire (they were hot or they had stingers). Fire belongs to the class because it belongs the same domain of experience as the sun. Thus, with some exceptions, category membership seems neatly explained by this approach. Problems with the approach have been raised by Mylne (1995), whose critique was previously discussed. In the same year, Debra Spitulnik (1987) published a study of Chewa (Bantu) classifiers.22 Her approach leaned heavily on highly abstract schemas, which she called “central notional values”, but she also proposed that some nouns belong in their classes by virtue of cultural associations. “The [ChiBemba] noun ímfumu ‘chief ’ occurs in the class dominated by nouns for wild animals (Cl. 9/10) because of the cultural association of the chief with the animal world” (Spitulnik 1987: 110) [italics added]. She did not lean heavily on the cultural approach, because in her view, grammatical factors compete for control over the classifiers. At about the same time, Ellen Contini-Morava proposed in a paper made available on the internet that the Swahili (Bantu) noun classes were dominated by “super- JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.22 (1258-1311) Gary Palmer schemas” that were linked by schematicity and extension to spatial, supernatural, and psychological features and schemas.23 To sum up these approaches to understanding classifiers, Leakey described classification by spiritual hierarchy, Dixon and Lakoff showed clear mythical motivations for Dyirbal classifiers, Spitulnik presented a plausible cultural explanation for the apparently anomalous classification of Chewa chiefs, and Contini-Morava saw supernatural schemas underlying Swahili classes. These observations suggest that it might be worthwhile to apply a cultural approach to the Bantu classifiers with special attention to the supernatural and to apply the approach more systematically than had been previously attempted. That is what I and students Dorthea Neal Arin, Claudia Woodman, and Russell Rader have begun to do for the Shona language of Zimbabwe. But before discussing those findings, I will present a brief description of the classifier system involved: Bantu noun classifiers are defined by characteristic prefixes on the nouns and concordial affixes on adjectives, verbs, and deictics. The classes are usually designated by numbers from 1 to 22. In classes 1 to 13, odd numbers are singulars, even numbers are plurals. Thus, for Shona singular class 1, mu-, the plural is class 2, va-, and for singular class 3, mu-, the plural is class 4, mi-. Of the first 15 classes identified by Guthrie (1967), the only ones to which he attributed clear semantic correlates are 1/2 (persons) and 9/10 (animals). He observed that parts of the body appeared more frequently in 3/4 and 5/6, but otherwise found no definite correlations of meanings to classes. Fortune (1955) observed that “class 3 contains nouns indicating trees, parts of the body, atmospheric phenomena, things characterized by length, and miscellanea” [emphasis added]. The only atmospheric phenomena that he listed are m]ando ‘breeze, wet weather’ and possibly m]ea ‘air, soul’ and cando ‘cold.’ (Palmer & Woodman 1999) Specifically, Palmer (1996) and Palmer and Arin (1999) proposed that the semantics of classifiers in Shona and other Bantu systems are governed by salient ritual scenarios that are more culturally specific and richer than the stereotypes and features proposed by Spitulnik (1987, 1989) and Contini-Morava (1994). After reading all available ethnographies of Shona culture and society, Palmer and Arin identified nine specific and two general scenarios that might govern the distribution of Shona noun classes. Close reading of Shona ethnography was the only systematic method used to identify these scenarios. Therefore,we cannot guarantee that Shona speakers would agree with us on their salience or structure them in the same way. It would be preferable to conduct interviews and make correlated observations in the field.24 Scenarios 1, 2, 10, and 11 are listed below. The numbers of these scenarios do not correspond to the numbers used by Bantuists to identify the noun classes. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.23 (1311-1403) When does cognitive linguistics become cultural? 1. The spirits of ancestral chiefs live in the bodies of lions (mhondoro). 2. The chiefly ancestral spirits (mhondoro) reign over both the things of the wild and human affairs. They are the protectors of the land and the wild animals. 10. There is a scenario of protection in which the central participants are dominating protectors, protected ones, and the victims of domination. 11. There is ritual danger, stemming mainly from foreign ancestors with grievances or from contact with the paraphernalia of mediums. Palmer and Arin (1999) proposed that Guthrie’s class 9/10 is governed by scenario 10 (which also subsumes 1 and 2), and that Guthrie’s 5/6 might be governed by scenario 11. Subsequent research by Rader (1998) suggests that class 5/6 is more directly governed by the imagery and mythology of fertility.25 Palmer and Woodman (1999) examined Guthrie’s class 3/4, finding that its central members involve an important domestic scenario and an ethno-ecological model as well as mythical and ritual scenarios. Central physical items in this class are those used in ritual and domestic activities. There is a network of salient categories and chains of extension, which justify using the term “central” for the salient categories. We concluded that a noun class is more than a radial category centering on a prototypical member or a single domain of experience. It is more like a network of radial categories based on a cross-section of the cosmos, including physical experience, domestic scenarios, ritual scenarios, and world view. We proposed that a classifier organized like this be termed a polycentric category. Shona noun class 3/4 grammaticizes and lexicalizes four scenarios and one ethno-ecological model which are salient themes of Shona culture. Scenario 3 was among the 11 previously defined. Three new ones include two new ritual scenarios (12, 14) and a domestic scenario (13). Item 15 is an ethno-ecological model. 3. 12. 13. 14. 15. The spirits of ancestral chiefs bring rain, thunder, and lightning. People pray to the ancestors. Grain is pounded daily with a mortar and pestle. Doctors cure with herbal medicines that are ground in a mortar and pestle. Trees, shrubs, and herbs are associated with coolness, moisture, and medicine. The conceptual elements provided by these models find lexical expression in many of the members of Shona class 3/4 – see Figure 9. Those lexemes in the class that do not predicate any of the major elements in the five models are semantically linked in various ways as described in Table 2 (in Appendix). The more inclusive cognitive model of a noun class that emerges from inspection of the semantics of the lexical members and their associative links to the ethnographic models is what I refer to as a polycentric category. The general structure of such a category is summarized with example terms in Table 2 (in Appendix). f MOULT SCATTER PEOPLE PRAY TO ANCESTORS f f DAILY POUNDING OF GRAIN f SCATTERED MEAL f NOISE f f f MORTAR AND PESTLE f POLES elaboration Key extension f metonymy BAD HABITS REPETITION DURATION LENGTH EXTENSION f END-POINT TRANSFORMATION WITCHCRAFT f CRUSHING GRINDING OR POUNDING f MEDICINES f CURING PRACTICE GROUND MEAL ANCESTORS ANCESTORS GIVERAIN RAIN GIVE Figure 9. Shona class 3/4 as a polycentric category WAYS OF SPEAKING LANGUAGE f f FOOD OFFERINGS RAIN TREES SHRUBS HERBS MOISTURE LIQUIDS JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.24 (1403-1403) Gary Palmer JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.25 (1403-1457) When does cognitive linguistics become cultural? A polycentric category has more complexity than a radial category, but it does not seem to display unnatural or excessive complexity for the semantic system of a natural spoken language. It is natural for people to have salient ideas based on rituals and daily domestic tasks, and it is natural for them to model their environmental surroundings. It is natural to identify clusters of models that are functionally related and to regard them as a cultural unit. It is natural to abstract schemas from the elements of those models and to discover similarities and metaphors across conceptual domains. And it is natural to recursively apply such thought processes to the derived categories. Finally, it is natural for a lexeme to be polysemous within the domains of a polycentric category. When such a complex is grammaticized, the result is culture-specific and based on models that can be discovered by the methods of ethnography, but dependent upon mental processes that have been best described in the literature of cognitive linguistics. This approach explains the numerous instances of nouns which appear to satisfy the criteria for more than one class but characteristically appear in only one class. The archetypal example in Bantu studies is the classification of chiefs with wild animals, rather than with humans (Creider 1975). Many terms do in fact satisfy the criteria for multiple classes, but they are judged by their speakers to fit one better than another. Each class has multiple criteria, and these may be activated by the context of a discourse. The selection and classification of a term is the product of multiple competing and synergistic activations. In Bantu, some nominal roots have more than one common classification. It is likely that some classifications are well-entrenched, while others are more subject to reassignment. This approach raises a question of boundaries. Where are the boundaries between classes, if any? If every class has multiple criteria and nominal participants are sufficiently complex in their semantics to satisfy multiple criteria, then classes will necessarily compete for members in an ecology of classification. In fact, there are no fixed boundaries between classes. The overriding criterion is cultural salience, which varies with situations, but how can cultural salience be evaluated by the linguist? How can one predict which classifiers will be used with Bantu nominal roots? Currently, conclusions regarding the motivations for particular classifications are largely a matter of interpretation based on familiarity with the culture gained through participant observation or reading of ethnographies. One could devise tests that would manipulate the salience of criteria and observe the assignments of nominal participants to categories, but such tests may not reproduce the motivations presented by naturally occurring discourse. Nevertheless, in the event that such tests are undertaken, two hypotheses are suggested: 1 Reassignments will be more likely to occur where a domain which is inherent in both the semantics of the nominal root and in an alternative classifier is saliently evoked by the discourse situation. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.26 (1457-1518) Gary Palmer 2 It will be more difficult to elicit reassignments to more entrenched category members, where entrenchment is independently measured by frequency of usage or infrequent reassignment in natural discourse. We must ask also how one can evaluate this analysis in comparison to other possibilities. Are there other analyses that would be just as convincing? Can our analysis predict which nouns will be classified together? There are a number of possible criteria that could be used to evaluate competing analyses. They do not entirely solve the problem of arriving at an analysis that is both replicable by others and true to native-speaker thinking, because they remain subject to judgement and interpretation, but if taken seriously, I think they are better than having no criteria. The criteria are as follows: 1 2 3 4 5 An analysis should be based upon thorough and comprehensive ethnography with attention to salient cultural scenarios. Given an adequate description of the cultural scenarios, an analysis should be plausible, that is, it should consist of obvious connections. Non-obvious connections may be adduced only where they are supported by native speaker attestations. A plausible analysis that is supported by native speaker attestation and reasoning is to be preferred over one that is not supported. A plausible analysis which explains the largest number of terms in a class is to be preferred. A plausible analysis of a classifier which excludes terms normally found in other classes is to be preferred, though even in a correct analysis many terms will not be excluded, only preferred more strongly by their canonical classifier. Finally, we must ask whether the cultural approach with polycentric categories can predict the emergence and structure of classifier systems cross-linguistically. The theory predicts that some kind of classifier system can emerge wherever there are salient and stable cultural practices and institutions. These are the necessary conditions. Certainly, many of the languages around the world have classifier systems, though some are hardly recognized as such. For example, the anatomical suffixes of the Salish languages are usually not regarded as constituting classifier systems, yet they function in much the same way as they take on abstract values of shape (Palmer 1996). Also marginal to our notion of noun classifiers are the click classifiers of the Khoisan and the verbal classifiers of Apache, but they have similar functions (Bernárdez n.d.; Basso 1990). One might even regard a finite paradigm of honorifics, as in Japanese or Korean, as a classifier system in the social domain. The approach does not currently specify the conditions that are sufficient to motivate the emergence of classifiers. Further cross-linguistic studies along these lines are needed. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.27 (1518-1569) When does cognitive linguistics become cultural? . Conclusions Many lexical domains and grammatical constructions link directly or indirectly to significant cultural models, notably including scenarios. Understanding the grammar and lexicon of a language requires grasp of cultural models and culturally defined imagery. The most appropriate term for this approach is cultural linguistics. Application of this approach to voice in Tagalog emotion-verbs shows that the semantics of voice affixes can be described in terms of elemental scenarios that variously profile agents, experiencers, or objects. Analysis of the grammar in the emotional language of a Tagalog melodrama reveals that choice of voicing affix causes the agency of emotional participants to be profiled (given grammatical focus, either as agents or experiencers) or relegated to the base of predication with reduced prominence (actors appear in genitive or oblique phrases, or not at all). Thus, cultural linguistics helps to elucidate the emotional semantics of very dynamic discourse situations as portrayed in a popular medium. The grammar is seen operating in its socio-cultural context. The scenarios of voice presented here in diagrams 4–8 provide a basis for graphic comparisons across languages and domains, so comparable cross-linguistic studies are needed. Video melodramas are particularly useful for the study of grammatical voice, because emotional speakers are attuned to the nuances of semantic agency and because melodrama reveals and highlights the agency of participants in other ways, such as the presentation of facial expressions and the portrayal of scenarios of fortune and misfortune. The perspective of cultural linguistics shows obvious utility compared to a more narrowly cognitive approach in its application to the problem of Bantu noun classifiers, where the use of ethnographic methods to identify salient cultural models and scenarios is a necessary step in the research. In this application, it was possible to show how cognitive processes of complex category formation and category chaining operate within culturally specific models to create the polycentric categories that we know as Bantu noun classifiers. The polycentric category introduced by Palmer and Woodman (1999) has multiple central scenarios and prototypes, from which radiate category chains and complex categories as defined, respectively, by Lakoff (1987) and Langacker (1987). The approach of cultural linguistics and its theory of polycentric categories improves on previous accounts of classifier systems in a number of ways. It makes extensive use of ethnography, which enables the content of categories to be related to a variety of salient scenarios of domestic and ritual life. The attention to ethnography reduces the risk of forcing native terms into non-native categories. The approach avoids reducing each classifier category to a few features, or even to a single radial category based on a single domain of experience. Instead, it posits a number of functionally related scenarios, each of which provides a rich semantic field of linkage for JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.28 (1569-1633) Gary Palmer dozens of nouns. It is more complex than previous approaches, but appropriately so, because classifiers are motivated by metonymies and metaphors that are often explicit in ethnographic descriptions, in the construction of terms, or in multiple definitions of a single term. Finally, this approach highlights a number of interesting scientific questions pertaining to how one may establish the cultural validity and psychological reality of polycentric categories. Notes . The research on Tagalog was supported by a Site grant and a sabbatical leave from the University of Nevada, Las Vegas for a study of “Popular Discourse in Manila and Las Vegas”, by the Department of Anthropology, and by grants from the Faculty Travel Committee. My understanding of Tagalog linguistics has benefited from discussions and correspondence with Ricardo Nolasco, Videa P. de Guzman, Lawrence Reid, and Stanley Starosta, though none would necessarily subscribe to this analysis of Tagalog voice. I am indebted to Nikolaus Himmelmann for generously sending me his papers in progress and to Eric Pederson for many constructive comments. All correspondences concerning this article should be sent to Gary Palmer at University of Nevada at Las Vegas, USA. . The construal of schematic processes at different stages has been termed image-schema transformation (Lakoff 1987: 440–444, 1988: 144–149). . For an example case study of a discursive term, see my discussion of Japanese yo in Palmer (1966: 206–212). . For examples of conceptions of discourse in various cultures, see Kuipers (1998), Scollon and Scollon (1995: 94–121), Feld (1982), Kochman (1981), and Basso (1979). . This case study of Tagalog emotion language draws heavily upon my paper “Sana’y Maulit Muli: The Grammar of Agency and Emotion in a Tagalog Transnational Video Melodrama,” which is a revision of a paper presented to the Linguistics Colloquium, University of the Philippines, Diliman Campus, February 11, 1999 and the Annual Meeting of the American Anthropological Association, Philadephia, December 2–8, 1998. That paper contains additional examples and more discussion of cultural and historical dimensions. The glosses of abbreviations are as follows: 1, 2, person; af, agent focus; drc, directional; emph, emphasis; ex, existential; gn, genitive; ger, gerund; incm, incompletive reduplication; irr, irrealis; lf, locative focus; lg, ligature; loc, locative; ncf, non-control focus; neg, negative; pr, pronoun; prnm, proper name; prox, proximate; r2, augmentative reduplication; rl, realis; ref, referential; rem, remote; s, singular; sf, stative focus; st, stative (not involved in focus); soc, social; uf, undergoer focus. . “In Austronesian languages generally, agency and posssession are marked in the same way. In other words, the agent of non-actor focus verbs co-occurs with the genitive marker, usually a reflex of PAn *ni ‘genitive of human nouns; agent of non-actor focus verbs” (Blust 2002: 67). . Henceforth, the term focus will refer only to grammatical focus as defined for Tagalog. . Banal na Aso, Santong Kabayo ‘Holy Dog, Horse Saint’ by YANO. YANO. 1994. Yano. Produced by Yano & Poch Concepcion. Alpha Records Corporation (audiotape). JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.29 (1633-1709) When does cognitive linguistics become cultural? . Maniwala Ka Sana ‘Your Belief Is Hope’ by Parokya Ni Edgar. KHANGKHUN GKHERRNITZ THE ALBUM. Parokya Ni Edgar: Backbeat. Pasig, Metro Manila (audiotape). . Banal na Aso, Santong Kabayo ‘Holy Dog, Horse Saint’ by YANO. YANO. 1994. Yano. Produced by Yano & Poch Concepcion. Alpha Records Corporation (audiotape). . Kaka, ‘Joe,’ by YANO. . Senti ‘Sentimental’ (from YANO). . Sana’y Maulit Muli. Regal Films. Star Cinema Productions, Inc. (videotape). . Kaka, ‘Joe,’ by YANO. . Sana’y Maulit Muli. Regal Films. Star Cinema Productions, Inc. (videotape). . Sana’y Maulit Muli. Regal Films. Star Cinema Productions, Inc. (videotape). . For a more detailed analysis of the semantics of the mag- forms, see Palmer (2003). . Sana’y Maulit Muli. Regal Films. Star Cinema Productions, Inc. (videotape). In the film, Lea Salonga plays Agnes, a young middle-class woman who has a boyfriend Jerry, who is in advertising. Jerry is played by Aga Mulach. . One could also say minamahal kita. If we think of mahal as the base, then the only senses explicitly added by morphology are the incompletive, by means of reduplication of ma, and realis, by means of the infix -in. Thus, it is probably best to think of this form as predicating the default lack of control on the part of the one who is loved, i.e. the referential focal participant, kita. It is usually translated with the more active English expression I love you. . My consultants translated the expression as ‘Why did he fool me?’, but since niya is genitive, a more structure-preserving translation would be ‘Why was I fooled by him?’ . The figure of 20 for the Bantu classes includes singular and plural forms. If these are not counted separately, the figure would be ten. Classes 1 and 2 (or 1/2), for example, labels the singular and plural of the class that includes most terms for humans. . See also, Spitulnik (1989). . The paper was eventually published in Contini-Morava (1994). . Palmer was also able to draw on memories of eight months of field experience with several Bantu ethnic groups in a rural community in Kenya in 1969 and extensive reading in Bantu ethnographies in preparation for that work. . In spite of the earlier date of publication, Rader’s paper was published after the Palmer and Arin paper. References Basso, Keith (1990). Western Apache Language and Culture: Essays in Linguistic Anthropology. Tucson: University of Arizona Press. Bernárdez, Enrique (n.d.). Categorization through phonetic symbolism: Radial categories based on the clicks in the San languages. Unpublished Ms. in possession of the author. Blust, Robert (2002). Notes on the history of ‘focus’ in Austronesian languages. In Fay Wouk & Malcolm Ross (Eds.), The History and Typology of Western Austronesian Voice Systems (pp. 63–78). Canberra: Pacific Linguistics, Research School of Pacific and Asian Studies, The Australian National University. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.30 (1709-1840) Gary Palmer Clark, Herbert (1996). Using Language. Cambridge: Cambridge University Press. Contini-Morava, Ellen (1994). Noun Classification in Swahili. Publications of the Institute for Advanced Technology in the Humanities, University of Virginia. Research Reports, Second Series. Cooreman, Ann, Barbara Fox, & Talmy Givón (1984). The discourse definition of ergativity. Studies in Language, 8, 1–34. Creider, Chet (1975). The semantic system of noun classes in Proto-Bantu. Anthropological Linguistics, 17, 127–138. Dixon, R. M. W. (1982). Where Have All the Adjectives Gone? Berlin: Walter de Gruyter. Duranti, Alessandro (1994). From Grammar to Politics. Linguistic Anthropology in a Western Samoan Village. Berkeley: University of California Press. Feld, Steven (1982). Sound and Sentiment: Birds, Weeping, Poetics, and Song in Kaluli Expression (2nd ed.). Philadelphia: University of Pennsylvania Press. Fortune, G. (1955). An Analytical Grammar of SHONA. London: Longmans, Green and Company. Friedrich, Paul (1979). Language, Context, and the Imagination: Essays by Paul Friedrich. Stanford: Stanford University Press. Guthrie, Malcolm (1967). Comparative Bantu: An Introduction to the Comparative Linguistics and Prehistory of the Bantu Languages. Amersham, England: Gregg Press, LTD. Hannan, M. (1984). Standard Shona Dictionary. Revised. Harare, Zimbabwe: The College Press. Hutchins, Edwin (1996). Cognition in the Wild. MIT Press. Kochman, Thomas (1981). Black and White Styles in Conflict. Chicago: University of Chicago Press. Kuipers, Joel C. (1998). Language, Identity, and Marginality in Indonesia: The Changing Nature of Ritual Speech on the Island of Sumba. Cambridge: C.U.P. Lakoff, George (1987). Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. Chicago: University of Chicago Press. Lakoff, George (1988). Cognitive semantics. In Umberto Eco, Marco Santambrogio, & Patrizia Violi (Eds.), Meaning and Mental Representations (pp. 119–154). Bloomington and Indianapolis: Indiana University Press. Langacker, Ronald (1987). Foundations of Cognitive Linguistics, Vol. 1: Theoretical Prerequisites. Stanford: Stanford University Press. Langacker, Ronald (1990). Subjectification. Cognitive Linguistics, 1, 5–38. Langacker, Ronald (1991a). Foundations of Cognitive Linguistics, Vol. 2: Descriptive Application. Stanford: Stanford University Press. Langacker, Ronald (1991b). Concept, Image, and Symbol. Berlin/New York: Mouton de Gruyter. Langacker, Ronald (1999). Assessing the cognitive linguistic enterprise. In Theo Janssen & Gisela Redeker (Eds.), Cognitive Linguistics: Foundations, Scope, and Methodology (pp. 13–59). Berlin and New York: Mouton de Gruyter. Langacker, Ronald (2001). Discourse in cognitive grammar. Cognitive Linguistics, 12(2), 143– 188. Leakey, Louis S. B. (1955). First Lessons in Kikuyu. Nairobi: The Eagle Press. Mylne, Tom (1995). Grammatical category and world view: Western colonization of the Dyirbal language. Cognitive Linguistics, 6(4), 379–404. Palmer, Gary (1996). Toward a Theory of Cultural Linguistics. Austin: University of Texas Press. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.31 (1840-1913) When does cognitive linguistics become cultural? Palmer, Gary (2000). Review of Anna Wierzbicka, Understanding Cultures Through Their Key Words: English, Russian, Polish, German, and Japanese (New York: Oxford University Press, 1997) and Semantics: Primes and Universals. (New York: Oxford University Press, 1996). Journal of Linguistic Anthropology, 10, 279–284. Palmer, Gary (2003). Metonymy and polysemy in the Tagalog voicing prefix PAG-. In Gene Casad & Gary B. Palmer (Eds.), Cognitive Linguistics and Non-Indo-European languages (pp. 193–222). Berlin: Mouton de Gruyter. Palmer, Gary & Dorothea Neal Arin (1999). The domain of ancestral spirits in Bantu Noun Classification. In Masako Hiraga, Chris Sinha, & Sherman Wilcox (Eds.), Cultural Typological and Psycholinguistic Issues: Selected Papers of the Bi-annual ICLA Meeting in Alburquerque, July 1995 (pp. 25–45). Amsterdam: John Benjamins. Palmer, Gary & Claudia Woodman (1999). Ontological Classifiers as Polycentric Categories, as Seen in Shona Class 3 Nouns. In Martin Puetz & Marjolijn Verspoor (Eds.), Explorations in Linguistic Relativity (pp. 225–249). Amsterdam and Philadelphia: John Benjamins. Rader, Russell (1998). Life and land-ownership: the autochthonous nature of Shona noun class 5 and 6. California Anthropologist, 25, 8–17. Scollon, Ron & Suzanne Wong Scollon (1995). Intercultural Communication: A Discourse Approach. Cambridge/Oxford: Blackwell Publishers, Inc. Silverstein, Michael (1976). Shifters, linguistic categories, and cultural description. In Keith Basso & Henry Selby (Eds.), Meaning in Anthropology (pp. 11–55). Albuquerque: University of New Mexico Press. Spitulnik, Debra A. (1987). Semantic Superstructuring and Infrastructuring: Nominal Class Struggle in ChiBemba. Bloomington, Indiana: Indiana University Linguistics Club. Spitulnik, Debra A. (1989). Levels of Semantic Structuring in Bantu Noun Classification. In R. Botne & P. Newman (Eds.), Current Approaches to African Linguistics, Volume 5 (pp. 207–220). Dordrecht, The Netherlands: Foris. Stubbs, Michael (1983). Discourse Analysis: The Sociolinguistic Analysis of Natural Language. Chicago: University of Chicago Press. Talmy, Leonard (1988). Force dynamics in language and cognition. Cognitive Science, 12, 49–100. Wierzbicka, Anna (1996). Semantics: Primes and Universals. New York: Oxford University Press. Wierzbicka, Anna (1997). Understanding Cultures Through Their Key Words: English, Russian, Polish, German, and Japanese. New York: Oxford University Press. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.32 (1913-2056) Gary Palmer Appendix Table 2. The structure of a polycentric category: Shona class 3/4a (1) Multiple Central Models: A class may be governed by one, two, or more salient cultural models and/or scenarios that are different from those governing other classes. The central models of Shona class 3/4 are: The spirits of ancestral chiefs bring rain, thunder, and lightning. People pray to the ancestors. Grain is pounded daily with a mortar and pestle. Doctors cure with herbal medicines that are ground in a mortar and pestle. Trees, shrubs, and herbs are associated with coolness, moisture, and medicine. (2) Multiple Prototypes: A central model may be sufficiently complex to offer more than one prototype concept. For example, trees provide large poles and sticks, shrubs provide small poles and sticks. All provide medicinal leaves and fruits. The term for tree, muti, also means ‘medicine.’ Any of these items may serve as prototypes. The scenario of pounding grain with the pestle and mortar presents pounding, grinding, crushing, and grain as salient elements from which abstractions and extensions can be derived. The grain itself assumes the form of piles of grain, piles of finely ground meal, and scattered grains. These provide additional prototypes for spatial distribution of dry granular or powdery solids. The ancestral scenarios of curing and rain-making offer component scenarios of propitiation of ancestors and grinding and giving of medicines. They also offer physical models of cool liquids. Lexemes for all these elements appear in Shona class 3/4. Examples: muhwi ‘pestle’, musi ‘pestle’, mutsi ‘pestle’, muti ‘tree, medicine’, mudzukwa ‘tall, straight object (e.g. tree; skyscraper)’, mudzvurwa, mutwiwa ‘meal ground in duri (mortar)’, muchaka ‘meal from green mealies’, muchinjwa ‘mealie meal ground by engine-driven grinding mill’, mubvau ‘young, green mealie’, mudede ‘green mealies’, muguri ‘mealie cob (with the grains on it)’, munyuchu ‘mealie-rice’, mubukirwa ‘green maize cob’, mudakunanzva ‘sweet-tasting liquid’, mudzamba ‘porridge made with milk as the liquid’ mujururu ‘any liquid thinner than it should be’, muchenga muchenga ‘abundance of grain’, muchenganherera ‘general rain <-chenga’, munakamwe ‘springtime (beginning of rainy season)’, mutsatsatire ‘gusty rain’, muzhandwa ‘crops, animals or people struck down in large numbers. <-zhanda; act of crushing (e.g. as heavy object does when it falls)’, muchito ‘sound of footsteps, hoofbeats, etc.’. (3) Chaining of central models by metonymy: The themes that provide the backbone of a class are closely related, not by similarity, but by function or metonymy. For example, the pestle, a kind of stick or pole, provides the conceptual link from the originating model of trees, shrubs, and herbs to the scenario of pounding grain with a pestle. Medicines for curing are made from plant leaves and bark. One cures with herbal medicines, but also by appeal to ancestors who bring the rain associated with cool, moist forests and good plant cover. Examples: mukwerera ‘ceremony to pray for rain’, munamato ‘prayer (act of praying; words of prayer)’, musumo ‘small pot of beer offered to husband to notify him that beer has been prepared and is now ready; amount of any prepared food or drink brought to head of family so that he may say the polite words of welcome to a guest; opening words of prayer to mudzimu [ancestor]’, mukwerera ‘ceremony to pray for rain’. JB[v.20020404] Prn:9/02/2006; 8:57 F: HCP1502.tex / p.33 (2056-2056) When does cognitive linguistics become cultural? Table 2. (continued) (4) Radial Categories: Non-central terms are linked and chained to central members by metonymy and metaphor. For example, witchcraft, which appears in this class, is a kind of pounding and crushing. Examples: muzhandwa ‘crops, animals or people struck down in large numbers [as by sickness]; act of crushing (e.g. as heavy object does when it falls)’, mupfuku ‘trampled grain or grass, peaceful place, case of witchcraft, fee for such a case’, muchapo ‘paddle, medicine for killing witches’, mushinhiriro ‘spell; act of bewitching’. (5) Primary Schematization: Spatial and temporal schemas may be abstracted from any substantive concept. The pole or stick provides the abstraction of a solid cylinder or extended solid object. From pounding of the pestle it is an easy step to repetition, and to duration of time. Examples: mudhadhadha ‘long object (e.g. low building, letter to someone); cursive writing’, mugavhanyu-gavhanyu ‘repetition of an action without interruption’, muchimbo ‘index finger. <-chimba’, mudhidhi ‘penis (polite expr)’, mutambwi ‘time since’, musanya ‘period of time (gen the present)’, mukore ‘era, period of history’. (6) Secondary Schematization and Extension: Spatial schemas are subject to various abstractions and extensions. The end-point transformation of an extended spatial object or time is a common extension, yielding ends of paths, beginnings, last times, and worn-out objects. Examples: muvambo ‘commencement, action of beginning’, mutangiro ‘beginning, way of beginning’, mugumo ‘end (of action, extent, etc.)’, mufika ‘tapered end of axe or hoe blade’, mugumegume ‘last time, occasion, etc.’, mudemo ‘useless, worn-out axe’. (7) Extension of concepts to human behavior. The schema of repetition is extended to repetitive behaviors, mostly bad habits and propensities. Spatial and physical are extended. For example, in Shona, theft is a narrow passage between two objects. Language is a metaphorical scattering, the feathers of a moulting bird. Examples: mubo ‘way of stealing’, mukoto ‘narrow passage between two objects, pass, act of stealing something in order to sell it, object stolen in order to be sold, act of stealing’, mutauro ‘language, discussion of a misdemeanour gen leading to legal case’ < tau ‘speak, molt’, mubwereketero ‘way of speaking’, mukafamwera ‘foolish, thoughtless way of speaking’, mukanya ‘peremptory, emphatic way of speaking’, muririro ‘call; characteristic cry or way of speaking’. a This table is based on the framework presented in Palmer and Woodman (1999). Principle (7) from that listing has been subsumed into principle (6). All examples are from Hannan (1984). JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.1 (47-119) chapter Purple persuasion Deliberative rhetoric and conceptual blending Seana Coulson and Todd Oakley University of California San Diego / Case Western Reserve University Conceptual blending, or conceptual integration, is a set of general cognitive processes used to combine conceptual structure in mental spaces. We analyze how speakers exploit these blending processes in two examples of persuasive discourse: one a widely distributed email message urging recipients to vote for Democratic candidates in the 1998 U.S. congressional election; the other, a solicitation for monetary donations from the St. Matthew’s Church Ministry. Both examples use discourse to prompt very specific actions in the world. We show here how blending theory accounts for the mental operations necessary for readers to metamorphose into activists. Keywords: conceptual blending, conceptual integration, mental spaces, discourse, usage based data . Introduction Flipping through a magazine, you come across a photograph of a martini glass against a blue satin background. The glass contains a clear liquid, an olive, and a car key in place of the swizzle stick. The caption reads, “Killer Cocktail”, and the message is clear. Though there is no explicit mention of either drinking or driving, this bizarre picture functions as a powerful argument against the combination of the two activities. Apparently, the picture of the martini is enough to activate the concept of drinking, the car key is sufficient to activate the concept of driving, and the array of image and caption serves to activate background knowledge about the dangers of drinking and driving. Comprehension of this simple public service message results largely from the processes of conceptual blending: a set of general cognitive processes used to combine conceptual structure in mental spaces (Fauconnier & Turner 1998). Mental spaces are very partial representations of the entities and relations of a particular scenario as perceived, imagined, remembered, or otherwise understood by a JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.2 (119-163) Seana Coulson and Todd Oakley speaker (Fauconnier 1994). Blending takes place in a conceptual integration network, an array of mental spaces that typically includes at least two input spaces and a blended space. Input spaces represent information from discrete cognitive domains, and the blended space contains structure from both inputs, as well as its own emergent structure. For example, in the killer cocktail blend, one input includes conceptual structure related to drinking alcoholic beverages, and the other input includes conceptual structure related to driving automobiles. The blended space gets partial projections from both inputs and can develop emergent structure of its own. The human agent behaves in such a way that the act of drinking alcoholic beverages impinges on the act of driving a car. Emergent structure arises out of the imaginative processes of blending. The first process is called composition, and involves the juxtaposition of information from different spaces, as in conjunction and role-filling. For example, in the killer cocktail blend, an element from the driving domain (the car key) has been composed with structure from the cocktail domain, such that it fills the swizzle stick role. Completion, as in pattern completion, occurs when part of a cognitive model is activated and results in the activation of the rest of the frame. In the killer cocktail blend, the martini frame activated by the picture is completed with a frame for drinking alcoholic beverages. Similarly, the car key leads to the activation of a frame for driving. Finally, elaboration is an extended version of completion that results from mental simulation, or various sorts of physical and social interaction with the world as construed with blended concepts. In this example, simulating the possible unfortunate effects of drunk driving constitutes the elaboration of the blend. We shall argue that acts of deliberation depend on this elaboration process. Below we analyze how blending is recruited in two examples of persuasive discourse: one a widely distributed email message urging recipients to vote for Democratic candidates in the 1998 U.S. congressional election; the other, a solicitation for monetary donations from the St. Matthew’s Church Ministry. Both examples use discourse to prompt very specific actions in the world. We show here how blending theory accounts for the mental operations necessary for readers to metamorphose into activists. . Voting This section addresses blending in an email message sent from documentary filmmaker and political activist, Michael Moore, to left-wing, third-party American voters like Greens, Communists, and Socialists. The letter, dated October 8, 1998, urges its recipients to vote the Democratic ticket in the November 1998 midterm elections. Because the intended audience is unlikely to vote for Democratic candidates (and, indeed, in many cases, unlikely to vote at all), Moore’s letter is aimed JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.3 (163-222) Purple persuasion at reconstruing the act of voting so that it is more consistent with the values and goals of political progressives. He does so by framing the act of voting as a “legal act of civil disobedience”, and, relatedly, as “sending Congress a message” to cease impeachment proceedings against U.S. President Bill Clinton. Moore begins his letter with the following proposal: Dear Friends. . . Ok, I’ve had it. The right wing is trying to overturn a national election because. . . they didn’t like the results! This must be stopped. I would like to propose a legal act of civil disobedience that could send the Right into near oblivion. With this Moore introduces the oxymoronic concept of a legal act of civil disobedience, prompting the reader to wonder both about what a legal act of civil disobedience might be, as well as what particular action Moore has in mind. Only later do we learn: The act of civil disobedience I am calling for is for each and every American to go to the polls on November 3 and vote for the Democratic candidate for Congress on your ballot. However, Moore does not advocate voting for Democrats because he supports their policies. Rather, he opposes the policies of their chief political adversaries, the Republicans. Consequently, Moore’s first rhetorical goal is to counter the default interpretation of the act he advocates. Because voting Democrat usually signals support for Democratic policies, Moore makes several remarks that serve to distance himself from the Democrats. For example, Moore writes: “I am not a member of the Democratic party”; “To me they are a barely tolerable version of the Republicans”; “I did not vote for Clinton in 1996”; and even, “Yes, most Democrats suck”. Here, as in many places in the letter, Moore’s rhetoric is meant to appeal to the values and goals of his target audience. In particular, he is forced to contend with the implicit tension in being a participant in third-party politics while advocating a particular political action that inherently acknowledges its impotence in current American politics. By recruiting conceptual blending processes, Moore invites readers to construct models which allow them to maintain these incompatible goals. Below we analyze five distinct instances of blending that shape Moore’s argument. Palatable candidates For example, Moore begins his discussion of the 1996 Presidential election by bemoaning the absence of viable progressive candidates on the ballot. Recounting how he himself voted for Clinton in 1992, but not in 1996, Moore cites a list of JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.4 (222-279) Seana Coulson and Todd Oakley Clinton’s policies that signalled an abandonment of liberal ideals. Nonetheless, Moore argues, Clinton was elected in a fair and democratic election and should be permitted to serve as President of the United States for the remainder of his second term. With the following excerpt, Moore presents his readers with a blend that acknowledges both the limited choice in American politics, and Clinton’s status as the legitimate winner of the election. Capitalizing on the entrenched mapping between ideas and food (see also Lakoff & Johnson 1980) Moore writes: . . . the majority who could stomach that pathetic choice on the ballot went and voted for Bill Clinton. One input, perhaps structured by a model of ordering food in a restaurant, involves a scenario in which the agent imagines the palatability of menu items and makes her decision on this basis. The other input contains a model of voting in which citizens evaluate the political platforms of candidates on the ballot. In the blend, we are invited to imagine citizens evaluating the ballot in the way one might evaluate a menu, such that candidates are chosen based on how tasty their ideas are. On this construal, people who don’t vote correspond to people who will not eat in a particular restaurant because they don’t like the menu. However, note that in the restaurant case, the diner doesn’t typically know the details of the menu until after he has been seated. But, because the contents of the ballot are widely publicized ahead of time, people like Moore can actually avoid the polling booth if they don’t like the list of candidates. So, rather than relying on prototypical domain knowledge, the stomach blend recruits a slightly less prototypical model, which better matches the topic input. The restaurant space is thus structured by a model in which both the contents of the menu and the taste of the food are so well-known that people might well use this knowledge to choose whether or not to dine there. In America, the menu at a place like Denny’s or McDonald’s might serve as a potential counterpart for the ballot in Moore’s blend. As noted above, this blend capitalizes on entrenched mappings between ideas and food, exemplified in sentences such as “I devour books”, and “She won’t swallow your proposal”. Indeed, the use of the verb “stomach” to refer to tolerance for unpleasant things is entrenched enough to be listed in many dictionaries. As argued in Coulson and Oakley (2005), conceptual blending is often involved in conventional metaphoric expressions, although the mappings are not elaborated in the same way they are in more spectacular blends (such as the “Killer Cocktail” blend discussed in the introduction). For both novel and entrenched metaphors, conceptual structure from both input domains is activated as well as the structure in the blended space. But, because entrenchment often leads to automaticity, the mappings in conventional metaphors are established via an automatic process of retrieval rather than via analogical reasoning, and the “emergent” inferences can simply be retrieved rather than being actively computed. JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.5 (279-325) Purple persuasion Depending on their linguistic experience, readers differ in the extent to which they utilize retrieval over more effortful interpretive strategies, and differ in their awareness of the different domains activated by a blend such as the palatable candidates one discussed here. The domain of food consumption implicitly evoked by the verb “stomach” is available for integration with concepts from the domain of political choice evoked by “ballot” and “voted”. While the food domain is likely to be more salient for some speakers than others, the visceral sense of being nauseated by the candidates is what makes this text potentially compelling. The rhetorical efficacy of the text, then, depends in part in the reader’s willingness to construct the blend. Stinky candidates In suggesting that readers “hold their nose” while voting, Moore again evokes the unpalatable candidate blend while simultaneously signalling his sympathy with third party politics. He writes: If you want Congress to stop this witch hunt, if you want Congress to start focussing on the real problems facing this country and the world . . . get out and vote November 3. Hold your nose if you have to. Since the writer and his audience dislike the policies of Democrats as well as Republicans, Moore must frame the act of voting with the proper “attitude”. Thus Moore’s ‘hold your nose while voting’ blend is aimed at describing the manner of the proscribed action. The inputs to this blend include voting, and holding one’s nose while acting. The act of voting entails going to a designated space and making a choice among several candidates. Holding one’s nose while acting calls up a different frame, that of completing an unpleasant task. Consistent with the unpalatable candidate blend discussed above, one might hold one’s nose while eating something that tastes bad. Similarly, one might hold one’s nose while doing a task that involves a foul stench, such as changing a diaper, cleaning a toilet, or taking out the trash. Composing voting and holding one’s nose results in framing the act of voting as an unpleasant but necessary chore, much like some of the tasks mentioned above. Moreover, entrenched meaning of the ‘stinks’ metaphor, allows speakers to understand the text as acknowledging the limited political options available to progressive voters. The distinct nature of these acts emerges when one considers that the ‘holding your nose while voting’ blend produces inferences not usually attributable to either voting proper or to unpleasant stench-ridden tasks. In voting, one makes a choice among several possibilities, some more desirable than others. By contrast, if one’s task is to change a baby’s diaper, one does not normally go into a room and make a choice about whose diaper to change. Nor does one choose between the lesser of JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.6 (325-385) Seana Coulson and Todd Oakley two stinky diapers. In the blend, however, the voter is performing an unpleasant task in a stench-ridden environment, and that task is to choose the thing that stinks the least. Thus the voter should choose Democratic candidates because they stink less than the Republican candidates. Public conversation After noting that Bill Clinton won the 1996 Presidential election, Moore continues: That was the will of the people. And that is the will the Republicans are trying to subvert. In the passage above (which precedes the actual proposal), Moore frames his as yet undefined act of civil disobedience as preventing the Republicans (construed as a unified entity) from subverting the will of the people (also construed as unified). Thus Moore advocates neither Democratic congressional candidates, nor their party leader President Clinton. Rather, he advocates the “will of the people”. Though he hasn’t yet revealed how the Republicans are trying to subvert the will of the people, we know that it has to do with Clinton being elected President in a fair and democratic election, and that the Republicans did not like the results. Immediately after his discussion of Clinton’s (re)election in 1996, Moore moves to the related, but non-identical, issue of impeachment proceedings: All the public opinion polls – New York Times, Wall Street Journal, CNN – have said the same thing over and over: The American public does not want impeachment. Yet, Congress has decided to tell the public to take a flying %$#@& and has moved ahead with the impeachment process anyway. Although it is easy to construe impeachment as tantamount to overturning an election, each is a distinct concept. Strictly speaking, impeachment involves accusing a public official of high crimes; and while this may result in removing the accused official from office, it need not. Overturning an election, on the other hand, usually occurs when there is evidence that the voting process was unfair. But, because both can result in removal of an official from office, it is easy to set up cross-space mappings between the two concepts. Moore’s task is also supported by models set up earlier in the letter: because Clinton’s 1996 election has been construed as the will of the people, impeachment (and removal from office) is subverting that will. Thus Moore relies on conceptual integration to construct a simplified model of the relationship between electoral politics, political ideology, and the impeachment proceedings against Bill Clinton. First, public opinion polls are personified in a metonymic way so that the American public can speak with one voice. For example, the reader is invited to blend the results of various opinion polls (NYT, WSJ, JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.7 (385-441) Purple persuasion CNN) with statements uttered by individual citizens. In the larger picture, the story of a conversation between individual people, or representatives of different groups, is being blended with the more abstract communication (or miscommunication) between politicians and citizens. Moore’s blend exemplifies a key phenomenon in conceptual integration theory: compression to achieve human scale. Compression is a tendency for objects from multiple related spaces to be represented in a single blended space (Fauconnier & Turner 2002). For example, the same person can be viewed in different stages of his life, as in a cartoon where the former basketball star Michael Jordan plays a game against himself at an earlier stage in his career (see Coulson 2003). Fauconnier and Turner discuss many different sorts of compression, and note that this phenomenon often allows us to represent abstract concepts with more familiar frames. In Moore’s example, the opinions of many different people in the opinion poll are mapped onto a single person in the blend so as to facilitate the application of the “human scale” conversation frame in the blended space. For the most part, Moore’s blends are quite standard: the construal of polls as the voice of the people, election results as the will of the people, and Clinton’s impeachment as the subversion of the will of the people were all publicly available at the time he composed the letter. However, his description of Congress members telling their constituents to “take a flying %$#@&” represents a novel extension. There is, of course, no actual town meeting in which Congress members hurl expletives at their constituents. Rather, Moore prompts the reader to construe two independent sets of occurrences – one involving the release of opinion polls which reveal public opposition to impeachment; and the other, the decision by the House Judiciary Committee to proceed with impeachment – as an integrated event scenario. The compression here is used to construct a conversational frame with potential motivational properties. Moore’s blend has desirable rhetorical characteristics from both a cognitive and an affective standpoint. Cognitively, the event integration simplifies reasoning about a complex series of events. Moreover, the integration of the construal of the political process with that of an interpersonal argument invites the reader to complete the blend with knowledge from her own argumentative experiences. Because Congress has already proceeded against the will of the public, Congress maps onto the winner of the argument, and the reader (who also corresponds to the public) maps onto the loser. If the reader truly integrates knowledge about the political process with her own personal experience with losing arguments, it can evoke the sorts of emotions that accompany the latter. This, in turn, helps motivate the revenge frames that support Moore’s ultimate call to action. JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.8 (441-498) Seana Coulson and Todd Oakley Sending a message Having framed the political act of impeachment as a defiant act of disobedience on the part of Congress, Moore invokes a salient counterfactual in which the House Judiciary Committee behaves in a manner more consistent with the ‘message’ in the polls. In fact, Moore later draws on this scenario in his attempts to convince people to vote. Voting is framed as a poll that Congress will listen to. He writes: The act of civil disobedience I am calling for is for each and every American to go to the polls on November 3 and vote for the Democratic candidate for Congress on your ballot. That’s right, my fellow cynics and progressives – the only way to send a true message to the right wing is to throw every Republican out of office. Here he capitalizes on a mapping between polling and voting. In both models, individual members of the public express their opinions and the results are tabulated in order to express collective opinion. And, while both influence the political sphere of events, only voting has explicit political consequences. Winning an election is constitutive of assuming a political role in a way that favourable poll results are not. Moore elaborates on the public conversation blend by scripting what the citizenry should “say” in reply to Congress’ recent actions, thus framing voting for Democrats as the citizenry’s turn in conversation: Imagine if the Democrats are voted in by overwhelming numbers (when all the pundits are predicting a Republican landslide). The message would be loud and clear to all these new Democrats – the american public wants the agenda of the (so-called) christian right removed from the halls of our united states congress! Here Moore describes the message as being “loud and clear”, adjectives appropriate for verbal communication, but not for the abstract information presumably conveyed by the results of an election. Their use here is licensed by a conceptual blend between voting and speaking. Pascual (2002) suggests that due to the centrality of talk in human social life, many situations that involve information exchange – from perception to abstract instances of communication – are metaphorically construed as verbal communication (see also Turner 2002). A phenomenon called fictive interaction, Pascual shows how this blend is common in rhetorical situations that occur in the courtroom. As noted in our discussion of the Public Conversation blend, fictive interaction can be seen as an attempt to construe abstract situations with more motivating “human scale” frames. Moore’s blend between voting and speaking is facilitated by their shared frame structure as communicative acts. In the conceptual integration network, these JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.9 (498-549) Purple persuasion commonalities are represented in a generic space that contains a communicating agent, a communicative action, a message, and a recipient. In the blend, voting sends a message, which (unlike the vote in the politics space) is audible. Interestingly, the number of votes maps onto the loudness of the reply in an adversarial conversation. Moreover, as in a conversation, the louder the message the more conviction we attribute to the speaker. Moore suggests that if enough readers follow his advice, the message will be so forceful as to end the public debate. Framed this way, Moore can assert another consequence of speaking: the end of the right-wing’s political agenda. Interpretation is supported by the configuration of mental spaces needed to represent the complex conditional in this excerpt. Besides embedding the counterfactual Democratic landslide in a scenario that includes the prediction of a Republican landslide, the excerpt above sets up two sorts of contingencies dubbed content-level and epistemic-level by Sweetser (1990, 1996). At the content level, the antecedent is the Democrats being elected (in the case where pundits predict a Republican landslide), and is (in some sense) causally related to the consequent space where the message is clear. At the epistemic level, the antecedent remains the same, and the epistemic consequent is that people oppose the Republicans. Thus the election of Democrats licenses the inference that voters oppose the Republicans. Interestingly, given the structure Moore has set up, a Democratic victory will be interpreted quite differently from a Republican victory. Because votes are generally interpreted as an endorsement of the elected candidates’ policies, a Republican victory would presumably be interpreted as support for the right-wing agenda. However, by this point, Moore has clearly framed voting for a Democrat as voting against the right-wing policies embodied by Republican candidates. Indeed, Moore goes even so far as to propose that the act of voting for a Democrat is an act of civil disobedience. Legal act of civil disobedience In many ways, Moore’s portrayal of voting as an act of civil disobedience is the most striking aspect of the piece. Civil disobedience, by its very definition, involves the violation of the law. In contrast, voting is not only legal, but strongly encouraged by law. However, by recruiting peripheral aspects of structure from the concept of civil disobedience, and blending it with structure in his own ‘sending a message’ blend, Moore directs his readers to integrate two concepts that appear to be contradictory. First, Moore relies on the fact that the concept of civil disobedience is itself a blend between spaces which detail two different components of law: the moral justification for law; and the workings of the law. In the former space, which we might JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.10 (549-604) Seana Coulson and Todd Oakley call the Spirit of the Law, is a construal of the law as being enacted to promote the common good. In the latter space, which we might dub the Letter of the Law, an act of disobedience is defined as an act that violates the law. The blended space composes the act of disobedience with the justification for law. Civil disobedience is thus an act that violates the law to promote the common good. Elaborating this blend produces the inference that the law in question is unjust, and that acts of civil disobedience are meant to bring public attention to the unjustness of the act. Further, just as acts of civil disobedience are aimed at sending a message that the law is unjust and should be repealed, Moore suggests that his proscribed action is aimed at sending the message that the impeachment proceedings (and, indeed, right-wing policies more generally construed) are unjust and should be stopped. Thus Moore’s legal act of civil disobedience represents a keying of emergent structure in the more standard concept of civil disobedience. In short, what is a violation of the law in the civil disobedience space corresponds to a violation of a general principle not to vote for either Democrats or Republicans in the progressive politics space. In this way, the legal act of voting has been construed as an act of civil disobedience in the blend. Rather than doing something illegal for the greater good, Moore suggests his readers do something politically distasteful. Further, by capitalizing on the parallels he has set up between disobeying an unjust law and signalling disagreement with unjust Republican policies, Moore is able to appeal to an ethic – that of civil disobedience – that is likely to arouse a sympathetic response in his target audience of disgruntled progressives. Summary This section has shown how blending can be used to compress and combine a number of simplified models in order to form integrated event scenarios. Among other things, Moore’s blends frame voting as speaking in a larger political argument, voting as an unpleasant but necessary task, and voting as a form of protest. As discussed above, the correspondences between domains are animated in the blend to produce emergent structure. Although analyzable, it is their emergence as blends that make them potentially persuasive. Thus the success or failure of Moore’s letter does not simply depend on being able to establish the appropriate mappings – for example, understanding the intended correspondences between personal dialogue and the political process. The mappings are necessary, but not sufficient for persuasion. The rhetorical efficacy of the text depends on the reader’s willingness to integrate and elaborate the models in a way that yields the desired emergent structure and affective responses. The result of blending in these cases is to encourage readers to construe events with cognitive models that are both easily understood and appropriately motivat- JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.11 (604-659) Purple persuasion ing. Moore’s letter is a call for a particular action from readers which has been successively framed and reframed so as to make it palatable to its intended audience. The persuasive element of the letter is not aimed at changing the reader’s goals, but changing her construal of one particular action – that of voting for a Democrat – so that it is consistent with presumably extant goals. These observations are consistent with other research on argumentative discourse that suggests people attempt to exploit conceptual blending to reframe a particular scenario, but not to restructure their opponents’ value systems (Coulson 2001). . Purple point of contact This section concerns an elaborate invitation to support a church group which one of the authors actually received via the U.S. postal service. It is a very complicated message that includes a letter, a ‘prayer page’ to send with donations, a return envelope for the prayer page, and a purple sealed envelope bearing a message from Jesus Christ. The letter urges its recipient to perform a number of concrete actions in order to show her faith, and be blessed by Jesus. In particular, the reader is instructed to: 1. Place the purple sealed envelope under his or her pillow 2. Sleep on this “purple point of contact just like the children of Israel did when God instructed them to do so (Numbers 15: 38, 39)” 3. Mail back the prayer page with a donation to the Ministry 4. Open the purple sealed envelope to receive the “purple point of contact blessing”. This package is a rich piece of persuasion, the success of which depends on the reader’s willingness to construct a number of blends outlined below. In particular, we focus on blending involved in the metaphoric construal of making a donation as sowing a seed, and on how the reader is invited to construe her own actions as fulfilling the purple point of contact. Analysis points to an important role for blending in understanding commonalities between performative aspects of language and the social construction of reality. In performative language (as when a justice of the peace pronounces a couple “man and wife”), and ritual (as when parents in a particular Italian village carry their child up a set of stairs to ensure his success in life), actions in one space, or domain, serve to effect changes in another (Sweetser 1998, 2000). However, performativity only occurs when the scenario fulfils particular sociocultural conditions that license conceptual integration. For example, the metaphoric significance of the act of carrying the child up the stairs is confined to the execution of the ritual. Though entrenched connections between vertical JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.12 (659-703) Seana Coulson and Todd Oakley ascent and success are always available, the everyday act of taking the child upstairs to bed is not construed as contributing to the child’s success in life. The import of the action in the course of the ritual thus stems from its status as an entrenched blend in which the action and the metaphor have been integrated such that the physical actions are construed as causing metaphoric effects. In the case of, “I now pronounce you man and wife”, the utterance is fully integrated with the marriage frame only when it is uttered by an individual with the proper social authority (a judge, a minister, a priest, etc.), preceded by the appropriate sequence of utterances, and, perhaps, followed by a kiss. Similarly, in the purple point of contact letter, the solicitation succeeds only if the reader believes that her actions of putting the purple envelope under her pillow and mailing in the donation will result in a blessing. In the case of marriage, the integration is licensed (largely) by the social authority of the utterer. In the present case, the integration is licensed by the extent to which the St. Matthew’s Church Ministry is construed as acting with the authority of God. Consequently, much of the text of the letter is aimed at establishing the religious legitimacy of the Ministry, framing the act of donation as an act of faith, and constructing a blend in which the act of donation (and fulfilling the other instructions contained in the letter) can be conceived of as causally connected to the receipt of the blessing. Let’s have church here in your home A number of aspects of the letter seem to be aimed at promoting the religious authority of St. Matthew’s Church Ministry, and the construal of reading the letter and following its instructions as religious acts. For example, the fact that the organization (“St. Matthew’s Church Ministry”) contains the words “church”, “ministry”, and the name of a New Testament saint, all suggest a legitimate connection to Christianity. The letter is peppered with quotations from the Bible and accompanying citations of chapter and verse. Moreover, on the first page of the letter we find the following invocation: Our dearly beloved in Christ, turn to page two and let’s have church here in your home. The reader is thus invited to integrate her activity of reading the letter with her conception of attending church. Normally, reading a letter (particularly a solicitation from an unknown organization) is construed as a secular, and, often private, activity. Moreover, attending church involves leaving one’s home to go to a place of worship with others in a public space. Aspects of each input domain are selectively projected into the blend, so that reading the letter is construed as a religious activity, and the church service is construed as occurring in the home. The letter-church JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.13 (703-753) Purple persuasion blend is helped along by strategic modes of address (e.g., “our dearly beloved in Christ”) that one might expect to hear at a religious ceremony. In this blend, the minister does not speak to the congregation from the pulpit. Rather, the Ministry communicates with the reader via the letter. Constructing the blend thus involves establishing cross- space mappings between the Minister in a church and the writers of the letter (viz. the St. Matthew’s Church Ministry), and between the members of a congregation and the reader of the letter. In turn, completion from background knowledge about church yields inferences about the relationship between the reader and the writers of the letter. In particular, the letter writers in the blend are construed as possessing a Minister’s knowledge and wisdom, as well as his moral authority over his Congregation. Testimony One of the interesting facets of this communication is the extent to which it functions generically as a blend between an epistle and a chain letter, where the reader is entreated to send some small amount of money to various people on a list, with the expectation that it will lead to exponential returns when subsequent recipients send money to the reader. In the purple point of contact letter, we learn almost from the outset that the blessing God will give us for fulfilling the instructions in the letter has a distinct financial component. The letter starts with the following testimony from a woman named Priscilla: I was a sinner and drank real heavy and had a lot on my mind. I remember some of the scriptures that you had written to me and . . . I felt God speaking to my heart saying, “My daughter, your sins are forgiven.” I felt so good inside, for I knew God had saved my [soul]. Rev., I haven’t drank another drop from that day. I wrote you a letter and joined the Gold Book [Seed Harvest Prosperity] Plan, and it seemed like heaven just opened up my life. I didn’t have transportation, but now since I have been a member of the . . . Plan God has really been blessing [me]. I have a new Ford and Cadillac. Not only that, but I have never been broke. Note that the persuasive character of this testimony depends crucially on the congruity of the reader’s worldview and that advocated by the St. Matthew’s Church Ministry. For example, the writer presumes that the biblical faith is a part, or, at least, a potential part of the reader’s construal of reality. In other words, the writer presumes that the reader believes in God, as well as in the divinity of Jesus. Rhetoricians have argued that all arguments ultimately rest on shared facts, beliefs, presumptions, and values, which they call ‘objects of agreement’ (Perelman & Olbrechts-Tyteca 1969). If the reader does not share the presumption of religious JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.14 (753-813) Seana Coulson and Todd Oakley faith, and appreciate the value of the proposed blessing, persuasion will simply not occur. Given these objects of agreement, Priscilla’s testimony is aimed at promoting a conception of God as an entity willing to grant monetary favours. Moreover, readers are invited to map sister Priscilla’s speedy transformation from a poor sinner to a prosperous disciple onto our own case – provided, of course, that we are willing to see ourselves as downtrodden sinners. In Perelmanian terms, this is also an object of agreement, as we will not do what the letter bids unless we see ourselves as sinners who might potentially benefit from the blessing. The inputs to the blend involve two sets of spaces to represent the scenario described by Priscilla, first, a troubled past, second, joining the plan, and finally, the resolution of her problems; and, another set of spaces to represent the reader’s own troubled present, and desired future. The blend inherits its causal structure from the Priscilla domain, and its elements from the reader’s domain. Thus the reader imagines herself joining the plan, and construes this act as causally mediating a transformation from her own troubled present to her own desired future. Persuasion, then, depends on both sharing the objects of agreement that enable the reader to believe Priscilla’s story, and the reader’s willingness to blend her own situation with aspect of Priscilla’s. Sowing the seed of $5, $10, or $20 The letter repeatedly appeals to a metaphoric construal of making a monetary donation as sowing a seed. For example, towards the end of the letter proper, that is, the part of the letter addressed to the reader (rather than the part of the letter addressed directly to the Lord), we read: We believe you are going to sow a seed so God can bless you with a harvest. God said, “Give and it shall be given unto you . . .” Luke 6:38. We pray that you will sow $5.00, $10.00, $20.00, or more. Let God lead you. Our prayer is that, by faith, what you sow will start being returned to you before the seventh day of next month, as God sees fit. He knows best how and when to let it begin. Let us pray over this last page and purple sealed word. Let us bow our heads in prayer – shall we? [all emphasis in the original] Broadly, sowing a seed maps onto sending a donation, and the harvest maps onto the money that the sender receives in return. Mappings in the network are set up by a conventional metaphoric connection between agriculture and investment, which maps the metamorphosis of a seed into crops for harvest onto the difference between the initial investment and its return. The inputs to the seed-sowing blend thus include one space we might call the Agriculture space, and another we might call the Material space. The mapping between the seed and the money is cued ex- JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.15 (813-866) Purple persuasion plicitly by the statement, “We pray that you will sow $5.00, $10.00 . . .” in which the object of “sow” is not a type of seed (as in the agriculture input), but a unit of currency (that originates in the material input). Linguistic prompts also help the reader identify the mapping between the harvest and the monetary returns, in “Our prayer is that, by faith, what you sow will start being returned to you . . . [emphasis ours]” Since the letter reader will presumably sow money, she can expect money to be returned to her. The structure in the blend differs from conventional conceptions of agriculture in several ways, especially in its recruitment of structure from a third input which we might dub the Spiritual space. For example, on the prayer page, which the reader sends in with her donation, is written, “I am sowing [followed by a list of potential dollar amounts] as my seed unto the Lord, in faith”. Thus unlike real seeds, the seed of $5 is not planted in the earth; and, unlike a conventional investment, it has not been used for its purchasing power. The example here involves a prototypical case of conceptual integration in which the blended concept involves partial structure from each of its inputs as well as novel structure of its own. In the context of the blend, the $5 has some of the properties of conventional money (it can be used to buy things) and some of the properties of a seed (it will undergo a transformation). Further, unlike most agricultural endeavours, the relationship between the initial sowing of the seed and the final harvest is not mediated by farming activity. In contrast to default knowledge about managing investments, the transformation from seed to harvest here occurs “by faith”. Because it is a seed of faith, the coming harvest depends on receiving a blessing from the Lord. Moreover, receiving the blessing depends in turn on following the instructions to achieve the purple point of contact: mailing in the donation, sleeping on the purple envelope, and opening the purple envelope after sunset on the following day. The purple envelope please Inside the envelope is an image of Jesus from religious art, His hand raised in a generic blessing gesture. At the top of the picture is a quote from the New Testament, “. . . If two of you shall agree . . . it shall be done . . .” Matthew 18:19. At the bottom of the picture, the caption reads “Jesus, my letter is in the mail on its way to the people of God who will pray over it for me.” But perhaps most striking, is that this text is divided by a line drawing of a woman’s hand, holding a letter up towards Jesus – as if for Him to bless it. The image prompts the reader to unpack the blend (i.e., reconstitute the roles, relations, and inferences of each input space), mapping the picture of Jesus onto the saviour, the unidentified hand maps onto the reader, and the envelope maps onto the one the reader presumably mailed to the St. Matthew’s Church Min- JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.16 (866-941) Seana Coulson and Todd Oakley istry. Importantly, in the picture, although the reader holds the envelope in her hand, the stamp on the envelope has already been cancelled. This suggests it is no longer in the reader’s actual possession, but is being processed by the postal system. Thus in one input (derived from the original piece of religious art), Jesus issues a generic blessing with no specific target. In the other input, metonymically evoked by the envelope with the cancelled stamp, the reader sends in her prayer page with donation. The information represented in the two input spaces constitute two separate events, which need not be construed as integrated. People mail letters every day and rarely consider the spiritual implications of such an act. Similarly, Jesus can be construed as blessing any number of objects and actions in the world, with no preference given to the transactions of the U.S. postal system. However, given the background knowledge set up by excerpts such as “Lord, keep Your eyes upon this very envelope until. . . it is returned back to this little 47 year old church ministry. Lord, bless this dear one as they open this purple Sealed Word after sunset and after they have mailed their prayer page back to us”, the visual image prompts the reader to construe the disparate input spaces as a unified event structure. Jesus blesses the prayer page as it passes through the postal system, and blesses its sender as she opens the sealed purple envelope. The picture epitomizes the set of actions, reinforcing the spiritual import of her donation. It is, in fact, a rhetorical technique Aristotle termed energia or bringing-before-the-eyes (Aristotle 1994), in which the reader witnesses in the present all that is supposed to have occurred up to this point. Energia is an example of compression in which structure from a number of spaces that each represent events occurring at different points in time, are integrated into a single scene in the blended space. Moore, in his description of the argument between the American people (as expressed in the polls) and Congress (as expressed by their impeachment of Clinton), exploits compression in a similar way to construe a complex scenario with a single frame that evokes emotions and other associations consistent with his rhetorical goals. Summary The desired rhetorical effect of this letter depends on the existence of systematic correspondences between the three input spaces displayed in Table 1. Besides conventional agricultural metaphors for investment (e.g. investments that grow), the letter authors are exploiting conventional agricultural metaphors for spirituality (e.g. spiritual growth). The former play into the readers’ greed, while the latter are reminiscent of the Bible and bolster the legitimacy of the St. Matthew’s Church Ministry. The integration of these three domains results in a scenario where the reader can satisfy her greed in a virtuous way. Thus the inputs JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.17 (941-987) Purple persuasion Table 1. 3-input blend Material Spiritual Agriculture Reading Letter Mail Prayer Page w/ $5 donation Sleep on Purple Envelope Receive Money Attending Church Make Offering Sow Seed Commit Act of Faith Receive Blessing Cultivate Seed Reap Harvest in this blend are being exploited not only for their inferential possibilities, but also for their sociocultural significance. Further, while the letter clearly establishes the mappings between sending the money, sowing a seed, and making an offering to God, establishing the blend goes further. Without the blend (or, at least, without some sort of a blend), there is no way that anyone would believe that sending off $5, $10, or $20 could ever result in a new car. Similarly, the reader will not carry around the purple envelope or sleep on it unless she or he believes the action will have the spiritual and/or the monetary results implied in the blend. So, to reiterate, anyone who performs the actions described in the letter will do so because they have adopted the blend where mailing $5 is sowing a faith seed, sleeping on the envelope is an act of faith, and that the ultimate result of these actions will be a monetary blessing from God. Moreover, the difference between someone who does and someone who does not carry out the instructions has little to do with the mappings (presumably anyone can figure out what one is supposed to do and why), and everything to do with integrating and elaborating the structure in the blend until it becomes a motivating frame. . Conclusions Deliberative rhetoric is the primary means of getting human beings to think and act according to the expectations of others without recourse to violent coercion. We have suggested that, as an interpretive model capable of describing the strategic and tactical ways human beings frame situations, conceptual integration theory provides a means of addressing this fundamental area of human cognition. Moreover, in the analyses above we have attempted to demonstrate the importance of blending for understanding specific, attested instances of human deliberation. In sum, deliberation recruits elaboration as blends animate mappings in a way that makes them compelling. Because persuasion depends crucially on objects of agreement, rhetorical blends are aimed at promoting the perception of this agreement. Thus, Moore does not recruit the stomach blend because of a preponderance of shared rela- JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.18 (987-1034) Seana Coulson and Todd Oakley tional structure in our understandings of the choice of political candidates and the choice of what to eat for lunch. Nor does he make reference to holding one’s nose while voting purely because of its analogical potential. These blends were recruited because of the way they frame the topic space of American politics for a disenchanted third-party citizen. Such a citizen may discard a political letter couched in language designed to appeal to a mainstream voter, but be willing to consider a plea which establishes initial agreement between writer and reader that both consider Democratic candidates to be too conservative. Conceptual blending is used to integrate concepts with different affective valences, often so that the desired course of action is seen as consistent with the audience’s value system. Further, compression is used to simplify complex causal relationships, both so they can be more readily understood, and so that they can be construed with motivational “human scale” frames. This suggests our concepts have abstract, inferential, as well as affective and motivational properties. Moreover, neither is set in stone as speakers frequently employ conceptual blending processes to reconstrue a particular action to alter its inferential, affective, sociocultural, and even spiritual significance. We have also seen that the binding force of blends-we-act-on depends as much on the ontology supported by our cultural values and practices as on the structural correspondences between the representations in the different domains. For example, we have argued that the possibility of interpreting polling data as the voice of the people depends on our cognitive capacity for conceptual integration. But so, too does the possibility of construing the beliefs of the 270 odd million American citizens as the will of a unified American people depend on the existence of polling practices, voting practices, and standard procedures for interpreting the results. Relatedly, the success of rhetorical efforts to reify a blend like sowing a faith seed will depend in a complex way on the character of their appeal to social roles and previously established cultural practices. While conceptual integration does indeed account for the mental operations necessary to incite action, these examples suggest that the roots of action extend beyond the individual’s nervous system as conceptual blends are intimately intertwined with human doings. Acknowledgments Seana Coulson was supported by National Research Service Award DC00355. Thanks also to Mark Turner and Cyma Van Petten for comments on an earlier draft of this chapter. JB[v.20020404] Prn:9/02/2006; 9:07 F: HCP1503.tex / p.19 (1034-1112) Purple persuasion References Aristotle (1994). On Rhetoric: A Theory of Civil Discourse. Book III. (Translated by George Kennedy). Oxford and New York: Oxford University Press. Coulson, Seana (2001). Semantic Leaps: Frame-shifting and Conceptual Blending in Meaning Construction. Cambridge and New York: Cambridge University Press. Coulson, Seana (2003). Reasoning and rhetoric: Conceptual blending in political and religious rhetoric. In Elzbieta Oleksy & Barbara Lewandowska-Tomaszczyk (Eds.), Research and Scholarship in Integration Processes (pp. 59–88). Lodz, Poland: Lodz University Press. Coulson, Seana & Todd Oakley (2005). Blending and Coded Meaning: Literal and Figurative Meaning in Cognitive Semantics. Journal of Pragmatics, 37, 1510–1511. Fauconnier, Gilles (1994). Mental Spaces. Cambridge and New York: Cambridge University Press. Fauconnier, G. & Mark Turner (1998). Conceptual Integration Networks. Cognitive Science, 22, 133–187. Fauconnier, G. & Mark Turner (2002). The Way We Think. New York: Basic Books. Lakoff, G. & Mark Johnson (1980). Metaphors We Live By. Chicago: U. Chicago Press. Pascual, Esther (2002). Imaginary Trialogues: Conceptual Blending and Fictive Interaction in Criminal Courts. Utrecht, Netherlands: LOT. Perelman, C. & L. Olbrechts-Tyteca (1969). The New Rhetoric: A Treatise on Argumentation. Notre Dame & London: University of Notre Dame Press. Sweetser, Eve (1990). From Etymology to Pragmatics. Cambridge: Cambridge U. Press. Sweetser, Eve (1996). Spaces, Worlds, and Grammar. In Gilles Fauconnier & Eve Sweetser (Eds., pp. 318–333). Cambridge and New York: Cambridge University Press. Sweetser, Eve (1998). Performativity and blended spaces. Paper presented at the 4th conference on Conceptual Structure, Discourse, and Language, Atlanta, GA. Sweetser, Eve (2000). Blended Spaces and Performativity. Cognitive Linguistics, 3(4), 305–334. Turner, Mark (2002). The Cognitive Study of Art, Language and Literature. Poetics Today, 23, 9–20. JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.1 (48-113) chapter Depicting fictive motion in drawings* Teenie Matlock Stanford University This chapter examines fictive motion sentences such as The road goes along the coast. These constructions, which contain a motion verb but describe no motion, have been argued to involve dynamic construal, whereby motion or scanning occurs along a path or linear object (Langacker 1986; Talmy 1983, 1996). Three experimental studies tested this idea with novel drawing tasks aimed at the underlying conceptual structure of these constructions, especially the trajector (e.g., road). Participants drew pictures to demonstrate their understanding of fictive motion sentences. They drew longer trajectors when conceptualizing fictive motion sentences versus comparable non-fictive motion sentences (e.g., The road is next to the coast), and longer trajectors when conceptualizing fictive motion sentences with fast verbs (e.g., race) versus slow verbs (e.g., creep). Together, the results suggest that fictive motion sentences include dynamic construal as mentally simulated motion or linear extension. Keywords: fictive motion, spatial language, motion verbs, psycholinguistics, mental imagery . Introduction Motion verbs are pervasive. Found in all languages and all levels of discourse (Miller 1972; Miller & Johnson-Laird 1976), they are highly polysemous, affording a range of interpretations and occurring in a wide variety of grammatical constructions. When interpreted literally, motion verbs express movement along a trajectory, as in Bob goes down the walkway and The stray cat runs across the alley. In such cases, the subject noun phrase referent (e.g., Bob) is animate and capable of traveling through space. When interpreted figuratively, motion verbs often express no physical perceivable movement, as in Weekends go by fast and The tone went from morose to ecstatic. In these cases, motion information metaphorically maps on to relatively abstract conceptual domains, such as change and time, and spatial information is transformed or backgrounded (see Boroditsky 2000; Lakoff JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.2 (113-194) Teenie Matlock 1987; Lakoff & Johnson 1980, 1999; Radden 1997, for discussion of change is motion, time is space, and related conceptual metaphors). Another pervasive figurative motion verb use is shown in (1a) and (1b). It too describes a static scene, but in this case, spatial information is highlighted, especially spatial information relating to the trajector (here, subject noun phrase). (1) a. The road goes along the coast b. A lake runs between the golf course and the train tracks In (1a), the trajector (road) is close to and parallel with a landmark (coastline). In (1b), it extends between two landmarks (golf course and train tracks). In both, the trajector is linear, occupying a relatively long space. Though the construction shown in (1a) and (1b) is ubiquitous in everyday language and has received considerable attention in cognitive linguistics, its conceptual structure is not yet well understood. This goal of this chapter is to gain a better understanding of the representation underlying these figurative uses of motion verbs. First, I provide an overview of relevant cognitive linguistic research, including discussion of fictive motion. Then, I discuss the results of three novel drawing tasks designed to investigate the way these constructions are conceptualized and in turn externally represented. A commonly held assumption among cognitive linguists is that some linguistic forms and constructions tacitly include fictive motion, mentally simulated motion that transpires from one part of a scene to another (see Talmy 1996, 2000).1 On this view, upon hearing a spatial description such as The road goes along the coast the listener “moves” along some portion of a road, and upon hearing a sentence such as The lake runs between the golf course and the train tracks the listener “scans” a lake. Fictive motion is thought to be analogous in some respects to real motion in that it takes time to “go” from one imagined point in space and time to another. It is also believed to provide language users a way to compute information about the layout of the scene, especially the configuration of the trajector and its position relative to other entities (Matsumoto 1996). For instance, A table runs along the wall immediately signals that a table is adjacent to the wall and not simply in the proximity of the wall. Fictive motion is also thought to be subjectively experienced in that the language user enacts “motion” in the absence of an explicitly coded animate agent (see Langacker 1986). Fictive motion is not limited to constructions with motion verbs. It is present in a broad range of spatial expressions, including sentences such as There’s a cottage every now and then in the woods, evoking “movement” along a line of cottages (see Talmy 2000), or Ed is across the room from John, which involves “scanning” from Ed to John. Fictive motion is subsumed under virtual motion, which covers a broad range dynamic construal, including temporal scanning, such as the “replay” of events in the historical present (see Langacker 1999). JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.3 (194-233) Fictive motion Linguistic observations provide some insights into the conceptual structure of fictive motion sentences (also referred to as FM sentences) such as (1a) and (1b). One observation concerns tense and aspect. FM sentences often appear in the simple present tense, as shown in (1a), but not in the progressive, as exemplified in ??The road is running along the coast. Because FM sentences “already” express an on-going situation with an implicit state change (scanning from one point on the road to another), there is no need to make them more “on-going” by imposing progressive aspect (see Langacker 1987, 2000). (Note that this utterance would be fine with sufficient context. For instance, Person A asks Person B about the status of a new road, and Person B, who works on the road crew, responds with The road is running along the coast, highlighting the evolving, changing state of the road.) A second observation is that temporal modifiers often occur with FM sentences, as in The road goes along the coast for two hours. The same phrase could also indicate how long it took to actually move along the coast, as in Bob drove along the coast for two hours. A third observation is that directional phrases often occur with FM sentences, as in The road goes north or The road goes left. The same phrases describe direction of actual movement, as in The train goes north or The taxi turned left. Such linguistic observations are informative and useful, but conducting experiments can lead to deeper insights into language representation, comprehension, and use (see Gibbs 1991). Doing on-line experiments is one way to investigate conceptual structure, including that of FM sentences. In one project I did a series of decision-time experiments that tested how long it took participants to read and make decisions about FM sentences in a variety of contexts (Matlock 2004). The rationale was that if people simulate motion or visual scanning while attempting to understand fictive motion language, it should be possible to manipulate that simulation by varying contextual information about motion, for instance, placing an FM sentence in the context of a story about fast motion versus slow motion. Overall, participants were quicker to process FM sentences after reading stories about fast travel versus slow travel, short-distance versus long-distance, and with easy terrains versus difficult terrains. Together, the results suggested that understanding an FM sentence required participants to tap into information about the actual motion they had read about and imagined while reading the story (for supporting arguments, see Barsalou 1999 and Glenberg 1999). Critically, control experiments showed that participants were no faster or slower when reading comparable spatial descriptions that did not include fictive motion, for instance, The road is next to the coast. Doing experiments with drawings is another way to investigate conceptual structure. Drawings are external representations of people’s conceptions of the world, and they provide insights into how they conceptualize objects, states, and actions (Tversky 1999, 2001). They can also reveal aspects of conceptual understanding that may otherwise be impossible to express in words alone. This is JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.4 (233-283) Teenie Matlock evident in advertisements that use pictorial metaphor (see Forceville 1997). It is also seen in the way illustrators draw lines trailing behind a figure or an elongated figure to depict motion (McCloud 1993; Tversky 1999), and in the way people use lines and arrows to specify direction and other motion information in maps (Tversky & Lee 1998, 1999). Inferring motion from lines is so natural that even blind individuals “see” motion in raised curved lines and draw lines to indicate motion, for instance, lines emanating from a person (Kennedy 1997). In what follows, I discuss three drawing studies designed to get at the conceptual structure of fictive motion sentences. If mental simulation of movement or scanning is part of the conceptual structure of sentences with fictive motion, then that information may be observable in the way people externally represent salient spatial elements described by FM sentences. In particular, they may spatially extend or elongate trajectors in spatial depictions. If so, we might expect a long narrow rectangle to represent a carpet (trajector) in the FM spatial description The carpet runs between the wall and the counter, but not necessarily in the comparable non-FM (non-fictive motion) spatial description The carpet is between the wall and the carpet. In all three studies, participants read a sentence that described a spatial scene, and drew an image to represent their understanding of that sentence. In Study 1, they generated depictions of FM sentences, such as The pond runs between the barn and the corral, and non-FM sentences, such The pond is between the barn and the corral – sentences judged as having similar meanings and as having trajectors that may or may not be long in the world (e.g., pond). In Study 2, participants drew pictures of sentences such as The trail goes along the road and The trail is next to the road – sentences with inherently long trajectors. In Study 3, participants drew arrows to represent traversable trajectors in FM sentences that featured slow, neutral, or fast manner verbs (e.g., race, go, crawl), for instance, The frontage races through the countryside and The road crawls from one vista point to another. . Study 1 The goal of study 1 was to examine how people would depict sentences that did and did not include fictive motion. Of interest was how trajectors that may or may not be construed as long would be drawn. Would they be longer in depictions of FM sentences than in depictions of non-FM sentences? If the trajector (hereafter, TR) is generally longer in depictions of FM sentences than in depictions of nonFM sentences, it could suggest differences in conceptual structure due to motion simulation or some kind of elongation or linear extension. JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.5 (283-335) Fictive motion Method Participants Fourteen UCSC undergraduates participated for credit in a psychology course. All were native speakers of English or learned the language before the age of 7. Stimuli and design Stimuli included 128 English sentences. Each sentence described a spatial scene that was (a) outdoors (e.g., farm), (b) indoors (e.g., classroom), or (c) on the human body (e.g., leg). Primary stimuli included 32 sentence-pairs. Sentences in each pair were nearly identical. The FM sentence featured the motion verb run, and the non-FM sentence featured the copula verb be. In addition, half the pairs featured the prepositional phrase between X and Y (both FM and non-FM) (e.g., A birthmark runs between her ankle and knee, A birthmark is between her ankle and knee), and the other half featured along X (FM) and next to X (non-FM) (e.g., The tattoo runs along his spine, The tattoo is next to his spine). The sentences in each pair varied only minimally to lessen the influence of other factors.2 Sample stimuli are shown in Appendix 1. All sentences had subject noun phrases that referred to objects of variable length in the real world. For instance, an object such as a table may or may not be long (e.g., small round coffee table or long rectangular dining room table). A norming study before the experiment ensured that experimental sentences would include only trajectors that were conceptually “flexible” in length. Twelve UCSC undergraduates rated 195 concrete (tangible, visible) nouns on how long they were. To make their judgments, participants used a scale of 1 to 7, in which “1” was “never long”, and “7” was “always long”. The list included a wide range of items, including lake, tattoo, parking lot, and blackboard. In the end, only the items with mean ratings in the middle range were recruited as TR’s for sentential stimuli in the experiment (3 to 5). This was important to determining whether TR’s that are neutral to length would be linearly extended when they appeared in depictions of FM sentences. Prior to the experiment it was also important to establish that the two types of stimuli – FM sentences and non-FM sentences – would be as semantically similar as possible. In a separate norming study, 10 UCSC undergraduates rated sentences in every pair on semantic equivalence. Participants were told to think about the meaning of each sentence in a pair and provide a similarity rating. Using a scale where “1” indicated “not at all the same meaning” and “7”, “the same meaning”, participants rated 50 pairs, including items such as A birthmark runs between her ankle and knee and A birthmark is between her ankle and knee. Only the pairs with mean ratings of 5 or higher were retained as stimuli for the experiment. Finally, it was important to ensure that all sentences in the experiment were semanti- JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.6 (335-393) Teenie Matlock cally sensible. Using a scale of 1 to 7, in which “1” was “makes no sense” and “7” was “makes perfect sense,” 15 UCSC undergraduates judged all sentences as on semantic sensibility. All sentences in this study had mean ratings of 5 or higher. The stimuli also included 32 filler pairs of spatial sentences, such as The rocking chair sits on the back porch and The rocking chair is on the back porch. All sentence pairs, including fillers, were put into two lists so no participant would see both sentences in a pair. One contained 16 FM sentences, 16 non-FM sentences, and 32 filler sentences, and the other, the remaining 16 FM sentences, 16 non-FM sentences, and 32 fillers. Sentences in each list were randomly ordered and put in a booklet. In both booklets, each sentence appeared at the top of an otherwise blank, vertically oriented 8.5 by 11 inch page. Procedure After filling out a survey about language background and visual impairments, each participant was given a booklet and instructed to (1) read each sentence carefully, (2) imagine what it meant, and (3) quickly sketch the image below the sentence. The participant was told not to be overly concerned with detail because no sketch would be analyzed on artistic merit. Results and discussion Only the drawings for the non-filler sentences were analyzed. Length scores were calculated by first measuring the length and width of every TR (e.g., birthmark) in centimeters, and then dividing length by width. (Two coders, who were blind to the study, measured the scores here and in Study 2 and agreed 92 percent of the time.) The length scores were averaged across all drawings for FM sentences and non-FM sentences. Overall, TR’s were longer in depictions of FM sentences (M = 2.73) than in depictions of non-FM sentences (M = 1.84), t (12) = 4.91, p < .001. See Appendix 2 for examples of drawings. To see whether the overall difference in TR length was primarily driven by any one sentence type, two additional t-tests were run. One compared only the FM and non-FM sentences with the preposition between, yielding a reliable difference, t (12) = 3.05, p < .01 (FM = 2.44, non-FM = 1.94). The other compared only the FM and non-FM sentences with the preposition along/next to, showing a reliable difference, t (12) = 5.10, p < .001 (FM = 2.99, non-FM = 1.75). Thus, the difference in TR length was not driven by differences in prepositions. The results suggest the TR is conceptualized differently for FM sentences than it is for non-FM sentences, even though the two types of sentences are judged to be highly similar in semantic content. One possibility for greater TR length in depictions of FM sentences is that people naturally simulate motion or tap into motion information when processing fictive motion language. If so, this could encourage JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.7 (393-447) Fictive motion them to conceptually elongate the TR – either through scanning along it and later representing it in static form, or through spatially extending it and building it up over time. Another possibility, however, is that the mere presence of the motion verb in the FM sentence led to differences in TR length. . Study 2 The second study further investigated the conceptual structure of fictive motion using the drawing task from Experiment 1. Here participants were given only FM and non-FM sentences that contained inherently long TR’s (e.g., road in A road goes along a mountain range and The road is next to the mountain range). Of interest again was how the two types of sentences would be depicted in drawings. Specifically, would inherently long TR’s be even longer in depictions of sentences with fictive motion? Method Participants Nineteen UCSC undergraduates participated for credit in a psychology course. All were native speakers of English or learned the language before the 7 years of age. Stimuli and design Primary stimuli included 16 pairs of sentences that described outdoor settings. Each pair contained an FM sentence and a non-FM sentence. The FM sentence featured a motion verb (go, run), and the non-FM sentence featured a copula verb (be). The FM sentence also included the preposition along, as in A road goes along a mountain range, and the non-FM sentence included the prepositional phrase next to, as in A road is next to a mountain range.3 Sample stimuli are shown in Appendix 1. A norming study ensured all FM and non-FM were highly semantically similar. Using a scale where “1” indicated “not at all the same” and “7” indicated “the same”, 10 UCSC undergraduates rated sentence pairs such as A sidewalk goes along a canal and A sidewalk is next to a canal. In the end, only highly similar pairs (mean rating of 5 or higher) were used in the study. Those same sentences had also been rated as semantically sensible (mean rating of 5 or higher) by 21 UCSC undergraduates. They also included TR’s judged as relatively long (mean rating of 5 or higher) in the norming study mentioned in Study 1. All pairs of sentences, including the 16 filler pairs, were put into two booklets. One set contained 16 FM sentences, 16 non-FM sentences, and 32 filler sentences, and the other, the remaining 16 FM sentences, 16 non-FM sentences, and 32 filler JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.8 (447-514) Teenie Matlock sentences. Sentences in both booklets were randomly ordered and each sentence appeared at the top of an otherwise blank horizontal 8.5 by 11 inch page. Procedure Each participant followed the same procedure used in Experiment 1. Results and discussion Only the depictions of non-filler items were coded and analyzed. Length scores were measured using the method in Study 1. Overall, people drew longer TR’s when drawing of FM sentences (M = 10.13) than when drawing non-FM sentences (M = 6.79), t (18) = 3.51, p < .01. See Appendix 2 for examples. The results, consistent with those of Study 1, show differences in the way people conceptualized the TR in understanding and drawing FM and non-FM sentences. One explanation for longer TR’s in depictions of FM sentences is that people simulated motion or tapped into conceptual structure about actual motion in making sense of the sentence and forming a mental image. If so, this may have led them to conceptually elongate the TR and draw a longer object in the picture. Another possibility is that the motion verb alone led to longer TR’s. . Study 3 The third study further investigated the conceptual structure of fictive motion. In this case, a slightly different task was used, one with more attention on the trajector and one that used only FM sentences. Participants were given FM sentences with manner verbs that expressed varying rates of speed in their literal uses, such as race (fast), creep (slow), and go (neutral). For each sentence, participants drew an arrow to represent the TR (e.g., road in The road jets from one vista point to another). Of interest was whether manner of movement alone would lead to difference in how arrows were drawn, especially length, thickness, and crookedness. If FM sentences include motion as part of their conceptual structure, and if this is reflected in a spatial depiction, we would expect TR’s to be longer, thinner, and less crooked for FM sentences with fast motion verbs, even though nothing is actually moving in the description. Method Participants Sixteen UCSC undergraduates participated for credit in a psychology course. All were native speakers of English or had learned the language before 7 years of age. JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.9 (514-565) Fictive motion Stimuli and design The stimuli included 24 FM sentences and 48 fillers that described spatial scenes. Every FM sentence featured an underlined TR that represented a travel route (e.g., road, highway) and a motion verb that expressed (in its literal interpretation) a fast, slow, or neutral travel rate. The 6 slow verbs were jog, crawl, creep, plod, meander, and ramble. The 4 fast verbs (some used twice) were jet, fly, race, and speed, and the neutral verb was go. Verbs were categorized on rate of speed determined by a survey in which 18 UCSC undergraduates rated 45 action words (e.g., slide, run, race, creep, jump) on how fast they imagined doing the action and how long the actions took (see Matlock 2001). See Appendix 1 for stimuli. All sentences were randomly ordered and put into a booklet. Under every sentence there was a space for drawing the arrow. The space was 2 inches high and 8.5 inches wide. Procedure Every participant was instructed to (1) read each sentence, (2) focus on the underlined word in the sentence, (3) quickly draw an arrow to represent it, and (4) not erase. Results and discussion Three research assistants who were blind to the experimental manipulation rated all arrows on length, crookedness, and thickness. A high degree of inter-rater reliability was obtained (95 to 98 percent). (All p-values are < .05 unless specified otherwise.) Length To calibrate themselves, the coders first examined all arrows produced by a single individual. Then they rated every arrow on how it compared to all others drawn by that individual. A length rating of “1” specified “very short”, and a rating of “7” specified “very long”. All scores were then averaged according to the rate of speed expressed by the verb (fast, neutral, slow). The mean length rating for fast verbs (FV) was 4.95, for slow verbs (SV), 4.07, and for the neutral verb (NV), 3.99. A within-subjects analysis of variance showed a main effect for verb, F(2,45) = 11.1, p < .001, suggesting that manner influenced arrow length. Closer inspection showed a reliable difference between FV and SV, t (30) = 4.26, and between FV and NV, t (30) = 4.04, but not between NV and SV. Crookedness Coders surveyed all arrows for a single participant, and later rated every arrow on how crooked it was compared to all other arrows drawn by that individual. A rat- JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.10 (565-610) Teenie Matlock ing of “1” meant “not at all crooked”, and a rating of “7” meant “very crooked”. Average crookedness scores were 1.59 for FV, 2.37 for SV, and 2.63 for NV, respectively. A within-subjects ANOVA then revealed a main effect of verb, F(2,45) = 9.51, p < .001, indicating that arrow crookedness was affected by the information expressed by the verb.4 Closer inspection yielded a reliable difference between FV and SV, t (30) = 2.88, and FV and NV, t (30) = 4.8, but not between NV and SV. Thickness Coders first examined all arrows per individual. Then they obtained a thickness score for every arrow by comparing it to all other arrows drawn by that individual. A rating of “1” meant “not at all thick”, and “7” meant “very thick”. The average ratings were 1.04 for FV, 1.2 for NV, and 1.41 for SV. A within-subjects ANOVA showed a main effect for verb, F(2,45) = 5.65, indicating that manner information in the verb influenced arrow thickness. Closer analysis showed a reliable difference between FV and SV, t (30) = 2.67, and between NV and SV, t (30) = 2.19, but no difference was observed between FV and NV. Together, the results show that arrows that depict TR’s in FM sentences with fast motion verbs (e.g., race) are longer, thinner, and less crooked than arrows that are depictive of TR’s in FM sentences with slow motion verbs (e.g., creep). One possibility is that people mentally simulated motion or tapped into motion information when thinking about and forming an image of fictive motion sentences. This would mean that fast verbs caused people to simulate movement quickly and slow verbs caused people to simulate movement slowly. If so, these conceptual differences could have led to differences in how drawings were executed, for instance, slower pen stroke and shorter arrow for slow manner verbs. Another possibility is that nothing more than type of manner that was specified in the motion verb drove the results. . General discussion Three studies investigated the comprehension of sentences such as The road runs along the coast, believed by cognitive linguists to evoke mentally simulated traversal or scanning. Study 1 and Study 2 used free-style drawing tasks to investigate how trajectors would be drawn in depictions of FM sentences and depictions of nonFM sentences. The results revealed that depictions of trajectors were longer for FM sentences than for non-FM sentences even though the sentences were judged as being similar in meaning. Study 3 used a drawing task to investigate how trajectors would be depicted by arrows in FM sentences. Of interest was whether manner information (slow, fast, or neutral verb) would influence the way arrows were drawn. JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.11 (610-669) Fictive motion The results showed that arrows were longer, thinner, and straighter with fast verbs than with slow verbs. The results of the studies reported here lend support cognitive linguists’ claims about fictive motion and its role in the understanding of FM sentences. As shown in Study 1, objects that are not necessarily long, such as birthmarks, are longer in depictions of FM sentences, such as The birthmark runs between her knee to her ankle, than in depictions of non-FM sentences, such as The birthmark is between her knee to her ankle. Because drawings reflect people’s conceptions about space (Tversky 1999), it is not unreasonable to assume that longer trajectors in depictions of FM sentences are the end result of (a greater degree of) simulated motion or scanning. The thinking is that conceptually elongating or scanning along a linear entity takes time and that in a static depiction, time maps onto space. The same explanation applies to the results of Study 2. In that case, trajectors that were already long (e.g., road) became longer in depictions of FM sentences than they were in depictions of non-FM sentences. Study 3 offers further support, as depictions of trajectors were longest with fast verbs and shortest with slow or neutral verbs, suggesting that the speed of the verb interacts with and structures the construal of the noun phrase. One explanation is that the semantic velocity expressed by the verb mapped onto the velocity of the hand during drawing. Support for this comes from recent work on haptic perception and visual memory. Kerzel (2001), for instance, found a connection between hand speed and perceived velocity of moving objects. Participants in his study first watched a fast- or slow-moving visual stimulus. After that, they moved their hands either slowly or quickly (as per verbal or non-verbal instruction). Next they were asked to specify how quickly or slowly the visual stimulus moved. The results, that participants’ velocity of hand movement influenced their retention of visual velocity, suggested that visual perception and somatosensory perception are tightly coupled. Thus, based on Kerzel’s findings, it is reasonable to entertain the idea that in drawing a sentence such as The road jets from one vista point to another, participants in the studies presented here mapped verb velocity onto hand manual, that is, faster hand movement for drawing trajectors associated with fast verbs. Future research that measures velocity of hand movements could be informative. The idea that figurative uses of motion verbs include mental simulation may seem odd to language theorists who do not appeal to dynamic representations. However, scores of psychological studies have shown that mental imagery figures into all sorts of reasoning and problem solving. For instance, people are able to generate and mentally rotate three-dimensional images (e.g., Cooper & Shepard 1984; Shepherd & Metzler 1971). What’s more, people are able to imagine moving through an imagined environment and to shift position in the environment with non visual input (see Denis 1996; Denis & Cocude 1989; Kosslyn 1994). People are so good at imagining motion that the time taken to mentally “move” across JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.12 (669-722) Teenie Matlock an imaginary region mirrors the times one would expect from actual movement through an actual region in space (see Kosslyn, Ball, & Reiser 1978). Thus, it is plausible that people mentally simulate motion or scanning along a trajector when understanding FM sentences. For instance, it is not unreasonable to assume that in Study 1, participants elongated items such as lake in drawings because they mentally scanned the lake during the processing of sentences such as A lake runs between the golf course and the train tracks. What is most intriguing about the results reported in this chapter is that none of the stimuli conveyed actual motion through physical space. In all three studies, only figurative interpretations of motion verbs were available. If the sentences had expressed explicit motion through physical space, the results would be less interesting. For instance, if Study 3 had used literal uses of motion verbs, we would expect long arrows for a sentence such as John races through the park and short arrows for John crawls through the park. That differences arise even though there is no physical motion conveyed in the figurative uses of motion verbs provides compelling evidence to support cognitive linguists’ claims that FM sentences involve simulation or scanning of movement along a trajectory. These results challenge standard psycholinguistic accounts for how words are represented and processed. Regardless of how motion or scanning was simulated while people did the task, there was a strong interdependence of verb and subject noun phrase. In every study, the depiction of the subject noun phrase varied according to a difference in the verb: a motion verb or copula verb in Studies 1 and 2, and a slow verb or fast verb in Study 3. That the same noun phrase was depicted differently lends support to the idea that lexical meaning is emergent and interactive (e.g., Elman, Bates, Johnson, Karmiloff-Smith, Parisi, & Plunkett 1996; MacWhinney 1999; Tomasello 1998). The results are also problematic for the view that comprehending polysemous verbs involves a dictionary look-up (for discussion, see Gibbs & Matlock 1999). That would not explain how a verb such as go would influence the way another constituent in the sentence was depicted in the end. The results also call into question approaches that assume a hard and fast distinction between figurative and literal language (for discussion, see Coulson & Matlock 2001; Gibbs 1994). In some respects, the meaning evoked with fictive motion language is not unlike that of actual motion, even though nothing is described as moving. This is especially clear in Study 3 (e.g., long arrow with fast verb). The possibility that simulated motion figures into the use and understanding of language, including of sentences, such as The road runs along the coast, is not all that mysterious. Thinking about motion and space during language comprehension is natural, and involves tapping into and assimilating knowledge acquired from direct embodied experience and interaction with the world (Clark 1973; Lakoff 1987; Glenberg 1999). Understanding FM sentences involves knowing things like how long movement generally takes and knowing that it occurs along a JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.13 (722-773) Fictive motion trajector “contained” by a spatial region (Matlock 2004). Much of this knowledge is probably tacit and structured by basic image schemata, such as source-pathgoal and container (see Gibbs & Colston 1995; Johnson 1987; Mandler 1992, 1996). Although some of it may be conscious, for instance, remembering the lake you used to swim in or your local golf course upon hearing The lake runs between the golf course and the train tracks. The precise mechanisms underlying fictive motion need to be mapped out before we can fully understand how people process figurative uses of motion verbs in sentences such as The road runs along the coast. But for now, we can say that figurative uses of motion verbs appear to evoke conceptual structure that is dynamic and reflective of the way we perceive and enact motion in the world. Acknowledgments Thanks to Frank Brisard, Ravid Aisenmann, Raymond Gibbs, Jr., and an anonymous reviewer for comments on an early draft. Thanks also to Herbert Clark, Rachel Giora, Art Glenberg, Paul Lee, Leonard Talmy, and Barbara Tversky for sharing insights related to this work, and to Nicole Albert, Jeremy Elman, Kat Firme, Sydney Gould, Krysta Hays, and John Nolte for collecting and coding data. Notes * Some of the work in this paper was presented at RAAM-4 (Research and Applying Metaphor), Tunis, Tunisia, April, 2001. All correspondence concerning this article should be sent to Teenie Matlock, Social & Cognitive Sciences, University of California, Merced, CA 95344. Email: [email protected] . Talmy (1983) originally used the term virtual motion to refer to this phenomenon. Fictive motion is akin to Langacker’s (1986) abstract motion and Matsumoto’s (1996) subjective motion. Here I address only one type of fictive motion, Talmy’s (2000) co-extension path fictive motion. . Along could not be used for both FM and non-FM sentences because it could have resulted in a few semantically odd non-FM sentences, for instance, ?The city park is along the financial district. . See Note 2. . This result is statistically reliable even when the verbs meander and ramble (inherently crooked or curved) are excluded from the analysis. JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.14 (773-885) Teenie Matlock References Barsalou, Lawrence W. (1999). Language comprehension: Archival memory or preparation for situated action? Discourse Processes, 28, 61–80. Boroditsky, Lera (2000). Metaphoric structuring: Understanding time through spatial metaphors. Cognition, 75, 1–28. Clark, Herbert H. (1973). Space, time, semantics, and the child. In T. E. Moore (Ed.), Cognitive development and the acquisition of language. San Diego: Academic Press. Cooper, Lynn A. & Roger N. Shepard (1984). Turning something over in the mind. Scientific American, 251, 106–114. Coulson, Seana & Teenie Matlock (2001). Metaphor and the space structuring model. Metaphor and Symbol, 16, 295–316. Denis, Michel (1996). Imagery and the description of spatial configurations. In M. de Vega, M. J. Intons-Peterson, P. N. Johnson-Laird, M. Denis, & M. Marschark (Eds.), Models of visuospatial cognition (pp. 128–197). New York, NY: Oxford University Press. Denis, Michel & Marguerite Cocude (1989). Scanning visual images generated from verbal descriptions. European Journal of Cognitive Psychology, 1, 293–307. Elman, Jeffrey L., Elizabeth A. Bates, Mark H. Johnson, Annettee Karmiloff-Smith, Dominco Parisi, & Kim Plunkett (1996). Rethinking innateness. A cognitive perspective on development. Cambridge, MA: MIT Press. Forceville, Charles (1997). Pictorial metaphor in adverstising. London: Routledge. Gibbs, Raymond W. Jr. (1991). What’s cognitive about cognitive linguistics? In Eugene Casad (Ed.), Cognitive linguistics in the redwoods: The expansion of a new paradigm in linguistics (pp. 27–53). The Hague: Mouton. Gibbs, Raymond W. Jr. (1994). Figurative thought and figurative language. In Morton A. Gernsbacher (Ed.), Handbook of psycholinguistics (pp. 411–446). San Diego, CA: Academic Press. Gibbs, Raymond W. Jr. & Herbert Colston (1995). The cognitive psychological reality of image schemas and their transformations. Cognitive Linguistics, 6, 347–378. Gibbs, Raymond W. & Teenie Matlock (1999). Psycholinguistics and mental representations. Cognitive Linguistics, 10, 263–269. Glenberg, Arthur M. (1997). What memory is for. Behavioral and Brain Sciences, 20, 1–55. Glenberg, Arthur M. (1999). Why mental models must be embodied. In Gert Rickheit & Christopher Habel (Eds.), Mental models in discourse processing and reasoning. New York, NY: North-Holland. Johnson, Mark (1987). The body in the mind: The bodily basis of meaning. Chicago, IL: The Chicago University Press. Kennedy, John M. (1997). How the blind draw. Scientific American, 276, 60–65. Kerzel, Dirk (2001). Visual short-term memory is influenced by haptic perception. Journal of Experimental psychology: Learning, Memory, and Cognition, 27, 1101–1109. Kosslyn, Stephen M. (1994). Image and brain. The resolution of the imagery debate. Cambridge, MA: MIT Press. Kosslyn, Stephen M., T. M. Ball, & B. J. Reiser (1978). Visual images preserve metric spatial information: Evidence from studies of image scanning. Journal of Experimental Psychology: Human Perception and Performance, 4, 47–60. Lakoff, George (1987). Women, fire, and dangerous things: What categories reveal about the mind. Chicago, IL: University of Chicago Press. JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.15 (885-1011) Fictive motion Lakoff, George & Mark Johnson (1980). Metaphors we live by. Chicago, IL: University of Chicago Press. Lakoff, George & Mark Johnson (1999). Philosophy in the flesh: The embodied mind and its challenge to Western thought. New York, NY: Basic Books. Langacker, Ronald W. (1986). Abstract motion. Proceedings of the Twelfth Annual Meeting of the Berkeley Linguistics Society, 455–471. Langacker, Ronald W. (1987). Foundations of cognitive grammar, Vol. 1: Theoretical Prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald W. (1999). Virtual reality. Studies in the Linguistic Sciences, 29, 77–103. Langacker, Ronald W. (2000). Grammar and conceptualization. Berlin: Mouton de Gruyter. MacWhinney, Brian (1999). The emergence of language. Mahwah, NJ: Lawrence Erlbaum. Mandler, Jean M. (1992). How to build a baby: II. Conceptual primitives. Psychological Review, 99, 587–604. Mandler, Jean M. (1996). Preverbal representation and language. In P. Bloom, M. A. Peterson, L. Nadel, & M. F. Garrett (Eds.), Language and space (pp. 365–384). Cambridge, MA: MIT Press. Matlock, T. (2001). How real is fictive motion? Doctoral dissertation. University of California, Santa Cruz. Matlock, T. (2004). Fictive motion as cognitive simulation. Memory & Cognition, 32, 1389–1400. Matsumoto, Yo (1996). Subjective motion and English and Japanese verbs. Cognitive Linguistics, 7, 183–226. McCloud, Scott (1993). Understanding comics: The invisible art. HarperPerennial. Miller, Geroge A. (1972). English verbs of motion: A case study in semantics and lexical memory. In A. W. Melton & E. Martin (Eds.), Coding processes in human memory (pp. 335–372). New York, NY: John Wiley & Sons. Miller, George A. & Philip N. Johnson-Laird (1976). Language and perception. Cambridge, MA: Harvard University Press. Nuyts, Jan & Eric Pederson (1997). Language and conceptualization. New York: Cambridge University Press. Radden, Gunter (1997). Time is space. In Birgit Smieja & Meike Tasch (Eds.), Human contact through language and linguistics (pp. 147–166). Frankfurt/Main: Peter Lang. Shepard, Roger N. & J. Metzler (1971). Mental rotation of three-dimensional objects. Science, 171, 701–703. Talmy, Leonard (1983). How language structures space. In H. Pick & L. P. Acredolo (Eds.), Spatial orientation: Theory, research, and application (pp. 225–282). New York: Plenum Press. Talmy, Leonard (1996). Fictive motion in language and “ception”. In Paul Bloom, Mary A. Peterson, Lynn Nadel, & M. F. Garrett (Eds.), Language and space (pp. 211–276). Cambridge, MA: MIT Press. Talmy, Leonard (2000). Toward a Cognitive Semantics, Volume I: Conceptual Structuring Systems. Cambridge: MIT Press. Tomasello, Michael (1998). The new psychology of language: Cognitive and functional approaches to language structure. Mahwah, NJ: Lawrence Erlbaum. Tversky, Barbara (2001). Spatial schemas in depictions. In M. Gattis (Ed.), Spatial schemas and abstract thought (pp. 79–112). Cambridge, MA: MIT Press. Tversky, Barbara (1999). What does drawing reveal about thinking? In John S. Gero & Barbara Tversky (Eds.), Visual and spatial reasoning in design (pp. 93–101). Sydney, Australia: Key Centre of Design Computing and Cognition. JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.16 (1011-1073) Teenie Matlock Tversky, Barbara & Paul U. Lee (1998). How space structures language. In Christian Freska, Christopher Habel, & Karl Friedrich Wender (Eds.), Spatial cognition: An interdisciplinary approach to representation and processing of spatial knowledge (pp. 157–175). Berlin: Springer-Verlag. Tversky, Barbara & Paul U. Lee (1999). Pictorial and verbal tools for conveying routes. Conference on Spatial Information Theory (COSIT ‘99). Hamburg, Germany. Appendix 1 Experiment 1 Sample Stimuli The military base runs between the two mountain ranges The military base is between the two mountain ranges A lake runs between the golf course and the train tracks A lake is between the golf course and the train tracks The pond runs between the barn and the corral The pond is between the barn and the corral The swimming pool runs between the patio and the garage The swimming pool is between the patio and the garage The blackboard runs between the water fountain and the door The blackboard is between the water fountain and the door The birthmark runs between her knee and ankle The birthmark is between her knee and ankle The university parking lot runs along the edge of the lagoon The university parking lot is next to the edge of the lagoon The city park runs along the financial district The city park is next to the financial district The pig pen runs along the side of the barn The pig pen is next to the side of the barn The lake runs along the golf course The lake is next to the golf course The tattoo runs along his spine The tattoo is next to his spine Experiment 2 Sample Stimuli The highway runs along the coast The highway is next to the coast A toll road runs along the coastline A toll road is next to the coastline The bike path runs along the railroad tracks The bike path is next to the railroad tracks The trail runs along a road The trail is next to the road A road runs along a mountain range A road is next to a mountain range The trail goes along the road The trail is next to the road JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.17 (1073-1102) Fictive motion A freeway goes along the mountain range A freeway is next to the mountain range A frontage road goes along the freeway A frontage road is next to the freeway The footpath goes along the creek The footpath is next to the creek The sidewalk goes along the canal The sidewalk is next to the canal Some huts run along the edge of the lake Some huts are next to the edge of the lake Some trees runs along the river Some trees are near the river Experiment 3 Sample Stimuli Fast-manner verbs The frontage road speeds alongside the freeway The road jets from one vista point to another The toll road races through the countryside The highway races through the grasslands The road flies through the countryside Neutral-manner verbs The road goes through the desert The footpath goes through the hills The trail goes through the valley The street goes through farmland The freeway goes through the forest Slow-manner verbs The toll road meanders through the countryside The road crawls from one vista point to another The highway crawls through the grasslands The sidewalk jogs from one house to another The road plods through the countryside JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.18 (1102-1126) Teenie Matlock Appendix 2 Examples of drawings from Experiment 1 Figure 1. The birthmark is between her knee and her ankle (non-FM) Figure 2. The birthmark runs between her knee and her ankle (FM) JB[v.20020404] Prn:13/02/2006; 13:16 F: HCP1504.tex / p.19 (1126-1150) Fictive motion Examples of drawings from Experiment 2 Figure 3. A road is next to a mountain range (non-FM) Figure 4. A road runs along a mountain range (FM) JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.1 (47-109) chapter Discourse, gesture, and mental spaces manoeuvers Inside versus outside F-space* June Luchjenbroers University of Wales, Bangor During discourse conversational gestures tend to occur in the physical area in front of the speaker. That space is referred to as the ‘comfort zone’ and is where the bulk of a speaker’s gestures tend to occur. This paper is an investigation into the relationship between the parameters of that physical space and the mental spaces required for discourse processing. It is argued that the boundaries of this space (called the ‘F-space’) provide additional aspects of speaker meaning in the form of clues about speaker cognition. The examples provided in this paper give evidence of the conceptual mappings needed in discourse processing. Keywords: ‘F-space’, comfort zone, iconicity, mental spaces . Introduction This paper is an exploration into the dynamics of the physical, gestural space used by speakers during discourse. This exploration furthers earlier research into how lexical, prosodic, and gestural information may combine to provide discourse participants with the appropriate cues needed to set up and structure mental spaces (cf. Luchjenbroers 2001, 2002, 2004). It is the aim of this paper to expand on what has been referred to in earlier work as ‘F-space’ and how the dimensions of this physical space may associate with the necessary navigations around and between any number of mental spaces required during discourse. Particular points of theoretical and observational importance are needed for this exploration, including an overview of the relevant features of Mental Spaces Theory, and a review of the main types of gesture. These gesture types are then considered in terms of how they manifest inside or outside a speaker’s ‘comfort zone’ (or ‘F-space’). The range of examples used in this exploration illustrate how gesture, together with the physical properties of ‘F-space’, can provide discourse JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.2 (109-163) June Luchjenbroers participants with a potentially rich strategy for navigating conceptual space, as well as enhancing information conveyed in the lexical component of talk. . Discourse processing theory The philosophical approach generally embraced in the field sees discourse as a process of ‘mutual ground’ construction. This label is meant to convey that discourse participants aim to achieve a mutual understanding of what they are talking about, and appear to work toward that goal (cf. Grice 1975, 1978). This is the basis of the ‘cooperative discourse’ approach inherent in the work of many theorists working in this field (e.g., Clark 1993, 1996, 1997; Chafe 1994; Lambrecht 1994; Tomlin 1987, 1997; Tomlin et al. 1997; Luchjenbroers 1993, 2000). This cooperation involves speakers giving addressees adequate cues to derive their speaker-intended meaning, and addressees making a determined search for that meaning. How participants manage to do this is thought to be the product of ‘shared’ knowledge (called ‘mutual’ or ‘common’ ground) – i.e., a speaker can produce the appropriate bite-sized pieces for their particular addressee(s) because they know or believe to know the conceptual context in which that information will be integrated; and similarly their addressee(s) can properly derive the speaker-intended meanings because they too know or believe to know the conceptual context in which each speaker’s contribution is being made. All versions of ‘mutual’ or ‘common’ ground have since had to deal with logical objections to the notion of ‘shared’ information, stemming from the cognitive fact that each person only has access to their own conceptual processes. Consistent with these logical objections, this research embraces the view that during discourse each speaker actively creates and manipulates a model of discourse that is unique to that participant’s understanding of the discourse content and their expectations of everyone else’s (cf. Luchjenbroers ms.). Hence, during discourse each speaker actively creates and manipulates a singular model of discourse (i.e., a representation of discourse in the speaker’s own mind) thought to capture the information s/he thinks is ‘mutual’. Thus speakers will construct their own model according to the expected discourse needs of their addressee(s), and they project this model into the discourse space between interlocutors, as though it can be mutually observed and manipulated. In this sense, the speaker’s representation of discourse is their version of a ‘public’ model of discourse, even though it is no more public than it is mutual. The result of this version of mutual ground (that isn’t mutual), is that each speaker’s judgments about how to distribute semantic and functional information in talk is not based on shared conceptual representations with their addressee(s), as suggested by the terms ‘mutual’ or ‘common’ ground, but on the perceived JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.3 (163-240) Discourse, gesture & mental spaces manoeuvers similarities and discrepancies between the content of their own ‘public’ model of discourse and all or any information they can glean from their addressee’s linguistic and gestural outputs as well as any other opportunistic sources of information that may present themselves. To this end the role of skill is paramount. The structuring of each discourse contribution into comprehensible chunks for a hearer’s benefit is thus of key importance, which directs the analyst to a more basic level of complexity: creating and maintaining a coherent discourse structure. This involves not only locating each proposition in the speaker-intended, pin-point context (or ‘mental space’) to be processed and appropriately understood, it also requires recognising the relationship between that mental space and any others that may be relevant to the subject-matter being discussed. In effect, all participants will need to create, manage, and navigate any number of mental spaces required during discourse. . Mental Spaces Theory [‘MST’] Mental Spaces Theory is fundamentally based on the view that linguistic form under-specifies speaker meaning (cf. Fauconnier 1985; Fauconnier & Sweetser 1996), and that meaning construction takes place at a conceptual level. This conceptual level involves the construction of appropriate mental spaces in which to process the propositions attributed to them. Mental spaces are like mini contexts in which propositions are processed and can be measured as True or False. For example, if a speaker were to say the utterance given in (1), I was in here with a girl from Sarawak, the hearer would need to discern the proposition being conveyed [I + a girl, together], and the mental spaces in which to process it – in this case an undefined temporal space in the past, triggered by the use of the past tense, which is further defined by a physical location: in here (see Figure 1). The act of separating the spatial definition from the proposition means that there are a number of ways in which that proposition can be measured as true or false. For example, different aspects of the triggered mental spaces can be rejected as false, such as “it wasn’t in this room”, or “it hasn’t happened yet”. This is quite distinct from the more traditional ways in which a speaker’s utterances are measured as true or false, such as when aspects of the proposition are rejected – e.g., reference failure (‘it wasn’t him’, or, ‘he didn’t meet a girl’, or, ‘she wasn’t from Sarawak’).1 Particularly relevant to this discussion however, is that if a speaker were to have used a different spatial definition (e.g., in the future, or in another building), both speaker and hearer would process the same proposition, but use a different mental space. Sentence meaning therefore depends on constructing the appropriate mental spaces, and within any stretch of discourse there can be several mental spaces JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.4 (240-305) June Luchjenbroers (1) Dennis: I was in here with a girl from Sarawak PAST TIME IN HERE I + a girl (together) Sarawak Figure 1. Embedded spaces simultaneously active, which both speakers and hearers need to navigate during discourse. These spaces are often interrelated and discourse participants need to make these interconnections in cognitive space for discourse to be coherent. For example, Figure 2 is an attempt to illustrate many of the interconnections required in discourse, such as those required in the following sample data extract, given in data extract (2). Talk is about plagiarism and more specifically what actions the participants deem appropriate. However, in the discussion of that general topic, talk involves references to: a university handbook; a past reference to the supposed reading of that handbook (links Past time and an expected action, to the hearer and the handbook); a hypothetical scenario, would you mark yourself down (links a hypothetical time frame to appropriate actions and the hearer); and then two other hypothetical scenarios: one located in the future, if I was really unsure I would (links a hypothetical time frame to the Here-&-Now, the supposed reading of the handbook, the hearer and the handbook – not included in Figure 2); and the other located in the past, if I were a first-year I would, which again links a hypothetical (counterfactual) time frame to the supposed reading of the handbook, the Here-&-Now, the hearer and the handbook. In each case, new mental spaces are created, as need, for the purposes of discourse, and the interconnections between them and others active in talk must be recognised for discourse to be coherent. There is also evidence that once activated, a mental space can be reaccessed at any time during discourse, and potentially also between discourses. For example, the two references to a boy from Hong Kong, given in (3a) and (3b), were produced roughly 15 minutes apart. Reference to this boy from Hong Kong (line 256) is specific and requires retrieving the referent from an earlier mention, and yet the only earlier reference to him was in lines 22–23.2 JB[v.20020404] Prn:12/05/2006; 13:15 F: HCP1505.tex / p.5 (305-305) Discourse, gesture & mental spaces manoeuvers (2) Gwen: it’s clearly defined in the hand book Dana: yeah? .. so wha- what, do you know what it says? Gwen: nup (laugh) I don’t .. but I know it’s there Dana: you you’ve not read the handbook? you would mark mark yourself down (laugh) Gwen: not for many years (both laugh) . . . um.. before doing an essay if I really, [was] unsure then I’d go but I’ve been writing a lot of essays so I I’m pretty clear on what it should be and isn’t by now.. but if I were a first year I certainly would Topic = Plagiarism Task = Decide action HERE-&-NOW IN HANDBOOK Rules about plagiarism Line 163-6: You know what it says? Line 171: You’ve not read it? PAST TIME HANDBOOK Is read by students Line 172: You would mark yourself down? YOU ARE TUTOR & YOU ARE STUDENT Plagiarism is punished Line 173: Not for many years Line 184: but if I were a 1st year, I would. HYPOTHETICAL First Year STUDENT ( unclear about rules ) I am First year student Figure 2. Interconnected spaces JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.6 (305-358) June Luchjenbroers (3) a. Ellen: b. Ellen: there’s a fellow there from Hong Kong you know.. [22] he was re-eally ba-ad .. [23] one of the assignments this boy from Hong Kong had to do was . . . [256] There was no evidence of reference failure by the hearer at the second mention, and therefore even though both mentions were very brief, it was still sufficient for discourse purposes. Research published elsewhere (Luchjenbroers 2001, 2003; Carroll et al. 2003) has argued that the discourse building process is facilitated by prosodic and/or gestural information that provide additional cues to derive speaker-intended meanings. These works have also illustrated how the informational load of gestures may substantially enrich the semantic content of the lexical component. In the following I will consider the pertinent gesture types, taking into account how the dynamics of a speaker’s F-space is utilized to convey aspects of meaning not always conveyed verbally. . Data The body of examples used in this discussion have been drawn from a larger, video-taped study into negotiated talk involving 36 Australian and non-Australian, Male and Female university students. These subjects were given the task of devising guidelines (to be given to faculty) about how new students should avoid the pitfalls associated with either cheating or plagiarism. They were recorded in a sound-proof room; positioned diagonally across from each other to enhance the analyst’s view of the interaction (sitting in the next room, behind a large tinted window), as well as the video-recorder that was placed back from the dyad in a triangulated position to the interaction. The total body of video data includes 36 conversational dyads (approximately 18 hours of data). . Gestures in discourse What counts as gesture? Much work on gesture has been devoted to outlining the different types of bodily movements a speaker can make during discourse, as well as which of these count as meaningful and thus worthy of linguistic analysis (e.g., McNeill 1992, 2000; Kendon 1981; Krauss 1998). The spectrum is often divided into three categories: (i) movements that have no apparent discourse meaning – i.e., what Krauss (1998) calls ‘Motor gestures’, or ‘beats’. These may coordinate with speech but have JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.7 (358-409) Discourse, gesture & mental spaces manoeuvers no apparent meaning; (ii) ‘symbolic’ gestures, or ‘emblems’, that carry conventionalized meaning, such as a thumbs up means ‘good’;3 and (iii) conversational gestures which are spontaneous and may have some conventionality but are totally optional – i.e., some speakers use them but discourse is not dependent on their presence. It is this last class of gestures that is the object of this research. The ‘comfort zone’ One of the first observations made of the data is that speakers vary in how much they gesture during discourse, as well as the proportion of physical space they use to gesture in. Although a number of subjects gestured very little (mainly Australian males and Asian females, but also some Australian females), most subjects made a variety of gestures during talk, and utilized a variable quantity of the physical space between their bodies and half-way to their interlocutor (where the task instructions were also taped to the desk). Social dynamics also made an impact on the amount and size of subject’s gestures. For example, Australian Females, in Female+Female dyads, were typically high users of gesture; while Australian Males, in Male+Male dyads showed noticeably fewer gestures. In fact, Australian Male participants often appear completely inert.4 The interesting result of mixed gender dyads (Australian Female+Australian Male) is that it was typically the Australian women who gestured less instead of Australian Males gesturing noticeably more. The only exception to this Australian male conduct was when the participant female was clearly foreign (such as an Asian woman). In previous papers I have described the physical area in which most gestures occur as the ‘comfort zone’: the area in which a speaker produces most gestures and which is in easy reach of the posture s/he has taken during discourse (Luchjenbroers 2001, 2004). The general dimensions of the comfort zone in these data is roughly the shape of a cube that runs from shoulder to waist in height, from the elbow (at the waist or in these data, the table) to the hand in depth, and has body width.5 As discussed above, the actual size of a speaker’s comfort zone, and the proportion of gesture to speech, varies from speaker to speaker, and culture to culture. Therefore for some, the gesture space is a much smaller cube, sometimes involving maybe no more than the speaker’s hands, and in some cases, just movement of the thumbs from a clasped hands position. In general, speakers who are less animated in gesture use a smaller gestural cube, and those who are more animated use a larger cube that is more consistent with the dimensions mentioned above.6 In addition to the comfort zone, speakers also make use of the physical space that either borders or is clearly outside these general boundaries, often involving a full, physical stretch. I suggest that these general vs. extreme boundaries are consistent with ‘inside’ and ‘outside’ a speaker’s gestural ‘F-space’, and that these JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.8 (409-458) June Luchjenbroers boundaries add an extra dimension to the meaning conveyed lexically by speakers during discourse. Hence, where speakers produce most gestures is where they are most comfortable (cf. comfort zone), which defines their F-space, and gestures that cost more physical energy are ‘outside’ that space. Although it is tempting to refer to this gestural comfort zone as the ‘Focus’space, this would be misleading as a speaker can refer to more than one mental space with gestures inside their F-space, each enjoying a certain degree of focus. Similarly, if the comfort zone were intimately linked to discourse focus, then one would expect the focus space to always be located inside F-space; however the data also provides examples where the mental space in focus is gesturally located outside the speaker’s F-space (e.g., talk about foreign practices). As will become evident in the following section, gestures within F-space are primarily relevant to ‘Me’ (i.e., the speaker) and gestures outside F-space to ‘Not Me’. In this sense, gestures can function like contrastive stress, in that pointing to a physical location in front of the speaker amplifies not only ‘Here’ (where ‘I’ am) but also ‘Not there’, or ‘This’ (what ‘I’ have) and ‘Not That’; while deictic gestures to physical locations outside F-space amplify the opposite. There is also some evidence of an association between what is topical in discourse and inside F-space, although it is often hard to separate these features with aspects of the speaker (i.e., its not so much what is topical as the speaker’s argument/ view of that topic, which again is relevant to the speaker). In the following examples I will consider the relationship between F-space and relevance to the speaker’s location (i.e., here); the speaker (i.e., important to ‘me’); and the subject-matter being discussed. Gesture types Conversational gestures have been identified as the object of this research; however within these spontaneous gesticulations, a number of different types are also discernable: (i) Deictic gestures, which relate to ‘here’ vs. ‘there’ [also called Indexical (cf. index finger) gestures];7 (ii) simple gestures, which iconically (and often metonymically) illustrate features of talk; and (iii) complex gestures, which add to the information conveyed in the lexical component of talk. Deictic gestures Indexicals are the most basic form of gesture and are presumably the first to be used by children learning language. This strategy involves an instruction to the hearer to direct their view (from the speaker’s finger) to a specific item that both speaker and hearer can see. Consider example (4) below.8 JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.9 (458-534) Discourse, gesture & mental spaces manoeuvers (1)↓ (4) Graham: I’ve had a quick look at that. . . particular list (2)↓ and I’ve resummarised it into fewer words 1. L hand/index finger points to the Task sheet taped to the table (= real referent) 2. L hand/index finger points to the writing pad in front of S (contains a written list of points, or his summary). In this example the first gesture, to that, involves a long extension of the left arm to outside the speaker’s comfort zone and hence, his F-space, to the typed task that both participants can see. Similarly the second gesture, to the summary, also points to the true referent which is directly in front of the speaker and thus within his F-space. However, because both point to the fixed location of the true referent, the dimensions of F-space are not relevant to the interpretation of the deictic relations involved. The second gesture is however richer than the deictic relation alone because it goes beyond the information conveyed by the lexical component: it informs the addressee that the written text in front of him is a summary. The most common way in which gestures enrich the lexical component is through iconicity. For example, the indexical here can mean, this room, this building, this university, this city, or this country. Each location can be serviced by the lexical and gestural indexicals here, and for each extension of the original here, the indexical bears an iconic (albeit metonymic) relationship to the full dimensions actually referred to. In each case the chosen gesture will point to the physical location in front of the speaker (inside their F-space), and references to a place not relevant to here would be paired with a gesture clearly outside F-space. Cases such as these show the most basic way in which indexicals can serve an iconic function. Later examples will show how indexical gestures may serve a more overt iconic function that enrich the lexical component of talk. Simple (iconic) gestures Earlier work has put forward the view that some gestures may be described as ‘simple’ in that they convey a straightforward semantic relationship between the essential message carried by the gesture and the lexical component it accompanies; whereas others are described as ‘complex’ in that they do substantially more than just clarify lexical meaning (Luchjenbroers 2001, 2004). Those described as complex, complement the lexical component by providing meaning not articulated. A frequent, simple gesture example in these data, is the take gesture (= one hand scoops an unseen substance or object and draws it to the body), which cooccurred with talk about taking, stealing, plagiarizing, and cheating throughout the data. This gesture is simple because it is consistent with the verbal component.9 Similarly, indexicals may be used to convey simple iconic relations that go beyond JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.10 (534-602) June Luchjenbroers the basic deictic relationship. In such cases the relationship between gesture location and F-space can be additionally informative. For example, in (5) the allocation of yours and mine relates to a hypothetical event: both speaker and hearer are surrogates for the roles needed in an event that was invented for discourse purposes only (cf. Liddle 2002). (1)↓ (5) Jake: it’s say handing in some work .. (2)↓ (3)↓ saying that this is yours . . . this is mine (4)↓-------------------------------------------in actual fact it should really come from somebody else 1. R hand clumped, palm down, fingers touching table in front of Speaker’s left chest (= inside F-space) 2. R hand (same shape) extends toward Hearer in centre field (= border F-space) 3. R hand (same shape) moves back and collides with centre of Speaker’s chest (= inside F-space) 4. R hand flattens and moves away from S, palm down (dismissive) to Speaker’s R, past the desk boundary (= outside F-space) In this example, the first gesture relates to the general topic, some work, which is focal and clearly located inside the speaker’s F-space. The location of this gesture inside F-space may also suggest that the speaker puts himself in the protagonist role. The second and third gestures illustrate more overtly the surrogate roles played by the hearer, your work, and the speaker, my work, in this hypothetical discourse event. These gestures are indexical in that they point to the different characters in this event, but are more meaningful because they simultaneously allocate roles to the discourse participants. Contrastive stress is also relevant here as the two locations occur at opposite boundaries of the speaker’s F-space. The yours gesture is reflected from the speaker (= not me), while the mine gesture is not just inside the speaker’s F-space, but is attached to the speaker (his hand is clutching his chest). Hence both gestures outline the boundaries of the speaker’s F-space; while the next gesture, to somebody else, is distinctly thrust outside that square. It is reflected away from both surrogates and the full dimensions of the speaker’s Fspace. In fact, the fourth gesture exceeds the table surface area, which emphasizes its complete removal from the scene. Similarly in (6) below, reference to this author and that author, like example (5), are again in diametrically opposite locations to each other, although the discourse participants are not the surrogates for these fictional roles. The ‘author’ roles are removed from both speaker and hearer, and again it is no coincidence JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.11 (602-666) Discourse, gesture & mental spaces manoeuvers that these gestures involve physical thrusts as far away from the real participants as possible (i.e., outside F-space). (1)↓ ----------------(6) Lenard: and they just write their ideas (2)↓ (3)↓ even if they’re using them from this author and that author and they do think it’s quite ah bizarre (4)↓ when they look at an essay that I’ve written 1. 2. 3. 4. both hands, flat, fingers meet at Speaker’s chest (= inside F-space) R hand + arm extends to Speaker’s far right (= outside F-space) R hand + arm extends across F-space to Speaker’s left (= outside F-space) R hand flicks back and hits Speaker’s right shoulder (= border F-space) Notably (6) also shows that the relationship between here and F-space is more complex than initial discussions have suggested, because references to here or this do not necessarily correlate with inside F-space. In (6) the speaker also refers to a third person, they, while touching his own shoulder (inside F-space), when one might expect an outside F-space gesture location, like the someone else gesture in (5). Here the speaker is talking about practices in Germany, and they refers to German students; his use of a gesture that puts himself in centre stage suggests that he identifies with that protagonist role, despite the use of the third person pronoun. In (7), the basic deictic process is further abstracted to illustrate a transference of fictional matter from one fictional location to another. The first gesture location, outside F-space, is dictated by the second clause reference to your own piece [work]. These gestures involve two components: (i) deictic references to an object in two different locations (even if fictional); and (ii) a pantomime consistent with the verb, take. The series of actions for take involves one hand clasping an unseen substance or object, picking it up, transporting it over an unseen obstacle and depositing it into the speaker’s zone (= ‘make mine’). Consequently, from whence it came must be ‘not mine’ (= outside F-space). It is no coincidence that this would be gesticulated as the migration of something from outside F-space to a point squarely inside F-space; and similarly it is no coincidence that reference to a chunk would be realized as the gesticulation of clutching and moving an object. However, in this example, like in (6), the lexical strategy is meaningfully different from the chosen gestures, in that a distancing (‘not me’) lexical pronoun is used (your), but a ‘make mine’ gesture complements it. JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.12 (666-757) June Luchjenbroers (1)↓ (7) Iris: like taking a big . . . chunk out of a book (2)↓ and putting it into your own piece and not referencing 1. R hand (palm down; fingers touching desk) to the right of speaker’s F-space – clutching an imaginary mass (= Outside F-space) 2. clutched mass is picked up & put down inside F-space (S centre front) The examples given in (4–7) illustrate how the basic deictic process of directing the hearer’s attention to specific, visual entities has been extended to a role-play between visual participants in a hypothetical scenario, and then to a role-play of non-visual entities in a hypothetical scenario. For this second extension of the deictic function, a speaker designates points in physical space to refer to referents in talk, as is also grammatically correct in sign languages. However, the location of those designated points is not arbitrary: those located closer to the speaker are sooner those with which s/he associates (or makes ‘mine’) and those to which the speaker does not identify or wish to be associated with are typically located further away from the speaker – often as far as physically possible. Of particular interest therefore is when the lexical strategies used by a speaker (often diverting responsibility away from themselves), is paired with gestural strategies that places themselves in the protagonist’s role. The example given in (8) similarly illustrates how speakers can distance themselves from the practices they describe by the location of the associated gesture; in this case, what ‘we’ don’t do is located outside F-space. Notably the next gesture, to them and what ‘they are required to do’ requires further movement away from F-space, to amplify the contrast between ‘our practices’ and ‘their practices’. (1)↓ (8) Jake: we’ve been given strict instruct strict instructions that we (2)↓ we’re not allowed to correct it for them (3)↓ they have to fix it up themselves 1. R index finger on desk-top, centre field, outlines the top boundary of an imagined source document (= inside F-space) 2. R hand (palm + fingers down, touching the desk) moves to the far R of desk-top (= outside F-space) 3. both hands carry over to beyond S’s R: L hand touches S’s R side and R hand extends in the same direction (= outside F-space) JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.13 (757-810) Discourse, gesture & mental spaces manoeuvers The above examples illustrate how the semantics of the chosen gestures resonate with the lexical component they accompany.10 However even in these simple cases, more complex features of added meaning may come into play, such as speaker attitude toward the content of discussion, inferable by the location of the associated mental space. In such cases, gesture complements speaker meaning in ways not captured by the lexical component alone. In the following section I will expand on some of the types of meaning ‘complementation’ that have occurred in these data. Complex gestures The data have revealed a range of gestures that expand on the information provided by the lexical component. This type of ‘complementation’ may include instructions regarding mental spaces and discourse structure. The gesture examples given above have component features that go beyond, and in some cases contradict the semantics of what is said lexically (such as using a gesture to indicate ‘mine’ while using the pronoun ‘your’). However, the type of complexity this category is meant to capture is that these gestures are both iconic and go beyond the meaning of the uttered sentence to which the gesture is paired. For example, if a speaker were to utter, Brilliant observation! but tap their forehead at the time of speaking (= you’re nuts!) then the total meaning of the uttered sentence would not only be enhanced, it would be very different. Another notable example of this was observed in a television program where an investigator, with obvious interest in a particular woman about to leave the country, suggests a hypothetical scenario using an unknown character, although his gesture makes clear that he chooses to fill that role himself. (9) Investigator: What if you were to get involved. . . . ? ↓ with some American. . . ? 1. both hands, fingers splayed dramatically grasp the S’s chest (inside F-space) Similarly in (10), the gesture associated with undergraduate conveys a depth of meaning not necessarily conveyed lexically. Even though Iris’s interpretation of Fran’s reference to some little snotty kid is undergraduate, the fact that her gesture hand extends beyond the desk-top away from her F-space and both discourse participants conveys ‘not us’; and the fact that her hand is lower than the desk-top itself (pointing downwards) conveys her contempt for that category of person (i.e., a small person = lesser than ‘us’). JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.14 (810-916) June Luchjenbroers (10) Fran: if I had spent you know at least two years of my life devoted to writing this book and some little snotty kid (1)↓ (2)↓ Iris: undergraduate . . . comes in an does . . . 1. R arm fully extended to R, over desk edge & pointing down (diagonal to exchange), slightly lower than the desk = lower status (outside F-space) 2. finger flicks back toward Speaker & across F-space (inside F-space) Examples such as this give evidence that there are more dimensions at play than just the horizontal plane, making other metaphors also of interest to a full interpretation of gesture (cf. Lakoff & Johnson 1980: up is good, down is bad). In this case, the location of the speaker’s gestures specifies characteristics of the mental spaces in which the speaker attributes and processes the information about undergraduates. A similar case is given in (11) where again the location of the gesture, so far away from the speaker’s F-space that he must turn in his seat to make it, maximally contrasting the speaker from the group he is talking about. ↓-----------------------------------(11) Jake: I feel a great deal of um empathy for them 1. both hands move to the S’s extreme R: L hand touches S’s R side and R hand moves from centre chest to far R (= outside F-space) Example (12) is also particularly interesting because it captures dimensions that are not conveyed lexically and have not been previously noted in the literature. Each gesture point refers to an illegal act (such as plagiarism) and each gesture point illustrates the fictional location of such acts in a hypothetical student paper. The fact that these gestures are produced on the diagonal captures the frequency of those acts in this hypothetical paper – i.e., plagiarisms (etc.) would unlikely be confined to a single location, but would presumably be distributed throughout a piece of work; and because these gestures also move from closer to farther from the speaker, it also captures that these illegal acts occur throughout the referent piece of work. Thus the diagonal iconically captures both the shape of the printed page (from top to bottom), as well as the width of the piece of work (from beginning to end). ↓ ↓ ↓ (12) Hariette: so we ’say if you do.. this.. and this and this then.. you-’re ah.. breaking the rules (laugh) 1–3. R hand, pinched (all 4 fingers on top of thumb), pointing to equidistant points in space, forming an oblique row, from just above F-space (height of L eye), into F-space (below R shoulder); these points also perceptively move from closer to the speaker to less close to the speaker. JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.15 (916-968) Discourse, gesture & mental spaces manoeuvers In sum, gestural complexity involves additions to the lexical component of discourse that may have a direct bearing on the interpretation of speaker meaning. Unlike the lexical component, which can generally be unambiguously assigned one or other mental space role (i.e., space builder or proposition), gestures often contain components with multiple roles: some relating to content and others to the speaker’s F-space and its relation to mental spaces functions. . Mental spaces manoeuvers The examples above have illustrated how basic strategies are used iconically to amplify (= simple gestures) if not complement (= complex gestures) information conveyed in the lexical component. Similarly, and quite distinct from the mental spaces these gestures involve, the above examples also illustrated the relevance of F-space in deriving additional aspects of speaker-meaning. Now I will focus on how simple and complex gestures, together with F-space, are employed to service navigations around mental spaces in talk. For example, in example (7) above, like taking a big chunk out of a book and putting it into your own piece. . . , the first gesture locates the source of the theft, the book (outside F-space), and the second gesture locates the target of the stolen material, your own piece (inside F-space). These locations are relevant to mental spaces navigation as lexically, a book is neither a positive or negative reference: the phrase says nothing of ownership or location in a physical sense. It is only the location of the gesture relative to the speaker, outside F-space, that serves to disambiguate what book (or what nature of book) is being referred to – i.e., the one being plagiarised (= ‘not mine’). Similarly, the second reference lexically attributes the piece to another (possibly abstract) person, but the ‘make mine’ path of the second gesture in relation to the first (landing inside F-space) associates the deed with the speaker. Gesturally, the speaker has played out a role to which lexically she is only a hypothetical spectator. However, the contrast in the two physical locations of these gestures also amplifies the two mental spaces required here: one for the source (and its associated attributes) and one for the target. Examples such as (7) reveal the importance of F-space in helping to construct, navigate and disambiguate mental spaces, because sentence meaning depends on the appropriate space(s) being accessed in which to properly associate incoming information with its content (referent). Similarly, example (13) reveals how complex iconic gestures can help clarify the appropriate mental space(s) for comprehension. Reference to inside is paired with a flipping pages gesture (above F-space), which conveys that the speaker is talking about a book; the gesture is indicative of the size of the referent, and helps to clarify the mental space that is needed here (i.e., a thesis, and not just a page or a short paper). In the next clause, she includes JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.16 (968-1056) June Luchjenbroers the proposition they cited, which is complemented by a writing gesture (in the air, also above F-space). The height and directionality of the writing gesture conveys that it is not a short citation, but a full page length declaration. The full import of these two gestures is to both clarify the mental space needed, as well as give qualitative detail about the proposition to be processed within it. ↓(1) ↓(2) (13) Fran: um.. and then inside they they’ve they ‘cited.. 1. Right hand in the air, flipping pages, temple height (outside F-space) 2. Right hand, writing in the air, from centre forehead to shoulder height (outside F-space). Example (14) also shows mental spaces management in that gestures to different physical spaces are attributed to (i) the plagiarized material and (ii) the source from which the plagiarized material was taken (both referents are focal and within F-space). In cases such as this, once a speaker has attributed a referent to a particular location in (physical) gesture space, s/he will continue to point to the same locations upon further references to those referents. This gestural strategy also helps disambiguate when multiple referents are simultaneously ‘on stage’. In this way, gestures serve as a reference tracking device that is available to all participants in discourse: a strategy also used in sign language. ↓(1) ↓(2) (14) Gwen: like, if you know they’ve sort of taken this out of this book. . . ↓(3) ↓(4) because they’ve referenced this and you’ve read this book .. what do you do? 1. 2. 3. 4. R hand, across L hand but centre field (inside F-space = plagiarised material) R hand, across L hand & further to Left (inside F-space = source text) R hand points again to ‘source text’ space R hand points again to ‘source text’ space In sum, these examples provide clear illustration of how those speakers who choose to gesture, may also reveal strong clues about their attitudes toward the subject-matter being discussed by where in the gestural space available to them (in terms of F-space), they choose to designate a particular referent. The location of the referent thus indicates the speaker’s mental spaces in which those referents are conceptually located. This level of speaker meaning involves mappings between a physical location in the speaker’s gestural space, and specific mental spaces activated in speech. Considering again the assumed projected conceptual models of discourse information that each speaker produces, and for which they are responsible for their hearer’s comprehension, it seems entirely plausible (if not in some JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.17 (1056-1097) Discourse, gesture & mental spaces manoeuvers cases crucial) that speakers make use of any ploys available to them to keep track of their own argument, together with those put forward by other speakers. Speakers clearly make use of these mappings between conceptual space and physical space; however, it remains to be seen if hearers also make full use of this information. Given the potentially huge cognitive load each participant deals with in discourse, it is plausible that many aspects of speaker-meaning discussed above may be missed by hearers. This does not diminish the information they may be able to derive (if they know what to look for and are paying adequate attention); nor does it diminish the value of these strategies that conceivably assist speakers in making their contributions as comprehensible as possible for their audience. . Conclusions In this paper most energy was devoted to illustrating how a speaker’s choice of gesture as well as where to locate those gestures, not only serves to amplify the lexicalized information presented to hearers, but also serves to enrich that information by adding dimensions of meaning that might not otherwise be conveyed. This extra dimension in some cases illustrates the mental spaces in which propositions are to be processed, such as the flipping pages gesture that denotes a book (in which a declaration was made), while in other cases is revealed by the strategic use of F-space that conveys the relevance of the subject-matter or the referent in talk to the speaker (or the arguments that they put forward). The examples included in this paper reveal that those speakers who make full use of conversational gesture, also make productive use of Inside versus Outside F-space and thus amplify the relevance of these referents to (primarily) themselves, in that a speaker’s F-space has as its referential centre, the ego. The dynamics of F-space have been shown to be an important source for discourse participants to navigate the many mental spaces that may be required during discourse. The conceptual integration of these sources of discourse information is important if a hearer is to fully comprehend all the information speakers convey that is before them in talk. Notes * The research drawn upon in this paper was supported by a postdoctoral fellowship and a New Staff grant to the author from the University of Queensland (Australia). Many thanks to Roland Sussex and Shannon Dougherty. I’d also like to thank Adam Glanz for his very helpful comments on an earlier draft of this paper. Thanks also to Pat Carroll and Simon Parker, with whom many of these issues have been discussed and developed, and the helpful comments from an anonymous reviewer. All oversights are of course my own. JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.18 (1097-1154) June Luchjenbroers All correspondence concerning this article should be sent to: Dr J. Luchjenbroers, c/- Dept of Linguistics, University of Wales, Bangor, GWYNEDD, LL57 2DG, Wales, U.K. Fax: 44+ 1248–38 2928; Email: <[email protected]>. . Cf. the negation test (Luchjenbroers 1993). If this sentence were an interrogative form (instead of declarative), a simple ‘no’ answer most specifically rejects the proposition; a rejection of the spatial definition(s) requires more linguistic effort. . The complete mental spaces dynamics for this example would be: Focus Mental Space = In foundation year (‘there’) + [proposition = a boy (from Hong Kong) was really bad (at plagiarism)]. The modifier ‘from Hong Kong’ points to an external, not focal Mental Space. . McNeill (2000) separates these into (a) emblems and (b) pantomimes, which play out a scene in more detail. . During a recent presentation (2002) I showed a silent clip from an Australian male-male dyad to illustrate this observation (that lasted several minutes), but before I could admit to the audience that gestural analysis on such data is somewhat challenging, members of the audience accused me of showing a ‘still’, and when they did finally notice movement it was welcomed with an applause. . I have noticed in other less formal conversations, that speakers may reveal a very different comfort zone. For example, a speaker whose arm is flung over a chair will display a very different F-space than those discussed in this paper: seemingly disjointed spaces instead of a single space. Notably the distinction between inside and outside F-space is still defined by the amount of overt effort a gesture costs the speaker. . In some cases, where a speaker may be described as ‘voluminous’, the cube may also rise from the table, as though speaking to a person positioned higher than (or further away from) the speaker. . Deixis (sometimes called ‘shifters’ because their specific reference shifts from speaker to speaker), refers to lexical and gestural items that depend on context for meaning – e.g., sitting here at my desk, my here is simultaneously everyone else’s there. Hence the words here and there, have no objective meaning apart from indicating the speaker’s orientation toward phenomena around him/her. . Arrows above the utterance example indicate the onset of a gesture (not the target), although in some cases a line from that arrow is an attempt to indicate how long it took the speaker to get from gesture onset to target. . ‘Simple’ does not refer to the length or detail of a gesture, only to whether that gesture correlates with the meaning conveyed lexically. . The metalinguistic term ‘semantics’ is used in reference to all forms in which a ‘word’ may present in discourse. Thus, whether words, such as TAKE or THIS or HERE, are verbalized, or conveyed gesturally or in print is irrelevant. References Carroll, Pat, June Luchjenbroers, & Simon Parker (2003). Sounds, Signs and Rapport: On the methodological importance of including audio-visual data in an analysis of discourse. In JB[v.20020404] Prn:13/02/2006; 13:17 F: HCP1505.tex / p.19 (1154-1293) Discourse, gesture & mental spaces manoeuvers Grant Malcom (Ed.), Multidisciplinary Studies of Visual Representations and Interpretations. Elsevier Science. Chafe, Wallace (1994). Discourse Consciousness and Time: The flow and displacement of conscious experience in speaking and writing. Chicago & London: Univ. Chicago Press. Clark, Herbert H. (1993). Arenas of Language Use. Chicago: Chicago University Press. Clark, Herbert H. (1996). Using Language. Cambridge U. Press. Clark, Herbert H. (1997). Dogmas of Understanding. Discourse Processes, 23, 567–598. Fauconnier, Gilles (1985). Mental Spaces: Aspects of Meaning Construction in Natural Language. Cambridge, MA: MIT Press. [Rev. Ed. New York: Cambridge U. Press, 1994]. Fauconnier, Gilles & Eve Sweetser (Eds.). (1996). Spaces, Worlds, and Grammar. Chicago: University of Chicago Press. Grice, Paul (1978). Further Notes on Logic and Conversation. In Peter Cole (Ed.), Syntax and Semantics, 9: Pragmatics (pp. 113–127). New York: Academic Press. Grice, Paul (1975). Logic and Conversation. In P. Cole & J. L. Morgan (Eds.), Syntax and Semantics, 3: Speech Acts. New York: Academic Press. Kendon, Adam (1981). Nonverbal communication, Interaction and Gesture. The Hague: Mouton. Krauss, R. M. (1998). Why do we gesture when we speak? Current directions in Psychological Science, 7, 54–59. Lakoff, George & Mark Johnson (1980). Metaphors we live by. Chicago: Chicago University Press. Lambrecht, Knud (1994). Information structure and sentence form. Cambridge Univ. Press. Liddle, Scott (2002). Blended spaces and deixis in sign language discourse. In D. McNeill (Ed.), Language and Gesture (pp. 331–357). Cambridge Univ. Press. Luchjenbroers, June (1993). Pragmatic inference in language processing. Unpublished doctoral dissertation, La Trobe Univeristy, Melbourne Australia. Luchjenbroers, June (2000). Cognitive strategies for mutual ground construction. Paper presented at the Language & Cognition Conference, Leiden University, Netherlands. Luchjenbroers, June (2001). Prosodic and Gestural cues for Navigations around Mental Space. BLS 27: Language and Gesture. Univ. of California Press (to appear). Luchjenbroers, June (2002). Flick ’o the wrist or deliberate action: how gestural information makes face-to-face conversation information-rich. Fourth Annual Meeting of Child Language Group. University of Wales, Gregynog, UK. Luchjenbroers, June (2004). Verbal & Visual Cues For Navigating Mental Space. In Grant Malcom (Ed.), Multidisciplinary Studies of Visual Representations and Interpretations. Elsevier Science. Luchjenbroers, June ms. Cognitive Discourse: Theory meets Practice. (in progress). McNeill, David (1992). Hand and Mind: What gestures reveal about thought. U. Chicago Press. McNeill, David (Ed). (2000). Language and Gesture. Cambridge Univ. Press. Tomlin, Russell (Ed.). (1987). Coherence and grounding in discourse. Amsterdam: Benjamins. Tomlin, Russell (Ed.). (2001). Mapping conceptual representations into linguistic representations: the role of attention in grammar. In J. Nuyts & E. Pederson (Eds.), Language and Conceptualization (pp. 162–189). Cambridge: C.U.P. Tomlin, Russell, L. Forest, M.-M. Pu, & M. H. Kim (1997). Discourse Semantics. In Teun van Dijk (Ed.), Discourse: A multidisciplinary introduction. London: Sage. JB[v.20020404] Prn:9/02/2006; 10:15 Computational models and conceptual mappings F: HCP15P2.tex / p.1 (47-73) JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.1 (48-112) chapter In search of meaning The acquisition of semantic structures and morphological systems* Ping Li University of Richmond The acquisition of meaning has been an intensely debated issue in the field of child language in the last thirty years. Recently, computational approaches that rely on connectionist networks and statistical learning models provide new insights into this issue. These models advocate that semantic representations are best viewed as emerging out of a continuously developing and adapting dynamical system. In this chapter, I show that connectionist networks can capture the emergence and representation of semantic structures. Moreover, such representations can serve to trigger productive morphological uses such as overgeneralizations in language acquisition. Our modeling results suggest that structured semantic representations emerge from statistical computations of the various form-form and form-meaning constraints, and the evolution and development of semantic representations as acquired by children are due to simple probabilistic procedures as embodied in connectionist networks or similar statistical learning mechanisms. Keywords: connectionist networks, computational approaches, language acquisition, corpus analysis, statistical learning . Introduction The representation of language has been traditionally considered as a construction out of basic structural building blocks in the form of symbols and rules. This approach in general looks at linguistic representations statically. A contrasting approach, in the spirit of recent developments in connectionist networks and statistical learning, attempts to capture linguistic representations dynamically. It considers linguistic representations as emergent properties that evolve out of a continuously developing and adapting system. A shortcut to the understanding of this approach might come from the following example. Structured, rule-like rep- JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.2 (112-166) Ping Li resentations in a connectionist network can emerge in much the same way as a hexagonal structure emerges from the honeycomb: every honeybee packs a given amount of honey to the honeycomb from multiple directions, but no honeybee has a grand planning for the hexagonal structure (Bates 1984). In this paper, I provide such an account of the emergence of semantic representations, in connection with morphological learning in language acquisition. Lexical semantics and its acquisition by children has been a hotly debated issue in the last thirty years. Until recently, most researchers in this domain have thought that there is a fixed set of conceptual and semantic properties associated with each lexical item, and that the child’s task is to acquire the necessary conceptual frameworks and the semantic properties. Recent computational models of language processing suggest that lexical semantics may be emergent properties, in particular, that lexical categories can be acquired by the computation of statistical regularities inherent in the input data. These models are in many ways consistent with the empirical approach of distributional analysis (dating back to structural linguistics; Saussure 1916) that emphasizes the child’s ability to analyze the linguistic input (e.g., Maratsos & Chalkley 1980). They can be classified roughly into two categories. First, proposals from statistical analyses of large-scale text corpora indicate that lexical-semantic representations may emerge from multiple contextual and lexical co-occurrence constraints in a high-dimensional space. Second, connectionist (or neural network) models indicate that lexical-semantic structures can emerge from statistical learning of form-form and form-meaning mappings. In what follows, I will briefly consider both types of models, but the focus of this chapter will be on the second.1 High-dimensional semantic space and lexical representation There have been a number of proposals that high-dimensional semantic space can provide accurate and faithful representations of lexical semantics through multiple contextual or lexical co-occurrence constraints in large text corpora. Two models have emerged most prominently in the last few years: the hal model (Hyperspace Analogue to Language), advocated by Burgess and Lund (1997), and Lund and Burgess (1996); and the lsa model (Latent Semantic Analysis), developed by Landauer and Dumais (1997), and Landauer, Foltz, and Laham (1998). These two models are highly compatible with each other, although the specific methods used are different. In the following, I will focus on the hal model as our research has linked this model specifically to children’s acquisition of lexical semantics. According to hal, the meaning and function of a given word are determined by lexical co-occurrence constraints in a high-dimensional input space, that is, by what items may precede a word and what may follow it, and how often they do so. hal focuses on global rather than local lexical co-occurrences: A word is an- JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.3 (166-236) In search of meaning Table 1. Global Co-occurrence Matrix for the Sentence, The horse raced past the barn. The values in the matrix rows represent co-occurrence values for words that preceded the word (row label). Columns represent co-occurrence values for words following the word (column label). Cells containing zeroes were left empty in this table. See Burgess and Lund (1997). Reproduced with authors’ permission. barn barn horse past raced the horse past raced the 2 4 3 6 5 3 4 2 4 5 3 5 5 4 chored with reference not only to other words immediately preceding or following it, but also to words that are further away from it in a variable co-occurrence window, with each slot (occurrence of a word) in the window acting as a constraint dimension to define the meaning and function of the target word. The example in Table 1 illustrates the notion of global lexical co-occurrence more clearly. It shows a matrix using a 5-word moving window for just one sentence (the horse raced past the barn). Within this five-word window, co-occurrence values are inversely proportional to the number of words separating a specific pair of words. A word pair separated by a four-word gap, for instance, would gain a cooccurrence strength of 1, while the same pair appearing adjacently would receive an increment of 5. The product of this procedure is an N-by-N matrix, where N is the number of words in the vocabulary being considered. This table illustrates how the matrix acquires information about meaning. Consider, for example, the word barn. The word barn is the last word of the sentence and is preceded by the word the twice. The row for barn encodes preceding information that co-occurs with barn. The occurrence of the word the just prior to the word barn gets a co-occurrence weight of 5 since there are no intervening items. The first occurrence of the in the sentence gets a co-occurrence weight of 1 since there are four intervening words. Adding the 5 and the 1 results in a value of 6 recorded in that cell. A word meaning vector is formed by concatenating the row and column values for the lexical item. Of course, not all vector values or elements contribute equally to the meaning representation. The most appropriate elements are those that contribute most to the contextual meaning and this is determined by identifying which vector elements have the greatest contextual diversity (see Lund & Burgess 1996, for details). It is this more complex pattern of co-occurrence, which is referred to as global lexical co-occurrence that contributes to the richness of meaning. In short, global lexical co-occurrence is a measure of a word’s total experience in the context of other words. The meanings of a word, in JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.4 (236-296) Ping Li this perspective, emerge from multiple constraints in a high-dimensional space of language use. Although models like hal are not originally designed for language acquisition, they have significant implications for the acquisition of word meanings. Redington, Chater, and Finch (1998) used a similar method as hal to capture lexical syntactic categories in child language. In another study, Li, Burgess, and Lund (2000) applied the hal method to the analysis of parental speech in the childes database (Child Language Database Exchange System; see MacWhinney 2000, for a description of the database). We analyzed 3.8 million words from the speeches of parents and caregivers addressed to children, and found that a reasonable size of speech corpus (e.g., 3.8 million words) with a reasonable amount of co-occurrence constraints (e.g., 50 co-occurrence elements) can yield accurate and faithful semantic representations of English words.2 Our results suggest that young children can learn word meanings by exploiting the considerable amount of contextual information in the input to compute multiple higher-order lexical constraints. This approach relies on a few simple assumptions about what the learner does. One important assumption is that the learner has the ability to track continuous speech with some limitation on working memory, which can be modeled with a weighted moving window of a variable size; another assumption is that the learner is sensitive to lexical co-occurrences during language processing. Such statistical abilities seem to be readily available to the child at a very early age, as studies of statistical learning in infants have revealed (Saffran, Aslin, & Newport 1996). In short, global lexical co-occurrences can provide useful and powerful cues to the young child in the acquisition of word meanings. Emergent semantic structures in connectionist networks A second set of models, consistent and complimentary with the computational approach discussed above, are the connectionist models of language processing and language learning. Recent years have seen rapidly developing interests in the application of connectionist models to the study of language acquisition (see Elman, Bates, Johnson, Karmiloff-Smith, Parisi, & Plunkett 1996; Klahr & MacWhinney 1998 for overview). This interest dates back to Rumelhart and McClelland’s (1986) connectionist model of the learning of the English past tense and the debates thereafter (MacWhinney & Leinbach 1991; Pinker 1991, 1999; Pinker & Prince 1988; Plunkett & Marchman 1991, 1993; Seidenberg 1997). Connectionist models rely on the use of a large number of connected micro-processing units (called ‘nodes’ or ‘neurons’) that activate in parallel and adjust weights of connections between one another through learning and processing.3 Two key assumptions of these networks have to do with (a) representation – knowledge is represented as patterns of activation distributed across the processing units, and (b) learning – new knowl- JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.5 (296-343) In search of meaning edge is formed through the adaptation of the strengths or weights of connections that hold among the processing units. These assumptions differ from traditional cognitive assumptions about knowledge representation that involves discrete symbolic representations of concepts, categories, and grammatical rules. With regard to language acquisition, advocates of connectionism argue that linguistic representations (of the lexicon, morphology, and grammar) are “emergent properties” due to the interaction of the processing units with the linguistic environment in the form-meaning mapping process. This view contrasts with the traditional psycholinguistic approaches that emphasize the mental representation of rules and the innateness of grammatical and semantic categories. Connectionist principles of distributed representation, weight adjustment, and nonlinear learning provide a mechanistic account of how syntactic and semantic structures can emerge out of learning. For example, Elman (1990, 1995) showed that a simple recurrent network is able to derive internal representations of semantic as well as syntactic categories in a task of predicting the next word in the sentence. Lexical categories such as nouns and verbs, animate and inanimate, and human and animals emerge clearly in the network’s hidden-unit representations after the network has been trained to map the current word in the input stream to the next word. What the network does is similar to the process of detecting lexical co-occurrence constraints in the input (as does the hal model). Note that both Elman’s network and the hal method can be likened to the “distributional analysis” technique used by structural linguistics (Bensch 1991), although structural linguistics did not have today’s powerful statistical machinery and computational tools. Li (1993) and Li and MacWhinney (1996) discussed more explicitly how a connectionist network can develop internal representations of semantic structures. Using the acquisition of the English reversive prefix un- as an example, they examined the role of cryptotypes in determining overgeneralization patterns, competition principles, and plasticity of learning. In three simulations, they showed that structured semantic representations can emerge from connectionist learning: the network formed internal representations of semantic categories that correspond to Whorf ’s cryptotypes, on the basis of learning limited semantic features of verbs and morphological classes. More important, the network produced overgeneralization errors similar to those reported by Bowerman (1982), Clark, Carpenter, and Deutsch (1995), and those observed in the childes database, indicating that emergent semantic structures underlie patterns of productivity in child language. In this paper, I take a more in-depth look at the issue of the acquisition of semantic structure along with the acquisition of morphological systems. I will focus on the second set of models discussed above, the connectionist approach to language acquisition, summarizing results from our studies. Our results indicate JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.6 (343-411) Ping Li how semantic structures can emerge from the learning of probabilistic associations that hold between lexical items and morphological markers. Moreover, understanding gained from connectionist semantic acquisition directly helps us to identify psycholinguistic and computational mechanisms of generalization and overgeneralization in language acquisition. . Cryptotype as an emergent category and as a trigger for overgeneralization Whorf ’s cryptotype In one of the classic papers of early cognitive linguistics, Whorf (1956) presented the following puzzle. In English, the reversive prefix un- can be used productively with many verbs to indicate the reversal of an action, for example, as in uncoil, uncover, undress, unfasten, unfold, unlock, untie, or untangle (the meaning of reversal can also be expressed by other prefixes such as dis- or de-). However, many seemingly parallel forms are not allowed, such as *unbury, *unfill, *ungrip, *unhang, *unpress, *unspill, *unsqueeze, or *untighten. Why is un- prefixation allowed with some verbs but not others? None of the standard categories of Latin grammar can be used as a basis for a rule to tell us when we can use un- and when we cannot. Whorf ’s puzzle was deeper than this simple discrepancy. He reminded us that un- is a productive device in English morphology, and that despite the difficulties that linguists have in characterizing its use, native speakers do have an intuitive feel for which verbs can be prefixed with un- and which cannot. He presented the following thought experiment: if a new verb, flimmick, is coined to mean “to tie a tin can to something”, then native speakers are willing to accept the sentence, “He unflimmicked the dog” as expressing the reversal of the “flimmicking” action; if flimmick means “to take apart”, then they will not accept “He unflimmicked the puzzle” as describing the act of putting a puzzle back together. The constrained productivity of un- prompted Whorf that there was some underlying or covert semantic category, a cryptotype, that governs the productive use of un-. According to Whorf, cryptotypes only make their presence known by the restrictions they place on the possible combinations of overt forms. When the overt prefix un- is combined with the overt verb tie, there is a covert cryptotype that licenses the combination untie. This same cryptotype also blocks a combination such as *unmove. To Whorf, the deep puzzle was that while the use of the prefix un- is productive, the cryptotype that governs its productivity is unclear: “we have no single word in the language which can give us a proper clue to its meaning or into which we can compress this meaning; hence the meaning is subtle, intangible, as is typical of cryptotypic meanings.” JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.7 (411-460) In search of meaning Although cryptotype seemed puzzling, Whorf did propose that there was “a covering, enclosing, and surface-attaching meaning” (Whorf 1956: 71) that could be the basis of the cryptotype for un-. Whorf was correct in noting that verbs that take un- usually have one or more of the covering, enclosing, or surface-attaching meaning. But it is not clear whether we should view this cryptotype as a single unit, three separate meanings, or a cluster of related meanings. Nor is it clear whether these notions of attachment and covering fully exhaust the subcomponents of the cryptotype. Subsequent analyses have suggested certain additional components not initially considered by Whorf. For example, Marchand (1969) and Clark et al. (1995) argue that verbs that license un- all involve a change of state, usually expressing a transitive action. This transitive action typically reaches a terminal point in time (encoded by a telic verb; Comrie 1976), or some end state or result (an accomplishment verb; Vendler 1967). When the meaning of a verb does not involve a change of state or does not indicate telicity or accomplishment, the verb cannot take un-, thus the ill-formedness of verbs like *unswim, *unplay, and *unsnore. Cryptotype and morphological productivity in child language Whorf ’s discussion shows clearly how cryptotype is important to the use of unin the adult language. Bowerman was the first to point out that the notion of cryptotype might also play an important role in children’s acquisition of un-. According to Bowerman (1982, 1983, 1988), children’s acquisition of un- tends to follow a U-shaped pattern, a pattern that children display in other areas of morphological acquisition as well, such as the acquisition of the English past tense (Brown 1973; Kuczaj 1977). Children initially produce un- verbs in appropriate contexts, treating un- and its base verb as an unanalyzed whole. This initial stage of rote control is analogous to the child’s saying went without realizing that it is the past-tense form of go. Productivity of un- comes at the next stage, when children realize that un- is independent of the verb to indicate the reversal of an action. The next stage in the acquisition of un- begins at around age 3. At this stage, children start to produce overgeneralizations in spontaneous speech such as *unarrange, *unbreak, *unblow, *unbury, *unget, *unhang, *unhate, *unopen, *unpress, *unspill, *unsqueeze, or *untake (Bowerman 1982). These overgeneralizations have also been observed in Clark et al. (1995) in both experimental and naturalistic data with children from ages 3 to 5, for example, *unbend, *unbury, *uncrush, *ungrow, *unstick, and *unsqueeze. Similar examples can also be found in the childes database, such as *unblow, *unbuild, *uncatch, *uncuff, *unhand, *unlight, *unpull, *unstick, and *unzipper (see Li & MacWhinney 1996, for a more complete list of examples of overgeneralization errors). During this period, children also make certain ‘overmarking’ errors. For example, the child might say *unopen and really only means to say open, or unloosen to mean loosen. In such JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.8 (460-515) Ping Li cases, the base forms open and loosen have a reversive meaning that triggers the attachment of the prefix, even when the action of the base meaning is not actually being reversed. These errors are analogous to redundant past-tense marking as in *camed and redundant plural marking as in *feets (Brown 1973). As children grow older, overgeneralization or overmarking errors gradually disappear. A traditional explanation of the U-shaped pattern in children’s morphological acquisition goes like this: initially they rely on rote learning, then they develop a general rule and apply it productively (and overgeneralize it), and finally they recover from productive errors (this is much like what has been argued for the acquisition of the English past tense). For productivity to take place at the second stage, Bowerman correctly pointed out that cryptotype plays an important role. But how could the child extract the cryptotype and use it as a basis for morphological generalization or recovery, when the cryptotype is intangible even to linguists like Whorf? (see Whorf ’s comments on the subtle and intangible nature of the cryptotype as discussed earlier). A connectionist account of cryptotype and its acquisition A connectionist perspective provides us with a natural way of capturing Whorf ’s insights of cryptotype as well as its acquisition in a formal mechanism. In our view, there can be several ‘mini-cryptotypes’ that work together as interactive ‘gangs’ (McClelland & Rumelhart 1981). For example, “enclosing” verbs, such as coil, curl, fold, reel, roll, screw, twist, and wind, all seem to share a meaning of circular movement. Similarly, “attaching” verbs, such as clasp, fasten, hook, link, plug, and tie, all involve hand movement. Other verbs such as bind, buckle, fasten, latch, leash, lock, strap, tie, and zip form a mini-cryptotype that share a “binding” or “locking” meaning. Still another cluster of verbs such as cover, dress, mask, pack, veil, and wrap forms the “covering” mini-cryptotype. These mini-cryptotypes or mini-gangs interact collaboratively to support the formation of the larger cryptotype that licenses the use of un-, in terms of summed activation, as illustrated in Figure 1. The mini-gangs collaborate rather than compete because their members are closely related by the overlap of semantic features. For example, the verb screw in unscrew may be viewed as having both a meaning of circular movement and a meaning of binding or locking; zip in unzip may be viewed as sharing both the “binding/locking” meaning and the “covering” meaning, and both screw and zip involve hand movements. Moreover, a feature may also vary in the strength with which it is represented in different verbs. For example, circular movement is an essential part of the meaning of the verb screw, but less so for wrap (one can wrap a small ball with a soft tissue paper without turning around either the object or the wrapping paper). These properties of feature overlap and degraded featu- JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.9 (515-569) In search of meaning covering “dress, mask,wrap” binding “buckle, fasten, strap” enclosing “fold, reel, wind” change of location “load, pack, plug” UNCryptotype circular movement “coil, curl, roll” change of state “scramble, tangle, twist” attaching “hook, link, tie” Figure 1. Multiple features support the formation of the un- cryptotype. Arrows represent the feature-to-category connections; the weights or strengths of connections are omitted. Dots in the center of the circle represent words that fit the core of the category, while dots near the border of the circle represent borderline cases. ral composition lend themselves naturally to properties of connectionist models. Distributed patterns, weighted connections, nonlinear learning as embodied in connectionist networks seem to be ideal for handling the elusiveness and gradience of these semantic structures. In the last few years, our laboratory has carried out connectionist simulations to study the issue of semantic structure and overgeneralization, using the acquisition of un- as an example. In the following sections, I will discuss two major models in this endeavor. The first model uses a standard feed-forward network to simulate the acquisition of cryptotypes and prefixes. The second model uses a self-organizing neural network, which has also been recently applied to the acquisition of semantic and grammatical structures in children and in bilingualism. Readers who are interested in the technical details of these models should consult Li and MacWhinney (1996), Li and Farkas (2002), Li (2003), and Li, Farkas, and MacWhinney (2004). JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.10 (569-626) Ping Li . A feed-forward network that learns to map semantic features of verbs to prefixes Method Connectionist networks that use the back-propagation algorithm (henceforth ‘backpropagation networks’) are perhaps the most popular class of networks and are most widely applied in studies dealing with language. A standard backpropagation network consists of three layers of processing units (Rumelhart, Hinton, & Williams 1986). In this type of network, information is first encoded at the input layer, then it funnels through the hidden layer, where internal representation is formed, and finally results are produced at the output layer (hence the nickname of ‘feed-forward networks’). Each layer consists of different units, representing different states/processes of information processing (from input to output). Learning in this case is a function of adjusting the weights of the connections between units across the layers. The adjustment is done through the back-propagation algorithm, according to which the network discovers a discrepancy between its actual output and the desired output, and then an error signal is propagated back through the system, so that weights are adjusted in a way such that the next time the same input will lead to an output that matches more closely to the desired output (for technical details of the algorithm, see Rumelhart, Hinton, & Williams 1986). In our simulation, we used 160 verbs as input to our network. They consisted of 49 verbs that can take the prefix un-, 19 verbs that can take the competing prefix dis- (see Li & MacWhinney 1996, for the rationale of including dis- verbs, and the competition between un- and dis- in both child and adult languages), and 92 randomly selected verbs that can take neither prefix (henceforth ‘zero verbs’). Each verb was represented by a semantic pattern (a vector) that consists of 20 semantic features. These features were selected in an attempt to capture basic linguistic and functional properties inherent in the semantic range of these verbs. In order to objectively determine the values of each semantic feature, we presented 15 native English speakers with the 160 verbs along with the 20 semantic features, and asked them to judge the semantic relevance of each feature to each verb. A feature-byverb relevance matrix was derived for each subject, and the final input vectors were derived by averaging the matrices from all subjects. A hierarchical clustering analysis on these vectors attests to the validity of our method, as distance metrics in this analysis reflected the similarities and differences between words. The task of the network was to take the semantic vectors of English verbs as input, and map them onto different prefixation patterns in the output: un-, dis-, and zero. Figure 2 shows the network architecture and examples. JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.11 (626-633) In search of meaning UN- DIS- Æ Internal Representation .7 .6 .5 .6 .1 .2 .1 .9 .3 .5 .2 .0 .3 .3 .6 .7 .0 .1 .0 .0 connect .9 .5 .6 .7 .0 .3 .1 .9 .4 .7 .3 .0 .5 .3 .4 .8 .1 .2 .0 .1 link .6 .0 .0 .0 .3 .5 .5 .1 .3 .1 .1 .6 .0 .1 .1 .0 .1 .1 .0 .0 turn ······ ······ ······ (160 verbs) Figure 2. The feed-forward network that learns to map semantic features of verbs to prefixation patterns (un-, dis-, Ø). Results and discussion Connectionist networks are dynamic systems that explore the regularities in the input-output mapping processes through the activation of the hidden units and the adjustment of connection weights (to and from the hidden units). To analyze how our network developed internal representations, we used the hierarchical cluster analysis to probe into the activation of the hidden units at various points in time during the network’s learning (see Elman 1990, for an application of this method). Figure 3 (in Appendix) presents such an analysis at three time points, the early (3a), intermediate (3b), and late stages of learning (3c), respectively. Focusing here on the verbs that share the enclosing-rotating meaning (most of which can be prefixed with un-), we can see how the network developed structured semantic representations. These cluster trees indicate that early on with little learning, there was not much meaningful structure in the data, and thus, the enclosing-rotating verbs were scattered all over the cluster tree. Gradually as learning progressed, these verbs started to form smaller groups at several levels. Finally when learning reached a stable situation, they were all grouped under one cluster. These snapshots provide a picture of the developmental trajectories in the network’s integration of semantic structures during the meaning-form mapping JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.12 (633-668) Ping Li process. They illustrate how a mini-cryptotype, such as the enclosing-rotating category, which supports the use of un-, can emerge from learning the mapping of verb semantics to prefixation. In the studies reported by Li and MacWhinney (1996), we used an incremental learning procedure, in which the network took in the input gradually, verb by verb. Learning with this procedure also lent us insights into the formation of cryptotype in the network. Figure 4 shows a cluster tree of the network’s hidden-unit representation when the network learned 50 verbs. In this graph, we can observe arrange DIS connect DIS put ZERO ravel UN hold ZERO wind UN hook UN mount DIS lace UN coil UN plug UN cork UN hitch UN bind UN fasten UN latch UN braid UN chain UN make ZERO learn ZERO turn ZERO stop ZERO roll UN keep ZERO call ZERO believe ZERO wait ZERO help ZERO come ZERO get ZERO take ZERO walk ZERO run ZERO give ZERO ask ZERO tell ZERO say ZERO see ZERO talk ZERO hear ZERO like ZERO start ZERO go ZERO work ZERO look ZERO show ZERO use ZERO charge DIS allow ZERO reach ZERO Figure 4. A hierarchical cluster analysis of the network’s hidden-unit representations after the network has learned 50 verbs. The labels after the verbs were not provided to the network during training. JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.13 (668-751) In search of meaning two general clusters: one for the un- verbs, and the other for the zero verbs – verbs that cannot be prefixed with un- or dis-. Our interpretation of these clusters is that the network acquired a distinct representation for the un- verbs by identifying the mini-cryptotypes inherent in these verbs. For example, most of the verbs in the un- cluster share the cryptotypic meaning of binding or locking: bind, chain, fasten, hitch, hook, latch, etc. However, not all mini-cryptotypes were identified at this time, and they emerged at different stages as discussed above. Figure 4 also shows, for example, that the network had not yet developed a clear representation for the enclosing verbs: the verbs ravel and coil were correctly categorized into the uncluster, but the verb roll was incorrectly treated as a zero verb. Note that our network received no discrete label of the semantic category associated with un-, nor was there a single categorical feature that tells which verb should take which prefix (hence Whorf ’s problem). All that the network received was semantic featural information distributed over different input patterns. Over time, however, the network was able to identify the regularities that hold between distributed semantic patterns and patterns of prefixation, and developed a structured representation in the mapping process. The structured representations in the network thus emerged as a function of its learning of the association between form and meaning, not as a property that was given ad hoc to the network by the modeler. The emerging representations also clearly capture Whorf ’s notion of cryptotype. The meaning of a cryptotype constitutes a complex semantic network, in which verbs differ from one another with respect to (a) how many features each verb contains, (b) how strongly each feature is represented in the verb, and (c) how strongly features overlap with one another within a verb (all true with the input to our network). It is these complex relationships that give rise to the notion of cryptotype. The emergence of cryptotype representations in our network can be viewed as a replacement for the traditional analytic frameworks of categories and rules (Lakoff 1987; MacWhinney 1989). In this perspective, children’s learning of unis not simply the learning of a symbolic rule for the use of the prefix with a class of verbs (given that it is not even clear what the rule is), but the accumulation of the connection strengths that hold between a particular prefix and a set of semantic features distributed across verbs. The learner groups together those verbs that share the largest number of features and take the same prefix. Over time, the verbs gradually form clustered patterns, with respect to both meaning and prefixation pattern. This learning process can best be described as a statistical procedure in which the child implicitly tallies and registers the frequencies of co-occurrence of semantic features, lexical items, and morphological devices. Bowerman (1982, 1983) suggested that there are two possible roles for cryptotypes to influence the learning of un-. (a) “Recovery via cryptotype”: cryptotypes JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.14 (751-794) Ping Li help the child to overcome overgeneralizations made at an earlier stage, if these overgeneralizations involve verbs that fall outside the cryptotype, such as *uncome, *unhate, and *untake (Bowerman 1982); (b) “Generalization via cryptotype”: cryptotypes trigger productivity and leads to overgeneralizations. This occurs because, once children have identified the cryptotype, they will overgeneralize un- to all verbs that fit the cryptotype, irrespective of whether the adult language actually allows un- with these verbs. Our simulation results provide support for the second role of cryptotype in inducing overgeneralizations that fall within the realm of the cryptotype. Figure 4 showed how the network included hold and mount in the un- category. These verbs were included apparently because of their semantic similarity with members of the cryptotype, most of which can take un- (e.g., bind, chain, fasten, hitch, hook, latch). Examining the output patterns of hold and mount in the network, we found that un- was overgeneralized on these verbs. Similar overgeneralization errors produced by the network included *unbury, *uncapture, *unfill, *unfreeze, *ungrip, *unhold, *unloosen, *unmelt, *unpeel, *unplant, *unpress, *unsplit, *unsqueeze, *unstrip, *untack, and *untighten, most of which fit the cryptotype meaning. Our network produced few simulated errors that were flagrant violations of the cryptotype meaning, such as forms like *uncome reported by Bowerman (1982), thus our results provide no direct evidence for the first role of cryptotype as hypothesized by Bowerman. In our simulations, overgeneralizations occurred typically after the network had developed structured cryptotype representation, indicating that cryptotype served as a trigger for morphological overgeneralization. These results match up well with available empirical data. For example, one child in Bowerman’s study produced errors such as *uncapture, *unpeel, *unpress, *unsplit, *unsqueeze, and *untighten, similar to those in our network. The overgeneralizations that the child produced all fell within the cryptotype, and her acquisition of un- as a reversive prefix went hand in hand with her discovery of the cryptotype meanings of the verbs. In Clark et al.’s (1995) naturalistic data, the child’s innovative uses of un- also respected the cryptotype from the beginning. Clark et al. noted that the child’s use of un- matched the semantic characteristics of the cryptotype even when the conventional meanings of the verb in the adult language did not: *unbuild was used to describe the action of detaching lego-blocks, *undisappear was used to describe the releasing of the child’s thumbs from inside his fists.4 Thus, once the learner (child and network alike) formed a structured representation that corresponds to the cryptotype for un-, the representation guides the learner’s behavior in productive morphological use. In subsequent simulations, our network also displayed a limited amount of recovery from overgeneralization errors. Typically, recovery was best when the network had developed only partial or unstable semantic structures at relatively early stages of learning, and it became increasingly difficult when a fixed structure had JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.15 (794-832) In search of meaning emerged at later stages of learning (Li & MacWhinney 1996). This is because the back-propagation learning algorithm proceeds in such a way that early on, the network’s weight configurations are not fully committed and more flexible to change, but later on as the network learns more and more words, it settles on a more stable weight space that makes adjustment difficult if not impossible (see Elman 1993: 91–93 for a detailed discussion of how the learning algorithm determines weight adjustment over time). This situation does not seem to match with what we know about child language: most children eventually recover from all overgeneralization errors, no matter how late. Even tough plasticity might be particularly characteristic of early learning (Spitzer 1999), older children and adults are still able to change, adapt, and recover from errors, unlike the network studied here (Bownds 1999). This mismatch, along with other considerations discussed below, prompted us to study another type of connectionist model, the self-organizing neural network, to account for lexical acquisition. . A self-organizing network that learns to map semantic features to prefixes Although most previous connectionist model of language acquisition have relied on the use of feed-forward networks with back-propagation, researchers have started to see their limitations. In addition to its limited ability to recover from overgeneralizations, there were two other major limitations to the network that we used. First, our network, like most previous models, received semantic input features selected on the basis of linguistic analyses on the part of the modeler. Input representation in this way is subject to the criticism that the network worked (e.g., displayed cryptotype representation) precisely because of the use of certain semantic features (cf. Lachter & Bever 1988). To overcome potential limitations associated with this problem, in the new simulations we used semantic representations that are based on analyses of global lexical co-occurrences from a large text corpus (see previous discussion of hal, and Method below). Second, backpropagation relies on a gradient-descent weight adjustment process to reduce the error between desired and actual outputs, but this type of adjustment seems unrealistic for child language learning. According to the well-known “no negative evidence” argument (Baker 1979; Bowerman 1988; Pinker 1989), children do not receive constant feedback about what is incorrect in their speech, or receive the kind of error corrections on a word-by-word basis as provided to a backpropagation network. Thus, back-propagation networks would seem to be poor candidates as models of language acquisition on grounds of their psychological or biological plausibility. Considerations of these problems lead us to self-organizing neural networks. Self-organizing networks are biologically more plausible because JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.16 (832-882) Ping Li one could conceive of the human cerebral cortex as essentially a self-organizing map (or multiple maps) that compresses information on a two-dimensional space (Spitzer 1999). They are computationally more relevant because one could argue that child language acquisition in the natural setting (especially organization and reorganization of the lexicon) is largely a self-organizing process that proceeds without explicit teaching (MacWhinney 1998, 2001). Method In contrast to standard feed-forward networks, self-organizing networks use unsupervised learning that requires no presence of a supervisor or an explicit teacher; learning is achieved entirely by the system’s self-organization in response to the input (Kohonen 1982, 1989, 2001). Self-organization in these networks typically occurs in a two-dimensional map (self-organizing map), where each unit is a location on the map that can uniquely represent one or several input patterns. At the beginning of learning, an input pattern randomly activates one of the many units on the map. Once a unit becomes active in response to a given input, the weights to the unit and its neighboring units are adjusted so that they become more similar to the input and will therefore respond to the same or similar inputs more strongly the next time. In this way, the network gradually develops concentrated areas of units on the map (like the activity “bubbles”) that respond to particular inputs. This process continues until all the inputs can elicit specific response patterns in the network. As a result of this self-organizing process, the statistical structures implicit in the multi-dimensional space of the input are represented in the two-dimensional space of the map. Here we used the hierarchical feature map model of Miikkulainen (1993, 1997) in our simulations, because it combines multiple self-organizing maps in a single network. In this model, there is a semantic map that processes semantic information of the words, and there is a phonological map that processes phonological information of words (for more details of the application of the model, see Li 2003). The two maps are connected via associative links trained by Hebbian learning, a well-established biologically plausible learning principle, according to which the associative strength between two units (semantic and phonological) is increased if the units are both active at the same time (Hebb 1949). The same set of verbs described in §3 was used as the input, but they were represented differently from the way they were represented in the previous simulations. The semantics of these words were encoded as patterns of global lexical co-occurrence constraints (Burgess & Lund 1997; see §1), rather than patterns of semantic features selected on the basis of our own linguistic analyses. Each verb was represented as a pattern of 100 units, and the values of these units reflected the degree of a lexical co-occurrence constraint (on a continuous scale from 0 to JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.17 (882-959) In search of meaning 1). We also derived a phonological representation for each verb and the prefixes un- and dis-, according to MacWhinney and Leinbach (1991). In this representation scheme, each verb was encoded by 168 units in a syllabic template to represent the combinatorial constraints of phonology (see also Li & MacWhinney 2002, for details). Upon training of the network, a phonological representation of the verb was presented to the network, and simultaneously, the semantic representation of the same verb was also presented to the network. By way of self-organization, the network formed an activity on the phonological map in response to the phonological input, and an activity on the semantic map in response to the semantic input. Depending on whether the verb is prefixable with un- or dis-, the phonological representation of un- or dis- may also be co-activated with the phonological and the semantic representations of the verb stem. At the same time, through Hebbian learning the network formed associations between the two maps for all the active units that responded to the input. The network’s task was to create new representations in the corresponding maps for all the input words and to be able to map the semantic properties of a verb to its phonological shape and its morphological pattern. Results and discussion In our network, the self-organizing process extracted and compressed the highdimensional information from the hal semantic vectors and expressed the semantic similarities on the two-dimensional space as localized patterns of activity. Figure 5 presents a snapshot of the network’s self-organization of 120 verbs after the network was trained for 600 epochs. An examination of the semantic map shows that the network has clearly developed forms of representation that correspond to cryptotype categories. Earlier we suggested that a connectionist model provides a formal mechanism to capture Whorf ’s notion of cryptotype, in that there can be several mini-cryptotypes that work collaboratively as interactive gangs to support the formation of the larger cryptotype. The idea of ‘mini-cryptotype’ is reflected most clearly in the emerging structure of the self-organizing map. Our network, without the use of ad hoc semantic features, formed clear mini-cryptotypes by mapping similar words onto nearby regions of the map. For example, towards the lower right-hand corner, verbs like lock, clasp, latch, lease, and button are mapped to the same region of the map, and these verbs all share the “binding/locking” meaning. A similar minicryptotype also occurs towards the lower left-hand corner, including verbs like snap, mantle, tangle, ravel, twist, tie, and bolt. Still a third mini-cryptotype can be found in the upper left-hand corner, including hear, say, speak, see, and tell, verbs of perceptions and audition. Finally, one can observe that embark, engage, JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.18 (959-982) Ping Li Figure 5. A self-organizing map model that shows the organization of 120 verbs after the network was trained on these verbs for 600 epochs. The upper panel is the lexical phonological map (indicated by capital letters), and the lower panel the semantic map (indicated by lower-case letters). Words longer than four letters are truncated. integrate, assemble, and unite are being mapped toward the upper right-hand corner of the map, which all seem to share the “connecting” or “putting-together” meaning (interestingly, these are the verbs that can take the prefix dis-). Of course, the network’s representation at this point is still incomplete, as self-organization is moving from diffuse to more focused patterns of activity; for example, the verb show, which shares similarity with none of the above mini-cryptotypes, is grouped with the binding/locking verbs. What is crucial, however, is that these mini-cryptotypes form the semantic basis for the larger cryptotype of un- verbs. As shown in Figure 5, the network has mapped most verbs in the cryptotype to the bottom layer of the semantic map, and these are the verbs that can take the prefix un-. JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.19 (982-1020) In search of meaning Moreover, our network was not only able to capture the elusive cryptotype by way of self-organization, but also able to generalize on the basis of its representation of the cryptotype. During testing of the network’s productive ability, overgeneralization occurred with 50% of the testing words. For example, the network produced overgeneralization errors that match up with empirical data and our previous simulation results (see §3), including *unbreak, *uncapture, *unconnect, *unfreeze, *ungrip, *unpeel, *unplant, *unpress, *unspill, *unstick, *untighten, etc. These overgeneralizations were based both on the network’s representation of the meaning of verbs and on the associative connections that the network formed through Hebbian learning in the semantics-phonology mapping process. Again, like in our previous simulations, most of these overgeneralizations involve verbs that fall within the un- cryptotype. Thus, the results here are again consistent with the “generalization via cryptotype” hypothesis, that is, the representation of cryptotype leads to overly general uses of un- (see also discussion of the clench example below) rather than the narrowing down of its uses (as predicted by the “recovery via cryptotype” hypothesis). One of the advantages of the self-organizing model is its ability to simulate comprehension and production through associative connections. The associative connections formed via Hebbian learning provide the basis for the production of overgeneralization errors. For example, the semantic properties of tighten and clench are similar and they were mapped onto nearby regions of the semantic map. During learning, the semantics of clench and unclench were co-activated, and the phonology of clench, unclench, and un- were also co-activated. When the semantics and the phonology of these items were associated through Hebbian learning, the network linked the semantics of tighten with the prefix un- because of clench, even though the network learned only the association for un-clench and not un-tighten (when tighten was withheld from training at an earlier stage). This associative process of correlating semantic features, lexical forms, and morphological devices simulates the process of learning and generalization in children’s productive speech, and shows that overgeneralizations can naturally result from the semantic structure in the lexical representations (which in turn is a result of self-organization), and from the associative learning of semantics and phonology. In §3 we discussed the failure of a feed-forward network in recovering from overgeneralization errors. We attributed that failure to the gradient-descent erroradjustment process used in the back-propagation algorithm. In self-organizing networks, recovery is a function of the adjustment of associative connections via Hebbian learning, proportional to how strongly the units in the associated maps (phonological and semantic maps in this case) are co-activated. When a given phonological unit and a given semantic unit have fewer chances to become coactivated, the strengths of their associative links are correspondingly decreased. We could compare this to a situation in which the learner receives no auditory JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.20 (1020-1085) Ping Li support about the specific meaning-form co-occurrences that he or she expects in the production (MacWhinney 1997). Given that the learning system is inputsensitive, over time, the meaning-to-form connections will weaken and therefore less likely to occur in its production. Indeed, our network displayed significant ability to recover from generalization errors. When tested for recovery with additional new learning (500 epochs), the network recovered from the majority of the overgeneralizations (75% recovery). Recovery in this case is a process of restructuring of the mapping between phonological, semantic, and morphological patterns, and the restructuring is based on the network’s ability to reconfigure the associative links through Hebbian learning, in particular, the ability to form new associations between prefixes and verbs and the ability to eliminate old associations that were the basis of erroneous generalizations. For example, un- was overgeneralized to tighten because of clench earlier on; when tested for recovery, only un- and clench continue to be coactivated. Hebbian learning determines that the associative connection between un- and clench remains strong, but that between un- and tighten weakens and gradually decreases to zero. This simulates the situation in which the child receives no support in the input about the relationship between un- and tighten. Of course, in the real learning situation, the strength of the connection between un- and tighten may also be reduced by a competing form such as loosen that functions to express the meaning of *untighten, whereby principles of contrast or competition help to eliminate the erroneous combination (e.g., Clark 1987; MacWhinney 1987). Note that the restructuring of associative connections often goes hand-inhand with the reorganization of the corresponding maps. For example, as the associative strengths of clench and tighten to un- varied, the verbs’ representations also became more distinct. This result is consistent with Pinker’s (1989) criteria proposal that children recover from generalizations by recognizing fine and subtle semantic and phonological properties of verbs. In the few cases in which our network did not recover from overgeneralizations, the network was unable to make the fine semantic distinctions between verbs. . General discussion and conclusions In this chapter I attempt to provide a computational perspective on a developmental issue. I started with two types of approaches to the problem of the acquisition of word meanings. I then gave a connectionist account of the acquisition of semantic structures and morphological systems, presenting modeling results from both a feed-forward network and a self-organizing network. I have chosen to examine a classical puzzle that Whorf presented some 70 years ago, the issue of cryptotype in connection with the use and acquisition of the English reversive prefix JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.21 (1085-1142) In search of meaning un-. This problem differs from many of the currently debated topics, for example, the acquisition of the English past tense where the patterns of use largely depend on phonological constraints and where the focus of debate has been on the competition between regular rules and exceptions. The un- problem examined here is essentially semantic, and there seems to be no regular rule that governs the use of this prefix (hence “intangible”, as Whorf named it). Our connectionist models provide some insights into the understanding of Whorf ’s puzzle, in particular, the understanding of the emergence of complex semantic structures in language acquisition and the role of a structured semantic representation in morphological productivity (e.g., overgeneralization). The simulation results suggest a dynamic learning picture in which the network extracts shared semantic information, develops representations of the cryptotype, and overgeneralizes morphological devices. Such results allow us to understand the processes underlying important phenomena such as the U-shaped behavior in language acquisition. Current debates in cognitive science and psycholinguistics revolve around the issue of the nature of linguistic representation. Symbolic theories construe linguistic representations in terms of rules in physical symbol systems. A child is said to have a general rule in her mental representation, “adding -ed to make the past tense”, at some stage of language acquisition. This kind of description seems intuitively clear, and the rule offers a powerful mechanism for productivity. Connectionist models provide alternative explanations to this perspective, explanations that place emphasis on the statistical learning processes that lead to rule-like behaviors. In this chapter I have demonstrated that the acquisition of linguistic patterns, such as the prefixation of un-, can be construed as emerging out of basic processing capacities, that is, the processing of the intricate relationships among phonological and semantic features, lexical items, and morphological devices in a natural language. This perspective seems to be especially suited for the problem that we have at hand, the cryptotype problem that was once thought “subtle” and “intangible” in a symbolic framework. In our view, the reason for the intangibility of the cryptotype is probably that the semantic features that unite different members of a cryptotype are represented in a complex distributed fashion (e.g., feature overlaps across categories; see discussion on page 121), such that they are not easily subject to traditional symbolic analysis, but are accessible to native intuition (according to Whorf). Native intuitions are clearly implicit representations of the complex semantic relationships among verbs and morphological markers, and connectionist networks provide mechanisms to capture these intuitions through weighted connections, distributed representations, and nonlinear dynamics. Virtually the same story could be told about many other linguistic domains in which the problem is primarily semantically motivated. For example, the use of classifiers is one of the hardest problems for second language learners of Chinese, as well as a major challenge to linguistic theories (cf. Chao 1968; Lakoff 1987; Li JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.22 (1142-1173) Ping Li & Thompson 1981). Each noun in Chinese has to be preceded by a classifier that categorizes the object of the noun in terms of its shape, orientation, dimension, texture, countability, and animacy. The appropriate uses of most classifiers by native speakers are mostly automatic, yet it is difficult for linguists to come up with a clear description of symbolic rules that govern their uses. We can probably assume that native speakers have acquired a representation by a connectionist cryptotypelike mechanism in which multiple weighted semantic features in a network jointly support the use of classifiers. We have recently successfully applied this type of mechanisms and explanations to the study of the acquisition of inherent verb aspect and tense-aspect morphology in Chinese, English, and Japanese (see Li & Bowerman 1998; Li 2000, 2003; Li & Shirai 2000). Following this line of research we have, further developed the DevLex model, a self-organizing neural network model for the development of the lexicon. We have applied DevLex to the modeling of monolingual and bilingual lexicon acquisition, simulating the formation of categorical representations, the confusion of competing lexical items in early speech, and the spurt of vocabulary in early word production (see details in Farkas & Li 2002; Hernandez, Li, & MacWhinney 2005; Li & Farkas 2002; Li, Farkas, & MacWhinney 2004). In sum, we can start to understand some of the most difficult problems in language acquisition, for example, the acquisition of semantic structures such as cryptotypes, when we take a computational approach of the type discussed here. Structured semantic representations can emerge from statistical computations of the various constraints among lexical items, semantic features, and morphological markers in a high-dimensional space of language use, as they dynamically evolve and develop. The evolution and development of semantic representations as acquired by children may be due to simple probabilistic procedures of the sort embodied in connectionist networks or statistical learning mechanisms for form-to-form and form-to-meaning mappings. Acknowledgments Preparation of this article was supported by grants from the National Science Foundation (#BCS-9975249; #BCS-0131829), and a Faculty Research Grant from the University of Richmond. I would like to thank Elizabeth Bates, Melissa Bowerman, and Jeffrey Elman for their discussions on the feed-forward network, Brian MacWhinney and Risto Miikkulainen for their comments and discussions on the self-organizing network, and Curt Burgess and Kevin Lund for making available the hal semantic vectors for our modeling. JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.23 (1173-1264) In search of meaning Notes * Correspondence concerning this article should be addressed to Ping Li, Department of Psychology, University of Richmond, Virginia, VA 23173, U.S.A. E-mail: [email protected] . For some readers, these two sets of models may simply be viewed as the same kind of models, given that they both rely on statistical patterns and are in many ways closely related. . Note that the 3.8 million words represent only a small portion of what the child is exposed to in the learning environment. According to one estimate, an average three-year-old has been exposed to 10–30 million words (Hart & Risley 1995). . Readers who are interested in details of connectionist theory and methods should read Rumelhart, McClelland, and the PDP Research Group (1986). For non-technical introduction of connectionism, read Bechtel and Abrahamsen (1991) and Spitzer (1999). For technical discussions, read (progressively more technical) Dayhoff (1990), Fausett (1994), Anderson (1995), and Hertz, Krogh, and Palmer (1991). For its relevance to developmental theories, read Elman et al. (1996) and Klahr and MacWhinney (1998). For a comprehensive review of all major fields in neural networks, consult Arbib (1995). . Diary notes of my daughter’s speech also include similar uses: “unbuild the snowman” was used to refer to the detachment of decorative pieces from the snowman, and untape to refer to the removal of tape from a piece of paper that has been taped (child was 6 years and 9 months). References Anderson, James (1995). An introduction to neural networks. Cambridge, MA: MIT Press. Arbib, Michael (1995). Handbook of brain theory and neural networks. Cambridge, MA: MIT Press. Baker, Carl (1979). Syntactic theory and the projection problem. Linguistic Inquiry, 10, 533–581. Bates, Elizabeth (1984). Bioprograms and the innateness hypothesis: Commentary on Bickerton. Behavioral and Brian Sciences, 7, 188–190. Bechtel, William & Adele Abrahamsen (1991). Connectionism and the mind. Cambridge, MA: Blackwell. Bensch, Peter A. (1991). Neo-structuralism: A commentary on the correlations between the work of Zelig Harris and Jeffrey Elman. Center for Research in Language Newsletter, 5(2). Bowerman, Melissa (1982). Reorganizational processes in lexical and syntactic development. In E. Wanner & L. Gleitman (Eds.), Language acquisition: The state of the art. Cambridge: Cambridge University Press. Bowerman, Melissa (1983). Hidden meanings: the role of covert conceptual structures in children’s development of language. In D. Rogers & J. Sloboda (Eds.), The acquisition of symbolic skills. New York: Plenum. Bowerman, Melissa (1988). The “no negative evidence” problem: How do children avoid constructing an overly general grammar? In J. Hawkins (Ed.), Explaining language universals. New York: Basil Blackwell. Bownds, M. Deric (1999). The biology of mind: Origins and structures of mind, brain, and consciousness. Bethesda, MD: Fitzgerald Science Press. Brown, Roger (1973). A first language. Cambridge, MA: Harvard University Press. JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.24 (1264-1392) Ping Li Burgess, Curt & Kevin Lund (1997). Modelling parsing constraints with high-dimensional context space. Language and Cognitive Processes, 12, 1–34. Chao, Yen-Ren (1968). A grammar of spoken Chinese. Berkeley: University of California Press. Clark, Eve V. (1987). The principle of contrast: A constraint on language acquisition. In B. MacWhinney (Ed.), Mechanisms of language acquisition. Hillsdale, NJ: Erlbaum. Clark, Eve, K. Carpenter, & W. Deutsch (1995). Reference states and reversals: Undoing actions with verbs. Journal of Child Language, 22, 633–662. Comrie, Bernard (1976). Aspect: An introduction to the study of verbal aspect and related problems. Cambridge, England: Cambridge University Press. Dayhoff, Judith (1990). Neural network architecture: An introduction. New York: Van Nostrand Reinhold. Elman, Jeffrey L. (1990). Finding structure in time. Cognitive Science, 14, 179–211. Elman, Jeffrey L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48, 71–99. Elman, Jeffrey L. (1995). Language as a dynamic system. In R. Port & T. van Gelder (Eds.), Mind as motion. Cambridge, MA: MIT Press. Elman, Jeffrey L., E. Bates, M. Johnson, A. Karmiloff-Smith, D. Parisi, & K. Plunkett (1996). Rethinking innateness: A connectionist perspective on development. Cambridge, MA: MIT Press. Farkas, I. & Ping Li (2002). Modeling the development of lexicon with a growing self-organizing map. In H. J. Caulfield et al. (Eds). Proceedings of the Sixth Joint Conference on Information Science (pp.553–556). Durham, NC: Association for Intelligent Machinery, Inc. Fausett, Laurene (1994). Fundamentals of neural networks. Englewood Cliffs, NJ: Prentice Hall. Hart, B. & T. Risley (1995). Meaningful differences in the everyday experiences of young American children. Baltimore, MD: Paul H. Brookes Publishing Co. Hebb, Donald (1949). The organization of behavior: A neuropsychological theory. New York, NY: Wiley. Hernandez, Arturo, Ping Li, & Brian MacWhinney (2005). The emergence of competing modules in bilingualism. Trends in Cognitive Sciences, 9, 220–225. Hertz, John, Anders Krogh, & Richard G. Palmer (1991). Introduction to the theory of neural computation. Redwood City, CA: Addison-Wesley. Klahr, David & Brian MacWhinney (1998). Information processing. In W. Damon, D. Kuhn, & R. Siegler (Eds.), Manual of Child Psychology (Vol. 2). New York: Wiley. Kohonen, Teuvo (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59–69. Kohonen, Teuvo (1989). Self-organization and associative memory. Heidelberg: Springer-Verlag. Kohonen, Teuvo (1997). Self-organizing maps. Heidelberg: Springer-Verlag. Kohonen, Teuvo (2001). The self-organizing maps (3rd ed.). Berlin: Springer. Kuczaj, Stanley (1977). The acquisition of regular and irregular past tense forms. Journal of Verbal Learning and Verbal Behavior, 16, 589–600. Lachter, Joel & Thomas Bever (1988). The relation between linguistic structure and associative theories of language learning: A constructive critique of some connectionist learning models. Cognition, 28, 195–247. Lakoff, George (1987). Women, fire, and dangerous things. Chicago: The University of Chicago Press. Landauer, Thomas & Susan Dumais (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104, 211–240. JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.25 (1392-1507) In search of meaning Landauer, Thomas, Peter Foltz, & Darrell Laham (1998). Introduction to Latent Semantic Analysis. Discourse Processes, 25, 309–336. Li, Charles & Sandra Thompson (1981). Mandarin Chinese: A functional reference grammar. Berkeley: University of California Press. Li, Ping (1993). Cryptotypes, form-meaning mappings, and overgeneralizations. In E. V. Clark (Ed.), The Proceedings of the 24th Child Language Research Forum, Center for the Study of Language and Information Publications, Stanford University. Li, Ping (2003). Language acquisition in a self-organising neural network model. In P. Quinlan (Ed.), Connectionist models of development: Developmental processes in real and artificial neural networks. Philadelphia & Brighton: Psychology Press. Li, Ping & Melissa Bowerman (1998). The acquisition of lexical and grammatical aspect in Chinese. First Language, 18, 311–350. Li, Ping, Curt Burgess, & Kevin Lund (2000). The acquisition of word meaning through global lexical co-occurrences. In E. V. Clark (Ed.), Proceedings of the 30th Child Language Research Forum. Cambridge, MA: Cambridge University Press. Li, Ping & Igor Farkas (2002). A self-organizing connectionist model of bilingual processing. In R. Heredia & J. Altarriba (Eds.), Bilingual sentence processing. North Holland: Elsevier Science Publisher. Li, Ping, Igor Farkas, & Brian MacWhinney (2004). Early lexical development in a selforganizing neural network. Neural Networks, 17, 1345–1367. Li, Ping & Brian MacWhinney (1996). Cryptotype, overgeneralization, and competition: A connectionist model of the learning of English reversive prefixes. Connection Science, 8, 1–28. Li, Ping & Brian MacWhinney (2002). PatPho: A phonological pattern generator for neural networks. Behavior Research Methods, Instruments, and Computers, 34, 408–415. Li, Ping & Yasuhiro Shirai (2000). The acquisition of lexical and grammatical aspect. Berlin and New York: Mouton de Gruyter. Lund, Kevin & Curt Burgess (1996). Producing high-dimensional semantic space from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28, 203–208. MacWhinney, Brian (1987). The competition model. In B. MacWhinney (Ed.), Mechanisms of language acquisition. Hillsdale, NJ: Erlbaum. MacWhinney, Brian (1989). Competition and lexical categorization. In R. Corrigan, F. Eckman, & M. Noonan (Eds.), Linguistic categorization. New York: Benjamins. MacWhinney, Brian (1998). Models of the emergence of language. Annual Review of Psychology, 49, 199–227. MacWhinney, Brian (2000). The childes project: Tools for analyzing talk. Hillsdale, NJ: Lawrence Erlbaum. MacWhinney, Brian (2001). Lexicalist connectionism. In P. Broeder & J. M. Murre (Eds.), Models of language acquisition: Inductive and deductive approaches. Oxford, UK: Oxford University Press. MacWhinney, Brian & Jared Leinbach (1991). Implementations are not conceptualizations: Revising the verb learning model. Cognition, 40, 121–157. Maratsos, Michael & Mary Chalkley (1980). The internal language of children’s syntax: The ontogenesis and representation of syntactic categories. In K. Nelson (Ed.), Children’s language (Vol. 2). New York: Gardner Press. Marchand, Hans (1969). The categories and types of present-day English word-formation: a synchronic-diachronic approach. Münich: C.H. Beck’sche Verlagsbuchhandlung. JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.26 (1507-1609) Ping Li McClelland, James & David Rumelhart (1981). An interactive activation model of context effects in letter perception: Part 1. An account of the basic findings. Psychological Review, 88, 375– 402. Miikkulainen, Risto (1993). Subsymbolic natural language processing: An integrated model of scripts, lexicon, and memory. Cambridge, MA: MIT Press. Miikkulainen, Risto (1997). Dyslexic and category-specific aphasic impairments in a self-organizing feature map model of the lexicon. Brain and Language, 59, 334–366. Pinker, Steven (1989). Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press. Pinker, Steven (1991). Rules of language. Science, 253, 530–535. Pinker, Steven (1999). Out of the minds of babes. Science, 283, 40–41. Pinker, Steven & Alan Prince (1988). On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, 73–193. Plunkett, Kim & Virginia Marchman (1991). U-shaped learning and frequency effects in a multilayer perceptron: Implications for child language acquisition. Cognition, 38, 43–102. Plunkett, Kim & Virginia Marchman (1993). From rote learning to system building: Acquiring verb morphology in children and connectionist nets. Cognition, 48, 21–69. Redington, Martin, Nick Chater, & Steven Finch (1998). Distributional information: A powerful cue for acquiring syntactic categories. Cognitive Science, 22, 425–470. Rumelhart, David & James McClelland (1986). On learning the past tenses of English verbs. In James L. McClelland, David E. Rumelhart, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructures of cognition (Vol. 1).Cambridge, MA: MIT Press. Rumelhart, David, Geoffrey Hinton, & Ronald Williams (1986). Learning internal representations by error propagation. In James McClelland, David Rumelhart, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 1). The MIT Press. Saffran, Jenny, Richard Aslin, & Elissa Newport (1996). Statistical learning by 8-month-old infants. Science, 274, 1926–1928. Saussure, Ferdinand de (1916). Cours de linguistique générale. Paris: Payot. (English translation: A course in general linguistics. New York: Philosophical Library; Chinese translation: Putong Yuyanxue Gangyao. Beijing: Commercial Press). Seidenberg, Mark (1997). Language acquisition and use: Learning and applying probabilistic constraints. Science, 275, 1599–1603. Spitzer, Manfred (1999). The mind within the net: Models of learning, thinking, and acting. The MIT Press. Vendler, Zeno (1967). Linguistics in philosophy. Ithaca: Cornell University Press. Whorf, Benjamin L. (1956). Thinking in primitive communities. In J. B. Carroll (Ed.), Language, thought, and reality. The MIT Press. JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.27 (1609-1641) In search of meaning Appendix arrange DIS integrate DIS make ZERO write ZERO roll UN load UN place DIS grow ZERO wind UN affiliate DIS work ZERO believe DIS move ZERO possess DIS like DIS settle UN start ZERO sit ZERO turn ZERO array DIS stop ZERO real UN put ZERO arm UN aggregate DIS engage DIS Figure 3. Part A. JB[v.20020404] Prn:13/02/2006; 13:26 Ping Li screw UN bind UN entangle DIS lock UN braid UN buckle UN fasten UN clasp UN latch UN tie UN clog UN fold UN bolt UN strapUN bandage UN wrap UN chain UN hitch UN close DIS lace UN tangle UN dress UN hinge UN zip UN curl UN wind UN veil UN hook UN cork UN mask UN sheathe UN coil UN twist UN crumple UN ravel UN scramble UN cover UN plug UN snap UN button UN leash UN Figure 3. Part B. F: HCP1506.tex / p.28 (1641-1641) JB[v.20020404] Prn:13/02/2006; 13:26 F: HCP1506.tex / p.29 (1641-1641) In search of meaning settle UN do UN make ZERO write ZERO load UN rol UN crumple UN ravel UN screw UN reel UN braid UN wind UN twist UN fold UN tie UN coil UN curl UN mask UN scramble UN Figure 3. Part C. JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.1 (47-110) chapter Grammar and language production Where do function words come from?* Joost Schilperoord and Arie Verhagen Tilburg University / Leiden University Most psycholinguistic models of language production start from a strict division between computation and memorization. Individual content words are retrieved from the lexicon, and assembled into larger structures by means of grammatical computation. Because function words are considered grammatical elements, their insertion into these structures results from computation, rather than retrieval. We argue that this view may be incorrect, or at least incomplete. Our case rests on an analysis of the distribution of production pauses relative to function words in a corpus of production data. We demonstrate that the data are better accounted for when we assume that the cognitive status of many of the linguistic structures people produce is that of schemata, with function words serving to retrieve them from memory. Keywords: language production, storage vs. computation, function words, grammatical schemata . Introduction In this paper, we want to bring evidence from linguistic processing, in particular from language production, to bear on the issue of the proper characterization of linguistic knowledge – i.e., on views about the organization of the mental lexicon and mental grammar. The specific topic we will focus on is the question: in exactly what way are grammatical words, or ‘function words’, selected in the process of spontaneous language production, and what this implies for theories of linguistic knowledge. Both the theoretical issue and the evidence we present are actually quite straightforward, which in our opinion makes the conclusions all the more inevitable, but to our knowledge this particular connection between theory and data has so far escaped the attention of linguists and psycholinguists alike. Cognitive linguistics has so far not really developed any serious attempt to relate theoretical JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.2 (110-161) Joost Schilperoord and Arie Verhagen ideas to processes of production, but it should – given the cognitive commitment – and we show that it actually has considerable insights to offer, especially concerning the question whether grammar and lexicon function as distinct ‘modules’ in production. . The roles of lexicon and grammar in a theory of language production Any approach to what a process of language production looks like naturally assumes that such a process starts with a communicative intention – i.e., the intention to convey the content of a message rather than the intention to produce some sounds, marks on paper, or whatever (cf. Levelt 1989: 108–110). There is also a tradition, especially among linguists but also embraced by many psycholinguists, to make a distinction between so-called content words and function words. Typical examples of the former are nouns and verbs, while typical examples of the latter comprise articles, conjunctions, prepositions, and the like. As the labels ‘content’ and ‘function words’ suggest, the former are supposed to carry the (conceptual) content of what is said, while the latter are indicators of some (grammatical) function of the elements that they are attached to – i.e., (at least in their most pure form) markers of structure rather than content. To give an example, in a phrase such as, the hunt for the escaped prisoners, the element for does not in itself contribute a particular meaning, but serves to mark the phrase the escaped prisoners as the object of the predicate hunt; similarly, the definite articles serve to mark the status of the syntactic category (‘noun phrase’) of the phrases they belong to, rather than to convey some independent aspect of content. Now this combination of ideas immediately gives rise to a question. If the language production process starts from conceptual content, and if function words do not carry semantic content themselves – as are indeed the main assumptions underlying many current theories of language production – then it cannot be content that triggers the production of function words; so what is it that gives rise to the production of function words? The natural answer that immediately suggests itself is, of course: the structural position for a specific function word becomes available at some point in the production process, and this is what triggers its production. This in turn leads to a new question: how does this structural position become available? Again, an answer seems to be readily available: the relevant structure can be produced through the application of certain grammatical rules invoked by the elements that do have an immediate connection to conceptual content: the content words. It is precisely this view that has been implemented in an influential model of language production: one that may safely be said to represent the received view of the role of grammatical rules in language production (cf. Carroll JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.3 (161-254) Grammar and language production 1999: 208/209) – i.e., the model of Incremental Procedural Grammar (IPG), proposed by Kempen and Hoenkamp (1987), and adopted by Levelt (1989, 1999). Informally, the model assumes the following major subsystems in the overall language production process: (1) Conceptualization → Formulation → Articulation. In principle, the later subsystems (Formulation, Articulation) are dependent on the ones preceding them. However, the model allows each of these processes, and possibly ‘smaller’ subprocesses, to operate in parallel to a large extent. That is, while the routines controlling the articulatory organs are doing their work for one piece of an utterance, the routines for formulation may be working on the following piece, and the conceptualizer is in fact already planning what to say next. The part that we are interested in here (as most researchers of language production are) is the Formulator. This subsystem converts conceptual structures into linguistic structures. The input to the Formulator is formed by a thought from the Conceptualizer; we do not have to be concerned here with the precise format of this input, and we will simply assume some system for representing propositions. This thought contains concepts, and it is these that set the formulation process in IPG in motion. This consists of the steps listed in Table 1 below. First of all, we want to stress the importance of step 4 in the model: the inherent limitations of working memory (Baddeley 1990). It is an essential factor in the explanation of a very general feature of normal language production, viz. the fact that it is incremental (hence the name of the model) in that it proceeds ‘in spurts’, with pauses reflecting the workings of the production system in between. If it were not for the limitations of working memory, language production would not proceed in spurts at all – i.e., it would not be incremental, as it actually is. After all, if speakers would have unlimited processing capacity at their disposal, then utterances – or even entire texts for that matter – could be prepared in advance, and the language production process would be continuous, guided by an all encompassing production plan (Kempen & Hoenkamp 1987: 203). However, since both the empirical phenomenon of pausing and the theoretical assumption of limited space in working memory are quite robust, we will also adopt this assumption; in fact, the consequent tendency of releasing working memory as soon as possible will play an important role in our argument for an alternative analysis. Secondly, the model is maximally ‘structure building’ and ‘lexically driven’, and these two features are strongly related. The model is structure building in the sense that it assumes that all of the structure of grammatical strings is computed, built ‘on the fly’, and none of it is directly retrieved from (long term) memory. This is also directly related to the next point, the ‘lexical hypothesis’: there is never a direct link between the conceptual structure and a rule of grammar (and hence a piece of grammatical structure), since a call to a grammatical rule (be it a syn- JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.4 (254-297) Joost Schilperoord and Arie Verhagen Table 1. Overview of formulation in IPG 1 2 3 4 The mental lexicon is accessed, with the concept as address, to retrieve the linguistic element expressing it. The routines performing this task (mapping non-linguistic concepts to linguistic units) are called lexicalization procedures. Retrieval of the lexical element normally activates the entire entry, not just the phonological shape of the word but also information about its syntactic category, and its sub-categorization frame, and perhaps other information. Given the lexical element, especially the information about its syntactic category, the appropriate phrase structures are built by means of ‘syntactic procedures’ (if the element retrieved from the lexicon is a noun, a noun phrase is built according to the grammatical rules for noun phrases in the language, etc.). The output of these syntactic procedures (i.e., syntactic phrase markers / ‘tree structures’), contains functional positions; these are filled in with the appropriate bound morphemes, inflections, auxiliaries, determiners, etcetera, by means of ‘functorization procedures’. Results of step (3) are put out to the Articulation routines as soon as possible in order to release working memory – i.e., the limited space is made available for another formulation process as quickly as possible. tactic procedure or a functorization procedure) is mediated by at least one lexical item; only the latter are directly linked to the conceptual structure in the process of language production. As Levelt put it: The lexical hypothesis entails, in particular, that nothing in the speaker’s message will by itself trigger a particular syntactic form, such as a passive or a dative construction. There will always be mediating lexical items, triggered by the message, which by their grammatical properties and their order of activation cause the Grammatical Encoder to generate a particular syntactic structure. (Levelt 1989: 181) These features of the model may be said to express a purely “formal” view of grammar; it specifies structural properties of linguistic utterances without considering them meaningful. Let us illustrate these characteristics of IPG by means of some simple examples. How does the production of a simple noun phrase such as the circumstance proceed? By assumption, the conceptual structure contains a specification of the concept circumstance and the first step in the formulation process consists of matching this non-linguistic concept with an element in the mental lexicon. The specification of the information found there (meaning, phonological shape, syntactic category, possibly other relevant information) is given partly in: (2) [circumstance, circumstance, N,. . . ] Subsequently, the information that the element expressing the concept is a noun, triggers the syntactic procedure for building a noun phrase:1 (3) a. N2 → det, N1 JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.5 (297-344) Grammar and language production The rule produces a structure consisting of two elements, one of which (N1 ) is itself a trigger for a syntactic subprocedure, (3b): (3) b. N1 → . . . N . . . The output of this rule does not contain triggers for calling further syntactic procedures, and the lexical node (N) provides a point to attach the lexical item to. At this point – i.e., after lexicalization and syntactic specification but before functorization (the end of stage 2, in Table 1), working memory contains the following partially specified structure: N2 (4) det N1 N circumstance This structure contains a node for a functional element, in this case a determiner position, which functions as a trigger for a functorization procedure (stage 3, in Table 1). This procedure inspects the conceptual structure for the specification of the ‘accessibility’ (cf. Ariel 1988) or some equivalent notion of the concept involved, in order to decide between inserting either the, a or ø; supposing that the value found is +accessible, the element the will be inserted. As Kempen and Hoenkamp (1987: 218) argue, the insertion of function words is “chiefly motivated on syntactic grounds, so they cannot be supposed to originate simply from lexicalization”. In this case, for example, it may be supposed that the realization of a determiner, such as the, is dependent on the presence of a Noun Phrase node in the structure being produced, and not only on the feature +accessible in the conceptual structure. An accessible concept expressed by an adjective or a verb should not be marked by the, so the determiner cannot be seen as arising directly from the conceptual structure by lexicalization of +accessible, in the same way as circumstance originates from lexicalization of circumstance. The consequence of the strict separation of functorization from lexicalization and syntactic procedures (in two distinct production stages) is thus that structures of the type (4), with all of the content words and none of the function words specified, have to be taken as representing a particular and necessary stage in the production of a linguistic utterance. To take a slightly more complicated example, consider the production of the phrase, the start of the program, according to this model. The relevant portion of the underlying conceptual structure will look like (5): JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.6 (344-397) Joost Schilperoord and Arie Verhagen (5) [START (PROGRAM)] After lexicalization and syntactic specification, the intermediate representation of the expression being produced, will look something like:2 N2 (6) det N1 N start Subject: N2 det N1 N program It is on the basis of this representation, containing all content words and a complete specification of the phrase structure, that the production process enters stage 3, in which functorization results in the addition of the, of, and again the to the representation, which can then be passed on to the articulation procedures (stage 4). The reason why we presented the workings of IPG in this respect in some detail, is that this view on the different status of content words and function words in production gives rise to a very specific prediction about the temporal structure of the production of utterances in languages like Dutch and English (for which IPG was designed), in which most function words precede the lexical heads of phrases. As we explained earlier, an empirical argument for the incremental nature of production consists in the occurrence of pauses. However, the model not only predicts that pauses occur at all, but also where they should normally occur. In Dutch, English, and similar languages, pauses are not to be expected between a function word and the related content word, but only at the phrase boundaries. The reason is that because of the assumed order of stages 2 and 3, whenever a function word (output of stage 3) is present, the associated lexical head (output of stage 2) is necessarily present as well. If it were not, the relevant functorization procedures could not have been called, so when the output of stage 3 is ready to be articulated, all related material that was produced in stage 2 is equally available. IPG does not seem to be committed to a particular prediction in this respect for functional elements that follow content words or for languages in which most function words occur to the right of a lexical head, since in such cases the linear order to be produced is parallel to the assumed order of formulation processes (stage 2 for heads, stage 3 for function words). The claims in this paper concern only (languages with) function words preceding lexical heads. In this situation the assumption about the limited capacity of working memory comes into play: since JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.7 (397-462) Grammar and language production information in working memory is released as soon as possible (cf. stage 4), and since lexical heads are available in working memory when function words are, it follows that when function words are produced and thus released from working memory the related lexical head should be uttered as well. Hence normally no pauses are to be expected after a function word, whereas they are expected to be quite normal before a function word. This is a clear and straightforward empirical prediction, related directly to a central assumption of the IPG model as incorporating a specific view on the relation between grammatical structure and the lexicon, viz. one that is maximally structure building, with no direct link whatsoever between the grammatical structure and the conceptual content of utterances (cf. above). It is also a prediction that we believe to be highly problematic. In ordinary language production, as we will see, pauses immediately following function words are so frequent that they must be taken as a quite normal phenomenon, not an exception. The next section is devoted to a demonstration of this claim. Following this demonstration, we will try to sketch an alternative view, incorporating the idea that function words mark grammatical constructions, or schema’s, as structured symbolic units that may be retrieved from long term memory, just as so-called content words are. . Pause patterns relative to function words Some quantitative data The previous section discussed what may be considered the ‘received’ view in psycholinguistics on the interaction between processing and grammar. Function words are essentially markers of structure. They enter the production process by means of functorization procedures which are activated as soon as the lexical head of the phrase marker is activated. Producing determiners, for example, depends on features of the activated lemma (its syntactic category, for instance), while the functorization procedure checks the conceptual structure for the presence of features in order to decide whether a definite or an indefinite article is to be produced. Therefore, function words have no independently represented correlate at the level of conceptual structure. The empirical phenomenon to be analysed in this section consists of pause patterns relative to function words. We will show that during the (oral) production of routine business letters, text producers tend to pause predominantly after function words. Data were collected by audio taping six Dutch lawyers in their offices while they were dictating routine daily correspondence, using a dictation machine. The data were naturalistic, i.e. all letters were actually sent to business associates or JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.8 (462-505) Joost Schilperoord and Arie Verhagen clients. The statistical data to be reported in this section are based on 120 of such letters. Together, these letters contained about 23,000 words, and over 7,800 pauses. Dictating can be taken to be a way of producing written texts (Schilperoord 1996: 19–23; see also Schilperoord 2001). All tapes were transcribed verbatim, including all pauses, errors, restarts and the like. Dictation was chosen because it makes the job of detecting, locating and measuring pauses fairly easy. Moreover, because of the monologic situation we may assume that pauses will not occur for interactional reasons, so that they may in general be considered to reflect cognitive processes. We only considered ‘silent’ pauses, not so-called filled ones (e.g., uh. . . ) to increase the validity of this assumption further, as there are suggestions in the literature that different sorts of filled pauses may have specific functions (cf. Clark 1996). It should be noted, though, that our conclusions do not depend in any way on how specific these pause patterns are for dictation.3 We tested general predictions about the relationship between language production and grammatical structure, using dictation as material, and using pauses between increments of production as evidence for the status of the segments involved. There are two possible causes for a pause or hesitation to pop up in the normal stream of speech:4 it may occur, firstly, because the language producer has some difficulty, or at least needs some time, in working out the conceptual specifications of his message, or secondly, because matching a concept with a lexical item leads to some delay. In both cases, however, we may expect pauses – allegedly reflecting these cognitive activities – to occur before a function word, and not after it. In other words, lexically driven models predict pauses to respect the phrasal structure of the message. Another way of putting this would be that by their very nature lexically driven models of language production deny that there might be any cognitive reason for a pause to occur after a function word. With this in mind, let us now have a look at the following transcript, taken from a dictation session of a Dutch lawyer, producing a routine judicial letter – see example (7). (7) (. . . ) → 1. deel ik u mede dat de / inform I you that the → 2. door / by → 3. mij op de / me at the → 4. zitting van / session of → 5. DATUM bij / DATE at JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.9 (505-560) Grammar and language production 6. de NAAM overhandigde / the NAME delivered 7. pleitnotities / oral petitions (. . . ) “. . . I inform you that the oral petitions delivered at NAME at the session of DATE. . . ” All numbered lines represent the increments by which this stretch of discourse came about. That is, slashes after each line indicate a pause of at least .3 seconds.5 As can be seen in lines 1 to 5, pauses occur right after a function word – determiners in 1 and 3, and prepositions in 2, 4 and 5. Obviously, the pattern shown in (7) is not what is expected on account of lexically driven models of speech production. Phrasal boundaries are often violated indicating that at these locations there is ‘structure’ with no apparent content (a situation that is ruled out by lexically driven models). If indeed these pauses reflect conceptualization or lexicalization processes, then where do the function words originate from? For example, if the presence of the determiner in line 3 depends on the presence of the lexical head zitting (“session”), as lexically driven models have it, then how can de (“the”) have been produced already whereas zitting is still underway, or may not even have been retrieved from memory? Phrased differently, how can we account for the fact that an NP is already ‘there’, so to speak, whereas its lexical head is not? To anticipate the conclusion, it is our conviction that data such as these force us to seriously consider the possibility, first, that functional elements such as articles might have an independent correlate at the level of conceptual structure, and second, that structured phrases, such as noun phrases, may be activated during language production as relatively underspecified templates, or ‘constructions’. That is, ‘bare’ phrasal units may very well result from retrieval processes, with a complete structural unit being accessed holistically, in a ‘Gestalt’-like manner, rather then from computational processes that build them out of elementary parts. However, in order to substantiate such a (far reaching) claim, we have to show that we are in fact dealing with a regular pattern in language production. That is, we have to show that what we see in (7) is not exceptional. To this end, we will provide information concerning the proportions of pauses relative to function words, such as articles and conjunctions. In brief, the question is: are we dealing with a phenomenon that occurs frequently enough to be theoretically interesting? For the proportional analysis, we used the data-base described above. Each transition between every pair of words in the 120 texts in the corpus was scored for the syntactic category of the word preceding the transition and the syntactic category of the word following the transition. A gross distinction was made between JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.10 (560-660) Joost Schilperoord and Arie Verhagen function words, such as articles, prepositions, and conjunctions on the one hand, and content words such as nouns, adjectives, adverbs and verbs, on the other. In addition, transitions between words were scored for the presence or absence of a pause. This information allows us to analyse pause occurrences in strings such as those in (8). Each slash marks a potential pause location: (8) /a/garden/ /an/English/garden/ /in/an/English/garden/ /in/the/garden/of/Monet/ /that/I/visited/an/English/garden/ The following set of patterns was selected for statistical analysis: (9) 1. det – (adjective) – N 2. prep – NP 3. conj – subordinate clause The category “det” in (9) included the definite (de and het) and indefinite (een) articles, not demonstratives occupying a pre-nominal position. The category “prep” includes all prepositions, and pronouns were included as members of “NP”. Finally, “conj” consisted of the words dat and om (i.e., the elements that can introduce complement clauses, such as finite and infinite clauses, respectively), and that are therefore often considered purely grammatical elements, devoid of meaning. These strings allow for the following set of possible locations for pauses: (10) 1. a. pause – det and/or: b. det – pause – (adj.) – N 2. a. pause – prep and/or: b. prep – pause – NP 3. a. pause – conj and/or: b. conj – pause – clause We first estimated pause proportions for each possible location with regard to these three kinds of function words. Then, in order to put these proportions into perspective, comparisons were made between the proportions of pauses preceding and those following function words (the a- and b-columns in (10)). In order to produce interpretable comparisons for the first two categories (det – N, prep – NP), all sentence-initial occurrences of these phrasal types were omitted as other analyses had revealed pauses occur at almost every sentence (or paragraph) transition (Schilperoord 1996). As such pauses presumably serve widely different cognitive purposes, including them in the data set would lead to an overestimation of pause proportions before function words. Both proportionate data and comparisons are summarized in Table 2. The data show that 53% of all determiners produced were followed by a pause, whereas 39% were preceded by a pause; similarly, 25% of the prepositions and 59% of conjunctions were followed by pauses. What is particularly noteworthy is JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.11 (660-696) Grammar and language production Table 2. Proportions of pauses and comparisons for three opposite pairs of pause locations (P = pause occurrence) string types proportions χ2 det – P P – det prep – P P – prep conj – P P – conj .53 .39 .25 .30 .59 .47 93.91* 14.24* 25.89* * = significant with P ≤ p .05 that the proportions of pauses after function words is quite high, given the prediction that no pauses should occur at these locations. In the case of determiners and conjunctions, these proportions even exceed the ones of pauses preceding these functional categories (but the situation is the reverse for prepositions).6 A chi-square analysis proved these differences to be significant. So, to conclude this section, pause occurrences after function words are a highly regular phenomenon; in fact, the post function word location even seems to be the favourite one in the case of determiners and conjunctions. Constructions: The case of determiners The empirical evidence presented in the previous section indicates that pauses predominantly occur after ‘meaningless’ function words such as determiners and conjunctions. Given the processing assumptions discussed in the first section, these data are difficult to account for by lexically driven models of speech production. This section will (briefly) introduce an alternative view on production, based on the notion of constructions (cf. Langacker 1990; Goldberg 1995; Jackendoff 1995, 1997, 2002; Kay & Fillmore 1999). The basic tenet of our proposal is that phrasal categories are involved in the process of production as underspecified constructions or schemas, which, being stored in long term memory, are on a par with words – i.e., they are all contained in the mental lexicon. Indeed, the relevant distinction between lexically driven models of speech production and a construction based view primarily concerns the relation between ‘lexicon’ and ‘grammar’, and the interplay between what is stored knowledge and what is computed ‘on the fly’. In order to avoid redundancy, lexically driven models tend to identify the grammatical component of the production system as computational, and to reduce it to the smallest possible set of rules required to account for the facts of language. Consequently, if a certain grammatical structure (say, that of a noun phrase) can be computed by some set of rules, then noun phrase templates cannot be part of the declarative mental lexicon. JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.12 (696-747) Joost Schilperoord and Arie Verhagen Construction based models, on the other hand, allow for redundancy: the ‘rules’ themselves are viewed as ‘constructional idioms’ (Jackendoff 1995: 155; Jackendoff 2002: Chapter 6) that may vary as to their degree of phonological specifications. This means that the outcomes of a certain set of rules coexist freely together with the rules. Redundancy is built in, so to speak, rather than an exception. With regard to noun phrases, the maximally underspecified or basic construction for languages such as English or Dutch is (11): (11) NP: [det + . . . + N] This construction has a number of elaborations, inheriting the features of the basic constructions, as shown in (12). (12) NP: [de/het/een + . . . + N] A construction such as (12) thus consists of a fixed element (the determiner), a ‘slot’ for the obligatory element (usually the lexical head) and (in some cases) some optional slots (indicated by dots). In addition, some expressions that are licensed by (11) may be fully specified, constituting a ‘fixed’ or ‘prefabricated’ construction (cf. Erman & Warren 2000), as in (13). (13) een kop koffie (“a cup of coffee”) het toilet (“the bathroom”)7 The essential property of basic schemas/constructions and their elaborations is that they are lexical items, stored in long term memory, despite the fact that they can be computed by phrase structure rules. Now, how do such constructions allow us to account for the kind of pause patterns observed? Our discussion of this issue will first be confined to noun phrase constructions – later on, we will discuss prepositions and conjunctions. First, look at the transcript example in (14). (14) (. . . ) 1. de / the 2. omstandigheid / circumstance (. . . ) Let us suppose that the pause after the determiner de indeed signals some cognitive activity, aimed either at specifying the concept to be expressed, or at retrieving a lexical item that serves to express an already activated conceptual structure [circumstance]. Since according to lexically driven models, the production of the determiner is ruled out in both situations, we have to look for ways in which the JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.13 (747-813) Grammar and language production determiner nevertheless can be produced independently from its ultimate lexical head (the noun omstandigheid). How can this be accomplished? Our proposal is that in the course of producing the noun phrase de omstandigheid, in fact two independent structures are activated: one ‘schematic’ construction [de + . . . + N], and one lexical element (omstandigheid, “circumstance”), and for some reason a more or less brief delay may occur between the activation of the two elements. Possible reasons for a delay of activation can be taken to be of the standard type (see also the discussion in Note 4); they may involve conceptualization (deciding on exactly what concept is to be expressed) or lexicalization (retrieving a lemma from the mental lexicon). In a lexically driven model, there is just one possible alternative cause for a pause to occur in such a location. This has to do with the fact that a lexical entry is assumed to be split up into two parts: lemma and form information, where the former is used in the grammatical encoding stage of the Formulator, and the latter (specifying the word’s morphology and phonology, and ‘pointed’ to by the lemma) is in the phonological encoding phase. This makes it possible in principle that the following situation arises: the lemma (e.g., [circumstance, circumstance, N,. . . ]) is retrieved, followed by grammatical processing and functorization leading to the utterance of an article (e.g. definite the), and then something goes wrong with retrieving the word’s phonological shape pointed to by the lemma. The resulting situation is one in which the speaker knows exactly what the word is he wants to say, with all kinds of relevant properties, except its full phonological shape; this is usually referred to as the tip-of-the-tongue phenomenon. Thus, theoretically there is a way in an IPG-type model, to account for pauses following a function word, while maintaining that the model, including the production of function words, is lexically driven. The question is, however, to what extent this can be considered a serious alternative to the hypothesis that such pauses reflect genuine cognitive processes (conceptualization or lexical retrieval). First of all, as Levelt points out, little is known whether or not lexical retrieval is a one stage or two stage process – i.e., whether in general an entry’s lemma and form properties are retrieved simultaneously or successively: “The distinction should not be overstated. In particular, we should not conclude that a lexical entry cannot be retrieved as whole [. . . ]” (Levelt 1989: 188). Secondly, as we all know from experience, the tip-of-the-tongue phenomenon is quite rare. If it were to account for the amount of observed pauses after determiners, we would be forced to assume this phenomenon to have occurred in over 50% of all noun phrases produced. This seems highly implausible. The safest thing to assume is therefore that the proportions presented in Table 2 are in fact marginally over-estimated. The large majority of cases, however, must have been produced by ordinary cognitive processes. We therefore feel justified in taking these data as strong support for our construction based proposal. JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.14 (813-877) Joost Schilperoord and Arie Verhagen This account for (most of) the observed pause patterns immediately raises the question: What conceptual specification is required in order to activate the construction [de + . . . + N] (cf. Section 1)? In other words, if indeed [de + . . . + N] is a lexical item, what does it ‘mean’? Actually, we think an answer is readily available. From a conceptual point of view, a determiner such as de (“the”) indicates that an instance of the category named by the noun with which it combines is part of the body of knowledge that is shared in the communicative situation. The communicative situation is called the ‘ground’ of a linguistic usage event, and determiners (among other elements) in English, Dutch, and other languages are said to have the function of specifying if and how concepts are instantiated in the ground – i.e., of ‘grounding’ the concepts that they are applied to. In Langacker’s words: In the case of (. . . ) nominals, grounding is effected by articles, demonstratives and certain quantifiers. Whereas a simple noun (. . . ) merely names a ‘type’ of thing, a full nominal (. . . ) designates an ‘instance’ of that type (. . . ). (Langacker 1990: 321) Langacker goes on by stating that “only ‘grammaticalized’ (as opposed to ‘lexical’) elements can serve as true grounding predications” (1990: 322). Since speakers usually talk about ‘instances’ of things, rather then ‘types’, grounding is a necessary element of any speech act. So, to answer the question “What does a determiner mean?” we may say that it “means” [grounded entity], a conceptual structure that, as such, is associated with the construction [de + . . . + N] in Dutch. Therefore, the notion of grounding constitutes the necessary conceptual motivation for determiners to pop up in the stream of language being produced. In addition however, it accounts for their appearance independently from the conceptual ‘type’ designated by the noun, and it is this property that we need in order to account for pauses occurring after function words. If the two lexical elements can be activated independently, rather than the activation of one being dependent on the activation of another, then nothing prohibits a ‘cognitive’ pause intervening between them. If Langacker’s grounding theory is essentially adequate, the meaning of this schema (or construction) and its activation can be usefully phrased in terms of Jackendoff ’s triple-theory of lexical items.8 The determiner represents a grounding function, taking an entity type as its argument, together constituting a [grounded entity]. This conceptual function can be represented as in (15): (15) [entity ground [entity type ( ) ]] Let us further assume that the Conceptual Structure is associated with Syntactic and Phonological Structures (CS, SS and PS, respectively) in the full lexical entry, as represented in (16). The associations are indicated by subscripts a, b, and c: (16) CS: [entity grounda [entity type ( ) ]b ]c SS: [NP [det de ]a Nb ]c PS: [[CL {de} ]a [WORD { } ]b ]c JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.15 (877-944) Grammar and language production According to this conception, a noun phrase, such as “the circumstance”, comes about as a result of a process of ‘unerging’9 of independently retrieved lexical items: (16), and the one shown in (17). (17) CS: [entity circumstance]x SS: [N ]x PS: [WORD {circumstance}]x The information in (17) tells us what type of entity is grounded; where it is to be inserted within the noun construction; and how it is to be pronounced. To summarize our proposal, the production process underlying noun phrases such as the one in (14) consists of retrieving two independent structures: (16) and (17), respectively. As the retrieval of these structures may well be separated in time, this allows a pause to occur after a determiner as a result of either a process of working out the conceptual specifications of the entity, or of a lexical search. This assumption of two independently retrieved structures, as an assumption about language processing, is directly tied up with assumptions about the structure of a person’s linguistic knowledge. First, the cognitive status of determiners is not inherently different from that of lexical nouns, whereas IPG considers the former as output of a computational process and only the second as retrieved from memory. Second, there can be immediate connections between aspects of conceptual structure and determiners; the latter are essentially meaningful. In brief: as far as determiners and nouns are concerned, there is no essential difference between grammar and lexicon, and structure may be retrieved from memory on the basis of conceptual content.10 This is not to say that this is the only possible route for the production of noun phrases; we would rather see this as an entirely empirical issue, not precluding the possibility that similar products (linguistic utterances) may in actuality result from multiple and variably used cognitive resources. In the present context, however, the crucial point is that the idea of grammatical schema’s (partly specified by determiners) finds strong support in processing phenomena, viz. pauses in language production. . Infinitival conjunctions and prepositions We will now turn to two other types of function words, in order to see whether the ideas put forth in the previous section may be generalized. This section is split up into two parts: the production of a special type of conjunction in Dutch, the infinitival conjunction om, after which attention will be paid to the functional category of prepositions. JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.16 (944-980) Joost Schilperoord and Arie Verhagen Om-clauses The sentences in (18) both contain a non-finite clause introduced by the (infinitival) conjunction om (“(for) to”, “in order to”). (18) a. Ik moge u dan ook thans verzoeken om deze notas met de I may you therefore now request for these invoices with the grootste spoed aan de dienst over te leggen. utmost speed to the service over to put “I therefore want to ask you now to hand these invoices over to the department with the utmost speed.” b. Misschien is het goed wanneer u een dezer dagen Maybe is it good when you one these-gen days telefonisch contact met mij opneemt om hiervover nader te telephone-adj contact with me takes-up for here-about further to overleggen consult “It may be a good idea that you call me one of these days in order to discuss the matter further.” The examples given here illustrate the most common uses of om-clauses in Dutch. (a) contains a complement clause om . . . over te leggen, while (b) contains an adjunct om .. te overleggen. The ‘canonical’ grammatical construction of om-clauses can be captured as follows: (19) [om + . . . + te + Vinf ] Conceptually speaking, however, there are some important differences between the two types of clauses. As we will show later in this section, there is a generalization to be made concerning the function of om itself in these two types (om is not homophonous), as well as the way in which they relate to their matrix clauses which is also quite different, and relevant to processing. In the case of a complement om-clause, the contents of the clause specify some aspects of the matrix phrase it is attached to, usually a mental space predicate (noun or verb of cognition or communication, e.g. believe to, request to, promise to), sometimes a causal predicate (e.g. cause to, attempt to). Adjuncts, on the other hand, are connected to the main clause by means of an adverbial relationship which is not itself predicated in the main clause, e.g. means-ends. Thus in (a), the om-clause specifies the object of the verb verzoeken (i.e., it gives the content of the request), whereas in (b), the relation of the om-clause to the main clause is interpreted such that its contents (“discussing the matter further”) constitutes the goal of “getting in touch with me”, which is expressed by the matrix clause. Thus the relationship between a non-finite complement clause and its matrix is that of part-to-whole (conceptually, as well as JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.17 (980-1056) Grammar and language production syntactically) – i.e., a matter of constituency: the relationship between a matrix and an adjunct is that of two parts constituting a whole: it is a coherence relation creating a discourse unit. Structural differences between both types of om-clauses testify to this conceptual difference. First, in the case of complements, om may be omitted (under certain conditions, cf. Van Haaften 1991). It is, in other words, optional for many of these clauses; but in adjuncts, such as in (b), om may never be omitted. Another difference concerns the order of clauses. In the case of adjuncts, the om-clause may be put in front position, a possibility that is ruled out for om-complements. With this in mind, we may say that the construction is associated with two conceptualizations, as indicated in brackets, {. . . } revealing the optional nature of the enclosed element (either om or the entire clause); ‘→’ indicates a constituency relation and ‘↔’ indicates a coherence relation).11 (20) a. CS: SS: b. CS: SS: [WHOLEa →x [PART]b ]c [matrix phrasea + [{om}x + . . . + te + Vinf ]b ]c [MEANS]a ↔x {[END]}b [[matrix phrasea ] + {[omx + . . . + te + Vinf ]b }] As can be gleaned from (20), there is yet another difference in characterizing these two kinds of om-clauses. This feature is treated in detail in Schilperoord and Verhagen (1998) under the heading of conceptual dependency. Put briefly, omcomplements represent some obligatory element of the matrix phrase. We can only conceptualize the event referred to by the verb verzoeken (“request”) if we can in one way or another construe the contents of what is being requested. On the other hand, the optionality of om-adjuncts reflects the fact that a sentence describing a certain action is in itself not necessarily interpreted as an instrument for reaching a goal in an event or state described in another clause. Put simply: one cannot make requests without some content, whereas one can get in touch with someone without this having to be thought of as an instrument for reaching some goal. This distinction leads us to the idea that the relation between a matrix phrase and an om-complement is to be located on the level of clause structure, whereas the relation between the matrix clause and an om-adjunct is to be located at the level of discourse structure (cf. Verhagen 2001 for a discussion of finite complementation as opposed to adjunction in these terms). In other words, om in om-adjuncts signals a coherence relation holding between two discourse segments (cf. Sanders, Spooren, & Noordman 1992). Having discussed the two constructions om participates in, the question now is: What does the X in both CSs mean? In other words: What concept motivates the occurrence of om in both constructions? In principle one could assume that, since there are two constructions, there are two oms as well. However, that would miss an interesting generalization. As we said, in the case of adjuncts om marks JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.18 (1056-1101) Joost Schilperoord and Arie Verhagen the relation between the matrix and the adjunct as one of means to ends; the fact that om specifically introduces the purpose clause is no coincidence: this clause represents the proposition that is not (yet) realized. It turns out that in this way, a generalization can be made to the role of om in complement clauses. Although the issue has been, and still is, much debated (cf. Pardoen 1998: 419ff. and the references cited there), most analysts agree that om in complements also indicates a notion of ‘potentiality’. The role of om, as marking a purpose in the case of adjuncts, provides a specific instance of this concept; after all, a goal is a potential state of affairs that is yet to be realized. In complements too, the notions of ‘goal’ and ‘potentiality’ can be quite close. In (21), for example, the complement (“to never read anything by Voskuil again”) may be said to just express something potential that is not necessarily someone’s purpose, but in (22) the realization of the potential state of affairs (“to come home early”) is probably also the purpose of the person asking the question: (21) Dit deed mij besluiten (om) nooit meer iets van Voskuil te lezen. “This made me decide to never read anything by Voskuil again.” (22) Hij vroeg mij (om) vroeg thuis te komen. “He asked me to come home early.” The possibility of om in these examples contrasts with (23): (23) Hij beweert (*om) ziek te zijn. “He claims to be ill.” In such cases, om is prohibited. The explanation is precisely that om marks its complement as a potential, non-realized state of affairs, which conflicts in this case with the meaning of claim, imposing an interpretation as ‘real’ on its complement. To conclude this point, the meaning of om can be captured as construing the potentiality of the state of affairs represented in the complement clause. Thus as far as its conceptual import is concerned, there is only one om. However, it is also part of the Dutch speaker’s linguistic knowledge that this element can conventionally participate in (at least) two different types of conceptual relations: one a part-whole relationship (complementation); and the other a relationship between two parts (coherence). This provides us with a basis for believing that the presence of om is tightly related to the conceptual structure underlying its production, and not the result of the presence of the lexical head of the non-finite clause, as lexically driven models would have it. With regard to the distribution of pauses with respect to om-clauses that inherit the properties of the schemas in (20a) and (20b) respectively, IPG would predict no differences: pauses would occur mainly before om, but no differences as to pause frequencies are to be expected. Our construction based approach, JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.19 (1101-1196) Grammar and language production however, predicts a substantial amount of pauses, possibly even the majority, to occur after the production of om. However, the differences between the schemas in (a) and (b) even allow for a further refinement of this prediction, especially with regard to pauses occurring before om. To see why, consider again the notion of conceptual dependency. In Schilperoord and Verhagen (1998), Langacker’s definition of conceptual dependency was used: D is conceptually dependent on A to the extent that A elaborates a salient substructure of D. (Langacker 1991: 436) Note that this definition only characterizes schema (20a), but not (20b). In (20a), the ‘whole’-concept is conceptually dependent upon the ‘part’-concept; that is, upon the non-finite complement clause, because the latter elaborates a salient, in fact an essential substructure of the ‘whole’-concept. The main clause in (20b) and its corresponding conceptual import is, however, not conceptually dependent upon the adjunct clause. Its contents may be conceptualized independently from the contents of the non-finite clause. And since pausing between discourse segments is a fairly regular phenomenon (Schilperoord 1996), our specific expectation is that the proportion of pauses before om-adjuncts will surpass the proportion of pauses before om-complements. Hence, the predictions are: I. Pause proportions after om ≥ Pause proportions before om II. Pause proportions before om-adjuncts ≥ Pause proportions before omcomplements In order to test this hypothesis, all cases of om-clauses within the corpus were selected, and labelled for their conceptual import (that is, whether it represented an instance of either (20a) (whole-part) or (20b) (means-end)). All cases of omadjuncts in sentence initial position were excluded from the data base, for reasons mentioned earlier (see the discussion preceding Table 2). This resulted in 89 omcomplements and 32 om-adjuncts. In addition, pauses occurring either before or after om were counted, and proportions were calculated, the results of which are presented in Table 3. Table 3. Numbers and proportions (between brackets) of pauses before and after om in complements and adjuncts complements adjuncts Totals (N = 89) (N = 32) (N = 121) before om after om 28 (.29) 21 (.46) 49 (.35) 68 (.71) 25 (.54) 93 (.65) JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.20 (1196-1235) Joost Schilperoord and Arie Verhagen In accordance with our first prediction, the total number of pauses after om by far exceeds the number of pauses before om (χ2 (1) = 13.63, p < .001). However, this is true only for complements (χ2 (1) = 16.67, p < .001), but not for adjuncts (χ2 (1) < 1). The second prediction concerned the (relative) number of pauses before om in case of om-complements and om-adjuncts. A chi-square test revealed that the proportion of pauses before om-adjuncts exceeds the one before om-complements (χ2 (1) = 3.85, p = .05).12 These data seem to indicate that in the production of om-complements the usual pause pattern is om -{pause}- non-finite clause, while the pattern characterizing the production of om-adjuncts is {pause}_om_-{pause}-_non-finite_clause. Note that this marked difference between both instances of om-clauses could in no way have been predicted on account of lexically driven models, since according to such models, the different conceptual structures in which om-clauses occur are not allowed to play any role as far as producing the ‘functional’ category om is concerned; om would enter the picture as the result of a functorization procedure, triggered by the clausal head alone (the V, being non-finite). In other words, lexically driven models would have predicted no proportionate differences with respect to the two types of om-clauses. However, as the schema’s in (20) clearly show, it is not the verb of the non-finite clause that makes the difference between the two kinds of om-clauses. We have now shown that, just as for determiners, a conceptual motivation for the presence of the conjunction om can be provided. Om marks the potentiality of the proposition expressed by its complement. We also showed that om participates in different constructions, in such a way that processing differences can be deduced depending on the kind of construction, and that these differences actually show up in systematically different patterns of pauses for these constructions. Prepositions We have now discussed two types of function words with different kinds of functions. In the sub-section Constructions, we analysed determiners as providing ‘grounding’ information for (roughly) ‘things’ under discussion in a discourse and as activating a noun phrase schema; and in the sub-section Om-clauses, we characterized the element om as activating a non-finite clause schema and marking the proposition as potential, either as a part of a complementation schema or as a marker of a coherence relation – a difference that was clearly reflected in the pause data. In the course of the discussion, it also became evident that pause patterns around function words may actually differ significantly depending on ‘details’ of the precise conceptual and linguistic relationship between a specific function word and its environment. To conclude our discussion of the relationship between linguistic knowledge and linguistic processing, we will now turn to prepositions. JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.21 (1235-1295) Grammar and language production The implicit claim of a language production model such as IPG, making a categorical distinction between content words (independent entries in the mental lexicon) and function words, is that elements of each of the two classes share some crucial properties that are not shared with elements from the other class (cf. Slobin 2001 for a critical discussion from the point of view of acquisition). We have already put forward several arguments against such a claim, but prepositions provide a particularly strong case against it. That prepositions pose a challenge for such a view could actually have been clear from the very beginning of IPG. Prepositions are markers of some kind of relation. Sometimes those relations seem to be purely ‘grammatical’; in a construct like the transfer of the documents to the judge by the lawyer, the prepositions of, to, and by apparently just mark the grammatical relations in the nominal phrase (direct and indirect object, and subject, respectively); whereas in something like staying under water during a whole day the prepositions under and during express conceptual content. On the basis of this observation, Kempen and Hoenkamp (1987) divided the class of prepositions into two types: ‘short’ and ‘frequent’ prepositions on the one hand; ‘long’, ‘infrequent’ ones on the other. Short prepositions, such as of, to, in, by, are believed to serve grammatical functions,13 and therefore belong to the class of function words, which are supposed to be produced through the application of functorization procedures, as we have seen. Longer and less frequent prepositions (beneath, during, despite, etc.) are assigned to the class of content words expressing conceptual content, and thus are produced by means of lexicalization, in the IPG-model. In view of the preceding discussion we may conclude that this version of IPG predicts systematic differences in the distribution of pauses around prepositions. No pauses are to be expected after the short, grammatical prepositions, precisely because they result from functorization which follows lexicalization; but pauses might very well occur after the longer, lexical prepositions. However, in transcript (7), pauses can occur right after the short prepositions door (“by”), van (“of ”), and bij (“at”), and we have little reason to believe that this would be unnatural or uncommon. So as far as we can see, the proposed division of the class of prepositions into grammatical and lexical subclasses lacks empirical support. However, prepositions as a class might still be said to occupy a kind of intermediate position, but in a different sense. Our view on the cognitive status of function words as developed in the previous sections implies that we attribute two distinct characteristics to them: one is their conceptual import (e.g. marking grounding, or potentiality); the other the fact that they activate a particular linguistic schema, a grammatical construction of some kind. Especially in the class of prepositions, the precise ‘balance’ between these features can differ greatly: whereas some elements serve more as schema activators than as indications of some specific conceptual content, others may specify the conceptual content of part of a message in a highly JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.22 (1295-1343) Joost Schilperoord and Arie Verhagen particular way. In the former kind of cases, the ‘meaning’ of the element in question may be felt to be so vague as to be virtually absent, which may lead people to conclude it serves ‘only’ a grammatical function. In our view, however, this represents just one extreme end of a scale of differences between the relative weights of conceptual content and schema activation, the other extreme end being the case of names – i.e., elements evoking a certain conceptual constellation but not activating any particular linguistic schema. On this scale, prepositions can occupy a wide range of positions, but there is no sharp dividing line between one class of (purely) grammatical elements, and another of (purely) lexical ones. This approach also provides a basis for understanding the difference in pause patterns between prepositions and other function words in Table 2: there are more pauses before prepositions than after them, whereas it is the other way around in the case of determiners and conjunctions. This may very well be a statistical result of the fact that many prepositions have at least some specific conceptual content, so that the production of a preposition more often reflects a conceptual choice which may require some time than the choice of, for instance, a determiner, where the function of schema activation is relatively more important. On the other hand, prepositions, especially if their meaning is highly schematic as in cases such as of, can participate in different grammatical schema’s, and thus give rise to different pause patterns in production. An illustration of this phenomenon was provided in the previous section. There are fewer pauses before om when it is part of a complementation construction than when it introduces an adjunct. Thus we actually should not expect any direct relationship between a particular word and the distribution of pauses during the production of this word; rather what we should look at is the construction of which it is a part on a specific occasion of use. In the case of prepositions, a phenomenon that is especially relevant is that of the so-called ‘fixed prepositions’ (as in prepositional objects, but also in other kinds of expressions). Consider expressions of the type “reply to X”, “think of X”, “talk about X”, and the like. In a view of linguistic knowledge as consisting largely of schema’s that may occur in any degree of abstractness, these expressions are no more than simple illustrations of the point; the schema’s may be retrieved from memory in their entirety. But in a view that distinguishes sharply between lexicalization and functorization, these expressions are much more problematic. Kempen and Hoenkamp (1987) implicitly treat listen to as a single lexical item, but they do not elaborate the point generally. One important point is, in our view, the fact that such units are still analysable.14 That is, think in the combination think of still means think, and of functions as an introduction of a PP-complement, in the same way as it does in the start of the program. It is not clear at all how an approach with a strict separation of lexicon and grammar would allow for this. Another point is that there are many cases where the choice of a head noun or verb (such as think or reply) may strongly constrain the choice of preposition, but JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.23 (1343-1394) Grammar and language production does not determine it fully (consider think of and think about, for example); again, in a ‘maximalistic’ schema conception, this does not pose a problem, but it is not at all clear how this could be accounted for with a strict separation of lexicalization and functorization. Thus, the strict division of prepositions into two subclasses with completely different processing properties, as proposed by Kempen and Hoenkamp (1987), does not seem viable. In retrospect, this should perhaps not come as a surprise. After all, the fact that they are all called prepositions is based on similarities in linguistic behaviour, which becomes something of a riddle when some prepositions are assigned a fundamentally different linguistic and cognitive status than others. Furthermore, the whole idea that superficial properties such as length and frequency of prepositions would correlate directly with a specific kind of cognitive status, seems highly implausible, both from a language-internal and from a comparative perspective. For example, would in and into in English have to be produced by two crucially different components of the Formulator – i.e., as a result of functorization and lexicalization, respectively? Or, would the same be true for na (“after”) in Dutch and after in English? Positive answers to both types of questions seem unlikely a priori, so that they would require substantive empirical and theoretical support. But they are precisely what IPG suggests, though without much independent support. All in all, it seems to us that when considered carefully, the treatment of prepositions in a lexically driven model gives rise to exactly the kind of problems that show that the distinction between lexicalization and functorization as processes that are supposed to be temporally separated in a systematic way, is untenable. . Conclusion What we have presented in this paper represents, as usual, to a large extent work in progress. We nevertheless believe to have established some points of general interest. Our explicit aim was to bring together cognitive linguistic views on the nature of linguistic knowledge on the one hand, and evidence from actual language processing on the other. In this way, we have been able to propose some reasonable theoretical accounts for empirical observations of language-in-use (viz. pause patterns relative to function words). Admittedly, some of the ideas presented are still somewhat vague, and as such they may seem to lack the formal elegance and rigour that constitute much of the attractiveness of models such as IPG, positing a strict division of labour between declarative and procedural components of linguistic knowledge (‘lexicon’ as opposed to ‘grammar’; ‘content words’ as opposed to ‘function words’). But elegance and rigour are not all that matters, of course, and especially not if such models leave data obtained from actual language use un- JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.24 (1394-1450) Joost Schilperoord and Arie Verhagen explained. Despite some vagueness, we therefore claim to have demonstrated the following points. 1. The production of linguistic elements marking grammatical constructions (so-called function words) does not have to depend on specification of other linguistic elements, assumed to express the conceptual content of a message (so-called content words, or lexical entries); conceptual motivation can be provided for the production of alleged function words independently from their ‘lexical heads’ or neighbours. 2. Therefore, linguistic knowledge, as put to use in spontaneous processes of language production, does not involve a principled distinction between ‘functional’ and ‘lexical’ words. 3. The view of language production that emerges from this is that of a person assembling an utterance by putting together a number of symbolic units retrieved from long term memory, some of which are more schematic than others, and each of which is relevant to at least some aspect of the message to be conveyed; constraints on the way the units are put together derive from information in the units themselves, at least to a large extent. By themselves, these ideas are not new, as even a brief glance at the history of cognitive linguistics shows. However, showing that one can use data from spontaneous language use to support these ideas is relatively new. Although it may sometimes be convenient, for expository purposes, to make a distinction between the linguistic system and language use, we would like to stress the importance of combining these points of view in linguistic research if we want to avoid either developing empirically inadequate theories or collecting theoretically empty data. As a final theoretical point, we would like to explicate one general consequence of these ideas. We think our results actually call for a serious reconsideration of the role of abstract notions such as ‘function word’, and abstract categories such as ‘Noun’, ‘Verb’, or ‘Preposition’ in theories of linguistic processing, and consequently in actual linguistic knowledge. Models such as IPG are obviously strongly inspired by formal theories of grammar, and therefore take great pains to model the role of abstract grammatical categories independently from concrete semantic and phonetic considerations. Levelt’s (1989) Formulator thus models the lexico-grammatical stage of the production process as a computational process of manipulating abstract, formal categories. Meaning is strictly separated from grammar, with a lexicon as mediator, and at the other side phonetic properties of an utterance are also separated from the grammar. Grammatical operations are not conceived as operations on units of meaning and form – i.e., symbolic elements. But does an abstract, formally defined notion of ‘function word’ ever play a separate role in processing, independently from the conceptual characterization of the specific element involved? IPG, formally inspired as it is, in fact embodies JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.25 (1450-1507) Grammar and language production the claim that this abstract notion has direct relevance in processing; similarly, it implies that equally abstract notions such as NP and PP, defining functorization procedures that essentially mirror very general phrase structure rules, have direct processing relevance. Our results, however, suggests a rather different picture. As we have seen, there are statistical patterns in the distribution of pauses around function words that do indeed tell us something about their cognitive status. But it has been clear from the start, first of all, that these patterns do not set function words apart, and second, that they are not the same for all subtypes of function words: prepositions differ significantly from determiners and conjunctions. Thus the notion “function word” does not really seem to have a unitary status in processing. Subsequently, we found that the more specific notion “infinitival conjunction” does not have some unitary processing relevance either: the way om is produced in complementation constructions differs from its production process in adjuncts. We also argued that we should in fact not expect specific prepositions to have exactly the same kind of processing properties as other ones. In IPG, prepositions are divided into two subclasses with different processing properties – i.e., a lexical and a grammatical one, in an apparent attempt to retain the idea of immediate processing relevance of such abstract notions. But the more details of actual language processing are taken into account, the more it becomes evident that ultimately each element has its own set of processing properties (which may vary with the constructions in which it participates). Some elements will be more similar in their processing properties than others; these relations of higher and lower degrees of similarity may provide a partial organization (in a kind of network) of the elements, and some of the nodes in this network may correspond to categories such as “Noun” or “Preposition”, which are essentially no more than sets of elements of, to some degree, similar linguistic behaviour, but without such an abstract notion in itself ever being directly relevant in processing.15 In fact, as we have seen, the best way to conceive of the activation of a grammatical schema, e.g. the “NP-schema”, is as the result of the activation of a function word – e.g., a determiner, the selection of which is itself directly motivated by some aspect of conceptual structure. What we process in linguistic communication are conceptual categories and relations which are conventionally associated with particular patterns of form; many of these categories and the relations between them are ‘frozen’ to varying degrees, into what may be analysed as ‘constructions’. These specific constructions are what we use when we produce language. Notes * We thank the audiences at different occasions where we had the opportunity of presenting previous versions of this material for their feedback. We would especially like to thank Gerard JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.26 (1507-1571) Joost Schilperoord and Arie Verhagen Kempen, Ray Jackendoff, Sieb Nooteboom, and other participants in the Utrecht Congress on Storage and Computation of October 1998, as well as June Luchjenbroers and two reviewers of the present volume. Their comments have led to several changes and refinements. Naturally, the responsibility for all claims and speculations in this paper remain entirely our own.All correspondence concerning this chapter should be sent to: Dr. J. Schilderoord, c/- Linguistics, University of Tilburg, P.O. Box 91053, 5000 LE Tilburg, The Netherlands. Email: [email protected], or Prof. A. Verhagen, Research Institute Linguistics Leiden, P.O. Box 9515, 2300 RA Leiden, The Netherlands. Email: [email protected] . The superscript indicates the number of levels of a category in the sense of the X-bar notation (“N2 ” = “N-double bar”). . Kempen and Hoenkamp’s notation of syntactic structures differs from standard generative tree structures in that they explicitly specify at least some of the grammatical functions. This functional information, even though it may strictly speaking be redundant, must be represented in the structure at some point anyhow in order to function as a trigger for the relevant functorization procedures, in this case for example the ones that ultimately result in the insertion of the preposition of. . This question was raised by one of the reviewers of this paper. Although we are not aware of research into the distribution of pauses with respect to function words in spontaneous conversation, incidental observations, including some reported in the literature (e.g., Clark 1996: 268) do suggest that similar patterns at least occur in conversation as well. See Schilperoord (2001) for various methodological and empirical aspects of dictation research. . Of course, pauses may have various other sources than cognitive ones, and this may endanger the validity of both our data and the conclusions drawn from them. In our research, we consider a pause ‘cognitive’ if it reflects conceptualization processes or lexical retrieval (see Boomer 1965; Schilperoord 1996). But what about other sources of pausing, how can we be sure to have kept pauses from other sources out of the corpus? We should first distinguish between pauses that are involuntary, and pauses that language producers willingly insert into the stream of speech. These latter pauses occur by intent and often serve rhetorical or communicative purposes, i.e. they are oriented towards an addressee. Clearly, such pauses could not be considered cognitive in the above sense. However, the possibility of such pauses being present in our corpus can safely be ruled out because of the strictly monologic nature of the production circumstances. All letters in our corpus were dictated to a machine, not to secretaries taking notes. Hence, pauses cannot even have resulted from a friendly employer pausing for the typist’s convenience. But even if pauses can be considered involuntary, they still can be caused by various factors. In terms of the IPG-model, pauses may be caused by all main components of the model, and therefore they may reflect conceptualization processes (preparing what to say), lexical-grammatical processes (retrieving lexical items), morpho-phonological processes (accessing word forms), monitoring processes (monitoring one’s own production), or they may originate from the workings of the articulator. Let us briefly consider these factors in turn. Obviously, the first two factors do not pose any problem since these are the factors that we are interested in in the first place. Pauses caused by the articulator were excluded from the corpus on grounds of pause duration. Dechert and Raupauch (1980) have calculated that ‘breathing’ pauses last .3 seconds at most, so we simply excluded pauses up to that length from the corpus. One should keep in mind that pauses lasting longer than .3 seconds may reflect articular activity, but in those cases one can be sure that this is not the only factor causing these pauses. In other words, pauses lasting over .3 seconds at least also originate from cognitive processing (see also Note 5). JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.27 (1571-1648) Grammar and language production Then there may be morpho-phonological factors causing pauses manifesting the so-called tip-ofthe-tongue. Clearly, such pauses are not cognitive. However, as we devote a lengthy discussion to this possibility in Section 3 (p. 14), we leave this issue aside here. In addition, pauses may originate from the workings of the monitor. While producing texts, text producers constantly monitor their own production. They attend to various aspects of their actions, such as content, choices of phrasing, and so on. Monitoring becomes apparent from various types of self-repairs that are produced ‘on the fly’, that is, while producing speech, but the monitoring process may also cause pauses itself. Once again, monitoring pauses are not the type of pauses that we are interested in here, so how can we be sure that monitoring does not interfere with conceptualization and lexical retrieval? To be honest, we cannot in any strict sense. However, there is some circumstantial evidence that in dictation, monitoring predominantly occurs at pre-established locations: major text structural locations such as prior to paragraphs and sentences. While it is clear that in spontaneous speech, the orientation of monitoring is mainly backwards, under the far more controlled production circumstances that we are dealing with here, its orientation is mainly forwards. That is, while dictating letters, text producers devote quite some attention to conceptual planning pieces of text in advance. One factor suggesting this is the fact that in dictation self-repairs are almost totally absent. Another point is the fact that pausing between paragraphs or sentences last considerably longer than pausing within sentences and clauses (see Schilperoord 1996, 2001), suggesting that at these locations preplanning the content of text parts takes place. Since the pauses that we are interested in are all located around function words, there seem to be good reasons for assuming that such pauses reflect the processes of refining conceptualization or retrieving lexical items. To conclude, our considerations thus far suggest the pauses in our corpus to be mainly caused by cognitive factors (conceptualization, lexical retrieval). Admittedly, other factors can never be ruled out completely, but in the absence of any compelling evidence that such factors correlate structurally with the relevant location types that we consider in this chapter, we may safely assume that these other factors are randomly distributed, and hence do not jeopardize the validity of the data. Finally, we would like to stress the fact that pausing in language production is an empirical phenomenon, and that pausing parameters, such as pause locations, can be analyzed independently from any pre-established theoretical point of view, be it computational psycholinguistics, or cognitive linguistics. What matters, in our view, is how to arrive at a proper account of this issue. . In psycholinguistics, .3 seconds is the generally accepted ‘cut off ’ value for a pause to be taken as reflecting some cognitive activity, rather than as resulting from muscular activities of the vocal tract. See for example Dechert and Raupauch (1980). . This may have something to do with the somewhat ambivalent status of prepositions with regard to their category status: lexical or functional. See Section 4.2 for further discussion, and also Schilperoord (1996), Schilperoord and Verhagen (1997). . In case a reader wonders what is ‘fixed’ about these expressions, compare them with the phrases this cup of coffee and a bathroom (e.g. in Would you like this cup of coffee? or Where can I find a bathroom, please?). . In Jackendoff ’s (2002) theory, lexical items are viewed as correspondence rules between semantic, syntactic and phonological information. Moreover, a lexical entry may be both larger and smaller than an individual word. Idioms are a case in point, but a plural suffix, as an item licensing the formation of plural forms, is also a lexical entry. These assumptions are shared by all present construction based approaches to grammar. Croft (2001) may be seen as arguing against a separate level SS for syntactic information, essentially because there is no way to define JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.28 (1648-1708) Joost Schilperoord and Arie Verhagen the necessary global syntactic notions (“noun”, etc.) in a non-circular fashion, independently of (language) specific constructions. In his view, the information usually considered syntactic reduces to schematic aspects of form and to the symbolic relation between form and meaning. On the other hand, Croft’s view seems to allow for language specific distributional classes to be included in the specification of the form of a construction. As this issue is not directly relevant for the present discussion, we use the more conservative notation here. To avoid misunderstanding: we use Jackendoff ’s formalism only for reasons of convenience. As he has repeatedly and rightly pointed out himself, the formalism does not assume any particular theoretical point of view. . This is the term used in Jackendoff (2002); another term used for essentially the same concept is ‘unification’. See Goldberg (1995), among others, for discussion of the way this notion fits into the theory of Construction Grammar. . While the possibility of a direct relationship between a function word and conceptual structure is a necessary condition for language production as we see it, a reviewer suggested that it might be a sufficient condition. For example, the production of a determiner such as the could be motivated by the presence of the feature +accessible in the conceptual structure, but it need not activate the structure “det–N”, which might still come from the head noun. Being lexically driven or not and being structure building or not are in principle separate characteristics of a production model. On logical grounds, such a possibility cannot be foreclosed, obviously. However, it is first of all not a part of IPG, and second, we have explicitly based our proposal on the constructional approach. The analyses of function words that we are aware of, all share the view that precisely what makes these elements “grammatical”, is the fact that they do not function independently (they are “bound forms”), and are necessarily associated with other, variable linguistic material. We thus continue to assume that activation of a function word by a feature of the conceptual structure also activates the associated schema. . For ease of exposition, we conflated the two formal representational levels S[yntactic] S[tructure] and P[honetic] S[structure]. But see also Note 8. . The difference between the proportions of pauses after om failed to reach significance (χ2 (1) = 3.31, p > .10). . Confusingly labelled ‘clitics’; they are not pronominal and they are also phonologically independent. . Recall that analyzability does not imply compositionality (in the sense of ‘having been composed’). If elements can be distinguished within a linguistic unit (analyzability), it does not follow that the unit has been constructed out of these elements. Even obvious idioms, necessarily stored as units, may exhibit analyzability: in spill the beans, the element spill corresponds to the semantic component divulge and the beans corresponds to information. For a recent discussion, moving in a somewhat different direction, cf. Croft (2001: 180–184). . This position resembles the one defended for linguistic theory in general on the basis of methodological, typological and analytic considerations in Croft (2001), and from the perspective of acquisition in Slobin (2001). In a sense, our analysis provides an additional argument from processing for the hypothesis that global structural notions do not really have explanatory power, and are not primitive but rather based on similarities between specific constructions (cf. Verhagen 2002: 420/421). JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.29 (1708-1829) Grammar and language production References Ariel, Mira (1988). Referring and accessibility. Journal of Linguistics, 24, 65–87. Baddeley, Alan (1990). Human memory. Theory and practice. Hove: Lawrence Erlbaum Associates. Boomer, David S. (1965). Hesitation and grammatical encoding. Language and Speech, 8, 148– 158. Carroll, David W. (1999). Psychology of language (3rd ed.). Pacific Grove, CA: Brooks/Cole Publishers. Clark, Herbert H. (1996). Using language. Cambridge: Cambridge University Press. Croft, William (2001). Radical construction grammar. Syntactic theory in typological perspective. Oxford: Oxford University Press. Dechert, Herbert W. & Marius Raupauch (Eds.). (1980). Temporal variables in speech. Studies in honour of Frieda Goldman-Eisler. The Hague: Mouton. Erman, Britt & Beatrice Warren (2000). The idiom principle and the open choice principle. Text, 20, 29–62. Goldberg, Adele (1995). Constructions: A construction grammar approach to argument structure. Chicago/London: University of Chicago Press. Haaften, Ton van (1991). De interpretatie van verzwegen subjecten [The interpretation of understood subjects]. Diss. VU Amsterdam. Dordrecht: ICG Printing. Jackendoff, Ray (1990). Semantic structures. Cambridge, MA: MIT Press. Jackendoff, Ray (1995). The boundaries of the lexicon. In M. Everaert et al. (Eds.), Idioms: Structural and psychological perspectives (pp. 133–167). Hillsdale, NJ: Lawrence Erlbaum Associates. Jackendoff, Ray (1997). The architecture of the language faculty. Cambridge, MA: MIT Press. Jackendoff, Ray (2002). Foundations of language; Brain, meaning, grammar, evolution. Oxford: Oxford University Press. Kay, Paul & Charles J. Fillmore (1999). Grammatical constructions and linguistic generalizations: the What’s X doing Y? construction. Language, 75, 1–33. Kempen, Gerard & Edward Hoenkamp (1987). An incremental procedural grammar for sentence formulation. Cognitive Science, 11, 201–257. Langacker, Ronald W. (1990). Concept, image and symbol. The cognitive basis of grammar. Berlin/New York: Mouton de Gruyter. Langacker, Ronald W. (1991). Foundations of cognitive grammar, Volume II, Descriptive application. Stanford: Stanford University Press. Levelt, Willem J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press. Levelt, Willem J. M. (1999). Producing Spoken Language: a blueprint of the speaker. In P. Hagoort & C. W. Brown (Eds.), The Neuro-cognition of Language (pp. 94–122). Oxford: Oxford University Press. Pardoen, Justine A. (1998). Interpretatiestructuur. Een onderzoek naar de relatie tussen woordvolgorde en zinsbetekenis in het Nederlands. [Interpretation Structure. A study of the relation between word order and sentence meaning in Dutch.] Amsterdam/Münster: Stichting Neerlandistiek VU/Nodus Publikationen. Sanders, Ted, Wilbert Spooren, & Leo Noordman (1992). Towards a taxonomy of coherence relations. Discourse Processes, 15, 1–35. Schilperoord, Joost (1996). It’s about time. Temporal aspects of cognitive processes in text production. Amsterdam/Atlanta: Rodopi. JB[v.20020404] Prn:9/02/2006; 12:00 F: HCP1507.tex / p.30 (1829-1864) Joost Schilperoord and Arie Verhagen Schilperoord, Joost (2001). On the cognitive status of pauses in discourse production. In T. Olive & M. C. Levy (Eds.), Contemporary tools and techniques for studying writing (pp. 60–89). Dordrecht, Boston, London: Kluwer Academic Publishers. Schilperoord, Joost & Arie Verhagen (1997). Functionele elementen in een cognitief perspectief. Evidentie uit taalproductie [Functional elements in a cognitive perspective. Evidence from language production]. Nederlandse Taalkunde, 3, 223–248. Schilperoord, Joost & Arie Verhagen (1998). Conceptual dependency and the clausal structure of discourse. In J. Koenig (Ed.), Discourse and cognition; Bridging the gap (pp. 141–165). Stanford, CA: CSLI Publications. Slobin, Dan I. (2001). Form-function relations: How do children find out what they are? In Melissa Bowerman & Stephen C. Levinson (Eds.), Language acquisition and conceptual development (pp. 406–449). Cambridge: CUP. Verhagen, Arie (2001). Subordination and discourse segmentation revisited, or: Why matrix clauses may be more dependent than complements. In Ted Sanders, Joost Schilperoord, & Wilbert Spooren (Eds.), Text representation. Linguistic and psycholinguistic aspects (pp. 337–357). Amsterdam: John Benjamins. Verhagen, Arie (2002). From parts to wholes and back again. Cognitive Linguistics, 13, 403–439. JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.1 (47-114) chapter Word recognition and sound merger Paul Warren Victoria University of Wellington Theories of spoken word recognition largely assume stability in the lexical representations onto which the input signal is mapped. Speaker-, style- or situation-dependent variability in the input is accounted for by an appropriately sensitive (or desensitized) pre-lexical analysis. In contrast to the primarily perceptual accounts provided for such variability, this chapter considers the need for a more cognitive account for variation arising from sound change. The particular case under consideration is the merger-in-progress of the front centering diphthongs in New Zealand English. This chapter reviews key research on the realization and comprehension of words containing these diphthongs, and discusses the theoretical implications for theories of lexical access and representation that derive from the current fluid state of these vowels. Keywords: word recognition, sound change, vowel merger, New Zealand English . Introduction Most psycholinguistic models of spoken word recognition assume that the process of recognizing a word normally involves the extraction of acoustic phonetic information from the speech signal and its utilization in some lexical search procedure, together with the exploitation of contextual information to constrain this search. This procedure has been the object of a variety of research questions, concerned with the processes and representations involved in analyzing the input (e.g. Klatt 1989; Marslen-Wilson & Warren 1994; Cutler 1990), as well as with the relationships between form-based and content-based access, or between ‘bottom-up’ and ‘top-down’ processing (Tyler 1990). The central questions for this paper concern how the recognition system copes with variability in the form of the input, long acknowledged as a potential problem for the successful recognition of a word. While the relative lack of invariance has long plagued speech researchers and engineers (Stevens & Blumstein 1981), it is important to recognize that variation can be informative rather than a hindrance to recognition; differences in the articulation of sounds at different word positions can for instance be exploited by the recogni- JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.2 (114-170) Paul Warren tion system as indicators of important phenomena such as word boundaries (cf. Church 1987), and of course differences between speakers are informative about aspects of their identity and status. For the process of spoken word recognition, however, it is important that variation in the input signal does not disrupt the correct identification of a word. The source of variability that we will be considering in this paper is the merger of the /i6/ and /e6/ diphthongs in New Zealand English (nze), sometimes referred to as the ear/air merger. In the remainder of this introduction, I will outline possible consequences of sound mergers in terms of the neutralization of distinctions between words. I will then review relevant recent findings from the study of the ear/air merger, adding some new analyses that highlight issues for word recognition. Finally, I will consider some of the implications of the merger for the process of spoken word recognition. Complete neutralization If the distinction between two sounds disappears completely, in all environments, then a range of words, which differed previously only in that one of them contained one of the sounds where the other had the second, will become homophones. Such homophones may either contain one of the two phonemes previously distinguished in the language, or they may be collapsed onto some intermediate form. In either case, once the relevant adjustments have been made to the phonemic system, the recognition mechanism will find itself in a familiar state, since homophony is already widespread throughout human language. Research on the recognition of homophones suggests that all meanings of a lexically ambiguous word such as bank are initially accessed when the word is heard, even in a strongly biasing context. However, selection of the intended meaning is then rapidly achieved as the various meanings are assessed against the context (e.g. Swinney 1979; Tanenhaus & Lucas 1987). Our experience of homophony is certainly such that it rarely causes difficulties in processing. Predictable partial neutralization A neutralization is partial if it occurs in some linguistic contexts but not in others. In many cases the neutralization is predictable, and is conditioned by the immediate phonetic context. Thus, in reasonably fast speech, the sequence bad girl may not be distinguishable from bag girl because of assimilation of the final /d/ of the first word to the initial /:/ of the second word (though see Nolan 1992, who finds some residual articulatory and perceptual evidence for an alveolar gesture in such sequences). Gaskell and Marslen-Wilson (1998) provide evidence that the perceptual and word recognition system can tolerate such surface variations as long as JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.3 (170-220) Word recognition and sound merger they occur in phonologically viable contexts, so that processes of phonological inferencing can help ensure the successful recognition of the intended word. Similarly, though they were not considering the issue of phonemic merger, Mann and Repp (1981) found that listeners will compensate for the effects of phonetic context, so that a stimulus ambiguous between /t/ and /k/ will be interpreted as the former after /w/ but as the latter after /s/, taking into account the fact that /s/ but not /w/ makes a following /k/ more /t/-like. In both of these examples, the conditioning context can be exploited by the listener in their interpretation of the acoustic input, as it were ‘unravelling’ the effects of phonetic context. In other cases, the partial neutralization of a distinction may result in greater dependency on higher-level contextual information for ambiguity resolution, just as with homophones. Thus in many dialects of English, /t/ and /d/ are both produced as an alveolar flap in intervocalic positions, so that latter and ladder may become auditorily indistinguishable (Wells 1982: 249). In situations like these, presumably, both latter and ladder will be activated on the basis of partial match with the input (or via a phonological rule that allows either to match the flapped form), and selection between them will be based on the additional contextual information, such as the overall meaning of the utterance. Unpredictable partial neutralization Sound mergers resulting in the types of neutralization discussed above will present no new problems for the word recognition system, which already needs strategies for dealing with homophony and predictable partial neutralizations. Our interest in this paper is in a rather different situation from these, in which a phonemic contrast appears currently to be undergoing merger so that the distinction is in a state of flux. Of course, a merger that is in the process of taking place could nevertheless result in some predictable distributions at a particular point in its progress. For instance, an ongoing process of phonological merger might be reflected in a previous distinction being lost for certain word pairs, while still maintained for others. Through a process of lexical diffusion, the merger may subsequently affect all instances of the sounds in question. Before such a time, however, there are likely to be word pairs that are still distinct, and others that have become homophones, and may be treated as such by the word recognition process. In other situations, if two phonemes are merged in predictable phonetic contexts, these contexts will provide information that can be used in retrieving the underlying form. An investigation of the implications of sound merger for word recognition thus includes an examination of whether the merger results in new homophones, and of whether phonetic contexts can be seen to motivate the merger. But simply looking for homophones or for phonetic motivation of merger involves taking a somewhat static snap-shot view of language change, which assumes JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.4 (220-268) Paul Warren that there are discrete stages at which merger has taken place completely for some forms and not at all for others. The reality is generally quite different, with changes taking place more gradually and with some communities or groups of speakers showing a greater tendency to merge than others. This being so, the word recognition system clearly needs to be sensitive to a wide range of variables, potentially including extra-linguistic factors such as speaker identity. The next section isolates some of the variables that are influencing the ear/air merger. . Production data on the ear/air merger Regional differences Various studies of this diphthong merger have been carried out in recent years in New Zealand, the most extensive being a long term study of adolescents in Christchurch (e.g. Gordon & Maclagan 2001, henceforward G&M) and Holmes and Bell’s social dialect study of Porirua near Wellington (Holmes & Bell 1992, henceforward H&B). Both studies agree that the distinction between ear and air diphthongs is becoming less clear. However, they differ in the claims they make about the pattern of the merger. H&B argue that the two diphthongs are now sharing the vowel space in which distinct forms used to be articulated, so that there is an eair vowel that ranges from /i6/-like to /e6/-like forms. The precise form of the merger appears to be in part dependent on speaker age. Thus H&B studied, amongst other groups in their Porirua survey, old (70–79 years), midaged (40–49) and young (20–29) Pakeha speakers (New Zealanders of European descent). Compared with the oldest group, they found that the mid-aged speakers show a shift towards /i6/, but the younger speakers show a movement in the opposite direction, towards /e6/. This is in apparent contradiction to G&M, who have surveyed Christchurch adolescents every five years since 1983, and find increasing evidence of a merger on /i6/, a conclusion that is supported by acoustic analyses by Watson et al. (1998). However, Maclagan and Gordon (1996) pointed out that their 1983 13–14 year-olds, who are near contemporaries of 20–29 yearolds sampled by H&B in 1989–1990, showed the same preference for /e6/. They attribute this to a perceived stigmatisation of the /i6/ form for this cohort of speakers. In addition, it is clear that there are a number of methodological differences between these two studies, not least of which are the speech styles sampled. Thus G&M’s study focused on read materials, while H&B included a larger sample of conversational speech. Starks, Allan and Kitto (1998) present data from a large number of subjects taped in the Auckland area. They also find a difference across age groups, apparently supporting H&B’s claims for increasing movement towards /e6/ amongst JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.5 (268-345) Word recognition and sound merger younger speakers. While /e6/ forms in their sample show little merger, /i6/ forms show movements towards /e6/ at rates that increase with decreasing age. However, Starks et al. only sampled the two words air and ear, and the first of these was actually given in the spoken question used to elicit the word. Thus their data may not be representative of what is happening to these diphthongs in the Auckland area. Age differences Given the sampling and methodological differences between the studies summarized above, the remainder of this section will focus on just one, H&B’s survey, for which the raw data were made available to the current author. It has already been mentioned that H&B found shifts to /i6/ for mid-aged and to /e6/ for young speakers. This finding was based on an analysis in which they collapsed an auditory scale covering [i], [iœ], [e›] and [e] (i.e. /i/, lowered /i/, raised /e/ and /e/ respectively) into a binary ear ([i], [iœ]) / air ([e›], [e]) distinction. Their analysis shows that midaged speakers pronounced 22% of air words with /i6/, but only 8% of ear words with /e6/, while younger speakers pronounced 15% of air words with /i6/, and 35% of ear words with /e6/. These figures also show an increasing tendency for instability. Using the auditory values from H&B’s raw data (i.e. using their initial fourpoint scale), I have derived median starting points for the diphthongs in ear and air words from a range of tasks from word list reading to conversation. The results, by age group for ear and air words, are shown in Figure 1. Each value plotted here represents the median starting point for the ear or air diphthong for a particular age group, and is based on between 904 and 1191 data points (the variation in sample size being largely due to unequal frequencies of occurrence in the conversational data). If age group differences reflect change over time, then the figure shows a small early shift of /e6/ (air) tokens to a closer (higher) starting point (compare old and middle-aged groups), followed by a significant lowering of the /i6/ vowel (i.e. ear, from mid to young groups). In terms of the scale of the changes, there seems to be support for Starks et al.’s (1998) finding of a greater change over age groups for /i6/ than for /e6/. Although speaker age is the dominant factor in their analysis, H&B highlight other speaker-group differences in the distribution of the diphthongs, pointing out that Pakeha women appear to be in the vanguard of change. Thus, the Pakeha women in the middle-aged group show the greatest shift towards /i6/ for air words, and the Pakeha women in the young group show the greatest movement in the opposite direction for ear words. Watson et al. (1998) also report a more complete merger for women than for men in their acoustic analysis. JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.6 (345-390) Paul Warren EAR AIR starting point i i– e+ e old mid young Figure 1. Median starting points for ear and air diphthongs, by speaker age, based on data from Holmes and Bell (1992) Linguistic constraints on the merger Using median starting points computed for ear and air in H&B’s Porirua data, and the distance between these starting points, I consider now other factors that may condition or constrain the merger, and which might thereby influence the word recognition process. The main points of interest are first, are there any word pairs that are so consistently merged that they are effectively homophones, and so are most likely to be processed in the same way as lexically ambiguous words like bank? Second, are there any further linguistic factors – such as phonetic context – that might be used by listeners as indicators that some process of change has taken place, thus helping them to recover the underlying form of a merged or merging diphthong? Evidence for homophony To look for evidence for homophony, consider H&B’s minimal pair words: ear/air, beer/bare, kea/care (kea is a native New Zealand parrot), cheer/chair, fear/fair, hear/hair, peer/pair, really/rarely, sheer/share and spear/spare. The /i6/-/e6/ distance data, for all age groups together, are shown in Figure 2. In this figure, one unit represents the distance between neighbouring steps on H&B’s auditory scale (e.g. between starting points for these diphthongs that were labeled as [iœ] and [e›]). Each plotted value corresponds to the distance between medians based on at least 73 and up to 76 ear and air pairs, the variation in sample size being due to the exclusion of some tokens as being monopthongs. Two word pairs in particular, cheer/chair and sheer/share show closer values than the others – inspection of the medians for the members of these contrasts JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.7 (390-430) Word recognition and sound merger i\-e\ distance 3 2 1 ear-air beerbear kea- cheer- fearcare chair fair here- peer- really- sheer- spearhair pair rarely share spare Figure 2. Distance between median starting points for ear and air words, based on data from Holmes and Bell (1992) shows that this is largely due to a closer (higher) articulation for the air form (to which we will return in the discussion of phonetic influences below). At the other end of the scale, ear/air and really/rarely have greater ear/air distances than the other sets. For each of these pairs, the greater distance between the two median values arises because more extreme pronunciations are kept for both forms. Interestingly, of the pairs of words examined, really/rarely are probably the most likely to appear in identical contexts, and with opposite meanings, as in I really/rarely like the ice-cream from that dairy. There may therefore be greater pressure on speakers to keep these words distinct. What these data clearly show is that the word pairs in question are not homophones in current New Zealand English. In fact median tests show that all pairs have reliably distinct starting points for ear and air (χ2 at p < 0.01). Some pairs, however, are clearly less distinct than others. These word differences will be discussed further in later sections. Homophony across age groups Does the change over time reflected in the age group comparison affect some words more than others? Figure 3 below shows the distances between median values for the three groups of subjects, with each plotted point representing the difference between medians based on between 21 and 26 tokens. The effect of speaker age is quite clear. Older speakers distinguish all word pairs, younger speakers hardly distinguish them at all, and mid-aged speakers are JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.8 (430-460) Paul Warren 4 old mid young i\-e\ distance 3 2 1 0 ear-air beerbear kea- cheercare chair fear- herefair hair peer- really- sheer- spearpair rarely share spare Figure 3. Distance between median starting points for ear and air words, by age group, based on data from Holmes and Bell (1992) between these two. The two older groups are quite similar overall, but differ more obviously for word pairs in /w-/, /tw-/ and /k-/, where the mid-aged group show a smaller, though still significant, ear/air distance. As with the overall picture, this difference is almost entirely the result of a higher starting point for the air tokens. For the younger speakers, it seems that many of the word pairs may effectively be homophones; the distance between the starting points of most pairs, at less than one point on the scale, is smaller than that between, e.g. [e] and a raised [e›]. In addition to the really/rarely pair discussed above, which is as distinct for these younger speakers as it is for the older groups, there is also a larger difference for the /k-/ pair, which is due to a higher starting point for the ear token, i.e. for the bird-name kea. It is possible that the fact that this word is of Māori origin results in a clearer distinction being maintained by this younger group, though G&M note that none of their surveys distinguished kea and care, which were both consistently pronounced with /i6/. These two pairs, really/rarely and kea/care, are the only pairs for which the younger group in H&B’s sample shows a significant ear/air difference in the median tests (χ2 at p < 0.01). These data suggest quite strongly that although the word pairs are distinct for most older speakers, they will become homophonous for the New Zealand population as the younger generation gets older, as long as following generations exhibit the same absence of a consistent distinction between the two diphthongs. And the data in Figure 2 suggest that cheer/chair and sheer/share are leading the way. JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.9 (460-513) Word recognition and sound merger Linguistic context There are a number of aspects to the question of whether the realisation of the diphthongs is in any way dependent on the linguistic context, involving sentence, lexical and phonetic levels. H&B’s survey included conversational data, word lists, prose reading and minimal pair lists. This range of tasks allows us to address the question of whether speakers are more likely to merge the vowels when there is a sentence context that could also serve to disambiguate, i.e. where the information load of the vowel contrast is reduced. In fact, there is very little difference in the /i6/-/e6/ distances in the four tasks; the prose reading task, where there is a sentential context, produces a slightly lower median difference (2.03) than the minimal pair task (2.26), but does not differ from the word list task (2.00). The conversational data, involving a different speech style as well as including sentence contexts, have a somewhat larger difference (2.49). There would appear to be no clear evidence for sentential contexts resulting in a greater degree of merger. Lexical frequency A factor that may influence the incidence of merged forms is the frequency with which individual words are used. In particular, if the change is proceeding through the language by lexical diffusion, then high frequency words may be affected earlier than low frequency words. Conversely, the opposite prediction results from an assumption that words that are used more often have more stable pronunciations and are so less likely to change. The frequency values for words in the minimal pair set were compared with their median starting point values, for ear and air words separately. Frequency counts from the Wellington Corpus of Spoken New Zealand English were used in preference to published frequency norms, since these have been collected for other English varieties. Since the different age groups show different tendencies as far as the merger is concerned, correlation coefficients were computed separately for each group. These showed that neither of the two older groups show any clear correlation of lexical frequency and sound change, while the younger group has higher starting points for higher frequency ear words (with a correlation of 0.518). However, this is due largely to the fact that really is a high frequency word in the corpus, and also has a high starting point for all speaker groups. Since really, as we have seen, may be kept distinct because of the extent of potential confusion with rarely, there is little evidence of a causal or constraining relationship between lexical frequency and extent of merger. JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.10 (513-575) Paul Warren Minimal contrast A further lexical factor that may influence the merger is whether a word stands in minimal opposition to another word, distinguished only by the ear/air diphthong. Gordon and Maclagan conjecture (1990: 140) that sound change may be able to proceed more quickly when a word is not perceived as part of a minimal pair. However, the median starting points in H&B’s data show no evidence of this, the overall distance between ear and air being very similar for minimal pair words (2.25) and words not in minimal pairs (2.42). Phonetic context The analysis of the minimal pair set in Figures 2 and 3 showed that there may be an effect of phonetic context, in the form of the consonant immediately preceding the diphthong. The data presented there suggest on closer inspection that the relevant phonetic context might be the place of articulation of the preceding consonant. It was noted that for the mid-age group in particular the distance between the ear and air vowels was smallest for words beginning in postalveolar /w/ and /tw/ and velar /k/, mainly because of a higher starting point for air words. In other words, for these speakers the air vowel is higher after a consonant with a high front(ish) tongue articulation. Since /k/ is likely to be fronted before these front vowels, these three consonants can all be characterized as having the phonetic place feature [+coronal], To examine the influence of coronal place of articulation on the ear/air vowels, the words for the minimal pair set were grouped according to whether they involved a coronal consonant (which also included /s/ in spear/spare). This grouping was carried out for each age group of speakers, but excluded really/rarely, which were discussed in the context of other factors above. Figure 4 shows the effect of consonant place of articulation on the height of the air vowel. Each data point is based on a median starting point for between 75 and 84 tokens (for the coronal contexts) or between 145 and 168 tokens (for the non-coronals; the variation in sample size is again due to the exclusion of some tokens for some speakers on the basis of their being monophthongs). A median test including all age groups showed the effect of coronality to be significant at p < 0.001. In separate median tests for each age group, coronality had a significant effect on air height for old speakers (p < 0.01), a greater one for mid-age speakers (p < .001), but no effect at all for young speakers. Data for the ear vowel are not presented, since this was not influenced by the value of the [coronal] feature of the preceding consonant. JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.11 (575-607) Word recognition and sound merger starting point i coronal non-coronal i– e+ e old mid young Figure 4. Median starting points for air diphthongs, by age group and place of articulation of preceding consonant, based on data from Holmes and Bell (1992) Summary of production data The data reviewed in the preceding sections show quite clearly that the ear/air merger is a change in progress, as reflected in the different patterns of realisation across the age groups in H&B’s survey (Figure 1). Speakers in the older group still distinguish the two diphthongs with reasonable consistency. Those in the midaged group also distinguish them, but exhibit a raising of air, particularly after coronal consonants (see also Figure 4). In marked contrast, the youngest speakers show a lowering of ear, in addition to a raising of air compared with the old group. However, data from G&M and from Watson et al. suggest that this group may be exceptional, and that the overall trend is to /i6/ (see also Warren & Hay 2005; Hay et al. 2006). For H&B’s mid-aged group, then, there is the suggestion of phonetic conditioning on air raising. The raising of the air vowel may be part of the general chain-shift raising of the New Zealand front vowels (Bauer 1992; Gordon & Deverson 1985), as pointed out by G&M. The re-analysis of H&B’s mid-aged data presented here suggests that the shift was initiated earlier in those environments in which it is phonetically conditioned. The youngest speaker group in H&B’s survey shows significant ear/air differences only for two of the minimal pair sets, really/rarely and kea/care. However, the data show that for all pairs (except hear/hair) there is a residual difference, since the median value for the ear vowel is in each case higher than that for the air vowel, although their distributions clearly overlap considerably. JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.12 (607-700) Paul Warren The review of other linguistic factors leads to the further conclusion that ear and air forms are distinct words for the older two groups, but that most are effectively homophonous for the youngest group. One constraint on homophony appeared to come from considerations of ambiguity – really and rarely, more than any other pair, have the potential to occur in identical environments with opposite meanings, and without being disambiguated by context. However, with the exception of this pairing, there was little evidence that merger is occurring less rapidly in minimal pairs than in other cases. In addition, lexical frequency and the general availability of a sentential context appeared to have no constraining or motivating effect on the extent of the merger. . Word recognition and sound merger The preceding sections have highlighted the variability of ear/air in NZE, and isolated a few factors, mainly phonetic context and speaker age, that appear to have the strongest influence on the merger. How do these observations relate to the process of word recognition? Word recognition in the merger process To what extent might the process of sound merger be affected by the requirements and mechanisms of word recognition? Clearly, an overriding objective of the process of word recognition is the correct identification of the intended word, and access to its further lexical properties. The merger of a phonemic distinction potentially inhibits this process, particularly for minimally distinct words. This danger is clearly reduced when the words concerned are unlikely to be found in comparable contexts. It is also reduced when phonetic context conditions a change and can be used in interpreting the result of that change, through processes of compensation for coarticulation (Mann & Repp 1981; Elman & McClelland 1988). In Table 1 I set out a series of hypothesised ‘states’ in the progress of the ear/air merger, forming an approximate temporal sequence for the process of the change as reflected in the production studies. The phonetic values given are approximate. In the following description, I conjecture how the constraints of word recognition might influence the progress of the merger. At state (1) in Table 1, ear and air words are distinct, representing the situation for H&B’s older group, as shown in Figure 1. At (2), phonetic conditioning raises the air vowel after coronal consonants (as represented in the table by /tw/). Since non-coronals do not show this raising, the overall effect is of a slight rise in the starting point of the air vowel (cf. the mid-aged data in Figures 1 and 4). The two /tw-/ words are still perceptually JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.13 (700-724) Word recognition and sound merger Table 1. Hypothesised states in the process of merger (see text for details) state ear set (1) [twi6] [bi6] [twi6] [bi6] [twiœ6] [biœ6] [twi6] [bi6] (2) (3) (4) air set cheer beer cheer beer cheer/chair beer cheer/?chair? beer/?bear? [twe6] [be6] [twiœ6] [be6] [twe›6] [be6] [twiœ6] [biœ6] chair bear chair bear chair/cheer bear chair/cheer bear/beer distinct, despite the closeness of their phonetic realisation, because listeners compensate for coarticulation, and hear [iœ] as lower than it actually is. State (3) shows the young speakers’ reaction against a stigmatised /i6/. All /i6/ forms are lowered, including the ear set. If the mechanisms of compensation for coarticulation are still operative, the system shown at (3) potentially runs a greater risk of ear/air confusion than that at (2), since both forms following coronals are now lowered, to positions that could be explained perceptually as air forms resulting from coarticulation, while their intermediate pronunciations mean they could also be heard as ear forms. Once the ear form is no longer perceived as stigmatised, we get spreading of the raised air from coronal to non-coronal contexts (4), potentially assisted by perceptual confusions that might have arisen at state (3), but possibly also by a more general principle by which subsequent generations of speakers “forget” the reasons for coarticulation (Ohala 1992). The eventual merger on /i6/ is but a short step from here. Speaker age and phonological status As noted above, even the youngest speaker group in H&B’s survey shows a residual difference between ear and air forms. It is possible that whatever variation remains respects Nolan’s hypothesis that “differences in lexical phonological form will always result in distinct articulatory gestures” (1992: 272–274). If this is the case, then the difference, though slight, suggests that even the younger speakers have distinct phonological forms for /i6/ and /e6/. This, however, conflicts with self reports from speakers in this age group. Similarly, my own observations of phonetics students agree with those reported by G&M, namely that young New Zealanders find it very difficult to distinguish /i6/ and /e6/, in other words that they appear to be losing the phonemic distinction between these diphthongs. Such difficulty for younger speakers contrasts with the heightened awareness of the merger reflected in opinions voiced in New Zealand newspapers, which pre- JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.14 (724-803) Paul Warren sumably originate from speakers in the older groups. One writer claims that the merger “could lead to a great deal of frustration, trouble and strife”, and continues: To use the [. . . ] examples of “beer” and “bare” and “here” and “hair”, I go into this bar and say, “Beer, please” and the barmaid, being an obliging girl, takes off her top and bra. Because I am devoutly decent, I say indignantly, “Here! Here!” [sic] and the barmaid who knows when enough’s enough whacks me with a jug of Old Dark which starts a bloody brawl. You see, the potential for misunderstanding is substantial and the consequences may be horrendous. Alex Veysey – Opinion column in Evening Post, Wellington, 29/10/94 Other writers complain about hearing on radio that cars are to be fitted with earbags, or on television that stuffed beers are available. Such comments suggest that the phonemic distinction is still very much alive for these speakers. If these comments are typical, it is interesting that they also reflect the direction of the change towards /i6/, in that these more ‘conservative’ listeners criticise speakers for making air words sound like they contain /i6/. Similarly, the opinion column gives a constructed example where the speaker (who maintains the phonemic distinction) produces /i6/ for ear words, but his productions are misinterpreted (by the presumably younger barmaid) as air words. A conflict of two systems? It is significant that our (mainly young) phonetics students claim not to be able to distinguish ear and air words, while the (presumably) older correspondents deplore the merger and its consequences. If these were two static and separate populations, then we could assume that the former would treat any vowel in the [i6]-[e6] range as a token of one vowel – i.e., there would be for these speakers complete neutralization, so that forms of fear and fair with vowels in this range would initially map onto both of these words, and contextual information would then be used to select the appropriate lexical form. The older speaker group, however, would hear [fi6] as fear and [fe6] as fair (or fare). However, these populations are neither static nor separate, and this raises further issues for processes and models of word recognition. The question of what a New Zealander does when hearing a form like [fi6] clearly depends on who that New Zealander is. But does it also depend on what the New Zealander knows about the speaker who uttered the form (Hay et al. 2006)? It is possible that listeners may be sensitive to the perceived age of the speaker. However, given the testimonials from young speakers attesting to their inability to distinguish reliably between the two vowels, this relationship between speaker and hearer may not be symmetrical. That is, as suggested above, young speakers JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.15 (803-834) Word recognition and sound merger may no longer have a reliable phonemic distinction between /i6/ and /e6/, and this will affect them both as speakers and as hearers. Let us assume, in line with most models of word recognition (though cf. Marslen-Wilson & Warren 1994), that listeners need to recognize phonemes in the speech stream and use these to make contact with lexical representations. In this case, the younger listeners, for whom [i6] and [e6] are allophones of a single EAIR phoneme, will generally be unable to distinguish /i6/ and /e6/, for all ages of speakers. (The issue of whether their single phoneme is more like /i6/ or /e6/ is irrelevant. What is important is that a phonemic distinction has been lost.) Older listeners, on the other hand, may recognize that a merger is in progress for the younger generations, and consequently map a young speaker’s [i6]-[e6] forms indiscriminately onto both /i6/ and /e6/, while still expecting closer correspondences of [i6] to /i6/ and [e6] to /e6/ from other older speakers. So [fi6] may be interpreted by older listeners as fear if produced by a speaker from the same age group, but as ambiguous between fear and fair if from a young speaker. Since such adjustments involve extra-linguistic knowledge, they are clearly different from the kind of normalization usually envisaged for other types of (idiosyncratic or allophonic) variation in the speech signal (e.g. template matching, distance metrics, etc. – cf. Klatt 1989). They also invoke interactions of knowledge types that are different in scope from even the lexical or sentential influences that are argued to have an effect on the outcomes of phonetic processing (Ganong 1980; Tyler 1990). A further empirical issue concerns differences between coronal and noncoronal contexts in interaction with this age group difference. For instance, if compensatory strategies are operative in the interpretation of [twiœ6] as chair amongst the mid-aged listeners in (2) in the table above, then maybe these listeners are more tolerant of [iœ6] for air after coronals for all speaker groups. Ambiguity in context Some of the newspaper correspondence that deplores the ear/air merger argues that it is important to know whether the speaker is saying that something is fair or fear. What is interesting about such comments, as well as the opinion text and other letters to the editor cited above is that the suggested confusion between members of word pairs is actually unlikely to arise in most contexts, since very few of the minimal pair words investigated in the production studies, and few of the examples cited in newspaper opinion and letter columns, are likely to be found in otherwise identical contexts. When they do, the confusion will probably be as short-lived and remain as unnoticed as ambiguities involving words like bank, thanks to the multiple access of word forms and rapid integration with context (Tyler 1990, but see also Schvaneveldt et al. 1976). An empirical question remains as to whether the recognition system is any way disadvantaged by the merger, i.e. is JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.16 (834-904) Paul Warren there any processing delay or any greater likelihood of error, relative to processing in a non-merged system? . Closing comments Whatever the answers to questions raised above may turn out to be, it is clear that the recognition system is able to cope with the type of variability that arises during an on-going process of sound merger, since the confusions reported in opinion and letter columns are rarely experienced. What remains to be seen, though, is just how it does cope with this variability. In addition to linguistic variables such as the phonetic context in which the diphthong is produced, the review and additional analysis of H&B’s data has shown that extralinguistic factors such as the age group of the speaker and potentially also of the listener are influential in determining the attested forms. Including these factors in an account of word recognition will amount to an extension of such accounts beyond perceptual and linguistic considerations, invoking a consideration of possible interactions between the acoustic-phonetic analysis of the input and the listener’s awareness of speaker identity. Further empirical studies of the merger will also address issues such as the relative importance of sentential and phonetic contextual information in coping with such variability, as well as the role of the word recognition process itself in constraining or directing the process of sound change. References Bauer, Laurie (1992). The second great vowel shift revisited. English World-Wide, 13, 253–268. Church, Kenneth W. (1987). Phonological parsing and lexical retrieval. Cognition, 25, 53–69. Cutler, Anne (1990). Exploiting prosodic probabilities in speech segmentation. In Gerry T. M. Altmann (Ed.), Cognitive models of speech processing (pp. 105–121). Cambridge, MA: MIT Press. Elman, Jeffrey L. & James L. McClelland (1988). Cognitive penetration of the mechanisms of perception: compensation for coarticulation of lexically restored phonemes. Journal of Memory and Language, 27, 143–165. Ganong, William F. (1980). Phonetic categorization in auditory word perception. Journal of Experimental Psychology: Human Perception and Performance, 6, 110–125. Gaskell, M. Gareth & William D. Marslen-Wilson (1998). Mechanisms of phonological inference in speech perception. Journal of Experimental Psychology: Human Perception and Performance. Gordon, Elizabeth & Tony Deverson (1985). New Zealand English: An introduction to New Zealand speech and usage. Auckland: Heinemann. JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.17 (904-1009) Word recognition and sound merger Gordon, Elizabeth & Margaret A. Maclagan (1990). A longitudinal study of the ear/air contrast in New Zealand speech. In Allan Bell & Janet Holmes (Eds.), New Zealand ways of speaking English (pp. 129–148). Clevedon, Avon: Multilingual Matters. Gordon, Elizabeth & Margaret A. Maclagan (2001). Capturing a sound change: A real time study over 15 years of the NEAR/SQUARE diphthong merger in New Zealand English. Australian Journal of Linguistics, 21(2), 215–238. Hay, Jennifer, Paul Warren, & Katie Drager (2006). Factors influencing speech perception in the context of a merger-in-progress. Submitted to special issue of Journal of Phonetics, Vol. 34, issue 1, to appear in 2006. Holmes, Janet & Allan Bell (1992). On shear markets and sharing sheep: The merger of EAR and AIR diphthongs in New Zealand English. Language Variation and Change, 4, 251–273. Klatt, Dennis H. (1989). Review of selected models of speech perception. In William D. MarslenWilson (Ed.), Lexical representation and process (pp. 169–226). Cambridge, MA: MIT Press. Maclagan, Margaret A. & Elizabeth Gordon (1996). Out of the AIR and into the EAR: Another view of the New Zealand diphthong merger. Language Variation and Change, 8, 125–147. Mann, V. A. & Bruno H. Repp (1981). Influence of preceding fricative on stop consonant perception. Journal of the Acoustical Society of America, 69, 548–558. Marslen-Wilson, William D. (1987). Functional parallelism in spoken word recognition. Cognition, 25, 71–102. Marslen-Wilson, William D. & Paul Warren (1994). Levels of perceptual representation and process in lexical access: words, phonemes, and features. Psychological Review, 101, 653–675. Nolan, Francis J. (1992). The descriptive role of segments: evidence from assimilation. In Gerard J. Docherty & D. Robert Ladd (Eds.), Papers in laboratory phonology II: Gesture, segment, prosody (pp. 261–279). Cambridge, England: Cambridge University Press. Ohala, John J. (1992). What’s cognitive, what’s not, in sound change. In G. Kellermann & M. D. Morrissey (Eds.), Diachrony within synchrony: Language history and cognition (pp. 309– 355). Frankfurt am Main: Peter Lang Verlag. Schvaneveldt, Roger W., David Meyer, & Curtis A. Becker (1976). Lexical ambiguity, semantic context and visual word recognition. Journal of Experimental Psychology: Human Perception and Performance, 2, 243–246. Starks, Donna, Scott Allan, & Catherine Kitto (1998). Why vernacular speech? Speech samples from the taped Auckland rapid and anonymous survey. Paper presented at the Sixth New Zealand Language and Society conference, Wellington, 28–30 June 1998. Stevens, Kenneth N. & Sheila E. Blumstein (1981). The search for invariant acoustic correlates of phonetic features. In Peter D. Eimas & Joanne L. Miller (Eds.), Perspectives on the study of speech (pp. 1–38). Hillsdale, NJ: Erlbaum. Swinney, David A. (1979). Lexical access during sentence comprehension: (Re)consideration of context effects. Journal of Verbal Learning and Verbal Behavior, 18, 645–659. Tanenhaus, Michael K. & Margery M. Lucas (1987). Context effects in lexical processing. Cognition, 25, 213–234. Tyler, Lorraine K. (1990). The relationship between sentential context and sensory input. In Gerry T. M. Altmann (Ed.), Cognitive models of speech processing (pp. 315–323). Cambridge, MA: MIT Press. Warren, Paul & Jen Hay (2005). Using sound change to explore the mental lexicon. To appear in Claire Fletcher-Flinn & Gus Haberman (Eds.), Cognition and language: Perspectives from New Zealand. Bowen Hills: Australian Academic Press. JB[v.20020404] Prn:9/02/2006; 12:57 F: HCP1508.tex / p.18 (1009-1018) Paul Warren Watson, Catherine I., Jonathan Harrington, & Zoe Evans (1998). An acoustic comparison between New Zealand and Australian English vowels. Australian Journal of Linguistics, 18, 185–208. Wells, John (1982). Accents of English, 3 vols. Cambridge: Cambridge University Press. JB[v.20020404] Prn:9/02/2006; 13:00 Linguistic components and conceptual mappings F: HCP15P3.tex / p.1 (47-73) JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.1 (47-121) chapter Verbal explication and the place of NSM semantics in cognitive linguistics* Cliff Goddard University of New England, Australia This paper argues that verbal explication has an indispensable role to play in semantic/conceptual representation. Cognitive linguistic diagrams are not semiotically self-contained and cannot be interpreted without overt or covert verbal support. Many also depend on culture-specific iconography. When verbal representation is employed in mainstream cognitive linguistics, as in work on prototypes, cultural models and conceptual metaphor, this is typically done in an under-theorised fashion without adequate attention to the complexity and culture-specificity of the representation. Abstract culture-laden vocabulary also demands a rich propositional style of representation, as shown with contrastive examples from Malay, Japanese and English. As the only stream of cognitive linguistics with a well-theorised and empirically grounded approach to verbal explication, the NSM (natural semantic metalanguage) framework has much to offer cognitive linguistics at large. Keywords: Wierzbicka, semantic primes, diagrams, Malay, Japanese In natural language, meaning consists in human interpretation of the world. It is subjective, it is anthropocentric, it reflects predominant cultural concerns and culture-specific modes of social interaction as much as any objective features of the world ‘as such’. (Wierzbicka 1988: 2) . Friend, foe, or fellow traveller? Cognitive linguists seem somewhat divided in their attitude towards Anna Wierzbicka and the distinctive semantic theory (the natural semantic metalanguage or NSM theory) originated by her (Wierzbicka 1972, 1988, 1992, 1996, 1999, and other works; cf. Goddard 1998a). As Paul Werth (in Niemeier 1997) has pointed out, Wierzbicka anticipated important themes in cognitive linguistics by many years. As early as 1972, in her book Semantic Primitives, she was upholding a JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.2 (121-174) Cliff Goddard view of meaning as conceptualisation, as opposed to the structure-based or logicbased views that dominated (and still dominate) the linguistic mainstream. In her empirical descriptive work she was researching topics such as emotion, time, and the interaction between language and culture well before cognitive linguistics, and other late-breaking trends in twentieth century linguistics, brought these topics out of the shadow of Chomskyan generativism. Already in the 1970s, Wierzbicka was employing a version of prototype analysis, some years before Fillmore, Lakoff, and others (cf. McCawley 1983). As noted by Peeters (1997a), Wierzbicka was at Duisburg in the spring of 1989 for the symposium organised by René Dirven which “marked the birth of cognitive linguistics as a broadly based, self-conscious intellectual movement” (Langacker 1990: 1), and she published in the first issue of the journal Cognitive Linguistics. Even if, as Peeters comments, she is best considered “a co-opted member rather than a founding member”, there is no doubt in the minds of many cognitive linguists that her work holds an honourable place within the broad movement of cognitive linguistics. For example, Athanasiadou and Tabakowska (1998: xxi) introduce a collective volume on emotions with the remark that it “represents a wide spectrum of cognitive trends, thereby testifying to the pluralism within the cognitive linguistic paradigm: the metaphorical-metonymical Lakovian approach (Kövecses), the semantic-primitives approach (Wierzbicka), and the semasiological-structure approach (Geeraerts/Grondelaers)”. On the other hand, Wierzbicka’s approach has also been deemed incompatible with, or even inimical to, the core tenets and proper principles of cognitive linguistics. For example, Lakoff (1990) in the same issue of Cognitive Linguistics just referred to, spends some time differentiating his own approach from Wierzbicka’s “Leibnizian commitment”, which includes her semantic universalism and her use of reductive paraphrase in natural language as a vehicle for semantic explication. Geeraerts (1999) characterises present-day cognitive linguistics as having two methodological extremes: the ‘empiricist’ tendency (corpus analysis, psycholinguistic research, neurophysiological modelling) and the ‘idealistic’ tendency represented by Wierzbicka and her colleagues, with their dubious appeals to intuition and platonist views about universal conceptual primes.1 Even if NSM semantics is not mentioned by name, one often finds cognitive linguistics characterised in terms which would seem to marginalise the role of propositional meaning and verbal explication – for example, when it is claimed to be a defining assumption of cognitive linguistics that meaning originates in experiential schemas, in visual-spatial templates, or in other pre-conceptual or embodied modes of understanding. The main thesis of this chapter is that verbal explication is in fact indispensable in cognitive linguistics, and that the NSM approach, as the only well developed and empirically grounded theory of verbal explication, is well equipped JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.3 (174-232) Verbal explication and NSM semantics to meet this need. Section 2 argues that familiar devices of cognitive linguistics, such as diagrams representing image schemas, and conceptual metaphors, cannot work effectively without a much improved theory of verbal explication. Section 3 then takes up a domain in which the need for verbal explication is particularly clear, namely the domain of abstract, culture-laden vocabulary, illustrating with examples from English, Malay and Japanese. Section 4 contains concluding remarks touching on the relative merits of NSM and alternative approaches to verbal explication. The natural semantic metalanguage approach The general outline of the ‘natural semantic metalanguage’ approach is well known, so I will give an abbreviated version here (for a review of common misunderstandings, see Goddard 1998b). The initial assumption is that the meanings encoded in the linguistic forms of any language (at least, the propositional or symbolic meanings) can be adequately described within the resources of that language – i.e., that any natural language is adequate as its own semantic metalanguage. The approach began as an attempt to systematise the traditional definitional technique in lexical semantics – i.e., stating the meaning of a word (in a particular utterance) by means of an exact paraphrase in other words. As recognised by seventeenth century thinkers such as Arnauld, Descartes, Pascal, and, above all, Leibniz, this procedure can only succeed if the paraphrasing is done in terms which are semantically simpler (i.e., easier to understand) than the term being defined. Otherwise the analysis gets bogged down in circularity and terminological obscurity. Assuming it is possible to avoid circularity and infinite regress, it follows that every natural language must contain a non-arbitrary and irreducible ‘semantic core’ which would be left after all the decomposable expressions had been dealt with. This semantic core must have a language-like structure, with a lexicon of indefinable expressions (semantic primes) and a grammar; that is, some principles governing how the lexical elements can be combined. The semantic primes and their principles of combination constitute a kind of ‘mini-language’. It is furthermore assumed, as a working hypothesis, that at this most basic level of semantic analysis there is substantial identity between the languages of the world:2 in effect, the semantic primes and elementary combinatorial grammar of different natural languages coincide. This assumption is supported by a large and growing body of empirical cross-linguistic research. After thirty years of trial-and-error experimentation in different semantic domains, and taking into account a number of careful cross-linguistic studies (Goddard & Wierzbicka 1994; Goddard 1997; Goddard & Wierzbicka 2002), the current inventory of proposed semantic primes numbers in the mid-sixties. It is listed in Appendix 1. Examples include substantive and determiner-like ele- JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.4 (232-284) Cliff Goddard ments such as i, you, something/thing, someone, this, other, one, and two; predicate-like elements such as do, happen, move, think, know, want, and say; descriptive and evaluational elements like big, small, good and bad; spatial and temporal elements such as where/place, here, above, below, near, far, when/time, before, and after; and logical elements such as because, if, not, can, and maybe. In the NSM system, one states the meaning of a word (grammatical construction, etc.) in terms of an extended paraphrase or ‘explication’ couched entirely within the natural semantic metalanguage. In this way it is hoped to achieve maximum granularity and transparency of semantic description, and at the same time minimise the problem of terminological ethnocentrism; that is, the danger of adopting a mode of semantic representation which is tied to one particular language (English), and which carries with it conceptual baggage from that language. In some respects NSM semantics can be seen as a classical approach to semantics, especially in its commitment to representation in discrete, propositional terms. However, it is quite unlike other so-called classical approaches to semantics, which have rightly attracted the ire of cognitive linguists. First, NSM explications are not bundles of semantic features. They are essentially texts composed in a specified minimal subset of ordinary language. Second, the proposed primes are not abstract in any sense, but are identified with word meanings of ordinary natural language. This means that they are grounded in everyday linguistic experience. Third, the NSM approach is not linked in any way with so-called Objectivism – i.e., the view that linguistic expressions get their meaning from correspondences with aspects of an objective, language-independent reality. On the contrary, the NSM metalanguage contains sundry elements which are inherently subjective, vague, and evaluational (such as, for example, like and good). Fourth, it is entirely possible to incorporate conceptual prototypes, scenarios, and so on, within NSM explications. From the exposition up to this point, it might appear that the NSM program is primarily about semantic universals, but it is equally about linguistic relativity and diversity. If the number of semantic primes is a mere 65 or so, it follows that the vast bulk of the vocabulary and syntax of any language is not language-universal, but language-specific. As Langacker has put it, in the context of acknowledging parallels between his own work and that of Wierzbicka: In positing her universal semantic metalanguage, Wierzbicka claims that all languages exhibit a fundamental commonality in their lexicogrammatical structure. At the same time, the limited array of elements in this metalanguage are combinable to form higher-order semantic structures of indefinite complexity and essentially infinite variability. This unity-in-diversity is not unlike the great profusion of life forms on earth, all governed by strands of DNA comprising different sequences of just four nucleotide bases. (Langacker 1999: 215) JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.5 (284-337) Verbal explication and NSM semantics The NSM program has produced numerous studies of language-specific semantics in a range of languages – studies of culture-specific lexical items (e.g., kin categories, colours, values, emotions, speech acts, natural kinds), of illocutionary devices (especially particles and conversational routines), and of morphosyntax (e.g., number marking, passives, causatives, case constructions, evidentials). The languages include English, French, Polish, Russian, Spanish, Chinese, Cree, Ewe, Lao, Hawaiian Creole English, Japanese, Korean, Malay, Mangaaba-Mbula, and Yankunytjatjara, among others. A selection of these studies is listed in Appendix 2. We will sample a small portion of this work in Section 3. . The indispensability of verbal explication In this section I wish to argue that verbal explication has an indispensable role in cognitive linguistics – not in place of, but alongside other, more schematic modes of representation. Diagrams are not enough Even if one grants that diagrammatic, or other non-verbal, means are sufficient to depict certain kinds of concept (spatial and dimensional concepts, concrete objects, concrete part-whole relations, numbers perhaps), surely not all concepts are amenable to such a treatment. The reason that spatial and dimensional concepts (e.g., inside, above, below, big, small) lend themselves to diagrammatic representation is that one can rely on an analogue (iconic) relationship between the diagram and the modelled reality. For example, to depict the idea of inside (‘containment’) one can present one figure inside another; to depict one object as above or below another, one can rely on an analogous spatial relationship on a page (taking the top of the page to represent the ‘up’ dimension); to depict the contrast between big and small one can present two figures, one big and one small. The Figure-Ground relationship can be conveyed visually by making the Figure visually ‘heavier’ (e.g., thicker, darker, shaded). Diagrams like those in Figure 1 are an everyday feature of cognitive linguistics. I will argue in a moment that these seemingly transparent visual depictions are not as semiotically ‘pure’ as they may seem; but for the time being, suppose we grant that they do the job they are intended to do. How can the diagrammatic mode be extended to abstract concepts – i.e., to concepts that lack physical or perceptual correlates? For example, how could we represent evaluational notions (good and bad) in a purely visual medium? How to depict the difference between ‘thinking that such-and-such’ and ‘knowing that such-and-such’? How to depict the relationship of similarity (like) or the notions of potentiality (can) or possibil- JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.6 (337-405) Cliff Goddard lm tr X Y Figure 1a. Containment Figure 1b. X above Y st1 st2 st3 Figure 1c. She went out of the room ity (maybe)? I am assuming, of course, that it is necessary to depict such notions in some fashion if we are to faithfully model the conceptual content of language, but this seems a thoroughly reasonable assumption. It is commonly accepted that countless lexical items embody evaluational dimensions, and that many languages have special benefactive and adversative constructions (linked with the notions of good and bad, respectively). Similarly, the notions of thinking and knowing are implicit in numerous lexical items, at least in English (doubt, wonder, prove, believe, etc.), and are involved in evidential markers and constructions in many languages. Similarity, potentiality and possibility are recurrent and pervasive dimensions of conceptual structure in the world’s languages. It is of course a simple matter to set up some symbolic conventions that could enable us to express such concepts in a visual mode. Trivially, one could designate good by a tick () and bad by a cross (x). One could depict the mental state of a person ‘thinking that . . . ’ by a thought balloon with a thin wavy line, but use a thick blocked line for ‘knowing that . . . ’. But clearly devices like these have a fundamentally different character to the analogue (iconic) representations given in Figure 1. Ticks, crosses, and so on, are symbolic in nature, bearing no particular iconic relationship with the intended meaning. Essentially they are just visual substitutes (codes) for the words whose meanings they represent. To know what is intended by a tick or by a thought balloon, one has to learn a particular culturespecific convention, and this learning itself depends on words. From a semiotic perspective, a tick or a cross is parasitic upon the verbal sign it represents. Now it may be pointed out that schematic diagrams for certain abstract concepts have played a prominent role in cognitive linguistics. For example, Figure 2 below is from Mark Johnson’s (1987) ground-breaking book The Body in the Mind, and Figure 3 comes from Leonard Talmy’s (1988) influential article on force dynamics. Johnson and Talmy refer to these diagrams as depicting imageschematic gestalts or experiential schemas. My point is that they cannot do their intended job without the assistance of verbal captions and explanations. For exF1 Figure 2. Compulsion JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.7 (405-492) Verbal explication and NSM semantics + Figure 3a. Force-dynamic pattern for a sentence like, The ball kept rolling because of the wind blowing on it + Figure 3b. Force-dynamic pattern for a sentence like, The shed kept standing despite the gale wind blowing against it ample, without verbal explanation I doubt very much if anyone would interpret Figure 2 as depicting ‘compulsion’. We need the caption and the accompanying explanation: “In such cases of compulsion, the force comes from somewhere, has a given magnitude, moves along a path, and has a direction. We can represent this image-schematic gestalt structure with the visual image below. Here the dark arrow represents an actual force vector and the broken arrow denotes a potential force vector or trajectory” (Johnson 1987: 45). Similarly, it is highly unlikely that anyone would understand Figure 3 (from Talmy 1988) without the accompanying legend, which tells us that the graphic elements of the circle and the shape with a concave side indicate the entities involved in the force dynamic scenario, the so-called Agonist and the Antagonist, respectively; that > and • represent the intrinsic force tendency towards action or towards rest, that the + and – signs indicate the “balance of strengths”, and the line at the bottom of each diagram gives the “resultant of the force interaction” (either action or rest, as in (a) and (b), respectively). In short, the diagrams are not semiotically “self-contained”. Even apparently self-explanatory diagrams such as Figure 1a for ‘containment’ are not necessarily as simple as they look. Compare that Figure with the very similar Figure 4a below. This was employed by Hawkins (1984) to designate not in (or ‘containment’), but on. Hawkins’ drawing makes sense, however, within his own set of conventions, which include him having adopted the ellipse shape as representing surface, as in Figure 4b. Once again, the point is that the captions play a vital interpretive role. Notice that I am not saying that the diagrams are perfectly equivalent to the verbal glosses of what they mean. I accept that diagrams have the capacity to convey gestalt or figural properties in a way that cannot be duplicated in words, and that properties of this kind may be very important for our understanding of how language works as part of the overall cognitive system.3 I am not saying that cognitive linguists should give up diagrams and use only verbal paraphrases instead. However, diagrams cannot achieve their purpose without verbal support, and we therefore must have some theory about the nature of the verbal items that form an essential part of the representational system. What, JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.8 (492-534) Cliff Goddard TR LM Figure 4a. ON (profile only) Figure 4b. SURFACE configuration for example, is the status of terms such as ‘vector’, ‘Agonist’, ‘force tendency’, and ‘path’? Are these terms primitives of the representational system, and if not, how can their own conceptual content be analysed? Does it matter that such terms are technical and unknown to the speech community at large? Does it matter that they have no equivalents in most languages of the world, thus effectively tying the representational system to one language – i.e., English? In my view there are some fundamental issues here for cognitive linguistics. One further point: even simple diagrams often (perhaps always) rely on culture-specific, ‘Western’ interpretive conventions and iconography. Interpretive conventions that arise from Western literacy practices (such as the institution of writing, the convention that print is read from left to right, and the existence of books and other printed materials as portable individual objects) can seem so natural to the encultured person that their artificiality is seldom noticed. For example, it seems very natural to Westerners that the ‘top’ of the page (i.e., the side canonically held furthest from the body) can represent a higher position than the ‘bottom’ of the page. It seems natural that moving from left to right across the page can represent the passage of time (as when in generative parlance we speak of a word or phrase being ‘in left-most position’, meaning that it is pronounced before the rest of the sentence). Furthermore, the institutions of representational art, and more recently photography, have entrenched the convention that, all other things being equal, images will be read as representing shapes viewed from one side and “in perspective”. We are reminded of the culture-specific nature of these conventions when we consider cultures that lack literacy (as most cultures do) and representational art. In the traditional Aboriginal cultures of the Australian Western Desert, for example, visual representations are usually made on the ground (as sand-drawings) or on the human body (as ceremonial designs). The usual viewpoint is from above (an aerial view) rather than from one side. In sand-drawings the placement of elements is usually done with respect to an absolute, external frame of reference; for example, a figure placed on the east side of the drawing represents someone who is on the east in the scene being depicted. For someone raised in this tradition, even Western diagrams like those in Figure 1 above will not convey the intended meanings. Figures 5 and 6 show some figures from typical Western Desert sand-drawings (Bardon 1979; Munn 1973: 120). In their own cultural context, figures such as the JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.9 (534-624) Verbal explication and NSM semantics Figure 5a. A person Figure 5b. A camp Figure 5c. Two people in camp Figure 6a. Kangaroo track Figure 6b. Emu track Figure 6c. Dingo track U-shape and the concentric circles are instantly recognisable as depicting human figures and camp or waterhole, respectively. Similarly, depictions of animal tracks (such as kangaroo, emu, and dingo) are instantly recognisable as indicating the presence or movements of those animals. Taking the perspective of a non-Western culture can dramatise the fact that something like the arrow symbol (→) of Western iconography, which is heavily relied upon in cognitive linguistic diagrams, is by no means a transparent and purely iconic sign of movement or directionality. For someone raised in the traditional Central Australian cultures, for example, it looks more like an emu track (Figure 6b) than anything else. The culture-specific character of visual representation warrants more detailed treatment, but for present purposes it is enough to note that signs such as the arrow symbol and the use of left-to-right sequencing to represent the passage of time are another way in which schematic diagrams may covertly assume verbal support – at least, if we aspire to a representational system which can be used across languages. Scenarios, models and conceptual metaphor I turn now to the role of verbal (or quasi-verbal) representation as used in work on prototypical scenarios, cultural models, etc. and in work on conceptual metaphor. My position is that the value of much of this work is compromised because the language of the representations is not sufficiently theorised. As representative of prototypical scenarios, consider the influential treatment of anger in Lakoff (1987; cf. Lakoff & Kövecses 1987: 213f.). This is a five-stage scenario which opens as follows: Stage 1, Offending Event: Wrongdoer offends S. Wrongdoer is at fault. The offending event displeases S. The intensity of the offense outweighs the intensity of the retribution (which equals zero at this point), thus creating an imbalance. The offense causes anger to come into existence. JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.10 (624-685) Cliff Goddard Stage 2, Anger: Anger exists. S experiences physiological effects (heat, pressure, agitation). Anger exerts force on the Self to attempt an act of retribution. Subsequent stages portray an attempt to control the anger, a loss of control leading to an outbreak of ‘angry behaviour’, and a final stage of retribution such that the intensity of the retribution balances that of the offense and the anger disappears. The basic idea behind the proposal is widely accepted: that the meaning of the English word anger is based on an ideal or prototypical scenario which involves an experiencer construing someone else as having done something bad, and because of that, experiencing a ‘bad feeling’ and a concomitant desire to do something bad to this person in return. However, the formulation as given above does not actually say this. Instead of using simple terms such as ‘do’, ‘bad’, ‘want’, ‘because’ and so on, it is phrased in complex words such as offending event and retribution, which obscure the semantic content rather than making it explicit. Words like offend and retribution are surely of comparable (or greater) complexity than anger itself, and just as deserving of explication. Presumably, additional prototypical scenarios will be required for them. How are they to be phrased? What are the implications of “scenarios within scenarios”? Does it matter that a child may know the word anger (or angry) prior to acquiring the words offend and retribution?4 These are serious questions and it is unsettling to think that they have not yet been widely identified and addressed within cognitive linguistics. Another cognitive linguistic tool which employs a propositional (or quasipropositional)5 style of representation is conceptual metaphor, a notion which has proved enormously fertile since it was introduced by Lakoff and Johnson (1980). Canonical examples include those shown in (1a) and (1b) below, along with some of the expressions which are supposed to instantiate them. These illustrate conceptual metaphors of the so-called ‘ontological’ type, which are supposed to establish a set of figurative correspondences between the elements of two domains: a concrete source domain and an abstract target domain. (1) a. THEORIES ARE BUILDINGS We will show that the theory is without foundation. We need to buttress that argument with more support. Some of the arguments are well constructed. b. ANGER IS THE HEAT OF LIQUID IN A CONTAINER She really got steamed up. He was seething. He just exploded. As work on conceptual metaphors advanced, however, certain problems became apparent. On the one hand, the correspondences between source and target domains are not comprehensive; one cannot, for example, speak of a theory having JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.11 (685-738) Verbal explication and NSM semantics walls or a roof, or as having inhabitants. On the other hand, many attested correspondences are not specific to particular source domains or target domains; rather, the mappings are many-to-many (Kövecses 1995; Grady 1997). One proposal which may alleviate both problems is to re-cast the formulation in terms of broader and more general metaphorical mappings.6 For example, Grady (1997) proposed the metaphors shown in (1c); cf. Kövecses’ (1995: 326–328) Complex Systems Metaphor. (1) c. ORGANISATION IS PHYSICAL STRUCTURE PERSISTING IS REMAINING ERECT I have a lot of sympathy for Grady’s proposal, but from the point of view of the language of representation, the metaphors in (1c) are framed in terms (such as ‘organisation’, ‘structure’, and ‘persisting’) which are more abstract and remote from ordinary usage than those in (1a) and (1b). If these terms were supposed to have a privileged theoretical status this could be problematical, but my impression is that Grady does not intend them as such. He evidently does not intend them to be semantically transparent – i.e., self-explanatory, because he goes to some effort to explain them. For example, he explains (quoting the American Heritage Dictionary) that the term ‘structure’ is to be understood as implying “a complex entity composed of arranged parts”. The ‘parts’ of a theory (such as its premises, claims, arguments, and supporting facts) can be seen as “arranged in certain logical relationships” in an analogous fashion to the arrangements of the physical parts of a complex physical object. In similar fashion, Kövecses (1995: 328) explains that ‘complex systems’ (which include theories, society, and complex interpersonal relationships such as marriage and friendship) resemble complex objects in the following ways: “they do not exist first and then they are made; they are made for a purpose; they have a function; they have a large number of parts that interact with each other; they require effort to make and maintain; the stronger they are the longer they last”. This kind of conceptual unpackaging is moving in the right direction. Complex notions are being resolved (or partially resolved) into simpler notions, and in the process semantic relations, which were implicit, are being made explicit. For example, Grady and Kövecses have identified two key components of ‘structure’ as ‘something which has many parts’, and which is ‘made by people’. Once this unpackaging has been done, the proposition that the same schema can be applied both to abstract objects and to concrete physical objects becomes much easier to appreciate, and, in my view, much easier to accept. Of course there is still more to ‘structure’ in the relevant sense but for present purposes we need not grapple with the substantive details. My concern is rather with the language of the representation. If complex terms like ‘structure’, ‘organisation’ and ‘persisting’ are not the end of the line, but merely “shorthand” (Grady 1997: 274) for other, more articu- JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.12 (738-787) Cliff Goddard lated meanings, then what is their theoretical status? Are they intermediate steps in the chain of analysis (in the same way that, for example, certain complex organic molecules can be broken down into simpler molecules before being further decomposed into atoms of their constituent elements)? Or are they merely approximations on the road to more elaborate and explicit analyses? In either case, no analysis is complete until it has been resolved into the simplest possible terms. It is legitimate to expect some theoretical account of the role of verbal elements in the representation of conceptual metaphors (schemas, etc.). One further theoretical issue is the question of language-specificity in the representational system. Does it or does it not matter if, at a particular level of analysis, the terms of the analysis are specific to the language at issue? Plausibly, the answer could depend on the level of the analysis. At an initial and fairly concrete level, one might expect the conceptual metaphors of Russian, Japanese, or Yankunytjatjara to fall out in terms of language-specific words of Russian, Japanese or Yankunytjatjara. At a deeper or more articulated level of analysis, however, one might expect the terms of the analysis to become less language specific. Endorsing a proposal by Lakoff (1993) to this effect, Cienki (1998: 141) provides comparative evidence from English and Russian that “higher-level metaphors for event structure are the ones that are more likely to be shared cross-culturally, while the lower-level metaphors are more likely to vary across cultures”. In my view this is a very interesting proposal, worthy of systematic research across a range of languages and cultures. On the other hand, an equally provocative (and incompatible) proposal has also been made in the literature, namely, Mühlhäusler’s (1995) claim that even the most fundamental metaphorical mappings can have a language-specific character, so that what is literal in one language is metaphorical in another. An issue like this goes to the heart of the cognitive linguistic project to understand the nature of human conceptualisation. Though we cannot enter this debate here (cf. Goddard 1996a), I am mentioning it to support my general point that issues of profound theoretical importance hinge on the metalanguage of representation, and to urge cognitive linguists to engage with these issues in a more sustained fashion. In summary, I have argued in this section: (a) that no matter how valuable diagrammatic representations may be they cannot do without verbal representation, (b) that in any case, verbal representation has played a leading role in cognitive linguistic practice, in the areas of scenarios, schemas, and conceptual metaphors. However, I have also argued that (c) cognitive linguistics has not sufficiently problematicised the role of verbal representations, and thus (d) runs the risk of unwittingly employing contradictory or self-defeating practices, and (e) at the same time misses the opportunity to focus on certain fundamental theoretical issues. As far as I am aware, the only stream of cognitive linguistics that has a welldeveloped position on these issues backed by wide-ranging empirical research, is JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.13 (787-853) Verbal explication and NSM semantics the natural semantic metalanguage approach. That is, only this approach problematicises and theorises the role of verbal elements in semantic/conceptual representations. . The challenge of abstract, culture-laden vocabulary From a methodological point of view, the need for verbal explication seems particularly pressing in relation to complex abstract vocabulary, such as terms for emotions and attitudes (e.g., happy vs. joyful, love vs. pity), values and social ideals (e.g., honest vs. sincere, freedom vs. duty), and speech-acts (e.g., praise vs. compliment, vow vs. swear). For words like these there seems to be no alternative to a propositional style of representation: essentially, a verbal explication. The fact that words of this kind tend to be highly culture specific poses an extra descriptive challenge, while at the same time adding a dimension of theoretical importance for any approach to language that seeks to articulate culture-specific conceptualisations. As mentioned, Anna Wierzbicka is responsible for numerous studies of abstract culture specific vocabulary, especially in European languages and in Japanese; see especially Wierzbicka (1992, 1997, 1999, in press). In this section, however, I draw on research by other NSM researchers, myself and Catherine Travis, in specific relation to value terminology. The studies to be summarised here both involve explicating subtle differences between apparently close translation equivalents across languages (Malay and English, Japanese and English). My intention is to illustrate the effectiveness of the paraphrase method in cross-linguistic applications, and at the same time to throw up a challenge to cognitive linguists who reject this methodology to show how they would cope with the same data. Malay ikhlas vs. English sincere Goddard (2001a) presents a contrastive semantic analysis of the Malay cultural key word ikhlas, and its conventional English translation equivalent sincere. In English language newspapers in Malaysia, it is not uncommon to see sincerity identified as one of the most important values. For example, Dr. Mahathir Mohamad, the then Prime Minister of Malaysia, was reported as telling the 1996 General Assembly of the UMNO political party that the party supported a culture based on “good manners, discipline, hard work, sincerity and fairness” (New Straits Times 11/10/96, 17). The National Literature Laureate, Abdullah Hussain, included sincerity among the list of values he said could be strengthened by literature (New Straits Times 10/4/96, Life and Times, 9). A newspaper column by an Islamic educator was headed “Sincerity is pure and absolute and the panacea against all vices” (New Straits Times 20/4/96). JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.14 (853-908) Cliff Goddard Reading examples like this one gets the feeling that the word sincerity is being used in a peculiarly Malaysian sense – perhaps something more akin to ‘selflessness’. It comes as no surprise to find such usages have their source in the Malay word ikhlas. Though ikhlas is usually given the gloss ‘sincere’ in bilingual MalayEnglish dictionaries, it can be used in a broader range of contexts than English sincere or sincerely. There is certainly an overlap in the range of use. In particular, ikhlas can be used to indicate that a person who is conveying some kind of ‘positive message’ deserves to be believed. In formal contexts, this use of ikhlas can often be translated as ‘sincere’, as in (2). In informal situations, it sounds more natural to translate ikhlas using a phrase such as ‘(to) really mean it’. For example, someone offering a compliment could back it up by saying (3).7 (2) Nampak-nya dia ber-cakap dengan ikhlas. look-3 3sg intr-talk with sincere He seemed to be speaking sincerely (with ikhlas). (3) Percaya-lah. Saya betul-betul ikhlas. believe-emph I really-really sincere Believe me. I really mean it. There are, however, many contexts in which English sincere can be used but in which ikhlas is quite impossible. For example, in English one can speak of sincerely believing something, sincerely admiring someone, or sincerely wanting something (see below). None of these uses are possible with Malay ikhlas. Conversely, unlike sincere, Malay ikhlas is frequently coupled with beri ‘give’, tolong ‘help’, and other benefactive verbs. For example, to urge someone to accept a gift one could say: (4) Saya beri dengan ikhlas, terima-lah. I give with sincere receive-emph I’m giving (it) with ikhlas, accept it. The former Malaysian P.M., Dr. Mahathir, a well-known commentator on Malay culture, has said that Malay people have a tendency to suspect a “hidden agenda” (the proverbial udang di balik batu ‘prawn under the rock’) behind any good gesture: Tetapi kita orang Melayu terutamanya suka sangat memikir, ‘Kalau dia memberi kepada saya sesuatu apa tujuan di sebaliknya?’ Kita selalu bertanya’, But people, especially Malays, tend to harbour thoughts (suspicions). “‘If he gives me something, what’s the hidden motive behind it?’ we always ask” (Utusan 7/8/96, 6). To say that something is done dengan ikhlas is to repudiate the idea that there is any hidden, self-interested motive. In a similar vein, ikhlas is often used about cinta (roughly, ‘romantic love’) and about domestic relationships. Used about love, ikhlas has about the same emotional “weight” as the English word true in the expression true love. It is a common word in pop songs; for example, the singer Nurul’s (Sept. 1996) album and hit JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.15 (908-970) Verbal explication and NSM semantics song Ikhlasnya Cintaku ‘My love is so ikhlas’. In saying this, I don’t want to suggest that ikhlas means the same thing as true as in true love, where, roughly speaking, what is at issue is the “faithfulness” of the love. Rather, ikhlas is concerned with the “purity” of the lover’s motives. For example: (5) Cinta sejati lahir dari hati yang ikhlas dan niat-nya untuk love genuine born from heart lig sincere and desire-3 for kekal selamanya. Bila hati-nya ikhlas kita, men-cinta-i dia lasting forever when heart-3 sincere 1du.incl tr-love-appl 3sg akan ber-kawan dengan kita tanpa niat buruk. will intr-friend with 1du.incl without desire bad Genuine love is born of a heart which is ikhlas and it seeks after permanence. When someone loves with an ikhlas heart he befriends us without any bad motives. So far we have seen ikhlas being used in contexts where, to speak metaphorically, it indicates that a person acts with a “pure” motive. Ikhlas can also be used to indicate that something is done “freely”, and not as a result of being under pressure or coercion. In discussing love and marriage, it is not uncommon to find ikhlas opposed to terpaksa ‘forced’; for example: (6) Per-kahwin-an itu biar-lah ikhlas. I tak setuju kahwin noml-marry-noml that let-emph sincere I not agree marry ter-paksa. inv-force Marriage should be free (ikhlas). I won’t agree with a forced marriage. The same usage can be found in politics too. For example, in July 1996 a group of members from one political party crossed over en bloc to a rival party. In an English language newspaper, a spokesperson was reported as saying that “he and the other members of S46 were sincere in declaring themselves as members of Pas. . . ‘We were not forced or persuaded by any party to join Pas’, he said” (New Straits Times 25/7/96, 4). With this broad range of use, how can the meaning of ikhlas be explicated? In particular, is it polysemous or not? Some Malay dictionaries give unitary definitions such as hati suci ‘pure of heart’ and putih hati ‘white hearted’, but figurative expressions like these do not really make the meaning explicit (plus, they would not be translatable into some languages). Other dictionary definitions are disjunctive, for example rela atau jujur ‘willing or honest’ (Kamus Harian Federal), implying polysemy. I suggest that ikhlas can be explicated as in [A]. This depicts an act (which could include a speech-act) as being ikhlas if (a) the person wants to do it, (b) because he or she thinks it would be good to do so, and (c) not for any other reason. JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.16 (970-1031) Cliff Goddard The first components combine the elements of “voluntariness” and “good intentions”, and the final component rules out any other causal factors being involved (thus excluding self-interest or coercion).8 [A] X did something dengan ikhlas = X did something because X wanted to do it X wanted to do it because X thought like this: it will be good if I do this not because of anything else Turning now to English sincere, English dictionaries tend to concentrate on the “genuineness” aspect of the meaning. For example, the Little Oxford Dictionary gives ‘free from pretence or deceit, genuine, frank, not assumed or put on’. This formulation is satisfactory, as far as it goes, but it does not make explicit the fact that one speaks of someone saying something sincerely, being sincere, etc., only in relation to words or actions that can be seen in a positive light. For example, one can sincerely thank, sincerely apologise, or sincerely praise, but not *sincerely threaten or *sincerely abuse; similarly, one can sincerely admire or sincerely appreciate someone, but hardly *sincerely despise them; one may smile sincerely, but not *snarl sincerely. Although sincere (sincerely, etc.) can be used in a wide variety of contexts, it is always connected with what we can call “self-expression” on behalf of the speaker. Obviously this applies in the case of the verb say itself and other speech-act verbs, and also with expressive actions such as smile and weep. The ‘self-expression factor’ is less obvious in connection with attitude verbs such as admire and appreciate, verbs of intention such as seek, try, and intend, and with believe (or related nouns such as belief and conviction). However, when one considers examples such as the following, it is clear enough that they all imply some verbal expression by the subject. (7) a. We sincerely appreciate your efforts. b. Her admiration for him was sincere and unreserved. (8) a. We sincerely hope you will take advantage of our offer. b. Brezhnev sincerely sought peace. (9) a. We sincerely believe that wisdom will prevail. b. He sincerely believed that he had a mission from God. Often, as in the (a) examples, the subject is first-person (I or we), in which case the sentence amounts to a profession of attitude, intention, or belief by the speaker. But even with third-person subjects, as in the (b) examples, the term sincerely (sincere, etc.) implies some ‘act of saying’. For instance, it wouldn’t make sense to say that She sincerely admired Bill Clinton unless she had expressed this admiration to JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.17 (1031-1084) Verbal explication and NSM semantics someone. One reflection of this fact is that such sentences become unacceptable if adverbs like secretly or privately are inserted into them (e.g., *Secretly, she sincerely admired Bill Clinton; *Privately, he sincerely believed he had a mission from God). Since it seems that sincerely has a special affiliation with saying, explication [B] below sets out its meaning in the frame involving saying something. The first part alludes to the potential perception that X spoke not from the heart, so to speak, but because of an expectation that someone else would approve of it (because of thinking ‘someone will think it is good if I say this’). The phrasing here is compatible with various possible motives, such as to create a good impression or to satisfy social expectations. The next component repudiates this potential perception. X’s real reason for speaking is that X was thinking: ‘I want to say what I think, I want to say what I feel’. [B] X said it sincerely = X said something someone could think that X said it because X thought like this: someone will think that it is good if I say this X didn’t say it because of this X said it because X thought like this: I want to say what I think, I want to say what I feel people think that it is good when someone does this Comparing explications [A] and [B], it should be plain that the resemblance between sincere and Malay ikhlas is rather superficial. As Trilling (1972: 2) says, sincere “refers primarily to a congruence between avowal and actual feeling”. Ikhlas, in contrast, is not primarily about one’s true motives and feelings, but about the goodness of one’s intentions. How could such differences could be brought out purely in terms of diagrams, without recourse to verbal explication? Japanese omoiyari vs. English empathy Travis (1998a) presents an insightful contrastive semantic analysis of the Japanese cultural key word omoiyari, and its nearest English translation equivalent empathy. (Other glosses found in Japanese-English dictionaries, and in scholarly commentaries, include ‘kind’, ‘considerate’, ‘thoughtful’, ‘sympathetic’, ‘compassionate’, ‘sensitive’, and ‘caring’.) Travis argues that a full understanding of omoiyari provides valuable insights into Japanese culture, revealing a great deal about the Japanese indirect communicative style, the importance of being “in tune” with others’ unexpressed desires and feelings and the “interdependence” on which group relations are based in Japan. Drawing on a wide range of intercultural commentaries (e.g., Lebra 1976; Barnlund 1975), she argues that omoiyari represents a JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.18 (1084-1153) Cliff Goddard kind of intuitive understanding of the unexpressed feelings, desires and thoughts of others, and doing something for those others on the basis of this understanding. For example, sentence (10) is about the writer’s ex-husband.9 The use of omoiyari implies that the ex-husband did these things for the writer without her communicating to him that she would like him to do them. It implies an understanding of the writer’s feelings, and also her desires, in terms of what she would need to live on ‘without any trouble’ in the manner to which she has become accustomed. (10) Watashi ga hyaku-made ikite-mo komaranai-yoo ni, I subj hundred-until live:ger-even.if trouble:neg-ensure dat bantan totonoete okuridashite kudasai-mashi-ta. (Rikon everything get.ready:ger set.up:ger give.to.me-pol-past divorce shi-ta-to iu koto.) Konna yasashii, omoiyari no aru do-past-quot called thing this.much kindness empathy poss be otto iya, wakare-mashi-ta node otoko-tte hajimete desu. man no separate-pol-past therefore man-quot first.time:ger is:pol He sent me away (I mean, divorced me), setting up everything for me so that I wouldn’t be in any trouble if I lived to be 100. I have never known such a yasashii (‘kind’) husband – no, I mean man, we’ve split up – with omoiyari. Consider also the following example, where not having omoiyari (omoiyari ga nai) implies a lack of understanding of others. This comes from the Japanese psychologist Takeo Doi’s (1971: 4) discussion of Japanese and American entertaining styles, in which he comments on the markedly different treatment guests receive in these two countries. While in America the guest is given a series of choices about what to drink and how they would like it served, in Japan the host assesses what the guest would like and serves it. The host is expected to know the guest’s desires, and to automatically satisfy them. As for the possibility in the West of inviting guests to “help themselves”: (11) ‘Go jibun o tasuke-nasai’ de wa funare-na kyaku ni hon self obj help-imp well.then top unfamiliar-adj guest dat taishite amari ni mo omoiyari no nai kotoba-to regarding:ger very dat also empathy poss neg word-quot omowarenai ka. think:neg ques To leave a guest unfamiliar with the house to ‘help himself ’ would seem excessively lacking in omoiyari. In addition to having an understanding of another person, an essential part of omoiyari is actually doing something on the basis of this understanding. This has already been evidenced in the two examples given above, which imply that the person with omoiyari has done or would do something for another. Similarly, a JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.19 (1153-1216) Verbal explication and NSM semantics person could not be described as having omoiyari for someone who was sick or in trouble simply on account of “empathising” with that person. It would be necessary to actually go on and do something about it – otherwise, one would be said simply to kawaisoo ni omou ‘feel sorry for him/her’, rather than to have omoiyari. Another subtle point about the meaning of omoiyari is that it does not necessarily imply a focus on the other person’s feelings (specifically, wanting the other person to feel good). The following quote describes the writer’s grandmother as having omoiyari, and refers specifically to the grandmother’s wish to die at a time when it was neither too hot nor too cold, so as not to cause her family the meiwaku ‘trouble’ of holding a funeral at an inconvenient time. Clearly there is no implication that the grandmother wanted to die at the right time of year so that others would feel something good. (12) Watashi-tachi wa, donna toki-mo minna no koto o I-plural top how.much time-even we.all poss thing obj kangaete kure-ta sobo no koto o sonna fuu think:ger give.to.me-past granny poss thing obj that.much manner ni hanashi, ima-made no yasashi-sa omoiyari o aratamete dat story now-until poss kind-ness empathy obj again:ger kanji-ta no desu. feel-past prt it.is.so:pol When saying such things about our grandmother, who always used to think of us, we all felt even more strongly her yasashii-ness (‘kindness’) and her omoiyari. On the basis of this and a good deal of other evidence, Travis (1998a) proposes an explication for the frame omoiyari ga aru ‘to have omoiyari’. It is presented here in a modified form. [C] X has omoiyari = X often thinks like this about other people: I can know what this person feels I can know what this person wants this person doesn’t have to say anything about it to me I can do some good things for this person because of this I want to do these things X does some things because of this The explication is framed in terms of how X ‘often thinks about other people’ to capture the notion that it refers to a permanent characteristic of someone’s personality, as opposed to something that manifests itself in a one-off incident, and that it is not uniquely directed towards any particular other person or group of people. Subsequent components represent the notion of having an understanding of JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.20 (1216-1265) Cliff Goddard another person, or of being “in tune” with them, without verbal communication. The penultimate pair of components reflect the subject’s (imputed) belief that they can do things which would be to the other person’s benefit and that they wish to do so. The final component reflects the fact that omoiyari implies that this attitude translates into practice in some way, i.e., that the subject actually does some things (without specifying whether or not they are in fact beneficial to the other person in a particular situation). As Travis says, an explication like this is not particularly lengthy or complicated. Presented in this way, omoiyari is relatively easy to understand, as opposed to the very complex concept it seems to be when explained via a list of apparently close English equivalents (such as empathetic, caring, sensitive, thoughtful, considerate, and so on), each of which is both somewhat similar and somewhat different to omoiyari. To underscore this point, Travis presents a parallel analysis of the English concept of empathy. This word is not, of course, a particularly salient concept in Anglo society, but it is frequently employed in the cultural literature on Japan as a gloss for omoiyari. Essentially, Travis’ explication (slightly modified) presents empathy as a capacity to appreciate someone else’s bad feelings, based on being able to imagine how it would feel to be in the same situation. [D] X has empathy = X can think like this about someone else: I know that something bad happened to this person I know that this person feels something bad now because of this I can know how this person feels because when I think about it, I can know how I would feel if something like this happened to me There are at least three differences between omoiyari and empathy. First, empathy is focused specifically on bad feelings; one cannot empathise with someone who is feeling good (for example, if someone announces that they have won a trip around the world, the response I really empathise with you is quite inappropriate unless intended sarcastically). Second, the kind of understanding evident in empathy is not based on intuition, but on imagining oneself in the same situation as another person, “putting oneself in their shoes”. Third, empathy does not imply that one actually does anything for that person on the basis of one’s understanding, as omoiyari does. The existence of these various differences does not diminish the fact that omoiyari and empathy both imply some kind of understanding of others, but equally the importance of the shared component should not be overstated. As Travis (1998a) says, the most illuminating perspective is to be found not in the recognition that the meanings are somehow similar, but by ascertaining exactly where the meanings coincide and where they vary. How this exploration of JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.21 (1265-1323) Verbal explication and NSM semantics cross-cultural semantics could possibly be conducted without verbal explication is entirely unclear. . Concluding remarks In Section 2, I argued on theoretical and methodological grounds for the indispensability of verbal explication. In particular I argued that diagrams cannot stand alone without verbal support, and, moreover, that cognitive linguistic diagrams often rely on complex culture-specific iconographic conventions, which are “smuggled in” without the necessary acknowledgment or explanation. In Section 3, I sought to show that by taking a fine-grained approach to verbal explication one can deal with subtle nuances of culture-rich vocabulary with a degree of success which would be unattainable by other means. In this concluding section I will take the argument one step further, by briefly considering the status of the NSM approach to verbal explication as compared with other possible approaches. Before that, however, I want to reiterate my view that the indispensability of verbal explication does not mean that one can or should give up diagrammatic representation. On the contrary, diagrammatic (schematic, figural) representation can have a valuable role to play in depicting meaningful aspects of language which are not symbolic or conceptual in nature, such as iconic-indexical effects, psycholinguistic processes, and experiential image schemas;10 see Goddard (2002b). I would still insist, however, that even when used for these purposes diagrammatic representation cannot stand alone. It always requires some semiotic support in the form of verbal explanation. Now as Dirk Geeraerts (p.c. 2001) has pointed out to me, even if the arguments advanced in the main body of this paper convincingly establish the need for verbal explication (paraphrase, definitional analysis), they are largely neutral with respect to the choice of natural semantic metalanguage as the descriptive language: “The argument against a purely diagrammatic form of representation supports any form of propositional representation, whether it is couched in the NSM language, a featural representation, dictionary-like entries, or even formal semantics”. I will therefore briefly mention several arguments in favour of the NSM approach in comparison with these alternatives. The most basic point is that cognitive linguistics cannot continue to approach verbal explication in a casual manner, disregarding theoretical and empirical grounding. If one compares dictionary-style entries and feature-based representations, on the one hand, with the NSM approach, on the other, then in my view one sees a very sharp contrast. The NSM approach has a well developed theoretical basis and a large body of empirical support from cross-linguistic studies conducted over several decades (see Wierzbicka 1996; Goddard & Wierzbicka Eds. 2002 and JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.22 (1323-1372) Cliff Goddard the works listed in Appendix 2), while the alternatives do not. As for formal semantics, it has a well-developed theory and methodology but its range of application is very narrow, and its theoretical commitments are orthogonal to those of cognitive linguistics. In any case, I would like to see cognitive linguists who reject the notion of paraphrase within a constrained metalanguage take up the challenge of developing a systematic alternative method of verbal explication – be it structuralist feature-style representations, or conventional dictionary-style explanations, or even some modified form of formal semantics. Better to see a range of positions being debated and discussed, than to have continued theoretical indifference to a fundamental issue. Unlike the abstract and technical categories of feature-style analysis and formal semantics, semantic primes are demonstrably present as word-meanings in basic vocabulary. They are grounded in the everyday linguistic experience of language users – apparently in all languages. On account of their definitional simplicity, they provide for a maximally fine-grained, explicit and transparent depiction of conceptual meanings. Being framed in natural language (albeit a standardised and constrained subset of natural language), NSM explications can be substituted directly or indirectly into contexts of use and their accuracy assessed against the evidence of usage and against native speaker intuitions.11 Finally and importantly, the natural semantic metalanguage approach is committed to avoiding the terminological ethnocentrism that arises when the vocabulary of English, and other European languages, is uncritically used as a descriptive vocabulary for semantics. It seems obvious that to represent the concepts of widely divergent languages and cultures in English-specific terms is necessarily to distort the linguistic conceptualisations inherent in those languages – unless, that is, one assumes that English and other European languages are specially “gifted” as tools of conceptual representation. Is this indeed the assumption of those who steadfastly ignore the non-translatability of their descriptive metalanguage?12 Or are we to assume that ethnocentrism is inevitable in cognitive linguistics (perhaps in the interpretive sciences generally) and that one must simply “grin and bear it”? Even so, one would like to see this position explained and defended, and to ask how the concomitant analytical relativism can be minimised or counteracted. The NSM approach provides cognitive linguistics with a well theorised, empirically grounded, and non-ethnocentric methodology for verbal explication. To what extent “mainstream” cognitive linguists will choose to adopt and implement the approach in their own work remains to be seen. I have the impression that interest in NSM work is growing, especially among the new generation of younger cognitive linguists. In any case, I hope to have shown in this chapter that the cognitive linguistics “mainsteam” can benefit from considering the principles behind the NSM approach. JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.23 (1372-1434) Verbal explication and NSM semantics Notes * For helpful comments on earlier versions of this paper I would like to thank Nick Enfield, Catherine Travis, June Luchjenbroers, Dirk Geeraerts, and an anonymous reviewer. All correspondence concerning this article should be addressed to Cliff Goddard, School of Languages, Cultures and Linguistics, University of New England, Armidale N.S.W. 2351, Australia. . A good deal of Geeraerts’ (1999) critique is directed against what he sees as Wierzbicka’s excessive reliance on semantic intuition and her “outspoken idealistic commitments”. Against this he counterposes a so-called “empiricist” approach which proceeds “not by denying the importance of intuition, to be sure, but by supplementing and supporting any reliance on introspection with corpus analysis or experimentation” (1999: 170). In recent years, however, Wierzbicka has been relying increasingly on corpus-based evidence, e.g., Wierzbicka (2002b, in press). Geeraerts also accuses NSM researchers of a “monosemous bias”, but this charge appears ill-conceived when one considers the meticulous documentation of grammatical and lexical polysemy undertaken in NSM studies such as Wierzbicka (1988, 1998), Goddard (2000, 2003), among others. . Exponents of semantic primes in different languages are of course not expected to be equivalent in every respect. While their primary (simplest) senses can be matched across languages, their secondary, polysemic meanings may differ widely. For example, English feel and Malay rasa have the same primary sense, but English feel has a secondary meaning related to ‘touching’ which is not shared by the Malay word, while Malay rasa has a secondary meaning ‘taste’ which is not shared by English feel (Goddard 2002a). It should also be pointed out that the term ‘lexical’ is used in a broad sense to include not only words, but also bound morphemes and phrasemes. Even when exponents of semantic primes take the form of single words, there is no need for them to be morphologically simple, and they can also have variant forms (allolexes or allomorphs). All these factors mean that testing the cross-linguistic viability of the proposed lexical primes is no straightforward matter. It requires rich and reliable data, and careful language-internal analysis of polysemy and allolexy (cf. Goddard & Wierzbicka Eds., 2002). . That is, I am not falling into the “standard trap”, as characterised by Johnson (1987: 4), of saying that “since we are bound to talk about preconceptual and non-propositional aspects of experience always in propositional terms, it must follow that they are themselves propositional in nature”. . A defender of the “retribution scenario” could perhaps argue that the wording is unimportant because the scenario is not intended to represent a propositional meaning, but this “catch all” move would considerably weaken the verifiability of the scenario. From the point of view of representing the conceptual reality of a young child, it would surely be preferable to re-formulate the scenario into simpler terms which are familiar to young children. For NSM studies of early conceptual and lexical acquisition, see Goddard (2001c) and Tien (forthcoming). . Lakoff (1993) has argued that, despite appearances, conceptual metaphors are not propositional but are merely shorthand for a set of preconceptual correspondences or mappings. This would make statements of conceptual metaphor “quasi-propositional”, rather than literally propositional in nature. . An interesting consequence of generalising conceptual metaphors is that it partially undermines the original claim of Lakoff and Johnson (1980) that abstract concepts are understood in terms of concrete, experience-near concepts. As noted by Grady (1997: 273), a metaphor such JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.24 (1434-1519) Cliff Goddard as ORGANISATION IS STRUCTURE is framed in terms which are too general to represent experiential domains. . Interlinear gloss symbols are as follows. 1du.incl first-person dual inclusive, 3: 3rd person, 3sg: third-person singular, adj: adjectiviser, appl: applicative suffix, dat: dative marker, emph: emphatic, ger: gerundive, hon: honorific, imp: imperative, intr: intransitive prefix, inv: involuntary prefix, lig: ligature, neg: negative, noml: nominaliser, obj: object marker, past: past tense, pol: politeness marker, poss: possessive, prt: particle, ques: question particle, quot: quotative, subj: subject marker, top: topic marker, tr: transitive prefix. . Although the NSM approach assumes that words have stable meanings which can be captured in paraphrase explications, such as [A], this does not mean that the approach is committed to a narrow invariance hypothesis. It is recognised that in natural discourse the interpretation of words in context is always influenced by linguistic context and by situational and cultural factors. Important aspects of such interpretative processes can be modelled using the theory of cultural scripts (cf. Goddard & Wierzbicka Eds., 2004). . Interlinear glosses have been added to the following three Japanese examples from Travis (1998a). . Since this point is sometimes misunderstood, it is worth stating explicitly that upholding the irreducibility of symbolic (conceptual) meaning in no way commits one to denying the existence of experiential schemas. One may very well accept that embodied, preconceptual experiential schemas underlie, constrain, and support the emergence of conceptual meaning without accepting that conceptual meaning is reducible to experiential schemas. Johnson (1987: 5) expresses a similar view, though from the opposite perspective, so to speak: “I am perfectly happy with talk of the conceptual/propositional content of an utterance, but only insofar as we are aware that this propositional content is possible only by virtue of a complex web of non-propositional schematic structures that emerge from our bodily experience” (italics in original). . Needless to say, these merits do not safeguard the analyst against all error. No doubt a good deal of the extant NSM work could be revised and improved, as shown in fact by successive revisions and improvements undertaken by NSM scholars themselves, in various domains. . I do not wish to imply that NSM researchers are the only cognitive linguists who are concerned with these issues. In particular one thinks of anthropologically oriented researchers, such as Palmer (2003), who has argued that indigenous conceptualisations are best revealed by the language-internal explanations and commentaries of native speakers. References Amberber, Mengistu (in press). Semantic primes and their grammar in Amharic. In C. Goddard (Ed.), Crosslinguistic Semantics. Amsterdam: John Benjamins. Amberber, Mengistu (2003). The grammatical encoding of thinking in Amharic. Cognitive Linguistics, 14(2/3), 195–220. Amberber, Mengistu (2001). Testing emotional universals in Amharic. In J. Harkins & A. Wierzbicka (Eds.), Emotions in Crosslinguistic Perspective (pp. 35–67). Berlin: Mouton de Gruyter. Ameka, Felix (2002). Cultural scripting of body parts for emotions: On ‘jealousy’ and related emotions in Ewe. Pragmatics & Cognition, 10(1), 1–25. JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.25 (1519-1639) Verbal explication and NSM semantics Ameka, Felix (1996). Body parts in Ewe. In H. Chappell & W. McGregor (Eds.), The Grammar of Inalienability (pp. 783–840). Berlin: Mouton de Gruyter. Ameka, Felix (1994). Ewe. In C. Goddard & A. Wierzbicka (Eds.), Semantic and Lexical Universals – Theory and Empirical Findings (pp. 57–86). Amsterdam: John Benjamins. Athanasiadou, A. & E. Tabakowska (Eds.). (1998). Speaking of Emotions: Conceptualization and expression. Berlin: Mouton de Gruyter. Bardon, Geoff (1979). Aboriginal Art of the Western Desert. Adelaide: Rigby. Barnlund, Dean (1975). Public and Private Self in Japan and the United States: Communicative styles of two cultures. Tokyo: Simul. Bugenhagen, Robert D. (2002). The syntax of semantic primes in Mangaaba-Mbula. In C. Goddard & A. Wierzbicka (Eds.), Meaning and Universal Grammar – Theory and Empirical Findings. Volume II (pp. 1–64). Amsterdam: John Benjamins. Bugenhagen, Robert D. (2001). Emotions and the nature of persons in Mbula. In J. Harkins & A. Wierzbicka (Eds.), Emotions in Crosslinguistic Perspective (pp. 73–118). Berlin: Mouton de Gruyter. Bugenhagen, Robert D. (1994). The exponents of semantic primitives in Mangap-Mbula. In C. Goddard & A. Wierzbicka (Eds.), Semantic and Lexical Universals – Theory and Empirical Findings (pp. 87–108). Amsterdam: John Benjamins. Chappell, Hilary (2002). The universal syntax of semantic primes in Mandarin Chinese. In C. Goddard & A. Wierzbicka (Eds.), Meaning and Universal Grammar – Theory and Empirical Findings. Volume I (pp. 243–322) Amsterdam: John Benjamins. Chappell, Hilary (1994). Mandarin semantic primitives. In C. Goddard & A. Wierzbicka (Eds.), Semantic and Lexical Universals – Theory and Empirical Findings (pp. 109–148). Amsterdam: John Benjamins. Chappell, Hilary (1986). The passive of bodily effect in Chinese. Studies in Language, 10, 271– 296. Cienki, Alan (1998). straight: An image schema and its metaphorical extensions. Cognitive Linguistics, 9, 107–149. Doi, Takeo (1971). Amae no Koozoo [The Anatomy of Dependence]. Tokyo: Koobundoo. Enfield, N. J. (2002). Combinatoric properties of Natural Semantic Metalanguage expressions in Lao. In C. Goddard & A. Wierzbicka (Eds.), Meaning and Universal Grammar – Theory and Empirical Findings. Volume II (pp. 145–256). Amsterdam: John Benjamins. Enfield, N. J. (1999). On the indispensability of semantics: Defining the ‘vacuous’. In J. Mey & A. Boguslawski (Eds.), ‘E Pluribus Una’. The One in the Many. Special Issue of RASK, International Journal of Language and Communication, 9(10), 285–304. Geeraerts, Dirk (1999). Idealistic and empiricist tendencies in cognitive semantics. In T. Janssen & G. Redeker (Eds.), Cognitive Linguistics: Foundations, scope and methodology (pp. 163– 194). Berlin/New York: Mouton de Gruyter. Goddard, Cliff (2003). Dynamic ter- in Malay (Bahasa Melayu): A study in grammatical polysemy. Studies in Language, 27(2), 287–322. Goddard, Cliff (2002a). Semantic primes and universal grammar in Malay (Bahasa Melayu). In C. Goddard & A. Wierzbicka (Eds.), Meaning and Universal Grammar – Theory and Empirical Findings, Volume I (pp. 87–172). Amsterdam: John Benjamins. Goddard, Cliff (2002b). Ethnosyntax, ethnopragmatics, sign-functions, and culture. In N. J. Enfield (Ed.), Ethnosyntax. Explorations in Grammar and Culture (pp. 52–73). Oxford: Oxford University Press. Goddard, Cliff (2001a). Sabar, ikhlas, setia – patient, sincere, loyal? A contrastive semantic study of some “virtues” in Malay and English. Journal of Pragmatics, 33, 653–681. JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.26 (1639-1770) Cliff Goddard Goddard, Cliff (2001b). Hati: A key word in the Malay vocabulary of emotion. In J. Harkins & A. Wierzbicka (Eds.), Emotions in Crosslinguistic Perspective (pp. 167–195). Berlin: Mouton de Gruyter. Goddard, Cliff (2001c). Conceptual primes in early language development. In M. Pütz, S. Niemeier, & R. Dirven (Eds.), Applied Cognitive Linguistics I: Theory and language acquisition (pp. 193–227). Berlin/New York: Mouton de Gruyter. Goddard, Cliff (2000). Polysemy: A problem of definition. In Y. Ravin & C. Leacock (Eds.), Polysemy and Ambiguity: Theoretical and applied approaches (pp. 129–151). New York: Oxford University Press. Goddard, Cliff (1998a). Semantic Analysis: A practical introduction. Oxford: Oxford U. Press. Goddard, Cliff (1998b). Bad arguments against semantic primitives. Theoretical Linguistics, 24(2/3), 129–156. Goddard, Cliff (1996a). Cross-linguistic research on metaphor. Language & Communication, 16(2), 145–151. Goddard, Cliff (1996b). The “social emotions” of Malay (Bahasa Melayu). Ethos, 24(3), 426–464. Goddard, Cliff (1994). Lexical primitives in Yankunytjatjara. In C. Goddard & A. Wierzbicka (Eds.), Semantic and Lexical Universals – Theory and Empirical Findings (pp. 229–262). Amsterdam: John Benjamins. Goddard, Cliff (1992). Traditional Yankunytjatjara ways of speaking – A semantic perspective. Australian Journal of Linguistics, 12, 93–122. Goddard, Cliff (1991a). Testing the translatability of semantic primitives into an Australian Aboriginal language. Anthropological Linguistics, 33(1), 31–56. Goddard, Cliff (1991b). Anger in the Western Desert – A case study in cross-cultural semantics of emotion. Man, 26(2), 265–279. Goddard, Cliff (1990). The lexical semantics of ‘good feelings’ in Yankunytjatjara. Australian Journal of Linguistics, 10(2), 257–292. Goddard, Cliff (Ed.). (1997). Studies in the Syntax of Universal Semantic Primitives. Special issue of Language Sciences, 19(3). Goddard, Cliff & Anna Wierzbicka (Eds.). (2004). Cultural Scripts. Special issue of Intercultural Pragmatics, 1(2). Goddard, Cliff & Anna Wierzbicka (Eds.). (2002). Meaning and Universal Grammar – Theory and Empirical Findings. Volumes I and II. Amsterdam: John Benjamins. Goddard, Cliff & Anna Wierzbicka (Eds.). (1994). Semantic and Lexical Universals – Theory and Empirical Findings. Amsterdam: John Benjamins. Grady, Joseph E. (1997). Theories are buildings revisited. Cognitive Linguistics, 8(4), 267–290. Harkins, Jean (2001). Talking about anger in Central Australia. In J. Harkins & A. Wierzbicka (Eds.), Emotions in Crosslinguistic Perspective (pp. 197–215). Berlin: Mouton de Gruyter. Harkins, Jean & David P. Wilkins (1994). Mparntwe Arrernte and the search for lexical universals. In C. Goddard & A. Wierzbicka (Eds.), Semantic and Lexical Universals – Theory and Empirical Findings (pp. 285–310). Amsterdam: John Benjamins. Hasada, Rie (2001). Explicating the meaning of sound-symbolic Japanese emotion terms. In J. Harkins & A. Wierzbicka (Eds.), Emotions in Crosslinguistic Perspective (pp. 217–253). Berlin: Mouton de Gruyter. Hasada, Rie (1998). Sound symbolic emotion words in Japanese. In A. Athanasiadou & E. Tabakowska (Eds.), Speaking of Emotions: Conceptualization and expression (pp. 83–98). Berlin: Mouton de Gruyter. Hawkins, Bruce (1984). The semantics of English spatial prepositions. PhD thesis. University of California. JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.27 (1770-1897) Verbal explication and NSM semantics Johnson, Mark (1987). The Body in the Mind. Chicago: Chicago University Press. Junker, Marie-Odile (in press a). Semantic primes and their grammar in a polysynthetic language: East Cree. In C. Goddard (Ed.), Crosslinguistic Semantics. Amsterdam: John Benjamins. Junker, Marie-Odile (in press b). Are there emotional universals? Evidence from the native American language East Cree. Culture & Psychology. Junker, Marie-Odile (2003). A native American view of the “mind” as seen in the lexicon of cognition in East Cree. Cognitive Linguistics, 14(2/3), 167–194. Kamus Harian Federal: Bahasa Malaysia – Inggeris – Bahasa Malaysia (1995). Mohd Salleh Daud (Ed.). Kuala Lumpur: Federal Publications. Kornacki, Pawel (2001). Concepts of anger in Chinese. In J. Harkins & A. Wierzbicka (Eds.), Emotions in Crosslinguistic Perspective (pp. 255–289). Berlin: Mouton de Gruyter. Kornacki, Pawel (1995). Aspects of Chinese cultural psychology as reflected in the Chinese lexicon. PhD Thesis. Australian National University. Kövecses, Zoltán (1995). American friendship and the scope of metaphor. Cognitive Linguistics, 6(4), 315–346. Lakoff, George (1993). The contemporary theory of metaphor. In A. Ortony (Ed.), Metaphor and Thought (pp. 202–251). Cambridge: Cambridge University Press. Lakoff, George (1990). The Invariance Hypothesis: Is abstract reason based on image-schemas? Cognitive Linguistics, 1(1), 39–74. Lakoff, George (1987). Women, Fire and Dangerous Things. Chicago: Chicago Universtiy Press. Lakoff, George & Mark Johnson (1980). Metaphors We Live By. Chicago: The University of Chicago Press. Lakoff, George & Zoltán Kövecses (1987). The cognitive model of anger inherent in American English. In D. Holland & N. Quinn (Eds.), Cultural Models in Language and Thought (pp. 195–221). Cambridge: Cambridge University Press. Langacker, Ronald W. (1999). A study in unified diversity: English and Mixtec locatives. In J. Mey & A. Boguslawski (Eds.), ‘E Pluribus Una’. The One in the Many (pp. 215–256). Odense: Odense University Press. Langacker, Ronald W. (1990). Concept, Image, and Symbol. The Cognitive Basis of Grammar. Berlin: Mouton de Gruyter. Lebra, Takie Sugiyama (1976). Japanese Patterns of Behavior. Honolulu: The University Press of Hawaii. McCawley, James D. (1983). Review of Anna Wierzbicka’s Lingua Mentalis: The semantics of natural language. Language, 59(3), 654–659. Mühlhäusler, Peter (1995). Metaphors others live by. Language and Communication, 15(3), 281– 288. Munn, Nancy D. (1973). Walbiri Iconography. Ithaca: Cornell University Press. Niemeier, Susanne (1997). Introduction. In S. Niemeier & R. Dirven (Eds.), The Language of Emotions. Amsterdam: John Benjamins. Onishi, Masayuki (1997). The grammar of mental predicates in Japanese. Language Sciences, 19(3), 219–233. Onishi, Masayuki (1994). Semantic primitives in Japanese. In C. Goddard & A. Wierzbicka (Eds.), Semantic and Lexical Universals – Theory and Empirical Findings (pp. 361–386). Amsterdam: John Benjamins. Palmer, Gary (2003). Talking about thinking in Tagalog. Cognitive Linguistics, 14(2/3), 251–280. Peeters, Bert (2002). Métalangue sémantique naturelle au service de l’étude du transculturel. Travaux de linguistique, 45, 83–101. JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.28 (1897-2030) Cliff Goddard Peeters, Bert (2000). “S’Engager” vs. “To Show Restraint”: Linguistic and cultural relativity in discourse management. In S. Niemeier & R. Dirven (Eds.), Evidence for Linguistic Relativity (pp. 193–222). Amsterdam: John Benjamins. Peeters, Bert (1997a). Using the natural semantic metalanguage in the French classroom. Paper delivered at Fifth International Cognitive Linguistics Conference, Amsterdam. Peeters, Bert (1997b). The syntax of time and space primitives in French. Language Sciences, 19(3), 235–244. Peeters, Bert (1994). Semantic and lexical universals in French. In C. Goddard & A. Wierzbicka 25 (Eds.), Semantic and Lexical Universals – Theory and Empirical Findings (pp. 423–444). Amsterdam: John Benjamins. Stanwood, Ryo E. (1999). On the Adequacy of Hawai’i Creole English. PhD dissertation. University of Hawai’i. Stanwood, Ryo E. (1997). The primitive syntax of mental predicates in Hawai‘i Creole English: A text-based study. Language Sciences, 19(3), 209–217. Talmy, Leonard (1988). Force dynamics in language and cognition. Cognitive Science, 12, 49–100. Tien, Adrian (2005). The Semantics of Children’s Mandarin Chinese: The first four years. PhD thesis. University of New England. Travis, Catherine (2003). The semantics of the Spanish subjunctive. Its use in the natural semantic metalanguage. Cognitive Linguistics, 14(1), 47–69. Travis, Catherine (2002). La Metalengua Semántica Natural: The Natural Semantic Metalanguage of Spanish. In C. Goddard & A. Wierzbicka (Eds.), Meaning and Universal Grammar – Theory and Empirical Findings. Volume I (pp. 173–242). Amsterdam: John Benjamins. Travis, Catherine (1998a). Omoiyari as a core Japanese value: Japanese-style empathy? In A. Athanasiadou & E. Tabakowska (Eds.), Speaking of Emotions: Conceptualization and expression (pp. 83–103). Berlin: Mouton de Gruyter. Travis, Catherine (1998b). Bueno: A Spanish interactive discourse marker. BLS, 24, 268–279. Trilling, Lionel (1972). Sincerity and authenticity. London: Oxford University Press. Wierzbicka, Anna (in press). English Meaning & Culture. New York: Oxford University Press. Wierzbicka, Anna (2002a). Semantic primes and universal grammar in Polish. In C. Goddard & A. Wierzbicka (Eds.), Meaning and Universal Grammar – Theory and Empirical Findings. Volume II (pp. 65–144). Amsterdam: John Benjamins. Wierzbicka, Anna (2002b). Right and wrong: From philosophy to everyday discourse. Discourse Studies, 4, 225–252. Wierzbicka, Anna (1999). Emotions Across Languages and Cultures: Diversity and universals. Cambridge: Cambridge University Press. Wierzbicka, Anna (1998). The semantics of English causative constructions in a universaltypological perspective. In M. Tomasello (Ed.), The New Psychology of Language (pp. 113– 153). Mahwah, NJ: Lawrence Elbaum. Wierzbicka, Anna (1997). Understanding Cultures Through Their Key Words. Oxford: Oxford University Press. Wierzbicka, Anna (1996). Semantics, Primes and Universals. Oxford: Oxford University Press. Wierzbicka, Anna (1992). Semantics, Culture and Cognition. Oxford: Oxford University Press. Wierzbicka, Anna (1988). The Semantics of Grammar. Amsterdam: John Benjamins. Wierzbicka, Anna (1972). Semantic Primitives. Translated by Anna Wierzbicka and John Besemeres. Frankfurt/M.: Athenäum Verlag. Wilkins, David P. (1986). Particles/clitics for criticism and complaint in Mparntwe Arrernte (Aranda). Journal of Pragmatics, 10(5), 575–596. JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.29 (2030-2177) Verbal explication and NSM semantics Wilkins, David P. (2000). Ants, ancestors and medicine: A semantic and pragmatic account of classifier constructions in Arrernte (Central Australia). In G. Senft (Ed.), Systems of Nominal Classification (pp. 147–216). Cambridge: Cambridge University Press. Ye, Zhengdao (2004). The Chinese folk model of facial expressions: A linguistic perspective. Culture & Psychology, 10(2), 195–222. Ye, Zhengdao (2002). Different modes of describing emotions in Chinese: Bodily changes, sensations and bodily images. Pragmatics and Cognition, 10(1/2), 321–356. Ye, Zhengdao (2001). An inquiry into “sadness” in Chinese. In J. Harkins & A. Wierzbicka (Eds.), Emotions in Crosslinguistic Perspective (pp. 359–404). Berlin: Mouton de Gruyter. Yoon, Kyung-Joo (2004). Korean maum vs. English heart and mind: Contrastive semantics of cultural concepts. In C. Moskosky (Ed.), Proceedings of the 2003 Conference of the Australian Linguistics Society. [www.newcastle.edu.au/school/lang-media/news/als2003/proceedings. html] Yoon, Kyung-Joo (2003). Constructing a Korean Natural Semantic Metalanguage. PhD thesis. The Australian National University. Appendix 1 Semantic primes – English exponents (after Goddard & Wierzbicka Eds., 2002) Substantives: Determiners: Quantifiers: Descriptors: Evaluators: Intensifier: Mental predicates: Speech: Events and actions: Existence and possession: Life and death: Time: Space: Logical concepts: Augmentor: Taxonomy, partonomy: Similarity: i, you, someone, something/thing, people, body this, the same, other/else one, two, all, much/many, some big, small good, bad very want, feel, think, know, see, hear say, word, true do, happen, move there is, have live, die when/time, now, after, before, a long time, a short time, for some time, moment where/place, here, above, below, side, near, far, inside, touching, be (somewhere) not, maybe, if, can, because more kind of, part of like Notes: • primes exist as the meanings of lexical units (not at the level of lexemes) • exponents of primes may be words, bound morphemes, or phrasemes • they can be formally, i.e. morphologically, complex • they can have different morphosyntactic properties, including word-class, in different languages • they can have combinatorial variants (allolexes) • each prime has well-specified syntactic (combinatorial) properties. JB[v.20020404] Prn:20/03/2006; 15:53 F: HCP1509.tex / p.30 (2177-2184) Cliff Goddard Appendix 2 Selected NSM studies of languages other than English Language Korean Lao (Tai) Mangaaba-Mbula (Austronesian) Malay (Austronesian) Mandarin Chinese (Sinitic) Polish (Indo-European) Spanish (Indo-European) Hawaii Creole English Primes and syntax Descriptive semantic studies Comprehensive studies Yoon (2003) Yoon (2004) Enfield (2002) Enfield (1999) Bugenhagen (1994, 2002) Bugenhagen (2001) Goddard (2002a) Chappell (1994, 2002) Kornacki (1995, 2001) Wierzbicka (2002a) Travis (2002) Stanwood (1997, 1999) Goddard (1996b, 1997, 2001a, b) Chappell (1986), Ye (2001, 2002, 2004) Wierzbicka (1997) Travis (1998b, 2003) Partial studies Amharic (Ethiosemitic) Arrernte (Pama-Nyungan) Cree (Algonquian) Ewe (Niger-Congo) French (Indo-European) Japanese Yankunytjatjara (Pama-Nyungan) Harkins/Wilkins (1994) Junker (in press a) Ameka (1994) Peeters (1994, 1997b) Onishi (1994, 1997) Goddard (1991a, 1994) Amberber (2001, 2003, in press) Harkins (2001), Wilkins (1986, 2000) Junker (2003, in press b) Ameka (1996, 2002) Peeters (2000, 2002) Hasada (1998, 2001), Travis (1998a) Goddard (1990, 1991b, 1992) For a more comprehensive listing, consult the NSM Homepage: www.une.edu.au/arts/LCL/disciplines/linguistics/nsmpage.htm JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.1 (47-121) chapter “How do you know she’s a woman?” Features, prototypes and category stress in Turkish kadin and kiz Robin Turner Bilkent University, Turkey This paper examines Turkish words for girls and women in order to investigate the relationship between categorization, culture and personal interaction. In doing so, it serves as a test case for a model which attempts to integrate prototype and featured-based categorisation based on a distinction between defining and typical features, both of which are subdivided into strong and weak features (the latter being more heavily dependent on context). I also consider a phenomenon I term category stress resulting from situations where there is a conflict between feature-based and prototype categorisation. Keywords: Turkish, categorization, prototype, stress . Introduction It has become a common-place observation that different cultures categorise phenomena in different ways. We “cut nature up, organize it into concepts, and ascribe significances as we do, largely because we are parties to an agreement to organize it in this way” (Whorf 1956: 214). If we can avoid the armchair cultural linguistics of the “Eskimos have twenty words for snow” variety, a comparison of categories across cultures can reveal much, not only about the cultures involved, but also about the nature of categorisation itself.1 This is perhaps clearest where ‘natural kinds’ and ‘functional kinds’ (Lehrer 1990: 372) overlap. It may not come as a surprise that some languages draw no distinction between a turtle and a tortoise, or a solicitor and a barrister, but when the boundaries of such basic concepts as ‘man’ or ‘woman’ are drawn differently, we might expect this to be indicative of a difference in attitudes towards these concepts which is not merely linguistic. The categories woman and girl provide a good example of the interaction between ‘natural’ and ‘functional’ kinds. Humanness (as distinct from humanity) JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.2 (121-175) Robin Turner and femaleness (as distinct from femininity) can be regarded as the result of natural discontinuities. Apart from a few extremely fuzzy cases, one requires no specific cultural apparatus to perceive what is or is not human or female. In contrast, the other major element in the categorisation of these terms in English is adulthood, which is not only culturally specific but also context-specific: someone may be referred to as a ‘girl’ in one context and a ‘woman’ in another, and choosing the appropriate term requires considerable sociolinguistic competence. Of course, there is much more to the category, woman, than the supposedly simple features of [+human][+female] and [+adult]. However, as I shall argue later, there is some value in adopting such a traditional semantic analysis alongside the more current prototype-based view. In deciding whether a particular human female should be classed as a woman or a girl, simply looking for the presence or absence of the feature [adult] is obviously simplistic; nevertheless, we can assume that, in English, our idea of adulthood, and the extent to which it applies to a particular person in a particular context, is the most important factor in the equation. In other languages and cultures, though, adulthood may not be the most significant factor involved. The Turkish terms, kiz and kadın approximate to ‘girl’ and ‘woman’ respectively, but to refer to an unmarried woman as a kadın would be a serious faux pas, since the most important factor in distinguishing between kiz and kadın is not age but sexual experience; all things being equal, a kiz is a virgin and a kadın is not. It is important to bear in mind that there is an asymmetry between terms for men and women here. Not only is sexual experience not important in the transition from oğlan (‘boy’) to erkek (‘man’), in fact the former term is rare: when it is necessary to specify a male child, the term erkek çocuğu (‘man child’) is more common. Erkek seems to have only one defining feature, [+male], since it is commonly used for male animals as well (e.g. erkek köpeği – ‘dog’ as opposed to ‘bitch’). In traditional semantic terms, this cultural difference in categorisation can be explained quite simply: in both English and Turkish a woman is [+human] [+female], but the languages differ in ascribing the feature [+adult] in one case, and [–virgin] in the other. This approach is, however, inadequate in explaining examples such as “I’m going out with the girls”, which in English may be uttered by a seventy-year-old, or in Turkish where kız can be used to greet any female friend, irrespective of age. (1) N’aber, kız? what news girl? ?“How’s things, girl?” JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.3 (175-230) How do you know she’s a woman? Furthermore, there are terms such as ‘International Women’s Day’ or ‘women’s sports’, which in both languages refer to all female humans, not just adults or nonvirgins. On the other hand, a ‘fuzzy’ prototype-based analysis is on its own equally inadequate. A middle-aged, unmarried woman (spinster) is almost as far-removed from the prototype of kiz as she is from that of girl, but outside certain specific contexts, she still may not be placed in the kadın category: category boundaries, while changeable, are often anything but fuzzy. In this study of the terms kiz and kadın, I will argue that both feature-based and prototype-based models are necessary in order to explain categorisation acts; however, instead of a simple binary feature bundle I use a variable weighted feature approach that involves a distinction between ‘defining’ and ‘typical’ features (which indicate category membership and prototypicality respectively), and a further distinction between those features that remain fairly constant and those whose salience varies according to context and communicative intent. Furthermore, I also propose the concept of ‘category stress’, which can be seen as a kind of cognitive dissonance resulting from disparity between feature-based and prototype-based categorisation processes. This may result from a number of factors, both contextual and cultural; it also often results in infelicitous categorisations or a search for alternative categories. Thus, in situations where the “strictly speaking” use of kız would apply to an item far removed from the prototype (as in the middleaged spinster example), alternative terms such as bayan or hanım (both roughly meaning ‘lady’) may be used. . Views of categorisation Since the publication of Lakoff ’s (1987) Women, Fire and Dangerous Things, it has become common to divide theories of categorisation into traditional, featurebased semantics in one camp, and cognitive approaches based on prototypes, metonymy and metaphor in the other, with Aristotle cast as the villain of the piece (Wierzbicka 1990: 364). However, Aristotle himself had a more sophisticated view of categorisation than is commonly supposed, and, with his distinction between ‘essential’ and ‘accidental’ attributes was the first to introduce the idea that not all features are of equal importance. The problem with the Aristotelian view lies not so much in the distinction between essential and accidental attributes as in its failure to realise that some accidental (i.e. non-defining) attributes are anything but accidental. To give one of Aristotle’s favourite examples, “white man” (Metaphysics, VII(6): 1031), the attribute ‘white’ is obviously not essential to being a man, but falls within an accepted colour-range that is probably an element in the process of categorising a creature as a ‘man’: “white man” and “black man” indicate dif- JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.4 (230-307) Robin Turner ferences between men, but “green man” or “purple man” imply something odd is going on (maybe what we are referring to is not actually a man but a leprechaun or an alien, or maybe the colour term is being used metaphorically). If we view features as “focal values in a continuous cognitive space” (Jackendoff 1992: 205), we need some set of rules for determining their relationship and relative importance, rather than simply lumping them together. The realisation that not all features are created equal gave rise to the “weighted feature-bundle” approach (Coleman & Kay 1981). A simple feature bundle fails to describe the internal structure of a category, nor does it give an accurate picture of its relationship with other categories (Langacker 1987: 19–20). Therefore an alternative is to rank features from most to least essential. However, while the idea of assigning different weightings to features is useful, it is still necessary to draw a distinction between types of feature in terms of those that define a category and those that establish centrality within that category. For this reason Lehrer (1974) proposed a distinction between ‘obligatory’ and ‘optional’ features, and similar approaches have been adopted by Lipka (1986), and Wierzbicka (1985). What these approaches have in common is an attempt to reconcile feature- and prototype-based categorisations. From a different perspective, Jackendoff (1983) and Pustejovsky (1995) have also addressed this problem. Jackendoff in particular suggests that the combination of an atomistic feature-based system with preference rules can explain “categories with fuzzy boundaries and family resemblance properties á la Wittgenstein and Roth” (1992: 206). In any case, it is obvious that mere resemblance to a prototype is in itself insufficient as a basis for categorisation. As Wierzbicka (1990: 350) points out, resemblance does not explain why “an ostrich is a bird but a bat is not”, as the latter is in many ways closer to our celebrated prototypical robin than the former. Furthermore, Cruse (1990: 388) argues, “It is not easy to see how the boundaries of a category can be derived from its prototypes.” Another problem is raised by context. Cruse points out that “It is at least possible that different criteria are used with different categories, and perhaps even different criteria on different occasions of judgement with the same category, under different sorts of contextual pressure” (1990: 384). Following Hymes, the role of language as a device for categorizing experience and its role as an instrument of communication cannot be so separated, and indeed, the latter includes the former. This is the more true when a language, as is often the case, affords alternative ways of categorizing the same experience, so that the patterns of selection among such alternatives must be determined in actual contexts of use. (Hymes 1972: 33) It is clear, then, that categorisation is partly determined by contextual factors and partly by the speaker’s state of mind and intention in communicating. JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.5 (307-349) How do you know she’s a woman? A categorisation act may also be influenced by a scenario: “a culturally defined sequence of actions; a story schema” (Palmer 1996: 75). Holland and Skinner claim that female American students’ categorisations of types of men are based on “a taken-for-granted relationship between males and females”, and that by using specialised terms, such as ‘jock’, ‘nerd’ and so on, women “relate types to a set of scenarios in which the prototypical male/female relationship is disrupted” (1987: 103). Disruption of a scenario results in what I have termed ‘category stress’, and often occurs in the search for alternative categories. In this case, such categories as ‘jock’ or ‘nerd’ could be seen as a way out when a male fails to meet the requirements of the scenario (e.g. by spending the whole of a date talking about football or computers). Our middle-aged unmarried Turkish woman is another example, since in the prototypical scenario she would have married sometime between the ages of fifteen and twenty-five (Atalay 1992: 271), and would have made the transitions from [+virgin] to [–virgin] and [–adult] to [+adult], moreor-less simultaneously. In Turkey, as in many cultures, the concepts of adulthood and marriage are intertwined, especially for women; marriage provides a rite of passage that enables the normally fuzzy boundary between child and adult to become much more clear-cut. A late, or a very early marriage can thus give rise to category stress. Finally, it is important to remember that diachronic factors are also important in categorisation and category stress; in fact historical linguistics as a field is largely concerned with the process of change in categories, such as amelioration and so forth. Societies change, and their languages change with them “through time and incessant patter” (Palmer 1996: 6), although there is frequently a time lag. Because cultural models organise large amounts of information successfully and are thus resistant to change (Holland & Skinner 1987: 105), the result is category stress. From this review of the literature, we may raise the following hypotheses: 1. Both features and prototypes play an important role in categorisation; neither approach on its own is adequate. 2. Features are not of equal importance in assigning items to a category, and can be broadly grouped into ‘defining features’ and ‘typical features’. 3. While the status of some features remains fairly constant, that of others varies according to a number of contextual and communicative factors. 4. Contextual and cultural factors may lead to disparity between feature- and prototype-based categorisation (category stress). In the rest of this paper I will test these hypotheses using the terms kiz and kadın as benchmarks. With the exception of example (13), which is presented as a “theoretically possible” sentence, the phrases and sentences used are examples of Standard Turkish. These data include utterances by friends and relatives (referred to by their initials) and popular media.2 Some examples come from concordances JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.6 (349-416) Robin Turner provided by Petek Kurtböke from her “Ozturk Corpus” (for a more detailed description of the data-collecting procedure and results of the concordancing, see Turner 1998).3 . Defining features of kiz and kadin I have claimed that in assigning items to the categories kiz and kadın, the feature [±virgin] is more important than [±adult], in contrast to the English categories girl and woman, where the reverse is the case. However, this claim obviously needs to be tested if it is not to fall into the “twenty words for snow” category. The first case is that the question given in (2) would receive an answer based on the referent’s sexual experience or marital status rather than age; it can function either as “Is she a virgin?” or “Is she married?” (2) kız mı, kadın mı? girl int. woman int. “Girl or woman?” In the prototypical scenario, loss of virginity and marriage coincide, but in practice, of course, they often do not. It is therefore necessary to establish which criterion – [±virgin] or [±married] – is more important in categorising such peripheral cases. After all, in some languages, such as Greek, it is marriage which makes one a woman (gineka), rather than a girl (kopela). An example that illustrates the priority of [±virgin] is given in (3). (3) Sen orta-okul-da-yken kadın ol-muş-sun you middle-school-loc.-while woman become-said-2ndsing. “They say you became a woman when you were at Middle School.” (Comedian “Huysuz Virjin” to singer Ajda Pekan, Star TV, 11/7/98) Kadın olmak, ‘to become a woman’, is also defined as ‘to have one’s hymen broken’ (Türk Dil Kurumu 1998). Additionally, the medical test to establish virginity (still legal, though widely condemned) is colloquially known as, kız kontrolu, ‘girl test’. We may conclude, then, that [+virgin] is a more important feature of kiz than [–adult], though not as vital as [+human] and [+female], since it may on occasions, be over-ridden, as we shall see later. [+virgin] in kiz and [–virgin] in kadın are defining features, but weak ones; an absence of strong defining features, such as [+human] and [+female], marks its usage as clearly metaphorical, as in the case of, kiz neyi, the smallest size of reed flute (ney), which is a metaphorical extension of the kiz category. JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.7 (416-483) How do you know she’s a woman? . Typical features If [±adult] is not of prime importance in distinguishing between kız and kadın, it is obviously important in establishing centrality in a category, and thus may be termed a ‘typical feature’. I would also argue that it is a ‘strong’ typical feature, in that in some contexts it may override [±virgin]. One case is where someone is not a virgin, but is nevertheless very young. An example was provided by the media furore surrounding the marriage of Sarah, a fourteen-year-old English tourist, to a Turkish waiter. Had Sarah been Turkish this might not have been worthy of comment, since in this area of Turkey (Kahramanmaraş) marriage at this age, though illegal, is still common; she would thus have made the normal transition from kiz to kadın. However, once the case made the press, Sarah was largely referred to as kiz rather than the technically accurate kadin. When I pointed this out to a Turkish-speaker, the reaction was “well, I suppose strictly speaking she is a kadın, but . . . ” (ŞH). Similarly, outside medical contexts, children who are the victims of rape are generally referred to as kız, as would a girl who had broken her hymen accidentally (TK, AA). What seems to be happening here is a disruption to the normal scenario, or, as I have called it, category stress. Lack of a typical feature is not usually enough to justify exclusion from a category, but extreme cases can change the weighting of features, so that a strong typical feature can override a weak defining feature. In the case of Sarah, her age was seen as sufficiently atypical as to override the [–virgin] feature, and placing her in the kiz category, added to the sense of moral outrage that draws on other typical features (or connotations) of kiz, such as innocence and vulnerability. These are associated with the feature [–adult] but probably more so for girls than boys. On their own, these weak typical features would not be sufficient to override a defining feature, but may add their weight to this process in combination with [–adult]. Not all cases need be as extreme as that of Sarah, though, and there is a fair degree of latitude in whether to refer to a married woman as kadın or kız. In addition to the absolute age of the person referred to, her age relative to the speaker can play a part. An older woman may refer to a younger woman as kız irrespective of the marital status of the latter.4 It is, however, very rare for the reverse to occur – i.e., for [+adult] to override [+virgin]. Outside contexts in which [±adult] becomes irrelevant (which will be discussed later), I have observed hardly any instances of virgins being referred to as kadın. Another typical feature of kiz is [+intimate], since one thing girls prototypically do is form close peer friendships. The phrase kız kıza (“girl-to-girl”) conjures up images of intimate conversation, while erkek erkeğe has the same connotations as its English equivalent, “man-to-man”: the emphasis is less on intimacy (though JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.8 (483-534) Robin Turner that may be involved) and more on honesty, or in competitive situations, fairness (erkek erkeğe dövüşmek translates as “fight fairly”). Kız kıza is a popular title for websites dealing with “girl-talk” (e.g. Turkstudent.net 2005) and the peer intimacy aspect has been used to market products. As we have seen in the case of “going out with the girls”, in both English and Turkish [+intimate] can be regarded as a strong typical feature, since it may override both [+virgin] and [–adult], though this is not true for all languages.5 In Turkish a woman going out with her workmates may say a sentence like (4), even though nearly all the ‘girls’ in this case were married or divorced. Interestingly, there seems to be no male equivalent (cf. “going out with the boys/lads”). Again, then, we see an asymmetry between gender-specific terms. (4) kız-lar-la gid-iyor-um girl-pl.-with go-prog.-1stsing. “I’m going with the girls” (NT) There seem to be two processes at work in this apparent miscategorisation, along with other examples, such as hadi kızlar! (‘Come on, girls!’). The first is that the presence of even one unmarried woman in the group would make the use of kadın infelicitous; the second is the peer-friendship element. As in English, the term would probably not be used by an outsider, especially a male one, since what would be understood there would not be [+intimate] but [–adult]; in other words, it would be seen as patronising. The case is clearer when only one person is being addressed. As we saw earlier, N’aber kız? (‘How’s things girl?’), may be used to greet any female friend, though its use is probably rather more common in female-female than malefemale exchanges. kiz is also used paternalistically, emphasizing the [–adult] feature, and this use is particularly common with the first person singular possessive (kızım, ‘my girl’). This may well be a metaphorical extension of the polysemic meaning of kiz as ‘daughter’, which I will discuss later. Like the intimate use, it may be felicitous or infelicitous depending also on whether the speaker is seen as occupying an appropriate social/discourse role. Older friends, relatives and sometimes even strangers are often expected to play a fatherly/motherly role, so this use of kiz or kizim may be appropriate, but it can equally well be seen as condescending. For example, in a television debate (Siyaset Meydanı), an older male participant repeatedly addressed a young woman with whom he was arguing as kızım, and although she did not verbally object, her anger was visible. As a viewer put it: (5) Görü-yor mu-sun nasıl aşağılı-yor see-prog. int.-2ndsing. how lower-prog. “Do you see that? He’s really putting her down.” (NT) JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.9 (534-599) How do you know she’s a woman? Note that there is no danger of infelicity when referring to a third party as kız, as mentioned earlier. . Context and topic Women and men We have seen how context and communicative intent can cause a strong typical feature to override a weak defining feature. In addition, strong contextual or topical pressure may sometimes simply eliminate a weak defining feature, such as the aforementioned ‘International Women’s Day’ example. As in English, topicbased categorization based on “women as opposed to men” results in kadın being stripped down to its strong defining features [+human] and [+female].6 Thus, it is possible to say kadın, to someone who would normally be a member of kiz, (6) Dünya Kadın Gün-ün kutlu olsun world woman day-2ndpos. celebrated be-3dimp. “Congratulations on ‘International Woman’s Day.”’ (NT, to NH, opening telephone conversation) Similar cases arise when kadın is found in collocation with erkek (‘man’) or with haklar (‘rights’) as in the following examples: (7) kadın mi, erkek mi? woman int. man int. “Male or female?” (8) kadin ve erkek iş-çi-ler-i woman and man work-er-pl.-pos. “male and female workers” (ŞH) (Ozturk Corpus) (9) köprü-ler-in altı-ndan, kadın hak-lar-ı-ndan, kadın-erkek bridge-pl.-gen. under-abl. woman right-pl.-pos-abl. woman-man eşit-liğ-i-nden yan-a çok su-lar geç-tiğ-i için equal-ness-pos.-abl. side-dat. very water-pl. pass-part.-pos. for “As for women’s rights and male-female equality, much water has flowed under the bridge.” (Milliyet, 7/12/92) This raises the question of whether we have a case of polysemy: one distinct meaning of kadın as [+human][+female] and [–virgin], and another meaning as simply [+human] and [+female]. However, as Wierzbicka (1992: 14) argues, “polysemy must never be postulated lightly”. Since all members of the first kadın category postulated are automatically members of the second, it might be more parsimonious to assume that there is just one kadın category, and this category JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.10 (599-668) Robin Turner may expand, in certain cases, by losing the [–virgin] feature. The same applies, of course, to English woman, where the [+adult] feature is dropped under the same circumstances. Other collocations Common collocations to do with services have the effect of eliminating the [±virgin] distinction. No one would assume that a kadın kuaförü (‘women’s hairdresser’) would only cut the hair of married women. Similarly, it is rare to think that sexual experience is a prerequisite for seeing a kadın doktoru (‘gynaecologist’), and if one were thrown out of a kız yurdu (‘girl’s hall of residence’) for not being a virgin, it would be on moralistic rather than semantic grounds. The first two cases employ the ‘women as opposed to men’ sense of kadın, while with kız yurdu [–adult] overrides [+vırgın]. In the case of occupations, the prototypical member of the collocated category pushes out exceptions. Thus a kadın doktor (without the accusative/possessive -u suffix) simply means ‘female doctor’. Aside from the use of kadın to mean woman as opposed to man, it is the case that most doctors are married, and the small number of doctors in the kiz category is insufficient to warrant use of the phrase kız doktor. A similar consideration applies in the case of kiz öğrencisi, which literally means ‘girl student’ but in practice conveys, ‘female student’. Turkish tends to force a choice between kız and kadin for ‘female’, since the literal word for female, dişi, is generally only used for (i) animals, (ii) as an insult through metaphorical extension (BÇ), (iii) as a way of emphasising female sexuality, again perhaps using the animal metaphor, or (iv) in collocations that are seen as somehow ‘odd’, such as dişi Rambo, ‘female Rambo’ (Milliyet 27/3/99). Prototypically female students are members of kız, and this is even extended to those students who obviously do not belong to this category. The following television news headline illustrates this: (10) Profesör-ler, kiz öğrenci-ler-i kullan-ıyor Professor-pl. girl student-pl.-acc. use-prog. “Professors are using girl students.” (Star TV News, 2/2/98) ‘Use’, in this case, means ‘have sex with’, referring to a scandal at Izmir University. Obviously if the students in question have been ‘used’, they are not strictly speaking members of kız, but the collocation overrides this and perhaps also adds to the sense of outrage. There is also a certain ambiguity here, though, since there may be a sense that the professors are actually deflowering students, who would, at the time, be classed as kız. Note that again the asymmetry applies: the male counterpart of kiz öğrencisi is erkek öğrencisi (‘man student’), not oğlan öğrencisi (‘boy student’), as seen in example (11). JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.11 (668-717) How do you know she’s a woman? (11) Brighton College, yaklaşık 500 erkek ve kız öğrencisi olan Brighton College close-to 500 man and girl student-pos. being yatılı bir okuldur boarding a school-is “Brighton College is a boarding school for around 500 male and female students.” (Promeths.com 2005) Collocation requires lexical conformity almost by definition. It is therefore not surprising that collocations are based on prototypical instances, and these exclude atypical cases. . Causes and effects of category stress I have argued that category stress occurs when there is a disparity between the results of feature-based and prototype-based categorisations. Sometimes this disparity is inconsequential, as when we call something a cup because it is used for drinking, even though it may actually look more like a bowl. However, with categories like woman and kadın, there is more at stake. The prototypes have more psychological impact; miscategorisation can have undesirable consequences; and difficulty in categorisation is more stressful. All three defining features of woman have a direct impact on social identity, and fuzziness or ambiguity, in that any of these can result in discomfort, humour or even fear. How these reactions can be exploited, consider: Lolita ([±adult]), Twelfth Night ([±female]), or Invasion of the Body Snatchers ([±human]). Normally, we would expect category stress to be a rare phenomenon; if this routinely occurred at the boundaries of categories, such categories would probably be altered over time to avoid confusion or infelicitous usage. It would be premature to say whether features are abstracted from prototypes, or that prototypes are constructed from commonly occurring features. In either case, the two work in parallel, otherwise they would not work at all. Nevertheless, at the periphery of a category there are bound to be some items that strike us as ‘odd’, like flightless birds or promiscuous priests. A more interesting cause of category stress is social change. I have stated that there is usually a lag between social change and linguistic change, and this is probably greater the more socially and psychologically salient a particular aspect of the cultural model is. In Turkey, the main causes of category stress in kız and kadın are the rise in the average age of marriage and cultural Westernisation. In the past, early marriage was the norm, but now it is rare for women to marry before the age of twenty.7 Despite the strong social sanctions still in operation, this has inevitably JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.12 (717-795) Robin Turner led to an increase in pre-marital sexual activity amongst young women (something which had always been considered normal in young men). As previously mentioned, one occasional result of category stress is humour. An ostrich is seen as a funny kind of bird, and a promiscuous priest might be the subject of a joke.8 An example of this in the case of kiz is the following line spoken by a stereotypical ‘schoolmarm’ in a comic film: (12) Yaş-ım otuz beş. Ve kizim. El değ-me-miş. . . age-gen.1st thirty five and girl-1st hand place-neg.-part. “My age is thirty-five. And I am a ?girl. Undefiled . . .” (Hababam Sınıfı) This is quite culture-specific humour, arising from the contrast between the strict definition (‘she is a girl’) and the prototype (‘she is not what one would expect on hearing the word’), plus the fact that one would not normally allude so obviously to one’s virginity. A more common effect of category stress is alternative categorization – i.e., use of a different word. Hanım and bayan are both acceptable alternatives, though somewhat formal. These can be used when one is not sure of the status of the person in question, although it is also common for older, unmarried women, in order to avoid the [–adult] implications of kiz. Bayan, although literally meaning ‘lady’ (and also a formal title similar to ‘Ms’) seems to be becoming a neutral term with defining features [+human] and [+female] with a typical feature [+adult].9 It is, for example, the normal term used in sports, such as in (13), (13) tek bayan-lar-da single lady-pl.-loc. “In the women’s [tennis] singles.” (Ozturk Corpus) An extreme example of bayan shedding its ‘ladylike’ associations is (14) bayan terörist-i ?lady terrorist-pos. “female terrorist” (TRT News) However, this use was still greeted with amusement by a Turkish colleague (AA), which seems to indicate that escaping from one form of category stress may lead to another. As for the features of kız and kadın themselves, it is possible that these may eventually change to reflect changes in the cultural model. However, given the continued importance given to virginity in Turkish society as a whole (rather than the progressive urban elite), this seems highly unlikely in the near future. JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.13 (795-844) How do you know she’s a woman? . Caveats and conclusions Perhaps because of the fluid nature of ‘meaning’, it is all too easy to think up a semantic theory and find language examples that seem to justify it. In using kadın and kız, I have deliberately chosen terms and contexts that place considerable strain on a semantic model, and an ability to perform successfully in interpreting the linguistic data should not be taken as proof of the validity of that model, but merely serve as an indication that it has potential. In particular, it should be stressed that this model makes no claims with regard to neurology; it attempts to explain linguistic behaviour in cognitively plausible terms, but does not presume to assert that the brain processes semantic information in exactly the same way as the model proposes. A point worth emphasising is that the use of traditional semantic notation should not be interpreted as support for the idea of atomistic binary features. Features are themselves categories, and are subject to fuzziness, prototype effects, metaphorical extension, and so forth. Writing, for example, [+virgin] is simply a convenient way of indicating that in the view of the person performing the categorisation act, the item to be categorised fits their minimum criteria for virginity. Even such an apparently non-gradable category as virgin has peripheral members, such as “technical virgin”; criteria for membership may also vary across cultures, so strictly speaking I should have used the term [±bakire] rather than [±virgin].10 Assignment of a positive or negative sign to features is also somewhat arbitrary. [+female] could be, and often is, written as [–male]. My choice of the former is simply a reflection of an assumption that femaleness is not perceived simply as the absence of maleness. I retained the conventional [–adult], rather than using [+child], because children may be viewed teleologically as potential adults, while women are not viewed as potential men. In the case of [+virgin] a positive rather than a negative feature was used because in both English and Turkish virginity is seen as a positive attribute or even a possession; something that may be ‘lost’ (English) or ‘broken’ (Turkish). This may not only be due to the cultural importance attached to virginity, but also to the physical existence of the hymen. It would also be possible to develop the model further to give a better idea of the internal structure of a category. Some features are subordinates of others; for example [+virgin] implies [+human], since one would not normally speak of a virgin cat. Similarly, some features are typical features of categories alluded to by other features; for example [+vulnerable] is a typical feature of [child] – i.e., [–adult]. One application of the model that has not been much examined in this study is its use in describing metaphor. For example, the item kız neyi (the smallest ney, or reed flute) is clearly metaphorical, as in (15), JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.14 (844-910) Robin Turner (15) kız gibi araba girl like car “beautiful new car” In kız neyi, the weak typical feature [+small], itself a typical feature of [–adult], is used to create the metaphor; in (15) a metaphorical extension of [+virgin] is used – kız gibi is explained as bozulmamiş – literally meaning ‘unbroken’ or ‘unspoiled’ (AH). We can postulate, therefore, that a metaphor may not simply involve a transfer of features from a source to a target domain, but a creative metaphorical extension of features themselves: a kind of ‘meta-metaphor’. This follows naturally from the assumption that features are themselves categories. This type of feature-based approach could be useful in distinguishing between deeply-buried metaphors, and ‘obvious’ metaphors of the kız neyi or ‘female joint’ type. It is possible that the obviousness of the metaphor has an inverse relationship to the number and strength of features transferred from the source to the target domain; it may also depend on whether features are transferred “as is” or are themselves metaphorically extended. For example, when Captain Kirk says of the Enterprise “She is a beautiful woman, and I love her!” he is deliberately confusing an object with a human. The starship lacks the features [+human] [+female] and [+adult], and the metaphor succeeds by taking the one feature [+female] and metaphorically extending it. On the other hand referring to a pet as ‘she’, rather than the customary ‘it’ for animals, is less obviously metaphorical, since it possesses at least one defining feature of woman [+female] in its original feature-bundle, rather than being an extended form. The notion of category stress is, I have suggested, of use in illuminating some types of sociolinguistic behaviour, language change and even art (as in the Twelfth Night example), whether or not it is coupled to the particular feature analysis discussed here. Categorisation, as I have argued, reveals much about culture and I would suggest that ‘stressful’ categorisation acts may be particularly revealing. In fact, my interest in the kadın and kız categories, and a realisation that virginity is not only socially but linguistically significant followed from the following embarrassing exchange (in English) with a Turkish student: (16) Author: “She’s a nice woman.” Student: “How do you know she’s a woman?” Notes . The idea that Eskimos have twenty words for snow is a linguistic ‘urban myth’ ably exploded by Geoffrey Pullum in The Great Eskimo Vocabulary Hoax (1991). JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.15 (910-983) How do you know she’s a woman? . Subjects referred to are: AA: Female, 25, English teacher BÇ: Male, 19, student AH: Male, 55, news photographer NH: Female, 25, travel agent. ŞH: Female, 45 (?), housewife TK: Female, 28, English teacher NT: Female, 28, ceramics teacher All are native speakers of Turkish, living in Ankara and born in either Ankara or Izmir. . The Ozturk Corpus is collected from Australian Turkish community newspapers. With a few exceptions (Kurtböke 1996) which are not relevant to this study, the corpus can be seen as representative of Standard Turkish. . I recently noticed my wife doing this, and when asked why, received the answer “I don’t know – probably because she’s ten years younger than me.” . In Greek, for example, an adult female does not ‘go out with the girls’, but with the women – ginekes (Sophia Piperis, personal communication, 1998). . This also eliminates the asymmetry of kadın and erkek, by providing erkek with the feature [+human], since we are obviously not talking about ‘women as opposed to male creatures of any species’. Incidentally, erkek, the commonest term for ‘man’, is never used in the sense of ‘human being’. Occasionally its near-synonym, adam, is used like this, but normally one would say insan (‘person’). Similarly, the problem of the ambiguous male third person pronoun does not arise, since Turkish pronouns have no gender. . For example, in 1991 only 28% of female 15 to 19-year-olds were married, compared to 82% of women in the 50–54 age range who had married at 19 or younger (calculated from figures in Atalay 1992: 271). . It is interesting that jokes may employ both stereotypes and peripheral category members, as in the Turkish joke about the frustrated housewife and the blind imam (unfortunately so culture-specific as to be virtually untranslatable). . One may refer to a child as bayan, but it is normally modified as küçük bayan, ‘little lady’. . The Turkish for ‘virgin’, bakire (from Arabic), simply means a female with an unbroken hymen, so a technical virgin is still a virgin. This can in turn lead to category stress, since the importance placed on the hymen encourages a lot of ‘technicality’, in a manner reminiscent of 1950’s America. References Aristotle (1987). Metaphysics. In J. L. Ackrill (Ed.), A New Aristotle Reader. Oxford: Clarendon Press. Atalay, Besir (1992). Türk Aile Yapısı Araştırması. Ankara: DPT Sosyal Planlama Genel Müdürlüğü. Coleman, Linda & Paul Kay (1981). Prototype Semantics: the English word, lie. Language, 57, 26–44. JB[v.20020404] Prn:20/03/2006; 15:56 F: HCP1510.tex / p.16 (983-1099) Robin Turner Cruse, D. A. (1990). Prototype theory and lexical semantics. In S. Tsohatzdis (Ed.), Meanings and Prototypes: Studies in linguistic categorization (pp. 382–402). London: Routledge. Holland, Dorothy & Debra Skinner (1987). Prestige and intimacy: the cultual models behind Americans’ talk about gender types. In D. Holland & N. Quinn (Eds.), Cultural Models in Language and Thought. Cambridge: Cambridge Univ. Press. Hymes, Dell (1972). Towards Ethnographies of Communication: the analysis of communicative events. In Pier Paolo Giglioli (Ed.), Language and Social Context. Harmondsworth: Penguin. Jackendoff, Ray S. (1983). Semantics and Cognition. Cambridge, MA: The MIT Press. Jackendoff, Ray S. (1992). Languages of the Mind: Essays on mental representation. Cambridge, MA. Kurtböke, N. Petek (1996). A Corpus-Based Analysis of the Turkish Community Newspapers in Australia: a progress report. In Proceedings of the VIIIth International Conference on Turkish Linguistics. August 7–9 1996. Ankara. Lakoff, George (1987). Women, Fire, and Dangerous Things: What categories reveal about the mind. Univ. Chicago Press. Langacker, Ronald W. (1987). Foundations of Cognitive Grammar: volume I: theoretical prerequisties. Stanford, CA: Stanford University Press. Lehrer, Adrienne (1974). Semantic Fields and Lexical Structure. Amsterdam: North-Holland. Lehrer, Adrienne (1990). Prototype theory and its implications for lexical analysis. In S. Tsohatzdis (Ed.), Meanings and Prototypes: Studies in linguistic categorization (pp. 368–381). London: Routledge. Lipka, Leonhard (1986). Semantic Features and Prototype Theory in English Lexicography. In D. Kastowsky & A. Szwedek (Eds.), Linguistics Across Historical and Geographical Boundaries. Berlin: Mouton de Gruyter. Palmer, Gary (1996). Towards a Theory of Cultural Linguistics. Austin: University of Texas Press. Promeths.com (2005). [Online] “İngiltere’de Ortaokul ve Lisede Misafir Öğrencisi” Available at http://www.promeths.com/programlar/ortaokullise/ingiltere.php Pullum, Geoffrey K. (1991). The great Eskimo vocabulary hoax, and other irreverent essays on the study of language. Chicago: University of Chicago Press. Pustejovsky, James (1995). The Generative Lexicon. Cambridge, MA: MIT Press. Türk Dil Kurumu (1998). Güncel Türkçe Sözlük [on line] Available at: http://tdk.org.tr/ sozluk.html Turkstudent.net (2005). Kız kıza. [Online] Available at: http://www.turkstudent.net/cat/558 Turner, Robin (1998). Culture, context and categorisation: a feature- and prototype-based study of Turkish terms for women. Unpublished MA dissertation, Surrey University. Whorf, Benjamin L. (1956). Language, Thought and Reality: selected writings of Benjamin Lee Whorf (Ed. John B. Carroll). New York: Wiley. Wierzbicka, Anna (1985). Lexicography and Conceptual Analysis. Ann Arbor: Karoma. Wierzbicka, Anna (1990). ‘Prototypes Save’: on the uses and abuses of the notion of ‘prototype’ in linguistics and related fields. In Savas L. Tsohatzidis (Ed.), Meanings and Prototypes: studies in linguistic categorization. London: Routledge. Wierzbicka, Anna (1992). Semantics, Culture, and Cognition: universal human concepts in culture-specific configurations. Oxford: Oxford University Press. JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.1 (47-111) chapter Cross-linguistic polysemy in tactile verbs* Iraide Ibarretxe-Antuñano University of Zaragoza The link between the semantic field of tactile perception and that of emotions has been long established (Kurath 1921; Buck 1949). Within the framework of Cognitive Semantics, Sweetser (1990) analyses the semantic extensions that occur in perception verbs. Taking Sweetser’s study as a starting point, in the first half of this paper, I analyse the metaphorical scope of tactile verbs, not only in English, but also in two other languages, Basque and Spanish. In the second half, I explain how these polysemous structures are obtained, what the semantic packaging of these extended meanings is. In other words, how the semantic content of the lexical items (tactile verb and arguments) interacts and contributes to the creation of each semantic extension. Keywords: polysemy, metaphor, touch, cognitive linguistics . Introduction: Tactile perception and emotions The link between the semantic field of tactile perception and that of emotions has been long established. Authors such as Kurath (1921) and Buck (1949) pointed out the relationship between these two domains in Indo-European languages already in the first half of the twentieth century. Although these studies are thorough investigations into the etymology and polysemous senses of tactile words, they do not provide a motivated account of why these different meanings are related to these words in particular. More recent studies within the cognitive semantics framework (Lakoff 1987; Johnson 1987; Langacker 1987, 1991) have tried to show that the polysemous structure of tactile words is motivated. That is to say, the fact that a lexical item has different meanings is not whimsical, but motivated by our experience and understanding of the world. These different meanings are not random, but structured by means of cognitive devices such as metaphor. Within the cognitive semantics model, Sweetser (1990) analyses the semantic extensions that occur in perception verbs. Like Kurath and Buck, she relates the physical sense of touch to emotional feeling and to the general sense of percep- JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.2 (111-172) Iraide Ibarretxe-Antuñano tion.1 She also proposes that these extended meanings are not particular to English only, but cross-linguistic. Taking Sweetser’s study as a starting point, in the first half of this paper, I have analysed the different meanings conveyed by tactile verbs, not only in English but in two other languages: Basque and Spanish. Following Kövecses’ (1995, 2000) terminology, the ‘metaphorical scope’ of the verbs of touch in these three languages seems to be broader than that proposed by the studies mentioned above.2 The aim of this paper, however, is not only to give an account of the meanings conveyed by these verbs in these three different languages, but also to explain how these polysemous structures are obtained, what the semantic packaging of these extended meanings is. In other words, how the semantic content of the lexical items (tactile verb and arguments) interact and contribute to the creation of each semantic extension. Previous studies on polysemy (Brugman 1981; Vandeloise 1991; Herskovits 1986; among many others) offer detailed descriptions of the different polysemous senses of specific lexical items, the relations that hold among themselves, the conceptual motivation for such relations, and so on. However, what these analyses do not explicitly do is to address the question of whether these different meanings are the result of the different senses of a polysemous verb through the interaction between the semantics of the verb and its arguments or whether it is the choice of a particular argument what really determines different meanings. I examine this issue in the second half of this paper. . Metaphorical scope of tactile verbs revisited The semantic field of tactile perception is usually linked only to the domain of emotion. However, if we review the different meanings that these verbs can convey in English, Basque and Spanish, it is found that these verbs not only map onto the field of emotions but also onto other semantic fields as well.3 The verbs used in this case are touch in English,4 ukitu in Basque,5 and tocar in Spanish. The linguistic data come from three different sources: (i) monolingual and bilingual dictionaries; (ii) several corpora: English (The Lancaster-Oslo/Bergen Corpus-LOB, The British National Corpus-BNC),6 Basque (Present-day Basque Reference Corpus-EEBS), and Spanish (Reference Corpus for Present-day SpanishCREA); and (iii) examples for the most part constructed by me, occasionally on the basis of an utterance that I have seen or heard used.7 Native speakers were always consulted concerning the naturalness of these examples. In the first instance there are two concrete extended meanings found in the three languages. One meaning is ‘to partake of food or drink’ as in (1), (2) and (3). (1) John hardly touched the food JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.3 (172-237) Cross-linguistic polysemy in tactile verbs (2) Jonek ez du ia janaria ikutu john.erg neg aux.3s hardly food.abs touch.per “John hardly touched the food” (3) Juan apenas ha tocado la comida john hardly has touched the food “John hardly touched the food” In these three examples we learn that John did not eat much of his food, so in these cases, the meaning is ‘to partake of food’. If we change the direct object food for drink, then the meaning will be ‘to partake of drink’ instead. It has been suggested (Barcelona, p.c.) that instead of having the meaning ‘to partake of food or drink’, which is too specific, it would be better to propose a more general meaning like ‘to partake of something’. That would cover not only sentences like John hardly touched the food, but also examples like I didn’t touch a penny of your money. Although this proposal is sensible to some extent, I keep the former for two reasons. First, because several dictionaries contain this entry as a separate one (cf. am, col). Second, because intuitively these two sentences do not imply exactly the same meaning. In my opinion, the inferences resulting from the two examples are different. A sentence like John hardly touched the food can only make reference to one action ‘to eat’ (or ‘to drink’ if we change the direct object to a drink), and the verb ‘to touch’ can be replaced by the verb ‘to taste’. In the second sentence, the verb ‘to touch’ is not related to the meaning ‘to eat’ (or ‘to drink’) and therefore, this substitution for ‘to taste’ is not possible. Here the meaning refers more to the fact that I have not taken any money from that person, where ‘taken’ can be understood as the physical action of grabbing something, if not ‘to steal’ it. Another physical meaning is ‘to affect’ as in (4), (5) and (6). (4) Just don’t touch anything in my room (am) (5) Nork ukitu nau, nork ukitu ditu nire soinekoak? who.erg touch.per aux.1s who.erg touch.per aux.3s my dress.abs.pl “Who touched me, who touched my dresses?” (is) (6) ¿Quién tocó mis vestidos? Who touched my dresses “Who touched my dresses?” These three examples imply that not only has physical contact occurred, but there has also been a change of location. In (4), the speaker does not want the other person to change anything in his/her room; whereas in both (5) and (6), the person is asking about the person who did change the position of the dresses from the place they were before. This meaning, which I term ‘to affect’, has also a metaphorical extension as we shall see below.8 JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.4 (237-306) Iraide Ibarretxe-Antuñano As far as metaphorical meanings are concerned, there are three meanings: ‘to affect’, ‘to reach’ and ‘to deal with’. We have already seen that ‘to affect’ can be understood physically as in (4), (5), and (6), but it also has a metaphorical interpretation as in the examples below. (7) The appeal touched her heart (lob) (8) Edertasunak ukitu du azkenean Iñakiren bihotz gogorra beauty.erg touch.per aux.3s end.loc iñaki.gen heart strong.abs “In the end, beauty changed Iñaki’s hard feelings” (is) (9) Juan le tocó el corazón a María john she.dat touched the heart to mary “John touched Mary’s heart” (cse) In these examples what is affected is the emotional side of the person in question. In (7), the appeal was very emotive to this person; she was not able to remain with the same feelings or ideas she had before hearing it. In (8), Iñaki’s feelings are changed too, as a result of the beauty that he saw in a person or thing. Finally, in (9), John also affected, i.e. changed, Mary’s feelings. Although the emotional perspective of touch has been seen as an independent metaphorical mapping (Sweetser 1990: 37/43), I would like to include it as part of this wider meaning domain ‘to affect’. There are other examples in these languages where we have the same ‘contact-to-effect’ chain and that can also be included under this label. For instance, in Basque there is the expression ardoa ukitu, (lit.) ‘touch wine’, which means that the wine is spoilt and can no longer be drunk. In Spanish, when a person wins the lottery it is very common to say Me tocó la lotería, (lit.) ‘the lottery touched me’, in which case the lottery is the agent that provokes the change in me; that is to say, I became rich. A second metaphorical meaning is ‘to reach’ as in (10), (11), and (12) below. (10) He touched the high point in his career (11) 1685etik aurrera agintearen gailurra ukitu zuena 1685.abl forward mandate.gen top.abs touch.per aux.3s.who “He who reached the top of his mandate from 1685 onwards” (12) Ha tocado el punto más alto de su carrera has touched the point most high of his career “He has reached the peak of his career” (col) (is) (osd) These three examples imply that there is a point, an aim to be reached or that the moment to do something or end-point has arrived. In (10), (11) and (12), this end-point is the success achieved in a career.9 In other cases, as in (13) and (14), the end-point is spatial.10 JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.5 (306-384) Cross-linguistic polysemy in tactile verbs (13) The ship touches at Tenerife (14) meros transeúntes que han tocado puerto mientras hacían mere.p passerby.p which have touched port while made cruceros. . . cruises “. . .Just travellers who arrive here while they are on a cruise. . .” (col) (crea) The ship in (13) and the passengers in (14) have arrived at their destination, at the dock. In both examples, the fact that the ship is going to stay in the dock for a brief period of time is also implied.11 In Spanish, however, this is not always the case: (15) El barco tocó puerto ayer the ship touched port yesterday “The ship arrived yesterday” In (15), the information we are given is simply that the ship arrived, but not about the length of time it will stay. In Spanish there is a further usage of this meaning ‘to reach’ in the sense of ‘reaching the time to do something’ as in (16), where it is implied that the time to pay has come, and in (17), where we are about to reach the end of a five year period. What (16) implies is that the time to pay has come. This usage is very interesting because it is etymologically related to the onomatopoeic origin of the verb tocar. In old times the tolling of the bells used to announce events in villages. Still in current times one can hear the church bells calling people to prayer. In Spanish this is referred to as tocar a misa, (lit.) ‘touch to mass’. Nowadays, we do not use bells for these matters anymore, but we use the same construction tocar a, which reflects this tradition, to indicate that the time to do something has come. The end point is temporal in these examples. (16) Tocan a pagar touch.3.p to pay “It is time to pay” (rae) (17) Durante estos cinco años que ya tocan a su fin. . . while these five years which already touch.3.p to their end “In this five year period that is about to end. . .” (crea) A third metaphorical meaning in the sense of touch is ‘to deal with’ as in (18), (19), and (20). (18) I wouldn’t touch that business (19) Nik ez nuke gai hori ikutuko I.erg neg aux.1s topic that.abs touch.f “I wouldn’t touch that issue” (am) JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.6 (384-454) Iraide Ibarretxe-Antuñano (20) Hasta el momento no ha tocado el tema de la alfabetización until the moment neg has touched the topic of the literacy “Until now he hasn’t dealt with the literacy issue” (crea) In these examples we are told that these people do not want or have not yet had the chance to deal with a specific subject (a business in (18), and some kind of issue in (19) and (20)). If we insert adverbial expressions such as luzez, ‘for a long time’, or en muchas ocasiones, ‘on many occasions’, the meaning ‘deal with’ changes a little bit, as in examples (21) and (22). (21) Unibertsitate-gaia luzaz ukitu dut university-topic.abs long.in touch.per aux.1s “I’ve dealt with university matters for a long time” (is) (22) En muchas ocasiones hemos tocado el tema de una posible on many occasions have.1p touched the topic of a possible intervención de las fuerzas armadas intervention of the forces armed “We have dealt with a possible intervention by the armed forces on many occasions” (crea) Due to the semantics of these specific adverbial expressions, what we imply is that we have dealt with the same subject for quite a long time, repeatedly. As a result we become very familiar with the subject, and come to know it fairly well. The meaning shifts from ‘deal with’ to ‘be familiar with’ (know by experience). Tocar can also mean ‘to deal with superficially’, such as in English, when a word like barely, and/or the preposition on is inserted, as in (23) and (24) respectively. (23) He barely touched on the incident in his speech (amgd) (24) To some extent I shall be touching on points already made by previous speakers (lob) In summary, the four major semantic extensions in tactile verbs analysed in this section include: ‘partake of food/drink’, ‘affect’ (physically and metaphorically), ‘reach’, and ‘deal with’. These semantic extensions represent four different ways in which the domain of tactile perception is conceptually linked to different experiential domains. These ‘links’ or mappings between domains are shared by the three languages under investigation, English, Basque and Spanish. The fact that these polysemes are found in different genetically unrelated languages must not be taken as a surprise. These mappings take place at a conceptual level. As has been argued in the Cognitive Linguistics literature (cf. ‘embodiment’, Johnson 1987), this conceptual level represents the way we understand and interact with the world; our own experience of what surrounds us. We, as human beings, have the same perceptual apparatus for the sense of touch, and therefore, it is only natural that JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.7 (454-501) Cross-linguistic polysemy in tactile verbs the experiences that we have with the sense of touch – how we perceive with this sense, its limitations and advantages, the type of information available through this sense – are used as the conceptual basis for these metaphorical meanings.12 But, in the case of Basque, English, and Spanish speakers, we also have to bear in mind that they are entrenched in the same Western culture (cf. Gibbs 1999), and therefore, they share – at least, as far as these meanings are concerned – a common view and conceptualisation of the tactile sense. The role of culture in conceptualisation is an important issue because it has been shown that in some cases, the ‘embodiment’ of the senses is not sufficient enough to explain why certain sensory modalities are linked to certain cognitive processes (cf. Classen 1993; and Howes 1991, for an enlightening exposition of how the senses are conceptualised by different cultures). As Ong (1967 [1991: 26– 27]) puts it: Cultures vary greatly in their exploitation of the various senses and in the way in which they relate their conceptual apparatus to the various senses. It has been a commonplace that the ancient Hebrews and the ancient Greeks differed in the value they set on the auditory. The Hebrews tended to think of understanding as a kind of hearing, whereas the Greeks thought of it more as a kind of seeing, although far less exclusively as seeing than post-Cartesian Western man generally has tended to do. The relation between visual/auditory perception and cognition is a good example to show that culture really matters. In Cognitive Linguistics, the link between vision and cognition has been generally accepted as one of the most consistently universal mappings in this domain. Authors such Sweetser (1990) suggest that vision has primacy as the modality from which verbs of higher intellection, such as ‘knowing’, ‘understanding’, and ‘thinking’, are recruited, whereas hearing verbs, such as hear or listen, would not take these readings, because they are more “connected with the specifically communicative aspects of understanding, rather than with intellection at large” (1990: 43). Although it is true that this correspondence is systematically found in many languages, and certainly, in the three languages under investigation here (cf. Ibarretxe-Antuñano 1999a, 2002), it is far from been universal. Evans and Wilkins (2000) have shown that Australian languages do not conceptualise intellection as vision, but as hearing. Furthermore, these authors claim that one of the possible reasons that could explain why Australian languages behave differently has to be found in the cultural and social practices of the Aboriginal people. In our case, we find that there are not significant differences in the conceptualisation of the sense of touch in English, Basque, and Spanish. Speakers of these three languages, therefore, seem to share both their experience and understanding JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.8 (501-557) Iraide Ibarretxe-Antuñano of the tactile sense, and the background and practices of Western culture – despite individual differences. Another issue that I would like to point out is that the semantic extensions that we have discussed in this paper are not the only ones that can be found in English, Basque, and Spanish. As discussed elsewhere (Ibarretxe-Antuñano 1999a, 2002), each of these languages creates further mappings from the domain of touch onto other semantic domains. In English, for example, the verb touch can convey the meaning of ‘to ask for a loan’, as in Touch a friend for five dollars (am). In Basque, we also find the semantic extension ‘to consider, to weigh up’ with the verb haztatu ‘touch’. In Spanish, the verb tocar ‘touch’ also means ‘to be a relative’ and ‘to fall to’. However, it is important to notice that the usage of these meanings is quite peripheral in comparison with the other extensions discussed above. There are two reasons that support this: (i) these extensions – especially in the case of English and Basque – are restricted to certain dialectal variations. The meaning ‘to ask for a loan’ is typical of American English, and the Basque verb haztatu is more common in northern dialects; (ii) these meanings are hardly ever used. Although we would need a statistical analysis of corpus data to be really sure about their status, a random search of a hundred examples on the corpora that we have used in these three languages retrieves no cases of these usages. The following sections examine how these polysemous senses are lexicalised. The main goal is to test whether these semantic extensions emerge from interaction between the semantic content of the verb and that of its arguments; and then, to determine what elements intervene in the lexicalisation, as well as to what extent each element is semantically responsible for such meanings. Contrary to the results in Section 2, the lexicalisation tools and techniques that languages possess vary from one to another. As a consequence, the lexicalisation patterns of these semantic extensions apply only to one language, and not cross-linguistically. . Compositional polysemy: The semantic packaging of lexical items A word is understood as polysemous if all its multiple meanings are systematically related. One of the most important goals in Cognitive Linguistics has been to show that the multiple semantic extensions of a lexical item are related not in an arbitrary but in a systematic and natural way by means of several cognitive mechanisms such as image schemas, metaphor and metonymy. Numerous studies within this framework have shown that this is a strong hypothesis. A classical example is the analysis of the preposition ‘over’ (Brugman 1981; Lakoff 1987). These authors offer a very detailed exposition of the relationships among the different semantic extensions of the preposition ‘over’. However, neither of them explicitly acknowledges that these meanings are possible not only thanks to the JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.9 (557-629) Cross-linguistic polysemy in tactile verbs semantic content of the preposition itself, but also to the insertion of very specific lexical items. Let us draw some examples to illustrate this point. The central meaning of ‘over’ is one that combines elements of both ‘above’ and ‘across’ as in (25). The ‘above-across’ meaning has several variants as in (26), (27), and (28): (25) The plane flew over (26) The bird flew over the yard (27) Sam climbed over the wall (28) Sausalito is over the bridge These are just four different examples taken from Brugman’s analysis of the preposition ‘over’. According to this author, the central sense of the preposition ‘over’ (‘above-across’), has different variants depending on (i) the contact or no contact between the LM and TR; (ii) the position and extension of the LM, and (iii) the endpoint focus. However, not all these extra bits of information are contained in the preposition itself; instead, they are contained in other elements of the sentence. For instance, the fact that in some cases ‘over’ implies contact, is not inferred from the preposition but from the verb used. In (27), the information provided by the verb, ‘climb’, automatically entails that there is contact between the subject, “Sam” (the TR) and “the wall” (the LM), because it is impossible to climb a wall without touching it. In a similar way, the no-contact characteristic of ‘over’ in (25) and (26) is also implied by the verb ‘to fly’. In most cases, when we say that something is flying, we visualise the flying object (bird, plane. . .) as not touching any surface. In (27), the additional information that the LM is vertical, is not only provided by the LM (“the wall”) itself, but also by the verb ‘to climb’, which by default implies an upward movement. Even in the case of an end-point focus, as in (28), this meaning is not added by anything in the sentence, but is “the result of a general process that applies in many, but not all English prepositions” (Lakoff 1987: 424), the other members of the sentence contribute to this meaning. Without the static verb ‘to be’, which implies that there is no movement, and ‘the bridge’ (a structure with a beginning and an end), the end-point focus could not be inferred. Based on these examples,13 it can be argued that the polysemy in the preposition ‘over’ is not only obtained by the semantic content of this preposition, but also in conjunction with the semantic content of the words that accompany it in the sentence in which it occurs. The emergence of different senses from an interaction between a preposition and co-occurring elements is not an isolated case. A similar situation can be found in the case of the semantic extensions of tactile verbs described in Section 2. For example: (29) John hardly touched the food JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.10 (629-685) Iraide Ibarretxe-Antuñano One of the cross-linguistic extensions of tactile verbs is ‘to partake of food (or drink)’ as illustrated in (1), reproduced here as (29). The reason why we interpret this sentence with this sense lies not only on the presence of the verb ‘to touch’, but also on those elements that directly complement it, such as “the food” and the adverbial “hardly”. Without either of these two elements, it would be impossible to infer a meaning like ‘to partake of food’. If we removed the adverbial, as in John touched the food, the meaning would correspond to either the prototypical meaning of touch, or to the semantic extension ‘affect’. If we change the complement that denotes some kind of edible object for some other concrete element as in John hardly touched the table, the interpretation of this sentence would be the same as in the case before: a prototypical ‘touch’ or ‘affect’. The same situation occurs both in Basque and Spanish. (30) Jonek janaria ikutu du john.erg food.abs touch.perf aux.3s (31) Juan tocó la comida john touched the food (32) Jonek ez du ia mahaia ikutu john.erg neg aux.3s almost table.abs touch.perf (33) Juan apenas tocó la mesa john hardly touched the table If the adverbs ia ez and apenas are got rid of, as in (30) and (31),14 or if the complement is exchanged for one such as mahai and mesa, as in (32) and (33), we obtain similar interpretations as those in the English examples. Therefore, it is possible to predict that whenever the complement of the verb ‘to touch’ refers to an edible object, then the meaning is ‘to partake of food’. The situation is somehow different in the following examples.15 (34) Nork ukitu ditu nire soinekoak? who.erg touch.perf aux my dress.abs.p “Who touched my dresses?” (35) Ha tocado el punto más alto de su carrera has touched the point more high of his career “He touched the highest point in his career” In (34), the extended meaning is ‘to affect, physically’. Someone has changed the state in which the clothes were and we want to know who that person is. In order to infer this meaning we need an entity that is able to carry the action of touching, as well as an entity that can be touched by the subject. Unlike in (33), where the choice of both subject and complement is not very wide; there are many entities JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.11 (685-756) Cross-linguistic polysemy in tactile verbs that can carry out both tasks (see previous examples (30), (31), (32), and (33)). This meaning does not depend upon such a restrictive choice of arguments. The same statement can be made about the second sentence. In (35), the meaning is ‘to reach’. In this case, the fact that an end-point is implied is not only conveyed by the nature of the tactile verb itself, but also by the complement el punto más alto, ‘the highest point’, that denotes a limit to that metaphorical action of ‘touching’. And since el punto más alto is without dimension, we get the achievement reading of ‘to reach’. As in the other examples, there are many other entities, like ‘bottom’ and ‘eternity’, that can be placed in this position. In these two examples, the semantics of the other elements of the sentence plays a role in the overall meaning, but the importance of these elements is not as decisive as in the previous example (33). In order to obtain the meanings ‘to affect, physically’ and ‘to reach’, it is necessary to have subjects who are able to touch, and complements that can be touched. The achievement of that meaning, however, is not as dependent on these arguments, as in (33). In (34) and (35) the intrinsic meaning of the verb itself plays a much more important role, than that of its arguments. Finally, we have sentence (36) with the meaning ‘to affect’. (36) John touched Mary This sentence is highly ambiguous; there are simultaneous interpretations of this sentence. (36) can infer a physical contact between John and Mary, i.e. the prototypical meaning of touch; the meaning ‘to affect, physically’ as in a situation where John is not expected by Mary and when he touches her, he makes her shiver; and the meaning ‘to affect, metaphorically’, in which case an emotional reaction from Mary is implied. Without any more information about the context in which this sentence is uttered, one cannot decide whether (36) should be interpreted physically or metaphorically. Unlike the other examples, (36) cannot be predicted by the semantic properties of the arguments that the verb takes. “John” and “Mary” are too vague to constrain the semantic extension that takes place in this example. In the case of the meaning ‘to partake of food’, the complement, “the food” constrains the semantic extension of the verb, because there are not too many things that can be done with food, apart from eating, cooking. . . With “John” and “Mary” the case is different: the possibilities for these two entities are infinite; and yet the meanings include only the prototypical meaning, and ‘to affect’ (both physically and metaphorically). In sum, based on these sets of examples, we can divide these polysemous senses into two groups. On the one hand, examples like (36), where it is not possible to predict what the interpretation is by means of the choice of arguments, are called ‘unpredictable’ cases of polysemy; and on the other hand, those where the JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.12 (756-799) Iraide Ibarretxe-Antuñano choice of arguments leads to a specific predictable extension of meaning are called ‘predictable’ cases.16 The latter is further classified depending on the degree of influence of the semantics of the arguments involved. Where a meaning such as ‘to partake of food’ is mainly determined by the arguments and other elements in the sentence and in other meanings like ‘to affect, physically’ and ‘to reach’, where it is the verb that mainly governs the choice of arguments and meaning. The former are called ‘argument-driven extensions’ and the latter, ‘verb-driven extensions’. These two groups, therefore, reveal that the weight of the semantics of the different elements in the overall meaning of a sentence is not the same in all extended meanings, but hierarchically organised according to the degree of influence of the lexical items involved. I call ‘compositional polysemy’ to this graded involvement of elements in the creation of polysemes. Although three different types of semantic extensions have been proposed so far, it is important to notice that all these meanings must have something in common in order to be extended from the physical sense of touch, and also in order to explain why the same extensions of meaning happen in English, Basque and Spanish. Otherwise, it will be impossible to say why other sentences like (37) are ruled out. (37) Peter touched the joke The reason why this example is not felicitous when no context is given lies in the fact that “the joke” is not a ‘touchable’ type of concept – i.e., a joke cannot be touched in any abstract possible way, as el punto más alto, ‘the highest point’, is in example (35) above. From a cognitive linguistics point of view, the fact that “the joke” is not licensed with the verb ‘to touch’, stems from the way we experience this sense in our lives, in the human embodiment of this sense (Johnson 1987). Therefore, all these meanings must fulfil the ‘verb property requirement’, which in the case of tactual verbs is the condition of being ‘touchable’. That is, the verb arguments must be able to touch, if they are subjects, or be able to be touched, if they are complements. . Cross-linguistic polysemy: Meaning and lexicalisation across languages In the previous section, I have shown that extended meanings are obtained by the interaction of the semantic content of both the perception verb and its complements. The role of the semantics of both the perception verb and its complements is not the same in all extended meanings; in some cases, the verb is more important and in some other cases, the complements are. One common characteristic of all the examples analysed in Section 3 is that the explanations provided were applicable to the same cases in the three languages JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.13 (799-860) Cross-linguistic polysemy in tactile verbs under investigation. In other words, the same elements were crucial in the lexicalisation of those meanings in English, Basque and Spanish, and as a result they were classified under the same degree of compositionality. However, this situation does not always happen. In most cases, as authors such as Talmy (1991, 2000) have shown, languages show a great deal of variation in mapping lexical resources onto semantic domains. The systematic relations between semantic elements – meaning – and surface elements – linguistic forms – do not usually show one-to-one correspondence across language types. In fact, this relationship may take different forms, with multiple semantic elements being expressed by one surface element, or a single semantic element being expressed by multiple surface elements. Let us illustrate this point with the example of unpredictable polysemy, as in John touched Mary. If we translate this sentence into Basque (38), and Spanish (39), using exactly the same elements (“John”, “Mary”, and “touch”), the results are quite different. (38) Jonek Miren ukitu zuen john.erg mary.abs touch aux.3s “John touched Mary” (39) Juan tocó a María john touched to mary “John touched Mary” In both languages, the only possible interpretation of (38) and (39) is the prototypical meaning of physical touching. In these sentences, it is understood that ‘John physically touched Mary’. In no way can they have the metaphorical ambiguity that exists in the English version. This is not to say that it is impossible to express the metaphorical reading ‘to affect’ in these two languages with tactile verbs. This is perfectly possible as we saw in examples (8) and (9) in Section 2. In Basque as well as in Spanish the mapping between the physical domain of ‘touch’ and that of ‘to affect’ is also allowed; but in order to obtain this meaning it is necessary to add a verb complement that denotes feelings. The direct object – bihotz gogorra ‘hard heart’ in (8), and el corazón ‘the heart’ in (9) – fulfils this necessity. The heart in these examples is not understood as a physical object, but as the seat of feeling. In the cognitive approach literature, ‘heart’ is a metaphorical realisation of the image schema of a container, where heart is a container for feelings (Kövecses 1986; Lakoff & Johnson 1980). In fact, as Moliner (1983) points out, in Spanish the verb tocar needs expressions, such as el corazón, ‘the heart’, el amor propio, ‘one’s own pride’, la dignidad, ‘dignity’, in order to imply this interpretation.17 These examples show that, although the same semantic mappings between different domains take place cross-linguistically, the strategies that each language follow to express such meanings are different. What in one language can be ex- JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.14 (860-915) Iraide Ibarretxe-Antuñano pressed by a single lexical item (i.e., a verb), in other languages may require several lexical items (i.e., a verb and arguments) to generate the same meaning. This statement has important implications for our theory of polysemy and its cross-linguistic character. First of all, it is important to make a distinction between conceptual mappings on the one hand, and overt realisations of those conceptual mappings on the other; between the links established between different domains of experience – those discussed in Section 2 – and the different strategies that languages follow to overtly express those links. In other words, one issue appeals to our conceptualisation of the world, which is shared by all humans with the same cultural background, the other, to the linguistic means that each language in particular has to lexicalise those conceptualisations. In previous analyses of polysemous lexical items (cf. Brugman 1981; Lakoff 1987), there was no distinction between these two concepts. If a lexical item was to be taken as polysemous in itself, that is to say if polysemous senses were localised in one lexical item without taking into account the semantic content of the other words that co-occur with this lexical item, then both conceptual structure and overt expression of such conceptual structure were the same. If the conceptual structure were cross-linguistic, and conceptual structure and the overt expression of such conceptual structure were the same, then, transitively, it could be argued that both were cross-linguistic. However, I have shown that this is not the case. Lexical items are not generally polysemous in themselves, unless they are cases of ‘unpredictable polysemy’. They need the help of the semantic content of other lexical items in order to obtain those polysemous senses, and as shown in this section, which lexical items are required to trigger and build the different extended polysemous readings are not the same in every language. It is for these reasons that I will consider that the verbs themselves are not polysemous, but that the conceptual domain of sense perception is polysemous. The different mappings presented in Section 2 are not to be taken as semantic extensions of the perception verbs themselves, but polysemous senses of the conceptual domain of sense perception. I will call the group of these extended meanings ‘conceptual polysemy’. In sum, I argue that when we analyse the meanings that take place in a semantic field, we need to distinguish and address two different sides. On the one hand, we need to establish its ‘conceptual polysemy’, i.e. the conceptual mappings that take place between different domains of experience. This conceptual polysemy is constrained by the bodily basis of the semantic field under analysis. Because this bodily basis is shared by and common to all humans with the same cultural background, conceptual polysemy is cross-linguistic. On the other hand, it is necessary to establish which elements are involved in the creation of such conceptual JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.15 (915-970) Cross-linguistic polysemy in tactile verbs polysemy, and to what extent their semantic content participates in the creation of such extended meanings. Therefore, conceptual polysemy can be considered a cross-linguistic phenomenon, but the classification of extended meanings under the three different degrees of compositionality only a language particular one. . Conclusions In this paper I have examined the polysemy that exists in tactile verbs, one per language, in three genetically unrelated languages, English, Basque and Spanish. The two major concerns raised include on the one hand, a description of the semantic extensions that take place in this domain of tactile perception; and on the other, the study of lexicalisation patterns and elements needed to convey these polysemous senses. As I have shown, tactile verbs do not only express physical contact. There are four semantic extensions shared by these three languages: ‘to partake of food/drink’, ‘to affect, physically’, ‘to affect, metaphorically’, ‘to reach’, and ‘to deal with’. The last three senses prove that the metaphorical scope of this domain is much more productive than that described in other studies (Sweetser 1990). With respect to the lexicalisation of these meanings, I have proposed the idea of ‘compositional polysemy’, i.e. different polysemes of a lexical item – the tactile verb in this case – are obtained through the interaction of the semantic content of both the lexical item itself and its different co-occurring elements. The weight of the semantics of these elements in the creation of these semantic extensions is not the same; it varies according to the degree of semantic influence of these elements on the overall meaning. That is to say, in some meanings, the role played by the arguments of the verb is crucial. In some other meanings it is the verb that governs the choice of arguments and meaning. These cases are predictable polysemous meanings: the former is an ‘argument-driven extension’, and the latter a ‘verb-driven extension’. Finally, there is a third class of meanings, where interpretation is not predictable by means of the choice of arguments. These are unpredictable cases of polysemy. Our model for the analysis of polysemy, therefore, can be situated between what is known as the maximization of polysemy, i.e. the word itself carries most of the polysemous workload and speakers just have to choose correctly in context, and the minimization of polysemy, i.e. most of the workload is on the speaker’s side who has to interpret the meaning from the context (see Behrens 1999; Cruse 1986, 2000; Lyons 1977, 1995, among others). On the one hand, our proposal recognises the importance of contextual elements in the creation of polysemous senses, but on the other, it establishes the necessity for (i) a graded typology of contextual involvement, and (ii) a constraint that restricts the participation of co-occurring elements to those that are compatible with JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.16 (970-1031) Iraide Ibarretxe-Antuñano the conceptual properties characterising the polysemous word analysed – tactile verbs in this paper. These last two elements differentiate our analysis from more traditional pragmatic approaches to polysemy, where many of these polysemous senses are the result of contextual effects (cf. Sperber & Wilson 1995; see Nerlich & Clarke 2001, for more information about polysemy in relation with pragmatics). Finally, it has been shown that the phenomenon of compositional polysemy is found cross-linguistically. What differs from language to language is the degree of compositionality of the same semantic extension which is language specific. Notes * This research is supported by Grant BFI99.53.DK from the Basque Country Government’s Department of Education, Universities and Research. I would like to thank June Luchjenbroers and an anonymous referee for their valuable comments, and especially June for her never-ending patience. The author can be contacted at <[email protected]>. . Sweetser argues that in all Indo-European languages, the verb to feel is the same as the verb indicating general perception. For instance, the verb sentir (< Latin sentire) in Spanish. However, this is overstated because it does not hold in languages such as Russian (Moiseeva 1998: 160). . “The scope of metaphor is simply the full range of cases, that is, all the possible target domains, to which a given specific source concept (such as war, building, fire) applies” (Kövecses 2000: 81). . Due to space constraints, in this paper I limit myself to enumerate and describe what those semantic fields are, I do not get into much detail about the cognitive mechanisms – metaphor and metonymy, for example- that make such mappings between conceptual domains possible. Those interested in this topic may consult Ibarretxe-Antuñano (1999a, 1999b, 2000, 2003). . In each of these languages there are more verbal realisations of the sense of touch than just the specific verb I have chosen to illustrate the main theoretical points put forward in this paper. The fact that I only analyse one per language is only due to length restrictions. The theoretical claims are therefore applicable to any tactile verb, or to any perceptual verb for that matter, as I have shown elsewhere (Ibarretxe-Antuñano 1999a). . Ukitu is the verb used in Standard Basque. In some of the examples discussed in this section, the verb ikutu is also used. This is a variant in the Guipuzcoan and Biscayan dialects. . The right to use the BNC is granted by Oxford University Press to researchers working on the FrameNet project, International Computer Science Institute and the Univ. California, Berkeley. . These examples occur without any bracketed indication of the source. . It has been suggested by one of the anonymous reviewers that these two physical semantic extensions in touch, ‘partake’ and ‘affect’, could be considered different interpretations of the literal physical touch calculated from the minimally necessary condition of touching for each of these activities (eating, taking, etc.), instead of conventional meanings. It is true that both activities, either when we eat/drink or cause a physical effect on something, require physical contact to be performed. It we are going to eat something, we have to necessarily touch the food, if we want to change the place where something is, we have to touch it. However, I think that these activities go beyond physical touch and cannot be considered as simple implicatures of JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.17 (1031-1097) Cross-linguistic polysemy in tactile verbs touching. Touching is a necessary condition for these activities, but not a sufficient explanation. In both cases, the physical activity – partaking, affecting – stands for the action that caused this result, the touching, and therefore, they can be considered cases of the metonymy result for action. . This positive interpretation is explained in Lakoff and Johnson (1980). (10), (11), and (12) are examples of what they call ‘orientational’ metaphors: “metaphorical concept that organises a whole system of concepts with respect to one another” (1980: 15). Up is always related to good, high status and it is opposed to down, which implies bad, low status; as in the expression to touch bottom. . The fact that the endpoint in these examples is spatial, i.e. there is a physical destination to which these people arrived – the dock –, causes an ambiguous interpretation. On the one hand, there is a physical contact between the ship and the dock and therefore, this interpretation could be understood as metonymical. However, on the other hand, there is a metaphorical mapping in these sentences because the expression touch at refers to the activity of arriving or reaching a destination. I would like to thank one of the anonymous reviewers for drawing this point to my attention. . This is possible thanks to the semantic contribution of the preposition at in (13) and that of the phrase mientras hacían cruceros ‘while they were cruising’ in (14). As I will explain in more detail in Section 3, these are cases of compositional polysemy. . For a detailed discussion on the conceptual basis of the semantic extensions of tactile verbs, see Ibarretxe-Antuñano (2000). In this paper, the sense of touch is characterised in terms of prototypical properties. These are drawn from psychological and physiological descriptions of this sense. Each semantic extension selects a number of these properties. These selected properties are to be taken as the bodily basis for the semantic extensions. For instance, the meaning ‘to reach’, selects three properties: (i) <contact>: the perceiver must have physical contact with the object perceived, (ii) <closeness>: the object perceived must be in the vicinity of the perceiver, and (iii) <limits>: the perceiver is aware of the boundaries imposed by the object perceived. . More discussion on other extensions of over can be found in Ibarretxe-Antuñano (1999a: 183). . The Basque equivalent to the English adverb hardly is the adverb ia ‘almost’ together with the negation ez. . In order to save some unnecessary repetition of the same explanations in each of these examples, I only include one example per language. The same explanations are applicable to the equivalent sentences in the other two languages reproduced in Section 2. . The labels ‘predictable’ and ‘unpredictable’ are to not be taken just as descriptive terms for the individual examples analysed in this section. In our opinion, in every polysemous word, we can find semantic extensions that can be easily ‘predicted’ or ‘guessed’ by the semantics of the co-occurring elements, and semantic extensions that cannot be ‘predicted’ or ‘guessed’. . As I have shown elsewhere (Ibarretxe-Antuñano 1999c), there is another possibility to lexicalise this meaning in Basque: to change the verb ukitu for the etymologically related hunkitu. The latter is generally used in the metaphorical sense, and does not refer to the physical touching unless an adjunct denoting a physical instrument such as eskuz ‘with the hand’ is inserted. JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.18 (1097-1211) Iraide Ibarretxe-Antuñano References Aulestia, Gorka (1989). Basque-English Dictionary. Reno and Las Vegas: University of Nevada Press. (au) Behrens, Leila (1999). Aspects of polysemy. In David A. Cruse, Franz Hundsnurscher, Michael Job, & Peter Rolf Lutzeier (Eds.), Lexicologie – Lexicology, Vol. 1 (pp. 135–167). Berlin: Walter de Gruyter. Brugman, Claudia (1981). The Story of Over. MA Thesis. University of California at Berkeley. Buck, Carl D. (1949). A Dictionary of Selected Synonyms in the Principal Indo-European Languages. Chicago: Chicago University Press. Classen, Constance (1993). Worlds of Sense: Exploring the Senses in History and Across Cultures. London: Routledge. Collins English Dictionary and Thesaurus (1993). Italy: Harper Collins Publishers. (col) Collins Spanish-English-Spanish Dictionary (1996). Glasgow: Harper Collins Publishers. (cse) Cruse, David A. (1986). Lexical Semantics. Cambridge: Cambridge University Press. Cruse, David A. (2000). Meaning in Language. Cambridge: Cambridge University Press. Diccionario de la Real Academia de la Lengua Española (1984). Madrid: RAE. (rae) Evans, Nick & David Wilkins (2000). In the mind’s ear: The semantic extensions of perception verbs in Australian languages. Language, 76(3), 546–592. Gibbs, Raymond W. Jr. (1999). Taking metaphor out of our heads and putting it into the cultural world. In Raymond W. Gibbs, Jr. & Gerard J. Steen (Eds.), Metaphor in Cognitive Linguistics (pp. 145–166). Amsterdam and Philadelphia: John Benjamins. Herskovits, Anna (1986). Language and Spatial Cognition: An Interdisciplinary Study of the Prepositions in English. Cambridge: Cambridge University Press. Howes, David (Ed.). (1991). Varieties of Sensory Experience: A Sourcebook in the Anthropology of the Senses. Toronto: University of Toronto Press. Ibarretxe-Antuñano, Iraide (1999a). Polysemy and Metaphor in Perception Verbs: A Crosslinguistic Study. PhD Thesis. University of Edinburgh. Ibarretxe-Antuñano, Iraide (1999b). Metaphorical mappings in the sense of smell. In Raymond W. Gibbs, Jr. & Gerard J. Steen (Eds.), Metaphor in Cognitive Linguistics (pp. 29–45). Amsterdam and Philadelphia: John Benjamins. Ibarretxe-Antuñano, Iraide (1999c). Predictable vs. unpredictable polysemy. In S. J. Hwang & Arle Lommel (Eds.), LACUS Forum, 25, 201–211. Ibarretxe-Antuñano, Iraide (2000). An inside look at the semantic extensions in tactile verbs. In Francisco J. Ruiz de Mendoza (Coord.), Panorama actual de la lingüística aplicada. Conocimiento, procesamiento y uso del lenguaje (pp. 1053–1060). Logroño: Universidad de La Rioja. Ibarretxe-Antuñano, Iraide (2002). Mind-as-body as a cross-linguistic conceptual metaphor. Miscelánea. A Journal of English and American Studies, 25, 93–119. Ibarretxe-Antuñano, Iraide (2003). El cómo y el porqué de la polisemia de los verbos de percepción. In Clara Molina, María Luisa Blanco, Juana Marín, Ana Laura Rodríguez, & Manuela Romano (Eds.), Cognitive Linguistics in Spain at the turn of the century / La Lingüística Cognitiva en España en el cambio de siglo (pp. 213–228). Madrid: Universidad Autónoma de Madrid. Johnson, Mark (1987). The Body in the Mind. The Bodily Basis of Meaning, Imagination and Reason. Chicago: Chicago University Press. JB[v.20020404] Prn:9/02/2006; 15:42 F: HCP1511.tex / p.19 (1211-1319) Cross-linguistic polysemy in tactile verbs Kövecses, Zoltán (1986). Metaphors of Anger, Pride, and Love: A Lexical Approach to the Study of Concepts. Amsterdam and Philadelphia: John Benjamins. Kövecses, Zoltán (1995). American Friendship and the Scope of Metaphor. Cognitive Linguistics, 6(4), 315–346. Kövecses, Zoltán (2000). The Scope of Metaphor. In Antonio Barcelona (Ed.), Metaphor and Metonymy at the Crossroads. A Cognitive Perspective (pp. 79–92). Berlin and New York: Mouton de Gruyter. Kurath, Hans (1921). The Semantic Sources of the Words for the Emotions in Sanskrit, Greek, Latin and the Germanic Languages. Menasha, WI: George Banta. Lakoff, George (1987). Women, Fire and Dangerous Things. What Categories Reveal about the Mind. Chicago: Chicago University Press. Lakoff, George & Mark Johnson (1980). Metaphors We Live By. Chicago: Chicago University Press. Langacker, Ronald W. (1987). Foundations of Cognitive Grammar, Vol. I: Theoretical Prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald W. (1991). Foundations of Cognitive Grammar, Vol. II: Descriptive Application. Stanford, CA: Stanford University Press. Lyons, John (1977). Semantics. Cambridge: Cambridge University Press. Lyons, John (1995). Linguistic Semantics. Cambridge: Cambridge University Press. Moiseeva, Nadezda (1998). Verbs of perception in Russian. In M. Giger, T. Menzel, & B. Wiemer (Eds.), Lexicologie und Sprachveränderung in der Slavia. Studia Slavica Oldenburgensia 2 (pp. 153–164). Oldenburg: Bibliotheks- und Informationssystem der Universität Oldenburg. Moliner, María (1983). Diccionario del Uso del Español. Madrid: Gredos. Nerlich, Brigitte & David D. Clarke (2001). Ambiguities we live by: Towards a pragmatics of polysemy. Journal of Pragmatics, 33, 1–20. Oxford Spanish Dictionary (1994). Oxford, New York, Madrid: OUP. (osd) Ong, Walter J. (1967). The shifting sensorium. In D. Howes (Ed.), Varieties of Sensory Experience: A Sourcebook in the Anthropology of the Senses (pp. 25–30). Toronto: University of Toronto Press. Sarasola, Ibon (1984–1995). Hauta-lanerako Euskal Hiztegia. Zarauz: Itxaropena. (is) Sperber, Dan & Deidre Wilson (1995). Relevance. Communication and Cognition. Oxford: Oxford University Press. Sweetser, Eve E. (1990). From Etymology to Pragmatics. Metaphorical and Cultural Aspects of Semantic Structure. Cambridge: Cambridge University Press. Talmy, Leonard (1991). Path to realisation: A typology of event conflation. In Proceedings of the Seventeenth Annual Meeting of the Berkeley Linguistics Society (pp. 480–519). Talmy, Leonard (2000). Toward a Cognitive Semantics. Cambridge, MA: MIT Press. The American Heritage Dictionary (1992). Houghton Mifflin Company. 3rd edition. (am) Vandeloise, Claude (1991). Spatial Prepositions: A Case Study from French. Chicago: Chicago University Press. JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.1 (48-113) chapter How experience structures the conceptualization of causality* Maarten Lemmens Université Lille 3, France The present article sketches some variations in the conceptualization of causative events, in particular as coded by a subgroup of lexical causatives – i.e., verbs of killing. However, my analysis aims to go one step further than a mere description of these variations by showing their experiential alignment, adopting a moderate experiential point of view. By examining a considerably large corpus of lexical causatives, and the variations between the event construals that they entail, the present paper will outline some of the factors that play a role in structuring our experience, and thus our coding of causation. The analysis shows how the choice between two different models at work in the grammar of causative events, viz. the transitive or the ergative model, aligns in subtle ways with the specifics of the event experienced. Keywords: lexical causatives, causative alternation, transitivity, ergativity . Introduction One of the basic tenets of Cognitive Grammar is that meaning is equated with conceptualization. That is, semantic structure is defined as conceptualization “tailored to the specifications of linguistic convention” (Langacker 1987: 99).1 The meaning of a linguistic expression is a cognitive structure characterized relative to cognitive domains “where a domain can be any sort of conceptualization: a perceptual experience, a concept, a conceptual complex, an elaborate knowledge system, etc.” (Langacker 1991a: 3). As Lakoff (1987) has shown, most of these are idealized cognitive models (icms). Such models are similar to Fillmore’s frames, defined as “unified frameworks of knowledge, or coherent schematizations of experience” (Fillmore 1985: 223). Against the background of such larger conceptual structures, linguistic structures impose their own specifications, which brings us to another pivotal claim of Cognitive Grammar, viz. that “linguistic expressions and grammatical construc- JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.2 (113-164) Maarten Lemmens tions embody conventional imagery” (Langacker 1988: 7). Meaning thus relies on our ability to conceptualize the same object or situation in different ways. As Casad summarized it: “the speaker’s ability to conceptualize situations in a variety of ways is, in fact, the foundation of cognitive semantics” (1995: 23). The present article sketches some of such variations in the conceptualization of causative events, in particular as coded by a subgroup of lexical causatives – i.e., verbs of killing. However, my analysis aims to go one step further than a mere description of these variations by showing their experiential alignment, adopting a moderate experiential point of view, as proposed by Lakoff (1987) and others. Lakoff defines the experientialist strategy as an attempt “to characterize meaning in terms of the nature and experience of the organisms doing the thinking” (1987: 266). In line with what was said before, experience is not to be understood in an individual sense, but in a broad sense: “the totality of human experience and everything that plays a role in it” (1987: 266). By examining a considerably large corpus of lexical causatives, and the variations between the event construals that they entail, the present paper will outline some of the factors that play a role in structuring our experience, and thus our coding of causation.2 The choice of a causative model, viz. transitive or ergative, aligns in subtle ways with the specifics of the event experienced. For example, the ergative predilection of the suffocate verbs (e.g., ‘asphyxiate’, ‘suffocate’, or ‘choke’), as emerging from both historical and contemporary data, can be explained as having an experiential basis. My data further suggest that the opposition between external and internal causation aligns firstly with the paradigmatic opposition between the transitive and the ergative; and secondly, within the ergative model, with the opposition between ‘effective’ and ‘non-effective’ constructions (or ‘causative’ vs. ‘non-causative’ in more traditional terminology). Experience from a more culturally or ideologically coloured point of view will be at the heart of the changes that have occurred in the semantic and constructional evolution of ‘abort’. By looking at the data in this way, we can arrive at a better characterization of the conceptual structures of these verbs as well as the constructions in which they occur, revealing coding patterns and/or tendencies that would otherwise have gone unnoticed. . Two models of causation In general, my analysis of lexical causatives follows Davidse’s (1991, 1992) account which posits that the English grammar of actions and events is governed by two distinct causative models, viz. the transitive and ergative paradigms. These two models represent different ways of conceptualizing causative processes, implying different conceptual centres and different participant relations. In Halliday’s terms, they are said to project different ‘inherent voice’ relations. The following summary JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.3 (164-278) The conceptualization of causality hardly does justice to Davidse’s innovative work and merely mentions the most basic distinctions relevant to my present purpose.3 The transitive paradigm, as realized in the example, John killed Mary, centres around an Agent, who directs a prototypically volitional action onto an inert Affected. In more specific terms, in the transitive system the Actor-Process combination is the nuclear building block: it can be isolated in an objectless transitive (e.g., Soldiers trained to kill), where the Actor is the transitive instantiation of a more schematic Agent. This system is a linear one that prototypically extends to the right to incorporate a fully passive Affected, called the ‘Goal’. The ergative paradigm, in contrast, centers around the Affected, which in addition to being affected is also active. Its conceptual independence is reflected in the fact that this participant can be isolated in a one-participant construction with an ‘agentive’ participant – e.g., Mary suffocated. In an ergative construal (with either one or two participants), the process is conceptually dependent on the ‘Medium’, which is the entity that is affected yet also co-participates in the event (much like a medium in the ESP sense). The process-medium cluster is semiautonomous vis-à-vis the ergative Agent, called the ‘Instigator’. In other words, unlike the linear transitive, the ergative system is a nuclear one with two processual layers: the instigated process and the instigation of the process, which need not be co-extensive in time or space (Davidse 1991: 67ff.). It is left oriented, in that the basic conceptualization may be opened up to include the instigator. Unlike many Asian, Australian or Amerindian languages, English does not indicate transitivity and/or ergativity by overt case marking. They manifest themselves in more covert ways as reflected in, among other things, different alternation patterns. In his cognitive reinterpretation of nominative/accusative and ergative/absolutive case marking, Langacker also observes that the transitive and ergative patterns are not only coded by morphological markings, but find “numerous other linguistic manifestations” (1991b: 381). On the basis of the most essential alternation patterns, the transitive and ergative paradigms can be distinguished as in Table 1. The effective constructions are more specific instantiations of the agentprocess-affected schema. While formally identical, the underlying semantics is different for the transitive and the ergative instantiations: for the latter the Affected still co-participates, as reflected in the possibility of forming an ergative non-effective.4 The objectless transitive maximizes the transitive focus on the actor-process unit. However, it is to be regarded as an effective construction, since the Goal is still very much implied (cf. Rice 1988; see also Lemmens 1998b: 140–146 for a more elaborate description).5 Such an objectless agentprocess construction is not possible with ergative verbs as they centre on the Affected. For example, in the sentence, John suffocated, John cannot be interpreted as the Agent who causes someone else’s suffocation; only as the entity who has JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.4 (278-298) Maarten Lemmens Table 1. Paradigmatic instantiations Construction Transitive Paradigmatic instantiations Ergative EFFECTIVE Ag-Proc-Af John killed Mary Actor-Process-Goal OBJECTLESS John killed Actor-Process-(Goal) NON-EFFECTIVE AcMe-Proc John suffocated Mary Instigator-Process-Medium Mary suffocated Medium-Process Mary died Actor-Process PSEUDO-EFFECTIVE (φ)Ag-Proc-(φ)Af Mary died a slow death Actor-Process-Range The house blew a fuse Setting-Process-Medium suffocated (i.e., affected by the verb). The semantic value of the ergative, noneffective construction is that it neutralizes whether the process was self-instigated or instigated by an external Instigator. As Smith (1978) has argued, a construction is positively marked for the features of external control, as well as independent activity (cf. also Haspelmath 1993: 90). The ergative effective resolves the voice vagueness. The traditional intransitive, here called ‘transitive non-effective’ (e.g., Mary died; John stumbled), is regarded as a subtype of the transitive paradigm, as it too centres around a volitional or non-volitional Agent. The pseudo-effectives, not really relevant to the present discussion, have one participant that is not a true participant (marked by the symbol φ); for the transitives, it is a pseudo-Goal (called a ‘Range’); and for the ergatives, a setting functions as a pseudo-Instigator (see Davidse 1991: 115–140). Note that the area of variability is different for both, and pertains to a non-nuclear participant. . Transitivity and ergativity in the field of killing In her classification of verbs, Levin (1993) distinguishes two categories of verbs of killing, the murder verbs which are manner-neutral and the poison verbs which “lexicalize a means” – i.e., are “verbs which relate to actions which can be ways of killing” (1993: 232). Next to these, she distinguishes a group of suffocate verbs, as a subgroup of processes involving the body and defines them as “[relating] to the disruption of breathing” (1993: 224). While Levin’s classification is correct for the murder verbs, her semantic grid is not refined enough when it comes to her class of poison verbs, which is a relatively heterogeneous group both lexically and grammatically, something of which she herself is well-aware. To remedy some of these shortcomings, I propose an alternative classification of the field which does JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.5 (298-340) The conceptualization of causality transitive murder lynch butcher kill ergative slaughter execute slay massacre murder assassinate suffocate decapitate throttle suffocate decollate stifle behead strangle asphyxiate choke smother instrument drown knife starve garrot starve action famish stab shoot hang Figure 1. General classification of verbs of killing more justice to the verbs’ lexical as well as constructional prototypes. Diagrammatically, my classification can be represented as in Figure 1 (the labels of the categories have been underlined; members are in italics). While this figure is a gross oversimplification, it nonetheless sheds some light on the internal structure of the field and the paradigmatic home-ground of some of the items. Most obvious is the transitive predilection of the whole field. The reason is straightforward: the concept of killing someone is quite compatible with the meaning of the transitive paradigm, in which an inert Goal is affected by the action of a volitional Actor. As indicated by the thickness of the circle, the murder verbs are prototypical within the field, and their markedly transitive character radiates to the rest of the field. The other group relevant to the present article is that of the suffocate verbs. Levin’s original group of suffocate verbs has been extended to include the prototypically transitive verbs ‘smother’, ‘strangle’ and ‘throttle’. Despite these three, the suffocate verbs are prototypically ergative (as are the starve verbs), as will be elaborated below. Two important nuances need to be added to the diagram. First, there are differences in the prototypicality of the different subgroups, which also show prototype effects. Moreover, the boundaries of the subgroups cannot always be sharply delineated. In other words, the lexical field of killing emerges as a prototypebased category, whose external boundaries are not sharply delineated, and whose internal structure shows considerable flexibility and differences in salience (see Lemmens 1998b for more discussion). Finally, the diagram presents a constructional (i.e., paradigmatic) classification of verbs as fixed and stable; whereas, in fact, these are features of the specific clause construal in which the lexical and JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.6 (340-392) Maarten Lemmens constructional meanings combine to form a complex semantically well-motivated unit (cf. Note 4; see also Lemmens 1998b for detailed descriptions of semantically motivated paradigmatic shifts). . Examples of experiential grounding In view of the different conceptual nuclei of the transitive and the ergative models, Actor vs. Medium, it is sensible to assume that events that are perceived as centring around either of these participants will activate a different coding model. While watertight predictions are impossible to make, the data examined confirm this view. As said, the field of killing generally tilts to the transitive side, which aligns with the typical experience of a kill-event as involving a (typically) volitional Agent who does something to an inert (and involuntary) Goal. In some events, the Affected is experientially more salient, leading to an ergative coding. The following subsection describes some specifics investigations into this experiential grounding. Focus on volitionality The more salient the volitionality of an Actor, the more likely that a typical transitive coding will be used. In the field of ‘killing’, this usually means a conceptualization in terms of a murder verb. Consider the case of a premeditated murder. In such events, the intentionality of the Actors is quite salient, as they definitely plan to kill someone. Given the availability of a particular lexical term to code such an event, ‘murder’, this will most likely be the coding used. As Geeraerts, Bakema and Grondelaers (1994) have shown, the more unique a referent, the more likely it is that a unique term will be used. Conversely, the more a referent is described in terms of a certain concept, the more that concept can be said to be entrenched. Clearly, a given event may be coded in alternate ways, and thus, the choice of ‘murder’ cannot be rigidly predicted. It can be noted in passing that the said importance of the victim in an assassination event does not constitute evidence that the construction should be ergative. As for the ergative system, it is not the Affected’s importance tout court, but their degree of (co-)participation in the event – i.e., its importance vis-à-vis the process itself – that opens up ergative constructions (e.g., the ergative non-effective). This, it can be added, is definitely not possible with ‘assassinate’. Accidental causation The murder verbs saliently incorporate the notion of volitionality into their semantic structure, and resist codings with unvolitional or inanimate Actors. Ironi- JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.7 (392-454) The conceptualization of causality (a) (b) (c) setting (d) setting in tr lm in tr die in a crash lm tr be killed in a crash tr be killed by a crash lm the crash killed 112 people Figure 2. Settings and Agents cally, the verb in the literature most often cited as typically transitive, is the general verb ‘kill’, which in fact, conforms least to the intentional Actor prototype of transitives. Its generality as well as non-prototypicality are explained by the interaction of a number of factors: 1. its high lexical flexibility (the data show a 22% metaphor ratio vs. less than 9% for the other murder verbs) 2. a less stringent implication of goal-achievement (as for instance reflected in the common hyperbolical use, mostly in the progressives – e.g., my feet are killing me.) 3. the possibility of a non-volitional Actor – e.g., he killed the woman without will or conscious mind. 4. the possibility of an inanimate Actor – e.g., they were killed by stray bullets or shrapnel. It is precisely the latter environment that may at first be difficult to explain, yet allows interesting comparisons with die verbs (e.g., die, perish, etc.). While the semantic difference between ‘kill’ and ‘die’ may typically be unproblematic, there seems to be a strong degree of overlap when it comes to coding casualties in accidents, diseases, and the like. A coding in terms of ‘kill’ for casualties in accidents is quite frequent and occurs in more than 71% for such events described in the wsj corpus, as opposed to 23% in terms of ‘die’ (cf. Table 2 below). This may be surprising, given the absence of a volitional Actor, which would typically trigger a transitive coding. Taking a ‘plane crash’ as an example, the most common codings can be represented as in Figure 2 (see Langacker 1991b, for the diagrammatic conventions). In all four constructions, the event (‘the crash’) is nominalized and consequently represented as a thing (indicated by a circle), lethally affecting the victim (the change of state is represented by the squiggly line). The difference between (c) and (d) represents the semantic purport of the passive construction, viz. a trajector/landmark reversal (Emanatian 1993; cf. also Langacker 1987: 120ff.). JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.8 (454-542) Maarten Lemmens Of the codings with ‘kill’, some 45% code the accident as the Actor, diagram (c) or (d), which makes it the largest subgroup of inanimate Actors with ‘kill’ (60%). This presents clear counterevidence to Dirven’s (1993: 95) claim that “killed by [. . .] is incompatible with circumstantial causes such as accidents”, from which he deduces that the circumstances can be expressed only by means of ‘in. . .’, as in, be killed in an accident. What is true, of course, is that for the latter construction, the agent is highly schematic (as indicated by the cross-hatching), and the accident cannot be added overly as the agent – e.g., *in that crash 111 people were killed by it. The reason is that within one series of temporally contiguous segments of an action chain, which Ryder (1991) conveniently terms ‘episodes’, one and the same participant cannot simultaneously function as Setting and as Agent.6 The absence of an overt Agent in a ‘circumstantial passive’ construction is experientially grounded: in events like the ones reported, there is usually no Agent more salient than the event itself. If anything is to function as Agent it is either the event itself or a participant situated ‘within’ that event, and one that furthermore acquires sufficient prominence to be construed as Agent. While the latter is in principle possible (e.g., In the explosion, seven people were killed by flying shards, one person died from a heart attack), it is marked, as indicated by its total absence in all corpora consulted (containing over 500 passives with the verb ‘kill’). Sansò (2000) correctly observes that the passive leaves the trajectory (here the victim) on stage while removing the causer: “the speaker can choose to focus on the patient, thus displaying empathy with him, or can choose to embrace the maximal scope of what is on-stage, thus conceptualizing the event as a whole” (2000: 3). At first sight, the use of ‘kill’ in the above context may not seem to align with our immediate experience, whence probably Dirven’s observation. The relationship between ‘kill’ and ‘die’ in the above contexts can indeed be puzzling (cf. DeLancey 1984), and probes into the difference between Agents and Causers. Within the scope of this paper, I cannot elaborate on this issue (see Lemmens 1998b: 123–126), but merely indicate that a coding with ‘kill’ still furnishes a more active construal; whereas that with ‘die’ brings in a cause that straddles the border between a nuclear participant and a circumstantial setting (see also Davidse 1991; Talmy 1985). Given the overlap between the two verbs in the coding of accidents and diseases, an absolute generalization is impossible to make. Nevertheless, the corpus reveals a clear pattern, tabulated in Table 2. As can be deduced from the table, the ratio of ‘kill’ and ‘die’ is reversed for the two different events. It can be noted in passing that neither the type of accident or disease, nor the number of people killed, can account for the different conceptualizations; on the whole, the contexts are comparable if not identical. Table 2 nevertheless indicates the prototypical conceptualizations: ‘kill’ will more readily be used for events that are prototypically caused by external and perceptually distinguishable causes; whereas ‘die’ will be more typical in cases of less percepti- JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.9 (542-599) The conceptualization of causality Table 2. Distribution of kill and die for accidents and diseases Lexical item Type of event accident disease Total die Freq Col Pct Row Pct 36 23.2% 26.1% 102 85.7% 73.9% 138 50.4% kill Freq Col Pct Row Pct 111 71.6% 86.7% 17 14.3% 13.3% 128 46.7% other Freq Col Pct Row Pct 8 5.1% 100% Total Freq Row Fct I55 56.6% 8 0.% 119 43.4% 274 100.0 ble causation, often of the kind that comes ‘from within’. I hold the view that in the prototypical case a transitive construal entails a maximally ‘external’ point of view: the Actor is an entity external to the Goal and directly impinging on it, from the outside. The ergative predilection of the suffocate verbs The internal-external alignment emerges even more strongly from the group of suffocate verbs where it motivates not only the opposition between the (prototypically) transitive members of the group (‘strangle’, ‘throttle’, and ‘smother’) and the ergative ones, but also within the latter group, the typical occurrence of an effective construction when combined with an external cause. The suffocate verbs can be distinguished by which part of the respiratory system is affected (lungs, throat, mouth, nose), and how it is typically affected (constriction, immersion, coverage, etc.). This distinction can be represented as in Figure 3. Although this distinction does not represent a rigid dichotomy, it gives a first assessment of the experiential basis of the ergative preference of the group as a whole, as well as the transitive character of some of its members and the mixed external zone transitive struggle throttle internal zone transitive-(ergative) smother stifle Figure 3. Internal/external and ergative/transitive correlation ergative drown choke suffocate asphyxiate JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.10 (599-658) Maarten Lemmens character of some others. The latter group are those that oscillate (both synchronically and diachronically between the two paradigms. Within the confines of this article, these cannot be elaborated in full detail; some discussion of such particular uses are found in Lemmens (2005a, subm.). That is, the ergative conception seems to align strongly with how a suffocation process is experienced, for a number of reasons. First, the causes of suffocation events, although multifarious, are typically imperceptible, as in the case of gases or (lung) diseases. This imperceptibility encourages a conception of suffocation as independent of the cause or instigation, which is an essential feature of an ergative construal. The lower perceptibility of a cause often leads to different (metaphorical) conceptualizations of it (e.g., as covering, enveloping, etc.), which gives rise to different lexical choices (‘smother’, ‘drown’, etc.). Clearly a transitive conception (e.g., with the cause as Actor) is not fully excluded, but the corpus shows that this is uncommon with these types of clauses. Secondly, the conceptual independence of the caused process is reinforced by the (prototypical) temporal distance between the instigation and the consequences, leading to an enhanced focus on the process itself. Moreover, as is also typical of an ergative conception of events, there is a low salience of goalachievement, which contrasts sharply with the foregrounding of this property in the case of prototypically transitive verbs, such as the murder verbs, but also ‘strangle’ and ‘throttle’. The latter two verbs typically code a process that ends in the death of the victim; whereas for the ergative suffocate verbs this feature is much less salient (though not excluded). Consider, for instance, the common usage of ‘choke’, in reference to (mostly non-lethal) swallowing the wrong way. The suffocate verbs also often occur in hyperbolical uses, in which speakers (deliberately) exclude the end point of the process from the conceptualization, focusing instead on the suffering of the Medium. (Recall that for the murder verbs only ‘kill’ occurs in such hyperbolical constructions.) Such usage is also common with ‘starve’, when it refers to the state of being very hungry and excludes the lethal outcome; for some speakers this has been attested as the default case. In other words, an ergative conception can come to focus strongly on the Medium’s activity, thereby not only excluding the instigation but also the end-point of the process. It can be noted that the hyperbolical uses only occur in the ing form, in line with the semantics of this form, viz. the exclusion of the endpoints of the process coded by the verb. Thirdly, the distinction between the external and internal parts of the affected respiratory system, is an additional experiential factor that motivates the transitive/ ergative distinction. It is quite logical that the external perspective lines up with the focus on the Agent, since, most naturally, the affected parts are those that can be easily accessed by an external Agent. Moreover, the more external parts of the respiratory system are also those not really active in the respiration process. As is typical of transitives, ‘smother’, ‘strangle’, and ‘throttle’ altogether omit the JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.11 (658-713) The conceptualization of causality notion of the victim’s participation. These verbs conceptualize the interruption of normal respiration as ‘externally’ inflicted on a fully inert patient. In other words, the more perceptible (and consequently also more external) the cause, the more likely it is that a transitive conception is used to encode the event. Clearly, it is not excluded that typically ergative verbs, like ‘suffocate’ or ‘choke’, are selected to encode events where the cause is clearly and exclusively external. Strikingly, however, is that in those cases the coding is typically effective (‘causative’); an intuition that is confirmed by both diachronic and synchronic data – see examples below. (1) The broider’d band That underbraced his helmet at the chin. . . Choaked him. (oed, 1790) (2) The man who choked the Emir (3) . . . suddenly he’d . . . grab him by the throat and choke him. (oed, 1866) (wsj) Alternatively, both diachronic and synchronic data indicate that a non-effective construction occurs with internal causes. For example, one of the typical uses of ‘choke’ is in the context of swallowing the wrong way; an event in which the cause is maximally internal (an unbreathable substance entering the trachea), and the bodily (re)action of the victim is quite salient. As can be expected, a non-effective construction is used, and is, in fact, the only possibility: (4) a. Riddle laughs so hard he starts to choke on his salad. b. *the salad choked Riddle. (5) a. Sorrille almost choked on his tongue. b. *Sorrille was almost choked by his tongue. The b-sentences can only be acceptable when the salad and the tongue are, in one way or another, seen as external causes. It could be argued that in reference to swallowing the wrong way, ‘choke’ no longer codes an instigatable process, but realizes an intransitive construal of the event, with the cause, the unbreathable substance, in a periphrastic coding in a ‘on-’ complement.7 Once again, one should not misinterpret these observations as indicating that the ontological situation rigidly predicts the construal. Nevertheless, the data unequivocally reveal a subtle alignment between the specifics of the event and the occurrence of an effective or a non-effective construal, or an ergative and transitive coding. The ergative predilection of the suffocate verbs (as opposed to the Agentoriented murder verbs) also emerges from a diachronic analysis of verbs of killing. Halliday (1985: 146) observes that “the coming of [the ergative] pattern to predominance in the system of modern English is one of a number of related developments that have been taking place in the language over the past five hundred years or more”. My data show that such an ergativization may show up unexpectedly JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.12 (713-777) Maarten Lemmens (e.g., with prototypically intransitive ‘die’), but also that particularly the suffocate verbs seem to have been subject to ergativization (see Lemmens 1998a). From the late 16th century up to the early 20th century, all suffocate verbs started to occur in ergative constructions. On top of that, there was considerable lexical overlapping, with some verbs encroaching on the semantic space of others. For example, the verb ‘strangle’, in present-day English, is fairly restricted in semantic coverage, and came to be used to report on deaths caused by the sword, disease, poisoning, drowning or suffocation by gases, etc. In these cases ‘strangle’ often occurred in ergative non-effective constructions – consider examples (6)–(9) below. (6) Hanybal . . . stranglyd with poisoun. (oed, 1443) (7) The swearde shal strangle them (oed, 1535) (8) She fell into the pond yesterday . . . She nearly strangled . . . (nov) This lexical and constructional flexibility is absent for the murder verbs, for which no ergativized use has been attested in any of the copora. They are too strongly tied to their Agent-centredness to allow a conceptual shift of focus to the second participant involved in the process. The ergativization process that characterized Early Modern English has not yet been analysed to the fullest, and it is unclear what motivates it for the whole of the English lexicon. I like to believe that it can be explained by changes in how humans came to see (i.e., experience) the world, but this is an hypothesis that needs to be verified in a much larger context than I have done so far. In any case, the ergativization of the suffocate verbs can be explained against the background of the typical conceptualisation of a suffocation event. Ideologically determined transitivization That the experience of events is highly influenced by cultural and ideological assumptions is nicely illustrated by the lexical and constructional evolution of the verb ‘abort’. This has been described in full detail elsewhere (see Lemmens 1997, 1998a), but it can be reiterated here that changes in the way we interact with the world have had a definite impact on the conceptualization of an abort-event. Previously, the conceptualization underlying the item pertained to a spontaneous termination of a pregnancy, and logically, it activated the ergative model. Indeed, the second participant, the fetus, was experienced as having the potential of selfinstigating and sustaining the process, which escaped the control of the woman. However, medical and technological advances have given us greater control over such processes: parents and doctors are now more in control, as volitional beings who can target the abortion process onto a fetus or, in more recent usages, onto JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.13 (777-851) The conceptualization of causality the woman. Logically, then, the causative model, activated by present-day ‘abort’, is the transitive one in which the fetus is reduced to an inert Goal, or even omitted from the episode altogether in an objectless transitive – e.g., the mother can abort, if she so chooses; or in a relatively recent type of construction with the woman as Affected – e.g., the doctor aborted the woman, or fewer women are aborted. Here are two recent examples (returned by a Google search) (9) Many Tibetan women are aborted and sterilised after the first birth (www.hsph.harvard.edu/Organizations/healthnet/ SAsia/repro3/tibet.html) (10) in Communist China where women are aborted by government order when pregnant for the second time (www.creationism.org/csshs/v14n1p15.htm) Apparently, ‘abort’ is also evolving in its lexical structure. While in the wsj corpus dating from 1989, the literal (prototypical) use can still be argued to be that of prematurely terminating a pregnancy (75% of the cases), recent data (and recent comments by native speakers) suggest that more and more, the prototype is shifting to the more schematic meaning of, “to terminate a process”, of which the halting of a pregnancy is a specific instantiation. It can, however, not be denied that this instantiation differs from the others in its selection of a causative model, as it only activates the transitive, whereas the others can invoke the ergative model – e.g., the takeoff aborted, vs. the pilot aborted the takeoff. This type of usage has become quite common in the domain of computer terminology – e.g., the program will abort. . Conclusions and prospects The above analyses have illustrated that the characteristics of a specific event subtly influence the way in which this event is conceptualized, in ways that may not be immediately obvious from introspection. While variations in conceptualization cannot be excluded, some clear tendencies have been highlighted, from which certain predictive power can be distilled. In a nutshell, the more volitional a participant seems to be whilst engaged in some causative process, the more likely it is that s/he will surface as a volitional Actor in a transitive construction; the more autonomous the process, vis-à-vis its cause, the more likely an ergative conception will be triggered. Moreover, within the field of ‘killing’, another parameter has been shown to be relevant – i.e., the internal or external nature of the cause. While these conclusions are quite compatible with what is commonly assumed in Cognitive Linguistics (e.g., Lakoff 1987: 54– 55), I do not characterize it in terms of directness or indirectness of causation, but in terms of participation in the process, which for the transitive and ergative construal is quite different. JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.14 (851-899) Maarten Lemmens The present study presents only the onset of a more fully-fledged onomasiological analysis – i.e., one that starts from the referent and examines how it is coded. Further research is warranted, preferably of the type that does not start from textual material. Clearly, corpus analysis, as that underlying the present research, is most relevant for discovering certain tendencies that would otherwise go unnoticed. For example, in addition to the observations above, my analysis of unprototypical (i.e., inanimate) Actors with ‘kill’ has revealed that there is a certain alignment with the type of Affected: Goals that are lower on the Silverstein hierarchy (e.g., low-level organisms) tend to take a low-level Actor as well (e.g., the antibody kills infected cells). This correlation is worth exploring further with other types of events, and may correct some of the widely held views on the prototypicality of certain types of Agents. A possible drawback inherent to corpus analysis is that it is still fairly much a posteriori: it starts from a conceptualization and tries to discover the experiential motivation behind it. To counterbalance that bias, one could set up experiments to elicit narrations from speakers when watching a filmed event, for example (cf. also Slobin 1996, 2000; Lemmens 2005b on this type of research). Such an investigation can shed further light on different mental imagery employed in encoding an event and what triggers it. Notes * All correspondences concerning this article should be sent to Maarten Lemmens, at Université Lille3, U.F.R. Angellier (English), Lille, France. Email: [email protected] . This paper regroups and reconsiders some of the more elaborate descriptions in Lemmens (1997, 1998b). I thank the anonymous reviewer for the valuable comments on an earlier version of this paper. Responsibility for the final product is of course mine. . The corpora used are the ACL-Wall Street Journal Corpus (WSJ: 5,353,500 words); the Leuven Drama Corpus (1,029,660 words); a collection of contemporary American short stories (1,066,875 words); a collection of 19th centuy novels (746,525 words); and citations from the OED on CD-ROM (11,713 citations). . The reader is referred to Davidse (1991, 1992) for more elaborate descriptions. For my own modifications to her theory and some changes in terminology, see Lemmens (1998b). . Not all constructions with prototypically ergative verbs allow the non-effective construction – e.g., John opened a tin of baked beans (Davidse 1991: 63) or The horn drowned out the opening bell (see Lemmens 1998b: 178–187). These constructions no longer activate the ergative model. For further discussion, see the cited works. . Among other things, the two works cited will provide ample illustration that virtually all transitive verbs allow the objectless transitive in the proper context. . See also Langacker’s notion of “scope of predication”, Nishimura’s (1994) “coherent actions chain”, or DeLancey’s (1984) “event schema”. . On the use of on (and with in other contexts), see Lemmens (1998b: 168–173). JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.15 (899-1005) The conceptualization of causality References Casad, Eugene (1995). Seeing It in More than One Way. In John Taylor & Robert E. MacLaury (Eds.), Language and the Cognitive Construal of the World (pp. 23–49). Berlin & New York: Mouton de Gruyter. Davidse, Kristin (1991). Categories of Experiential Grammar. Unpublished PhD Thesis, K. U. Leuven. Davidse, Kristin (1992). Transitivity/Ergativity: The Janus-Headed Grammar of Actions and Events. In M. Davies & L. Ravelli (Eds.), Advances in Systemic Linguistics (pp. 105–135). London: Pinter. DeLancey, Scott (1984). Notes on Agentivity and Causation. Studies in Language, 8, 181–213. Dirven, René (1993). Dividing up Physical and Mental Space into Conceptual Categories by means of English Prepositions. In Cornelia Zelinsky-Zwibbelt (Ed.), The Semantics of Prepositions. From Mental Processing to Natural Language Processing (pp. 73–97). Berlin & New York: Mouton de Gruyter. Emanatian, Michele (1993). Figure-Ground Reversal in Grammar. Paper presented at the Third International Cognitive Linguistics Association Conference, Leuven, 18–23 July. Fillmore, Charles J. (1985). Frames and The Semantics of Understanding. Quaderni di Semantica, 6, 222–254. Geeraerts, Dirk, Peter Bakema, & Stefan Grondelaers (1994). The Structure of Lexical Variation: Meaning, Naming, and Context. Berlin & New York: Mouton de Gruyter. Halliday, M. A. K. (1985). An Introduction to Functional Grammar. London: Arnold. Haspelmath, Martin (1993). More on the Typology of Inchoative/Causative Verb Alternations. In Bernard Comrie & Maria Polinsky (Eds.), Causatives and Transitivity (pp. 87–120). Amsterdam & Philadelphia: John Benjamins. Lakoff, G. (1987). Women, Fire and Dangerous Things. What Categories Reveal about the Mind. Chicago: Chicago University Press. Langacker, Ronald W. (1987). Foundations of Cognitive Grammar, Vol. 1. Stanford: Stanford University Press. Langacker, Ronald W. (1988). An Overview of Cognitive Grammar. In Brygida RudzkaOstyn (Ed.), Topics in Cognitive Linguistics (pp. 3–48). Amsterdam & Philadelphia: John Benjamins. Langacker, Ronald W. (1991a). Concept, Image, and Symbol. Berlin & New York: Mouton de Gruyter. Langacker, Ronald W. (1991b). Foundations of Cognitive Grammar, Vol. 2. Stanford: Stanford Univ. Press. Lemmens, Maarten (1997). The Influence of World Conception on Transitivity and Ergativity: a Case Study. In Eve Sweetser, Kee Dong Lee, & Marjolijn Verspoor (Eds.), Lexical and Syntactic Constructions and the Construction of Meaning (pp. 363–382). Amsterdam & Philadelphia: John Benjamins. Lemmens, Maarten (1998a). The experiential basis of lexical and constructional flexibility: a diachronic and synchronic study. Leuvense Bijdragen, 87, 79–113. Lemmens, Maarten (1998b). Lexical Perspectives on Transitivity and Ergativity. Causative Constructions in English [CILT 166]. Amsterdam/Philadelphia: Benjamins. JB[v.20020404] Prn:21/03/2006; 15:24 F: HCP1512.tex / p.16 (1005-1073) Maarten Lemmens Lemmens, Maarten (2005a). Des constructions causatives sans objet: un complément à l’analyse récente de Goldberg. In Claude Delmas & Mireille Quivy (Eds.), 6 Etudes en linguistique anglaise, CERCLES Revue pluridisciplinaire du monde anglophone [Occasional Papers] (pp. 79–113). Université de Rouen. Lemmens, Maarten (2005b). Motion and location: toward a cognitive typology. In Geneviève Girard-Gillet (Ed.), Parcours linguistiques. Domaine anglais [CIEREC Travaux 122] (pp. 223–244). Lemmens, Maarten (forthcoming). More on objectless transitives and ergativization patterns in English. Thematic issue of Constructions. Levin, Beth (1993). English Verb Classes and Alternations. A Preliminary Investigation. Chicago: Chicago University Press. Nishimura, Yoshiki (1994). Agentivity in Cognitive Linguistics. In W. Noth (Ed.), Origins of Semiosis (pp. 487–530). Berlin, New York: Mouton. Rice, Sally (1988). Unlikely Lexical Entries. Berkeley Linguistics Society, 14, 202–212. Ryder, Mary-Ellen (1991). Mixers, Mufflers and Mousers: The Extending of the -er Suffix as a Case of Prototype Reanalysis. Berkeley Linguistics Society, 17, 299–311. Sansò, A. (2000). The domain of demotion: a new view of passive constructions. Paper presented at the 2nd International conference on Contrastive Semantics & Pragmatics. Cambridge, 10– 13 September, 2000. Smith, Carlota S. (1978). Jespersen’s ‘Move and Change’ Class and Causative Verbs in English. In M. A. Jazayery, E. C. Polome, & W. Winter (Eds.), Linguistic and Literary Studies in Honor of Archibald A. Hill. Vol. II: Descriptive Studies (pp. 101–109). The Hague: Mouton. Slobin, Dan I. (1996). Two ways to travel: Verbs of motion in English and Spanish. In M. S. Shibatani & S. A. Thompson (Eds.), Grammatical Constructions: Their form and meaning (pp. 195–200). The Hague: Mouton. Slobin, Dan I. (2000). Saturation of a semantic field. Paper presented at the conference on Language Culture and Cognition, Leiden, 22–23 March, 2000. Talmy, Leonard (1985). Lexicalization Patterns: Semantic Structure in Lexical Forms. In Timothy Shopen (Ed.), Language Typology and Syntactic Description. Vol. III: Grammatical Categories and the Lexicon (pp. 57–149). Cambridge: Cambridge University Press. JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.1 (47-109) chapter Internal state predicates in Japanese A cognitive approach* Satoshi Uehara Tohoku University, Japan This paper examines internal state predicates, or “subjective predicates”, in Japanese, which exhibit some grammatical behavior different from their counterparts in other languages like English. It employs Langacker’s framework on subjectivity (1985, 1991a), and argues that these predicates represent those that Langacker calls “egocentric viewing arrangement”, in which the construal of the event/situation is optimally subjective. The consideration presented in this paper demonstrates that this “subjective construal” analysis of Japanese internal state predicates can uniformly account for the grammatical behaviors exhibited by them, namely, the person restriction, the implicit reference to their experiencer subject, and their formation of the so-called “double nominative” constructions. It furthermore discusses implications on the cognitive framework’s cross-linguistic applicability. Keywords: subjectivity, internal state predicates, viewing arrangement, person restriction, Japanese . Introduction It is widely known that Japanese possesses a large group of internal state predicates called subjective predicates (Kuroda 1973; Kuno 1973; Aoki 1986; Iwasaki 1993; Backhouse 1993; Uehara 1998b). They are so called because they denote, by default, the speaker’s internal states, such as feelings and emotional reactions. They have attracted linguists’ attention because they exhibit some grammatical behavior different from internal state predicates in other languages like English. In this paper I will employ a cognitive linguistic framework for subjectivity (Langacker 1985, 1991a) in order to analyze the internal state predicates in Japanese, and I will argue that these predicates represent what Langacker calls “subjective construal” (1991a: 316), in which the construal of the situation (by, the conceptualizer) is highly subjective. I will also demonstrate that this framework can uniformly ac- JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.2 (109-170) Satoshi Uehara count for the grammatical behaviors exhibited by internal state predicates, and also that it is compatible with, and thus receives strong support from, the recent discourse studies on Japanese (Iwasaki 1993; Uehara 1998b). . Japanese internal state predicates Following Iwasaki’s (1993: 15) description, internal state predicates are said to be those “adjectival predicates showing sensations (atui ‘hot’), emotions (kanasii ‘sad’) or desires (hosii ‘want’)”. Of course, other languages like English have words with similar semantic functions (e.g., the English glosses for the Japanese internal state predicates given above). The most basic and widely known characteristic of internal state predicates in Japanese is that they strictly require the sentence subject to be in the first person when used in simple declarative sentences. An example is uresii ‘glad’, illustrated in (1) and (2):1, 2 (1) Watasi wa uresii. 1.sg top glad.nonpast “I am glad.” (2) *Zyon wa uresii. John top glad.nonpast “John is glad.” To express the proposition in (2) in Japanese, one must make explicit reference to the evidence on which it is based. For example, one must say something like “John looks glad”, “John is showing the signs of being glad”, and so on, as illustrated in (3) and (4) below: (3) Zyon wa uresii yoo-da. John top glad appear.nonpast “John appears to be glad.” (4) Zyon wa uresi-gatte iru. John top glad-showing the sign of.nonpast “John shows the signs of being glad.” Two points should be noted here about the person restriction of Japanese internal state predicates. First, this restriction is lifted in ‘non-reportive’ or noncommunicative discourse such as the literary mode (Kuroda 1973). This seems to be because the point of view in sentences in the literary mode is not necessarily associated with the narrator, but can be freely interpreted with respect to a specific character in the story.3 Thus, although the sentence in (2) sounds infelicitous and is starred in naturally-occurring, communicative discourse in Japanese, this would be acceptable in the literary mode. Secondly, manifestation of this restric- JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.3 (170-220) Internal state predicates in Japanese tion in the internal state predicates of a language is a matter of degree, and its degree seems to vary from language to language. Thus some may point out that internal state predicates in other languages (e.g., in English) do show a similar restriction. In the case of Japanese however, this structural restriction appears to be relatively strong, which may explain the attention this restriction in Japanese has attracted in the linguistics literature, as well as the large variety of Japanese evidential markers used to get around this restriction (see Aoki’s 1986 discussion of Japanese evidential markers). Other properties of Japanese internal state predicates The most well-known of the characteristics of internal state predicates in Japanese is the restriction regarding the person of predicate subjects discussed in the previous section. However, it is not the only characteristic property, and I will discuss here two other properties that are much less well-known but are highly relevant to an understanding of this phenomenon. First is that the experiencer subjects of these internal state predicates are typically unexpressed in Japanese discourse. Japanese is a ‘pro-drop’ language, and utterances frequently do not linguistically code their clausal arguments. This is especially true of the experiencer subject argument of internal state predicates – i.e., the subject argument whose role is experiencer (cognizer) of the state denoted by them, such as watasi, ‘I’ in (1). Thus, Uehara (1998b), who analyzed an English novel and its Japanese translation, has shown that by default the first person (cognizer) subject of the internal state predicates is unexpressed unless some discourse factors favor explicit mention of it. For example, example (1) given earlier is typically expressed without the first person subject, as in (5): (5) Uresii. glad.nonpast “I am glad.” One can easily see a functional link between this second property of the internal state predicates in Japanese and the first. Since the predicate form can carry the function of expressing the (first) person of its experiencer (i.e., if the predicate form is not marked with any evidential markers, then the experiencer is in the first person), there is no need for the first person experiencer to be overt; it can be unexpressed without causing any ambiguity. The other property of internal state predicates in Japanese to be noted here, which does not appear to be any way related to the other two, is that internal state predicates are among those predicates which take what Kuno (1973: 79) calls “Ga for Object Marking”. The particle ga is a nominative marker, and is usually used to mark the subject of a sentence (while the particle o is used to mark the JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.4 (220-285) Satoshi Uehara object). However, according to Kuno, ga is used to mark the object for these predicates in question. Since the ‘real’ subject whose role is experiencer is marked with ga as well (or frequently topicalized and marked with the topic marker wa), internal state predicates form what is known as a ‘double nominative’ (or ‘double subject’) construction, as illustrated by using an internal state predicate of desire hosii, ‘want’, in (6): (6) (Watasi ga/wa) sake ga hosii. 1.sg nom/top sake nom want.nonpast “I want sake.” In fact, they constitute a major group of the double nominative constructions, and Kuno lists the following in (7) as subjective predicates in his list of predicates which take ga for object marking:4 (7) -tai ‘be anxious to’ arigatai ‘to be grateful for’ hazukasii ‘to be bashful/ashamed of ’ hosii ‘to want’ itosii ‘to think tenderly of ’ kawaii ‘to hold dear’ kutiosii ‘to be regretful of ’ natukasii ‘to miss, to feel yearning for’ netamasii ‘to be jealous of ’ nikurasii ‘to be hateful of ’ omosiroi ‘to be interested in’ osorosii ‘to be afraid/fearful of ’ urayamasii ‘to be envious of ’ -Tai ‘be anxious to’ at the top of (7) combines with verbal stems to yield compound adjectives ‘be anxious/want to do . . . ’, and marks the verbal object with ga as in (8) and (9): (8) (Watasi wa) sake ga nomi-tai. 1.sg top sake nom drink-be anxious to.nonpast “I am anxious to/want to drink sake.” (9) (Watasi wa) kono hon ga yomi-tai. 1.sg top this book nom read-be anxious to.nonpast “I am anxious to/want to read this book.” As a result the number of subjective predicates involving ga for object marking is not limited to a dozen listed above in (7); the instances can be easily multiplied. In this section, internal state predicates in Japanese have been examined in terms of the characteristic properties that distinguish them from their close equiv- JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.5 (285-338) Internal state predicates in Japanese alents in other languages like English. They are namely: (1) there is a strong structural restriction in terms of the person of their subjects; (2) they typically appear without their (experiencer) subjects; and (3) the nominative marker ga in these constructions marks what are considered to be grammatical objects. Although these three properties of internal state predicates in Japanese have been discussed, or described, separately from each other in the existing linguistic literature, they have never before been treated as a single system. In the next section I will consider a cognitive semantic analysis of these internal state predicates, to see if these three characteristics can come together to reveal common, underlying principles. . Cognitive Grammar approach to subjectivity One of the foundational claims of cognitive semantics is that “an expression’s meaning cannot be reduced to an objective characterization of the situation described: equally important for linguistic semantics is how the conceptualizer chooses to construe the situation and portray it for expressive purposes” (Langacker 1991a: 315). In analyzing subjective predicates in Japanese, a formal framework is needed by which the relationship between the speaker (i.e., the conceptualizer) and event described (as well as the role the speaker plays in conceptualizing the event) can be explicitly described and examined. Langacker’s (1985, 1991a, and 1991b) cognitive semantic theory of subjectivity represents one such framework, and provides a theoretical vocabulary for analyzing the internal state predicates in Japanese; the necessary constructs are briefly introduced below. Definition of linguistic subjectivity Langacker explains ‘subjectivity’ as follows: “Subjectivity pertains to the observer role in viewing situations where the observer/observed asymmetry is maximized” (Langacker 1985: 109). Consider such a viewing situation, which he calls an optimal viewing arrangement, diagramed in Figure 1a. In the diagram, ‘S’ stands for the subject of conception (or observer), ‘O’ for the object of conception (or observed), the arrow for the direction of conception, and the broken-line circle for the objective scene. With respect to an optimal viewing arrangement in Figure 1a, he notes, “S can be characterized as maximally subjective, and O as maximally objective” (Langacker 1985: 121). This optimal viewing arrangement is contrasted with what he calls the egocentric viewing arrangement, diagramed in Figure 1b, where the locus of viewing attention is expanded to include the position of S and his/her immediate surroundings. The subject of conception S is no longer simply an ob- JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.6 (338-390) Satoshi Uehara (a) optimal viewing arrangement S O (b) egocentric viewing arrangement S O Figure 1. From Langacker (1985: 121) server, but to some degree an object of conception as well, and in this situation, S receives a more objective construal while the scene conceived becomes more subjective. Thus, the conceptualization diagrammed in Figure 1b represents the semantic structure of ‘subjective’ expressions, and Langacker defines a subjective expression “as one that includes the ground – or some facet of the ground – in its scope of predication (i.e., its base)” (Langacker 1985: 113, emphasis in the original). (The term ‘ground’ is used in Cognitive Grammar to indicate the speech event, its setting, and its participants.) The subjectivity scale As it is obvious from expressions like ‘maximally subjective’ and ‘more subjective’, subjectivity is a matter of degree, depending on how prominent the ground is conceived in the overall conceptualization. Thus, Langacker introduces the notion of a subjectivity scale, along which linguistic expressions can be ranked. The pair of sentences in (10) illustrates two levels of the gradience (Langacker 1991a: 326). (10) a. Vanessa is sitting across the table from Veronica. b. Vanessa is sitting across the table from me. So far as the locative relationships are concerned, the sense of across in (10a) is fully objective in that it profiles the spatial configuration without regard to speakerhearer position.5 (10b) is more subjective in that one of the participants, namely, the reference-point for the across relation is the speaker (first person); the expression includes the ground in its scope of predication, as in the conceptualization pattern represented by the diagram in Figure 1b above. Langacker identifies a further subjective level of construal, using the same situation for the already subjective construal in (10b). The contrast between the two levels, according to Langacker (1991a: 328), is reflected in the following pair of sentences in (11): (11) a. Vanessa is sitting across the table from me. (=10b) b. Vanessa is sitting across the table. JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.7 (390-409) Internal state predicates in Japanese (a) (b) Figure 2. The two describe precisely the same spatial configuration, and they both presuppose the reference point for the spatial configuration they describe (so (11b) may sound awkward without context). The only structural difference is that the presupposed reference point is covert in (11b). Conceptually, they are both subjective in that they take the speaker as the reference point. However, the formal distinction between overt (11a) and covert (11b) reference to the speaker, iconically reflects its being construed with a lesser or greater degree of subjectivity.6 (11a) suggests a detached outlook in which the speaker treats her own participation as being on a par with anybody else’s (‘objective construal’ of the speaker), whereas (11b) identifies the reference point with the speaker and portrays the situation “through her eyes”. In other words, the scene depicted by the sentence in (11a) and that in (11b) are like (a) and (b), respectively, in Figure 2. In Figure 2a, the person with the glasses is the speaker, myself, while in Figure 2b, I the speaker sit on this side of the table, the vantage point for the scene. The thick line in Figure 2b represents the speaker’s camera angle and indicates that the vantage point is specified with the speaker. In (11b), as Traugott (1995: 49) puts JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.8 (409-476) Satoshi Uehara it, “what is strengthened is specifically the subjective stance of the speaker.” Thus, of the two already subjective expressions in (11), b is the more subjective. The representations in Figure 2 of the scenes depicted by the sentences in (11a) and (11b) illustrate why (12a) is considerably more natural than (12b): (12) a. Look! My picture’s in the paper! And Vanessa is sitting across the table from me! b. ?Look! My picture’s in the paper! And Vanessa is sitting across the table! Langacker (1991a: 329) explains as follows: “Examining a picture of oneself involves self-construal that has a high degree of objectivity, for it literally implies an external vantage point. This is consistent with the objectivity conveyed by the speaker’s explicit self-reference in the final clause of (12a), but inconsistent with the subjectivity signaled by the lack of the speaker’s explicit self-reference in (12b).” A formal way of representing the fine-grained contrast between two subjective structures like the pair in (11) is due, since they both represent the same situation in the diagram in Figure 1b, and since those in Figure 2 depict only one specific instance (with the two ‘slots’ elaborated with Vanessa and myself). In fact, the sitting-across case in (11) represents a productive, spatial configuration pattern in English as illustrated in (13) and (14) [taken from Langacker 1985, 1991a]: (13) a. There is snow all around me. b. There is snow all around. (14) a. The store is through the tunnel from here. b. The store is through the tunnel. Thus, many spatial configuration expressions in English have a pair of established senses analogous to those of sitting-across. The difference between the two semantic structures are shown in Figure 3, using Cognitive Grammar style representations. In Figure 3, abbreviated as tr is the trajector, the technical term for ‘figure within a profiled relation’ and lm is the landmark, ‘less prominent entity in the relation’. In the case of (11), the trajector is Vanessa, and the landmark is the table in the across relationship. The profiled across relationship is represented by the thick dotted arrow, whose starting point indicates R, the reference point with respect to which the trajector is located. In (a), Sp, the speaker, takes a vantage point external to the described situation (Sp’), where the reference point happens to be the speaker, while in (b), the reference point is identified with the speaker’s vantage point. JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.9 (476-514) Internal state predicates in Japanese (b) (a) lm lm tr rp tr rp Sp Sp OS OS Sp’ OS: onstage region : across relation tr: trajector lm: landmark rp: reference point Sp: speaker Figure 3. . Subjective construal and Japanese internal state predicates Having examined the Cognitive Grammar analysis on the sitting-across predicate in English, one can easily see exactly the same analysis holds for subjective, spatial configuration expressions in Japanese as well, such as the me-no-mae-ni-suwatteiru (‘sitting-right-in-front’) predicate in (15): (15) a. Hanako ga watasi no me no mae ni suwatte iru. Hanako nom 1.sg gen eye gen front at sitting be.nonpast “Hanako is sitting right in front of me.” b. Hanako ga me no mae ni suwatte iru. Hanako nom eye gen front at sitting be.nonpast “Hanako is sitting right in front.” Thus, basically Figures 2a and 3a and their discussions apply to (15a) and Figures 2b and 3b to (15b). The Cognitive Grammar analysis can uniformly apply to the subjectivity phenomena in spatial configuration expressions in English and Japanese. I now will propose a similar line of analysis for the internal state predicates in Japanese, which are also called subjective predicates. By examining similarities and differences between the two phenomena, I will demonstrate that the Cognitive Grammar approach to subjectivity not only has a cross-linguistic applicability but also can help explicate cross-linguistic variations in linguistic subjectivity. JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.10 (514-581) Satoshi Uehara The speaker’s role Internal state predicates in Japanese resemble the (English) sentences of spatial relation in (11) in that the speaker (the conceptualizer) plays a prominent role in their conceptual structure. Whether or not encoded linguistically, the speaker occupies the role of experiencer/cognizer of a state denoted by the predicate in the former, and that of viewer of a denoted situation in the latter. Consider the hosii ‘want’ example reproduced in (16): (16) a. Watasi wa sake ga hosii. 1.sg top sake nom want.nonpast “I (explicit) want sake.” b. Sake ga hosii. sake nom want.nonpast “I (implicit) want sake.” The hosii ‘want’ sentences in (16) describe a cognizer’s psychological state, where the speaker is the cognizer, and functions as the subject (as opposed to the object) of cognition. In a very similar manner, the sitting-across sentences in (11), describe a spatial configuration, where the speaker is the reference point for the figure in the across relationship, and functions as the viewer, i.e., the subject of perception. Thus, in either case, the speaker functions as the subject of conception. The Japanese internal state predicate patterns resemble the patterns of spatial configuration in another respect. In both, the (b) pattern, where the subject of conception is not linguistically coded, typically assumes the speaker for that implicit role. Thus ‘speaker’, not just a schematic ‘person’, is present in the semantic structure of (b) sentences. These two points of resemblance motivate to posit for the two subjective ‘want’ sentences in (16a) and (16b), the semantic structures illustrated in Figures 4a and 4b, respectively. In Figure 4a, the cognizer of Japanese internal state predicates (i.e., the speaker) is explicitly encoded as in (16a), and the speaker is somewhat objectively construed since he puts himself more or less onstage and sees himself as if through the eyes of someone else. In Figure 4b, in contrast, the speaker is not linguistically encoded but is necessarily evoked as in (16b), and the construal of the situation is highly subjective since the speaker is equated with the cognizer and serves as the vantage point for what is conceptualized. The scenes depicted by the specific sentences Watasi wa sake ga hosii ‘I (explicit) want sake’ in (16a) and Sake ga hosii ‘I (implicit) want sake’ in (16b) can be shown in Figure 5a and Figure 5b, respectively. In the scene in Figure 5a, the person with the glasses on the left, who is wanting and imagining the sake, is the speaker, myself. The scene in Figure 5b, on the other hand, is what I the speaker cognize, that is, what appears in the conceptual space of JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.11 (581-581) Internal state predicates in Japanese (b) (a) cd cr Sp Sp’ (a) (b) Figure 5. cd Sp OS Figure 4. cr OS OS: onstage region : cognizing relation cr: cognizer cd: cognized entity Sp: speaker JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.12 (581-636) Satoshi Uehara the person on the left in Figure 5a, whose viewpoint is the assumed vantage point for the conceptualization. Again, the thick line represents the speaker’s cognition and indicates that the vantage point is specified with the speaker.7 Difference in the default pattern in subjectivity Japanese internal state predicates differ most from spatial configuration expressions (in English and Japanese) in the default degree of subjectivity. Although construals of different degrees of subjectivity are available in both types, a close examination into the behavioral and structural patterns of each construal in the two types reveals a different construal for the default one in each. Objective construal seems to be more of the default case in the spatial configuration expressions, while in the Japanese internal state predicates the subjective perspective is the norm. This can be seen in the fact that the most subjective construal in spatial configuration expressions is compositional in its formulation, while that in Japanese internal state predicates is non-compositional and its subjectivity is inherent to the lexicon (as their alternative name subjective predicates suggests). The (English) predications of spatial relations (e.g., across, around) have nothing inherently subjective in their semantic structure. In other words, their profile does not involve any element of the speaker (the ground). Thus around, for example, has a profile whose trajector and landmark are both instantiated by things in the objective scene, so that the composite structure has the whole profile in the on-stage region. The composite relational profile becomes subjective only if the spatial predication is combined with the ground (as in Figure 1b) in its compositional process. In sharp contrast with the spatial configuration expressions, Japanese internal state predicates can be analyzed as subjective in their lexical structure. The cognizer slot (‘e-site’) of their lexical semantic structure is specified with the ground, which can be made overt by the compositional process with its overt form such as watasi, ‘I’. In terms of the lexical structure organization, Japanese internal state predicates thus resemble a ‘deictic verb’, come (Langacker 1985), or kuru, ‘come’, in Japanese, for that matter, where the goal position of its movement is specified with the ground. Its ground element can be made overt, as in, He came here, but its phonological presence is not required (or “subcategorized”) as in, He came. Thus the subjective pattern in spatial configuration expressions in (11) is subjective only by virtue of its composition with the ground, while the subjective pattern in Japanese internal state predicates in (16) is subjective by itself. This contrast in the default structure status is in fact compatible with other observations about them. The compositional status of the subjectivity of the spatial configuration patterns is evident in Langacker’s description of the subjective JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.13 (636-691) Internal state predicates in Japanese construal as the extension of an objective one, and the fact that in some situations, like in (12) above, the most subjective construal becomes infelicitous. The lexical and default status of the subjectivity of the Japanese internal state predicate patterns in (16) is nicely in line with its typological markedness pattern.8 Japanese internal state predicates can be used in three patterns: with the unexpressed first person cognizer, with the explicit first person cognizer, and with the third person cognizer. Of the three, the pattern with the third person cognizer is the most marked, since the predicate has to be structurally marked (i.e., with evidential markers, as in §2). Of the two less marked patterns with the first person cognizer, the unexpressed cognizer pattern is the less marked one behaviorally since it is the typical and most frequently attested pattern for them (as in §2). Thus, in the case of internal state predicates in Japanese, the markedness order (from less to more marked) is: (i) the subjective construal of the speaker (i.e., (16b)); (ii) the more objective construal of him/her (i.e., (16a)); (iii) the objective construal of the third person (e.g., (3) and (4)). In fact, for the obligatory use of evidential markers in the most objective pattern, it can be postulated that the most subjective construal (the first person cognizer) is so much inherent to the semantic structure of internal state predicates in Japanese that the maximally objective construal, where the third person takes the cognizer role, is rendered unacceptable as it is. Such obvious markedness patterns, structural or behavioral, do not appear to exist in the example of spatial relations. In sum, the two conceptual structures, spatial configuration expressions and Japanese internal state predicates, resemble each other as seen in the previous section, but shared similarity is present only at the composite structure level. They differ in their compositional pattern, and, more importantly, in their component lexical structure. The role in the event structure The analysis of Japanese internal state predicates so far presented offers an interesting account of the nominative ga for object marking, the other of their characteristic properties. In Cognitive Grammar, the grammatical subject is defined as the trajector of the clause level relational predication, and the trajector is in turn defined as the primary figure in a profiled relation. The most salient figure in the unmarked type of conceptual structure (the most subjective construal) of internal state predicates in Japanese is what the speaker conceives – i.e., the object of her cognition (see Figure 4b and, more specifically, Figure 5b). In fact, it is the only profiled ‘thing’ in her conceptual space,9 where the construal of the cognizer speaker is subjective and she is away from the scope of her conception into its background. This means that this cognized thing is the trajector, and therefore the subject rather than the object of the predicates. Thus, the use of the nomi- JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.14 (691-738) Satoshi Uehara Figure 6. native case marker ga for the cognized thing is not without reason: the cognized thing is marked with the nominative because it is the subject in the lexical semantic structure of internal state predicates in question. This pattern parallels that of the fore-mentioned deictic verb kuru, ‘come’, where the thing moving toward the speaker is the trajector/subject and is marked with the nominative marker ga in Japanese. Let us compare the scene depicted by an internal state predicate hosii ‘want’ in Figure 5b to that for the deictic verb come in Figure 6 [slightly modified from a visual aide for the verb come in Hatasa et al. (www.sla.purdue.edu/fll/JapanProj/FLClipart/)]. In Figure 6, the thing/person moving toward the speaker is the object of perception (or ‘the perceived thing’) from the vantage point of the speaker at the goal point of the profiled movement, but its position in the overall conceptual structure (with the speaker simply assumed in the background) motivates it being marked with the nominative. This analysis of the cognized entity as the subject, rather than the object of internal state predicates in Japanese naturally raises another question regarding the status of the other overt nominal (i.e., the speaker) in the somewhat objective construal in Figures 4a and 5a. The present analysis, accordingly, analyzes it more or less as an entity in the discourse space, somewhat external to the lexicosyntactic semantic structure of the internal state predicates. As Uehara (1998b) has shown, the speaker is made overt only when discourse factors (e.g., focus) favor explicit mention of the cognizer role for Japanese internal state predicates. In other words, the somewhat objective construal in Figure 4a holds only when the discourse space is evoked. Hence, the overt cognizer role is in the discourse space, and need not be assigned any grammatical role in the lexical or sentence level conceptual structure.10 JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.15 (738-797) Internal state predicates in Japanese This can be illustrated again with the parallel case of the deictic motion verb kuru, ‘come’. The speaker’s role as the perceiver of the thing moving toward her is crucial in the deictic motion profile whether she is linguistically encoded or not. However, its crucialness lies in its function as the vantage point for the conceptualization, and differs from that of the subject, the most salient figure in a relational profile. Thus, the moving thing is the subject; the cognizer speaker is not. A large corpus text analysis conducted and published in 1997 by Kokuritu Kokugo Kenkyujo (The National Language Research Institute) strongly supports the claim here. They examined a total of 3146 instances of the nominative particle ga in the text corpus, and made a list of all the semantic roles they occupy and the predicates they occur with. All and only instances of ga for such internal state predicates as listed in (7) above are used with what is cognized (among what they call ‘objective’). They attested instances of ga with the cognizer role for internal state predicates as well (among what they call ‘experiencer’), but all their predicate forms are marked with the evidential marker -garu, ‘to show the signs of ’, such as hosi-garu, ‘show signs of wanting’, indicating that the cognizer in question is not the speaker (see (4) in §2). Unfortunately, they do not extend their analysis to cover the 3477 nouns marked with the topic particle wa in the data, except that they note that they would posit ga as the underlying case marker for 3382 instances (97.3%) of them (1997: 210). Therefore, the text discourse data shows that only the cognized thing is marked with the nominative particle ga for the unmarked forms of internal state predicates, while for their forms with evidential markers, the (third person) cognizer is marked with ga. The first person cognizer might be overt when the form is most likely to be marked with the discourse marker of topic. In sum, the cognitive semantic analysis presented here can nicely account for the otherwise problematic use of the nominative particle ga for the grammatical object, as well as the other characteristic properties of internal state predicates in Japanese. Their lexical semantic structure is inherently specified with the speaker as the cognizer, which is realized as their person restriction to the effect that some extra morpheme (evidential marker) is required to rid themselves of the lexical specification in expressing the third person’s internal states. The same inherent specification with the speaker leads to their default and most frequent use in which the cognizer speaker is assumed and covert. The cognizer role of Japanese internal state predicates, which is specified with the speaker, is in the background (like the speaker’s role in the verb come), leaving the stimulus role for the most salient figure on the onstage region – the best candidate for the predicate subject – and thus motivating its nominative marking. The construal patterns of internal state predicates in Japanese motivate their behavior in morphology, syntax and discourse. JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.16 (797-847) Satoshi Uehara Subjective construal in the Japanese language The discussion so far demonstrates that internal state predicates in Japanese can be best characterized as ‘deictic’ verbs (Langacker 1985) on a par with the verb come in English or kuru, ‘come’, in Japanese. Their lexical semantic structure is characterized with the inherent presence of the speaker, or the ground element, in the base somewhere off of its central, onstage region. What they profile is the object of conception from the vantage point of the speaker. In the case of internal state predicates in Japanese, what they profile is what is cognized by the speaker, where the speaker’s role as the cognizer is assumed and in the background. The fact that the lexical entries of a deictic nature exist not only for motion verbs but also for internal state predicates in Japanese, and that such internal state predicates are large in number and productive as well in forming other subjective predicates combining with stem forms of verbs in general, suggests the following: the speaker’s perspective is rather frequently assumed in Japanese and prominent for the viewing arrangement for expressions in the language. In other words, the subjective construal more or less represents the unmarked pattern in Japanese, unlike other languages such as English. Some universalists may doubt that the speaker’s perspective is typically assumed and represents the unmarked pattern in Japanese, despite strong support from recent studies on Japanese discourse (Iwasaki 1993; Uehara 1998b). These studies have demonstrated that the relationship between the speaker’s perspective and the sentential subject in Japanese is more direct than other languages like English. For example, Uehara (1998b) analyzed an English story and its Japanese translation, and observed that when the speaker is coded as the subject in English sentences such as in (17), these are typically rendered into Japanese like in (17’) [see also Iwasaki 1993: 80]: (17) Then I saw a girl standing there. (17’) Suruto, onnanoko ga soko ni tatte ita. then a girl nom there at standing be.past (lit.) “Then, a girl was standing there.” In the contrastive patterns in (17) and (17’), the speaker’s role as discoverer of a situation, which is encoded explicitly as the subject of the main (discovery) predicate in English, is structurally missing altogether along with the predicate of the discovery in the typical Japanese construal. Here again, the speaker’s perspective is assumed and linguistically unencoded in Japanese. Getting back to the cognitive analysis of the nominative ga for the grammatical object, we can see the parallelism in the sentences in (17) and (17’). One can see that a girl is the grammatical object in, I saw a girl standing there, but not in, a girl was standing there. Ga appears to mark the grammatical object only if we JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.17 (847-907) Internal state predicates in Japanese (force ourselves to) take the more objective construal typical in English, for the originally ‘subjective’ predicates in Japanese. The ga-marked nominal is in fact functioning as the grammatical subject in the unmarked subjective conception of them, as in (17’). The cross-linguistic variation in the unmarked viewing arrangements pointed out here, has a profound implication for the theories of verbal semantics. In cognitive as well as many other linguistic theories, discussions of event structures assume the optimal viewing arrangement as the unmarked one, where “the roles of the observer and the observed are fully distinct” (Langacker 1991b: 550). Thus, in Cognitive Grammar the ‘canonical’ event model represents “the normal observation of a prototypical action. . . a single event observed from a vantage point external to its setting” (1991b: 545 [emphasis added]), and accordingly, the canonical viewing arrangement “incorporates the canonical event model” [emphasis in the original]. Languages like Japanese, therefore, present a counter-example to the canonical status of such a viewing arrangement. In the viewing arrangement prominent in Japanese, the roles of the observer and the observed are not fully distinct, and events are frequently observed from a vantage point internal to its setting. The canonical event model is in fact canonical as the event structure model for languages like English, but not so for Japanese. The term ‘canonical’ in the ‘canonical event model’, then, should not be taken in the cross-linguistic sense, but in a language-specific sense. Taking a cross-linguistic approach and characterizing the event structure model in terms of different viewing arrangements does not, of course, undermine any of the previous, fruitful work in verbal semantics. Rather, it adds another dimension to them, by making it possible now to discuss the canonical viewing arrangement and the canonical event model for each language, and differences in the degree of subjectivity among languages’ canonical viewing arrangements. This newly added dimension is the “subjective axis” (Achard 1996: 1168) viewing arrangement, and without it events are assumed to hold in the “objective axis” (Achard 1996) alone. The subjective axis is, as it were, the ‘third’ dimension added to the former, ‘two dimensional’ event structure model. (Event structures are depicted two-dimensionally, anyway.) The cognitive analysis presented above for the problematic case marking of internal state predicates in Japanese thus represents an approach using this new event structure model. The traditional ga for the grammatical object analysis, on the other hand, is a natural consequence of its theoretical presupposition that event structures be of two dimensions. Furthermore, the current research can arguably shed new light on the study of linguistic subjectivity itself. For example, the existence of speaker-perspective oriented languages like Japanese seems to suggest cases of objectification, as opposed to those of subjectification, exemplified by Langacker (1991a) with the synchronic JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.18 (907-957) Satoshi Uehara compositional process of spatial configuration expressions.11 The semantic compositional process of internal state predicates in Japanese, in fact, represents the synchronic process of objectification: lexically subjective, internal state predicates become less subjectively construed in their compositional process to the sentence and discourse levels. Taking an evidential marker in expressing the third person’s internal states as in (3) and (4) is literally a process of objectifying lexically subjective, internal state predicates, with the evidential marker functioning as an objectifying morpheme. Cases of the diachronic objectification process in Japanese must await future research,12 but Uehara’s (1998a) analysis on the major parts-of-speech in Japanese suggests a positive outlook. Uehara has shown that adjectival predicates are structurally divided into the two categories, (Canonical) Adjectives and Nominal Adjectives, and that the former are represented by words in the native stratum, while the latter are words of later coinage, mostly of non-native origin. Interestingly, most, if not all, of subjective predicates (including all of those listed in (7) above) belong to the Canonical Adjective category. This suggests that later additions to the language’s lexical stock are all words of non-subjective type, possibly leading to an objectification of the overall lexical structures. Of course, further and more detailed research on this aspect of linguistic subjectivity is necessary.13 . Conclusion In this paper I have given a cognitive semantic account for the grammatical behaviors exhibited by internal state predicates in Japanese. In this account, their grammatical behaviors, such as ga being used for object marking and the impossibility of a third person subject, all come together and have conceptual motivations. The analysis of internal state predicates in Japanese presented here crucially relies on, and therefore provides support for, the Cognitive Grammar theory of subjectivity. This approach has cross-linguistic applicability in that it is possible to uniformly examine and analyze subjectivity-related phenomena in more than one language, and thus provide a mechanism for considering cross-linguistic variations in subjectivity, as well as handle language-specific realizations of this phenomenon. Notes * Earlier versions of this paper were presented at the Research Issues for Cognitive Linguistics Workshop at the Australian Linguistics Institute in July 1998; and at the Fourth Conference on Conceptual Structure, Discourse and Language, in October 1998. I am grateful to the au- JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.19 (957-1023) Internal state predicates in Japanese diences of those conferences, especially to Eve Sweetser and Ron Langacker for their valuable comments. I would also like to thank June Luchjenbroers, the editor of the volume, and anonymous reviewers for their insightful comments and encouragement. Bob Sanders and Andrew Barke also deserve my thanks for checking my English. All the remaining faults are my own. The research was supported in part by a 1998 Grant-in-Aid from the Ministry of Education, Science and Culture (# 10680303). All correspondences concerning this article should be sent to: Satoshi Uehara, Graduate School of International Cultural Studies, Tohoku University, Japan. Email: [email protected] . The following abbreviations are used in the gloss: ACC = Accusative; GEN = Genitive; NOM = Nominative; NONPAST = Nonpast tense marker; PAST = Past tense marker; PL = Plural; POL = Politeness marker; SG = Singular; TOP = Topic marker. . In normal circumstances, the subjective predicates in Japanese can be used only with the 1st person singular experiencer. Only in some contexts, however, where the speaker can know and representatively express the emotion shared by her group members, the plural form is possible: (18) Konna kekkoona mono o itadaite, watasitati wa hizyooni uresii desu. this valuable thing acc receiving 1.pl top extremely glad.nonpast pol ‘We are extremely glad receiving this valuable thing (from you).’ It should be also noted here that the term subjective subsumes a number of related concepts even in cognitive linguistics alone, and that the term here is different from that in Achard (1996), which analyzes the French equivalent of want (vouloir) to be inherently subjective. The current analysis examines the subjectivity of main predicates while his discusses that of complement clauses. The following French data from his analysis (slightly modified) shows that the person restriction dealt here for Japanese subjective predicates does not apply to French vouloir ‘want’: a. b. Je veux revenir. ‘I want to come back.’ Jean veut revenir. ‘John wants to come back.’ . Similar distinctions in mode of discourse in other languages are noted in Benveniste (1971) and Banfield (1982). Banfield (1982: 12), for example, compares Kuno’s ‘non-reportive’ style in Japanese to “a literary style known to modern grammarians under the French term style indirect libre and the German erlebte Rede.” . There are predicates, other than subjective ones, which also form the double nominative constructions in Japanese (e.g. dekiru ‘capable/possible’ as in Taroo wa tenisu ga dekiru (Taro top tennis nom capable.nonpast) ‘Taro is capable of/good at playing tennis.’). See Croft (1991) for an analysis from a cognitive and typological perspective on such non-subjective, double nominative (and other non-canonical case-marking) constructions in Japanese and other languages. The double nominative construction in Japanese arguably constitutes a case of constructional polysemy, where non-subjective senses can be argued to be extensions from the core, subjective ones. However, it is beyond the scope of the current paper. See the discussion on the diachronic objectification process below for a relevant point. . Thus, Langacker intends that (10a) is spoken by someone other than Veronica: As shown in Figure 1a, Veronica is fully distinct from the speaker as his object of conception O in (10a). Interestingly, however, we can also think of a situation in which (10a) is spoken by Veronica herself (a more felicitous example would probably be Talk to George uttered by the former US President George Bush pointing to himself). Such an interpretation of (10a), which is called JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.20 (1023-1087) Satoshi Uehara (10a’) here, is more subjective than (10a) because the former does refer to the speaker. (10a’) is more objective than (10b) because the former uses a non-deictic expression. It should be noted in this connection that Langacker (1991a) discusses the ‘maximally objective’ level (e.g. Vanessa jumped across the table), compared to which, the senses of across in (10a) and (10b) are more subjective in that they represent not concrete motion, but the abstract construal of a conceptualizer (the speaker) tracing a mental path “in order to locate the trajector vis-à-vis the reference point” (ibid.: 327). This is another kind of subjectification phenomenon and often referred to as “subjective motion”. . Traugott and Dasher (2002: 98) argues against Langacker’s analysis here observing the fact that the reference point is not necessarily the speaker, but it could be someone else for the sentence (11b) giving certain contexts for it. However, they seem to be confusing “maximally subjective” in the basic level and that in the secondary level or the cases of what I call “perspective transfer” (Uehara 1998b), where the speaker/narrator more or less takes, and describes the situation from, the perspective of someone else. The case under discussion for (11b) is the basic level case where the speaker is the default reference point. (11b) in that sense depicts what the speaker sees in real time. It should also be noted in passing that according to Traugott (1995), this principle that zero expression is iconic of maximum subjectivity holds only for subjectification processes involving “constructions which originate in argument structure (events, particularly motion events, and the participants in them)” (Traugott 1985: 48), but not for those of others such as adversative connectives and focus particles. In contrast with Langacker’s, Traugott’s discussion of subjectification originated with the latter. . Without that specification, the viewpoint can be anybody else’s, and such scenes are objectively construed ones, like the one for the expression Sake ga aru (sake nom exist) ‘There is some sake.’ . The concept of markedness, as typologically interpreted, can handle multi-valued categories (e.g. “most/more/less/least marked” in addition to “marked/unmarked” in the classical twovalued model) and thus connect markedness patterns to typological hierarchies. Please see Croft (1990), for example, for a detailed discussion of the model of typological markedness. . ‘Thing’ here is a technical term in Cognitive Grammar, which designates a region in some domain, and corresponds to the semantic pole of a noun. . It should be noted that whether a certain role is external or internal to the lexical semantic structure is a matter of degree (Cognitive Grammar assumes the lexicon-syntax-discourse continuum). Thus, some internal state predicates in Japanese may have their cognizer role almost as inherent as those internal to them, which may explain why many take the cognizer role for the subject argument of such internal state predicates degrading their stimulus role to their object argument. . Traugot (1995) points out that “Langacker’s work focuses primarily on ‘subjectivity’ as a gradient phenomenon found synchronically,” (p. 32). . Some native speakers of Japanese have pointed out that in their speech some of the internal state predicates no longer require any evidential markers in expressing the third person’s internal states. If research finds that more internal state predicates have lost their person restriction in more people’s speech, it can constitute an instance of diachronic objectification. JB[v.20020404] Prn:13/02/2006; 13:35 F: HCP1513.tex / p.21 (1087-1202) Internal state predicates in Japanese . Ikegami (1999: 93) claims that the so-called “pro-drop” nature of Japanese, whereby thirdperson (and other person) subjects are frequently omitted, is the result of an extension from the “cognizer-less” nature of “prototypical” subjective predicates. References Achard, Michel (1996). Perspective and syntactic realization: French sentential complements. Linguistics, 34, 1159–1198. Aoki, Haruo (1986). Evidentials in Japanese. In Chafe & Nichols (Eds.), Evidentiality: The linguistic encoding of epistemology. Norwood: Ablex. Backhouse, A. E. (1993). The Japanese language: An introduction. Oxford: Oxford University Press. Banfield, Ann (1982). Unspeakable sentences. Boston: Routledge and Kegan Paul. Benveniste, Emile (1971). Problems in general linguistics. (Translation by Mary E. Meek). Coral Gables: University of Miami Press. Croft, William (1990). Typology and universals. Cambridge: Cambridge University Press. Croft, William (1991). Syntactic categories and grammatical relations: The cognitive organization of information. Chicago: University of Chicago Press. Ikegami, Yoshihiko (1999). Nihongo rashisa no naka no “syukansei” (“Subjectivity” in Japaneselike-ness). Gengo (Language), 84–94. Tokyo: Taishûkan. Iwasaki, Shoichi (1993). Subjectivity in grammar and discourse: Theoretical considerations and a case study of Japanese spoken discourse. Philadelphia, PA: John Benjamins. Kokuritsu Kokugo Kenkyûjo (1997). Nihongo ni okeru hyôsôkaku to shinsôkaku no taiôkankei (Cases and Japanese postpositions), Kokuritsu Kokugo Kenkyûjo Report 113. Tokyo: Sanseidô. Kuno, Susumu (1973). The structure of the Japanese language. Cambridge, MA: MIT Press. 25 Kuroda, S.-Y. (1973). Where epistemology, style and grammar meet: A case study from Japanese. In S. R. Anderson & P. Kiparsky A Festschrift for Moris Halle (pp. 377–391). New York: Rinehart and Winston. Langacker, Ronald W. (1985). Observations and speculations on subjectivity. In Haiman (Ed.), Iconicity in Syntax. Amsterdam/Philadelphia: John Benjamins. Langacker, Ronald W. (1987). Foundations of Cognitive Grammar. Vol. 1, Theoretical prerequisites. Stanford: Stanford University Press. Langacker, Ronald W. (1991a). Concept, image, and symbol: The cognitive basis of grammar. Berlin: Mouton de Gruyter. Langacker, Ronald W. (1991b). Foundations of Cognitive Grammar. Vol. 2, Descriptive application. Stanford: Stanford University Press. Traugot, Elizabeth Closs (1995). Subjectification in grammaticalisation. In Stein & Wright (Eds.), Subjectivity and subjectivisation: Linguistic perspectives (pp. 31–54). Cambridge: Cambridge University Press. Traugott, Elizabeth Closs & Richard B. Dasher (2002). Regularity in semantic change. Cambridge: Cambridge University Press. Uehara, Satoshi (1998a). Syntactic categories in Japanese: A cognitive and typological introduction. Tokyo: Kurosio Publishers. Uehara, Satoshi (1998b). Pronoun drop and perspective in Japanese. In Akatsuka, Hoji, Iwasaki, Sohn, & Strauss (Eds.), Japanese/Korean linguistics Vol. 7 (pp. 275–289). Stanford: CSLI. JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.1 (47-117) chapter Figure, ground and connexity Evidence from Xhosa narrative David Gough Christchurch Polytechnic Institute of Technology, New Zealand This article examines how the discourse concepts of background/foreground and connexity are significant organisational factors in the Xhosa verbal system. After defining background/foreground and connexity, evidence from Xhosa folk narrative is presented to show how these features provide a richer and more coherent explanation of the structure of the verbal system than the traditional analysis. The article advocates an orientation to language which holds that its nature can and should be explained in terms discourse/cognitive factors rather than seeing it as a discreet and separate and internally describable ‘module’ Keywords: background, foreground, connexity, discourse, Xhosa (Bantu) . Introduction In this paper the perspective taken is that discourse pragmatic and cognitive factors are essential to understanding the way in which language is organised and structured. Such an approach focuses on the notion that the nature of language can and, indeed, should be described in terms outside language itself – beyond, that is, the dominant conceptualisation of language which sees it as an independent mental module or faculty and therefore being the way it is because it is the way it is (for a convincing critique of this perspective, see Givón 1990). This paper will argue specifically, that the concepts of ‘grounding’ and ‘connexity’ contribute significantly to explaining the structure of, at least, the Xhosa verbal system. It will do so by focussing on data taken from folk narratives. . Connexity/dependence Analysis of discourse reveals a basic principle: the relative syntactic dependence of a clause signals its relative conceptual connection or integration to its dis- JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.2 (117-161) David Gough course context (Gough 1986: 79; see also Givón 1990 for a similar perspective). The concept of connexity in this regard refers to the degree to which information in a particular clause is seen as, on the one hand, independent from or, on the other hand, integrated with or ‘dependent on’ the information in a previous clause. While there are a variety of lexical and grammatical features that encode such connexity (cf. Thompson 1987), of particular interest to this paper is the fact that in many languages ‘dependent’ or ‘subordinate verb’ forms tend to display less prototypical features of verbs, such as tense-aspect, modality and agreement (see Gough 1986), which together constitute the traditional category of ‘finiteness’ (see also Givón 1990; Carlson 1992). . Background and foreground information The distinction between background and foreground is, of course, basic to human perception. It is also one of the most basic concepts in discourse analysis (see Wallace 1982; Givón 1987; and Tomlin et al. 1997 for overviews). In metaphorical terms the foreground event clauses of a narrative form its skeleton – its basic structure, which advances the story itself. The event clauses are arranged in terms of temporal sequence forming an event line. According to Hopper and Thompson (1980) background information adds flesh to this skeleton, not advancing the story but rather characterising the backdrop against which the story develops. For this reason it is also known as durative descriptive information (to be referred to as d/d information in the discussion below). (1) a. b. c. d. e. f. g. h. Yahamba lahamba. He travelled and travelled. Lithe lisahamba njalo ladibana nomvundla. While he was so travelling, he met a rabbit. Lafika ijoni labuza kumvundla ukuba khange liwubone umvundla. The soldier arrived and asked the rabbit whether it had seen a rabbit at all. Umvundla lo nawo wayenxiba indevu apha phezu komlomo. (The rabbit was wearing a moustache here above the mouth) Wabuza umvundla, ‘kunjani lo mvundla uwufunayo?’ The rabbit asked, ‘What’s this rabbit like that you’re looking for?’ Lathi elijoni ukuphendula ukuphendula, ‘Ufana nawe.’ The soldier answered, ‘He looks like you.’ Wathi umvundla, ‘Hayi, zange ndiwubone umvundla oneendevu.’ The rabbit said, ‘I’ve never seen a rabbit with a beard.’ Wathi umvundla, ‘Hayi, hamba, mlhawumbi uphazamile.’ The rabbit said again, ‘No, go, maybe you’re mistaken.’ JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.3 (161-223) Figure, ground and connexity i. Hayi ke, nejoni laqonda okokuba mhlawumbi liphazamile. Anyway, the soldier too thought he was perhaps mistaken. j. Lahamba, labuyela umva. He travelled and went back. k. Lithe lisahamba njalo, laqonda ukuba, ‘Hayi. . . ’ While he was so travelling, he thought, ‘No. . . ’ Here we may note that each successive event clause advances the story line and that it is either temporally or causally consequential to the clause that precedes it. Changing the order of any of these clauses would change our interpretation of the events they encode. The d/d information, however, is off the event line. We may note that (d) for example, is not temporally or causally related to the events that precede or follow it. Rather it represents parenthetical background information necessary for the comprehension of the events. In conceptual terms, the distinction between durative descriptive and foregrounded ‘event’ information can be seen in terms of temporal grounding. Such temporal grounding is parallel to the organisation of visual information. According to Eysenck (1984: 33) a fundamental way in which visual information is organised is the ‘segregation of the visual field into one part called the figure and another part called the ground’. In general, the figure has ‘thing-like’ qualities, is well-defined and bounded; while the ground in which the figure is perceived is, in contrast, continuous, less definite and boundless. An example of this is the figure of a house perceived against the background of the sky. Events can be seen as temporal figures: perceived as temporally bound and discrete against a temporal background of continuous and durative situations. Such grounding ,which is basic to perception, thus also appears to form an important organisational principle in language. Wallace (1982: 214) for instance, presents the hypothesis that certain linguistic categories “function to differentiate linguistic figure from linguistic ground”, a perspective more recently illustrated in Langacker’s Cognitive grammar, as discussed by Ungerer and Schmid (1996). Similarly, Longacre (1981: 329) notes that the figure-ground categories, once distinguished solely on semantic basis, are “more and more seen to correlate with the morphosyntactics of the world’s languages”. My analysis supports this particular perspective. . Some points on past approaches The framework sketched above is of significance for two broad reasons. Firstly, it allows an alternative analysis of Xhosa verbal categories as opposed to the dominant taxonomic framework which continues to hold sway in the categorisation of Bantu verbal forms. Traditionally, this framework, known as the Dokean frame- JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.4 (223-259) David Gough work, uses the terms ‘mood’ and ‘tense’ as convenient labels to refer to quite diverse verbal categories. (The framework was named after the Bantuist Clement C. Doke. For a critique see Khoali 1993.) The result was a mixed bag of verbal inflections all falling under the same general rubric with little indication of systematicity. Mood itself is rather vaguely defined (beyond simply being a ‘form of the verb’), and the the discussion of these various ‘forms’ does not posit any underlying systematicity. Traditional accounts (e.g. Davey 1973) typically simply name the ‘moods’ (typically indicative, consecutive, participial and subjunctive) and then go on to describe these as disparate items. With regard to two of the verbal categories discussed in this chapter for instance, the consecutive is described simply as a ‘subordinate mood type’ signalling consecutive actions in the past (Davey 1973: 106), while the participial mood is merely described as a verb form used for actions simultaneous to those in a main clause (Davey 1973: 106; Du Plessis 1978: 135). As will be demonstrated in more detail in this chapter, these verbal categories appear though to be systematically structured around key discourse concepts. The second reason why a functionally oriented framework is significant is that it allows some insight into some of the debates that have emerged on discourse pragmatic accounts of grammar, particularly surrounding the coding of foreground and background information. Research has specifically questioned the claim that there is a straightforward relationship between subordinate clauses and background information on the one hand; as well as independent clauses with foreground information on the other (cf. Thompson 1987; and Wallace 1987 for some questions in this regard). As grounding and dependence are treated separately here, this may present an alternative approach to the issues involved. . Connexity, grounding and Xhosa narrative The concepts of grounding and connexity referred to above appear to form the organisational basis of a good deal of the Xhosa verbal system. In particular I will show that the verbal forms, referred to as the participial, consecutive and indicative moods as well as the so-called ‘continuous tense’ rather than being isolated grammatical structures, form a sub-system that is structured around grounding and connexity. The consecutive mood The consecutive marker is -a- (to be referred to here as cons). The structure of the consecutive is: Subject concord-a-Verb Stem), for example, JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.5 (259-316) Figure, ground and connexity (2) ixhego li-a-thetha > ixhego lathetha old-man he-cons-talk “and the old man spoke” The consecutive has been traditionally described as a ‘subordinate mood type’ with the function of, inter alia, encoding consecutive actions in the past (Davey 1973: 106). Consider the following example: (3) UThemba uye evenkileni wathenga ukutya Themba he-perf-ind-go loc-shop he-cons-buy food wagoduka he-cons go-home “Themba went to the shop, bought food and went home.” Here the first (non-consecutive) clause of the sentence uses the ‘independent’ indicative mood (perfect) while the second (consecutive) clause uses the dependent consecutive mood. Connection is thus not expressed through an overt conjunction such as ‘and’ in English, but rather through a verbal inflection. It is significant to note that the consecutive is not marked for tense; it inherits polarity from a preceding main clause; and it has limited aspectual marking in relation to verb forms traditionally regarded as ‘independent’. It is thus marked for less finiteness than other verb forms. The following is a textual example of the consecutive taken from a folk narrative: (4) a. wabetha kuyo ephondweni he-cons-hit to-it loc-hom “He hit it on the horn” b. kwasuka kwaphuma ukuyta it-cons-go it-cons-come-out food “and some food came out” c. watya He-cons-eat “and he ate” d. wahlutha he-cons-full “and got full” e. wagoduka he-cons-go-home “and went home.” The consecutive according to this approach encodes two things: connexity and foregrounded event information. Unlike the indicative past or perfect, the consecutive is marked for connexity, signalled by its less than finite form, to the clause that precedes it. Furthermore, unlike the participial which, as we shall see, also JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.6 (316-370) David Gough encodes such connexity, it does not involve a focus on the internal structure of the situation it encodes. All the consecutive clauses in (3), for example, refer to temporally bounded situations that move the time of the story forward, and all can be answers to the question, “what happened then?”. With no focus on either the internal structure of situation, nor its temporal orientation, the focus of the consecutive is the occurrence of the event itself. If the consecutive signals connexity, then breaks in the conceptual relatedness of the narrative should be indicated by the non-use of the consecutive. In such places the so called independent indicative mood should occur. This is indeed supported by the following example (here indperf indicates the indicative perfect) taken from a Xhosa folk narrative. Note here that example (4.2) follows on from example (4.1). (4.1) a. hayi ke uhambile ke umntwana nenqwelo yakhe no-then she-travel-perf then child with-carriage of-her “So then, the child travelled with her carriage.” b. wayifihla ke lo mtwana inqwelo etyholweni she-cons-it-hid then this child carriage loc-bush “Then the child hid the carriage in the bush.” c. wafika apha emdanisweni she-cons-arrived here loc-dance “She arrived here at the dance.” d. yaye inkosi idanisa nezaa ntombi zimbini he-pct chief he-part-dance with-those girls they-two “The chief was dancing with those two girls.” (4.2) a. hayi okunene uyithathile le ntombi isangena no truly he-her-take-perf this girl she-part-enter emnyango loc-doorway “So then truly, he took the girl as she entered the door.” b. wayixhwila ngoko he-cons-her twirl then “He twirled her around then,” c. wathi nanku umfazi ungenile he-cons-say here-is wife she-part-enter-perf “and said, ‘This is my wife, she has entered,”’ d. wadinisa naye ngobusuku bonke he-cons-dance with-her with-night all “and he danced with her the whole night.”’ In both of these cases, the sections are distinct: in Givón’s terms (1990: 826), there is a thematic break between these sections. In (4.1) the common orientation of the clauses is the series of events leading up to the girl’s arrival at the chief ’s party. The JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.7 (370-433) Figure, ground and connexity ideas in (4.2) are distinct from those in (4.1) as the orientation now switches to focus on the chief ’s actions. Just as there is a break in conceptual connexity, there is a matching break in syntactic connexity or dependence with the occurrence of a clause using the indicative mood. The participial mood The form of the positive participial is: Subject Concord + Verb stem, for example, (5) ixhego li-cula > ixhego licula old-man he-part-sing “the old man singing” The participial morpheme itself is realised supra-segmentally through certain perturbations of the tonal form of the verb (historically as the result of a high toned morpheme -*ki-) and it has a durative significance. It occurs in subordinate clauses and while it displays both polarity and a range of aspectual markings, it is not marked for tense, with the time orientation being an inherited feature of the associated independent clause. Consider the following individual examples with their associated discourse contexts: (6) a. baya emdanisweni elila njalo lo mntwana they-cons-go loc-dance she-part-crying like-this this child “They went to the dance, this child crying so.” b. wahamba ethwela umthwalo she-cons-travel she-part-carry load “Then she travelled, carrying her load.” c. wafika engekho he-cons-arrive she-neg-part-there “Then he arrived, she not being there.” Traditionally participial clauses of the above type have been described as a mood type occurring only in subordinate clauses and encoding actions simultaneous to those in the main clause (for example, Du Plessis 1978: 135). If this were an adequate description then the information encoded in the participial would have the same status as that encoded in consecutive clauses, that is, encoding foreground events. However, it appears that the information is of a different status encoding rather background information as defined above. The participial clauses in the examples above, as well as participial clauses more generally, do not, I claim, code events and do not thus form part of the event line advancing the story line. They, like the consecutive, encode syntactic connexity to the clauses they follow. Unlike the consecutive, however, they are marked for ‘durative’ aspect, and thus, JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.8 (433-487) David Gough rather than representing actions or events, they encode unbounded temporally continuous situations. It is in terms of these situations that the associated consecutive, representing bound events, are foregrounded. The situation is therefore not, as traditional descriptions would have it, simultaneous to the event, but forms, rather, its durative background so that the bounded and momentary event is located within the temporally durative framework established by the participial. In (6a) above, for example, the event of the girl’s going to the dance is given the temporal backdrop of the girl’s crying and in (6b) the girl’s travelling is similarly located in the durative backdrop of her carrying a load. Neither of these clauses contributes to the movement of narrative time. Research into the participial in other Bantu languages supports this view. Wald (1975) and Poulos (1982) argue, respectively, that in Swahili and Zulu the participial is, in both form and function, a temporal relative clause. Poulos (1982: 210) states that the participial, like other relative clauses, has a ‘restrictive force’; what participial clauses restrict as relative clauses is the “dimension of time” (1982: 219). This approach is supportive of the present view of the participial in terms of its backgrounding function. The continuous tense The form of the so-called continuous tense is Subject Concord-a-(ye/be) participial, for example: (7) si-a-(yebe) sihamba > sasihamba we-past-pct we-part travel “we were travelling” The form given above has been traditionally labelled the (remote) past continuous tense (pct) which has been described as indicating “an action which was in progress . . . at some time in the past” (Davey 1973: 87). The pct, typically a fully finite form, is a compound utilising an auxiliary verb -be (also realised as -ye and optionally elided), which encodes the notion of ‘being’, preceded by the past tense marker -a-. As complement to this auxiliary, the participial indicates the temporal domain or durational situation of this being. In the illustration above the being is restricted to the temporal domain of ‘travelling’. The pct encodes, in terms of this durational basis, an unbounded situation as opposed to an event. It is important to note in this respect that the pct does not as a whole form the durative background of a contingent event as does the participial on its own. Rather, the pct indicates an independent ‘scene’. In narrative, pcts usually cluster together to form the initial settings of the tale which functions as an orientation to the body of the story events. Consider the following example: JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.9 (487-555) Figure, ground and connexity (8) a. kwakukho umntwana ekwakusithiwa ngujon nabanye It-pct-it-present child part-it-pct-said cop-John with-others abantwana bakokwabo children of-home “There once was a child called John and other children at home.” b. ke ngoku ke lo mntwana wayengathandwa kokwabo Then now then this child he-pct-neg-like-pass cop-home enikwa iinkonzo zombona he-part-give-pass husks of-maize “Now then, this child was not liked at home, being given maize husks.” In such settings there is no focus on the movement of narrative time as such. Rather, the durative setting orientating the audience to the story world is described before the events occurring in this backdrop are described. The following examples illustrate the use of pcts, not in the initial setting, but in the body of the narrative itself: (9) a. laflka ijoni labuza kumvundla ukuba khange He-pct-arrive soldier he-pct-ask loc-rabbit that ever uwubone na umvundla he-it-see-subj ques rabbit “The soldier arrived and asked the rabbit whether it had seen a rabbit at all.” b. umvundla nawo wayenxiba indevu apha phezu komlomo Rabbit with-it he-pct-wear moustache here above of-mouth (“The rabbit was wearing a moustache here above the mouth”) c. wabuza umvundla unjani lo mvundla uwufunayo He-cons-ask rabbit it-how this rabbit you-it-want-rel “The rabbit asked, ‘What’s this rabbit like that you’re looking for’?”’ In these examples we may see that pct clauses are clearly off the event line, representing background information. The pct forms are thus backgrounding in function. They encode, not the bounded events holding only for the moment of their occurrence, but temporally unbounded situations which hold for the narrative world in general. Furthermore, unlike the participial, the pct indicate independent scene. We are now in a position to see how the concepts of grounding and connexity are fundamental to the organisation of the Xhosa verbal system. This can be represented in the following diagram: JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.10 (555-588) David Gough Table 1. Grounding and Coherence relations GROUNDING: Foregound event Background non-event COHESION: Consecutive mood Participial mood Indicative mood Non-continuouns Aspect Connected Nonconnected . Discussion The framework proposed here is an attempt to show that in Xhosa there is a systematic basis, in terms of discourse functions, to what have been labelled fairly arbitrarily as ‘moods’ . Through this paper, I hope to have demonstrated the value of an orientation to language which holds that its nature can and should be explained in terms of factors outside of language , as narrowly conceived by some branches of both traditional taxonomic and current theoretical language study. Without this orientation that does not see language as the product of a separate ‘module’, we will remain at the whim of a view of language that is effectively removed, abstracted and isolated from the humans whose cognitive activities it is supposed to define. From this perspective, it is hoped that the concept of connexity, in addition to that of grounding as explored in this paper, may allow some insight into language study as an essentially human endeavour. References Carlson, R. (1992). Narrative, subjunctive and finiteness. Journal of African Languages and Linguistics, 13, 59–85. Davey, A. S. (1973). The Moods and Tenses of the Verb in Xhosa. Master’s dissertation, University of South Africa, Pretoria. Du Plessis, J. A. (1978). Isixhosa 4. Goodwood: Audiovista. Eyesenck, M. W. (1984). A Handbook of Cognitive Psychology. London: Lawrence Erlbaum. Givón, T. (1987). Beyond Background and Foreground. In R. Tomlin (Ed.), Coherence and Grounding in Discourse (pp. 175–187). Philadelphia: John Benjamins. JB[v.20020404] Prn:10/02/2006; 8:34 F: HCP1514.tex / p.11 (588-647) Figure, ground and connexity Givón, T. (1990). Syntax: A Functional Typological Introduction, Vol. 11. Amsterdam: John Benjamins. Gough, D. (1986). Xhosa Narrative: An Analysis of the Production and Linguistic Properties of Discourse with Particular Reference to Iintsomi Texts. Doctoral thesis, Rhodes University. Hopper, R. J. & S. A. Thompson (1980). Transitivity in grammar and discourse. Language, 56, 251–299. Khoali, B. T. (1993). Cole’s Dodean model: Issues and Implications. South African Journal of African Languages, 13(1): 29–32. Longacre, R. (1981). A spectrum and profile approach to discourse analysis. Text, 1(4), 337–359. Poulous, G. (1982). Issues in Zulu Relativization. Department of African Languages, Rhodes University. Communication No. 7. Thompson, S. A. (1987). “Subordination” and Narrative Event Structure. In R. Tomlin (Ed.), Coherence and Grounding in Discourse (pp. 435–452). Philadelphia: Benjamins. Tomlin, Russell S., L. Forrest, M.-M. Pu, & H. K. Myung (1997). Discourse Semantics. In T. van Dijk (Ed.), Discourse as Structure and Process (pp. 63–111). London: Sage. Ungerer, F. & H.-J. Schmid (1996). An introduction to cognitive linguistics. Harlow: Addison Wesley Longman. Wald, B. (1975). Variation in the System of Tense Markers of Mombassa Swahili. Doctoral thesis, Columbia University, New York. Wallace, S. (1982). Figure and Ground: The Interrelationships of Linguistic Categories. In P. J. Hopper (Ed.), Tense-aspect between Semantics and Pragmatics (pp. 201–223). Philadelphia: John Benjamins. JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.1 (48-119) chapter Discourse organization and coherence Ming-Ming Pu University of Maine at Farmington This chapter investigates discourse organization and coherence from a cognitive perspective and demonstrates that stories produced in different forms and languages are strikingly similar with regard to their structural organization, coherence building, and event coding. Speakers/writers are generally quite sensitive to the episode boundary information, and organize narratives into separate yet interrelated episodes. They seek and achieve coherence through establishing story frame, focusing on the central character, systematically tracking references, and maintaining topic continuity. The discourse organization and coherence establishment seems to be a systematic and even automatic process, which is governed by our underlying cognitive activities and driven by our subconscious attempt to enable our addressee to establish mental representations congruent with our own in discourse processing. Keywords: discourse structure, discourse coherence, cognitive activities, episode and episode boundaries . Introduction Researchers in various fields have investigated and shed light on how speakers and writers organize discourse to achieve coherence in terms of thematic structure and information units. It has been shown that coherence is not only an observable artifact of the external text or discourse, but also a cognitive phenomenon in the mind that processes the discourse. Van Dijk and Kintsch (1983), for example, propose the construction of a mental representation of text as consisting of both microstructure and macrostructure, reflecting local and global organization respectively. Similarly, Givón (1995: 63) argues that text is represented in part as a network of connected nodes (chunks). This network structure displays both hierarchical organization, where nodes are connected both ‘upward’ and ‘downward’ to other hierarchically adjacent nodes, and sequential chaining, where nodes are connected to both preceding and following sequentially adjacent nodes. Furthermore, many studies (Chafe 1992, 1994; Fox 1987; Lichtenberk 1996; Pu 1995; and JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.2 (119-168) Ming-Ming Pu Tomlin 1987) have not only explicated the importance of speakers’ and writers’ underlying cognitive constraints but also their awareness of the addressee in discourse processing by signaling discourse units and prompting the information retrieval. These studies demonstrate how speakers employ explicit versus implicit anaphora to mark changes of episode, shifts in location, and interventions of main storyline. Chafe (1992, 1994), in particular, describes the constraints upon speakers in casual and unplanned conversation and how speakers assess the current status of a given idea/event/referent in their listeners’ mind and systematically verbalize it as given/old, accessible, or new information in an ongoing discourse. Although researchers have agreed that cognitive operations underlie the overall discourse structure to guarantee coherence in the external discourse, it is not always clear how mental representations of discourse are construed during comprehension, and how they are realized during production. One of the most evasive and slippery issues is discourse structure, which is paramount in the study of discourse organization and coherence, but the structural units such as episode, paragraph, event, theme, etc. are not conceptually and theoretically well defined and prone to misinterpretation. The identification and discussion of mental representations of these discourse units, on the other hand, are also problematic because they are based mostly on some text-oriented notions such as ‘paragraph’, ‘discourse segment’, ‘sequence of thematically related sentences’ etc., and hence risk the problem of circularity. While also taking a cognitive approach to discourse organization and coherence, the present study aims to investigate cognitive activities underlying discourse organization and coherence, specifically the structure of episodes and its mental representations. The study first tries to define and identify, independent of linguistic information, conceptual structures of episode, and then addresses the issue of how these structural units are construed and represented in discourse comprehension and production. The study uses narrative data elicited from both English and Mandarin Chinese speakers to demonstrate the universal characteristics of discourse organization and information packaging regardless of the speakers’ linguistic background since discourse processing is constrained by general human cognitive activities (Chafe 1994; Gernsbacher 1990; Tomlin 1987). The following section details a narrative study, and in subsequent sections I will consider a number of arguments and claims that have appeared in the literature relating to discourse structure and coherence, using speech and written samples taken from the study. JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.3 (168-214) Discourse organization and coherence . The narrative study The narrative study was conducted to examine how speakers process incoming information and organize it into a structured and coherent discourse, and how they deliver such a structure in discourse production. The different conditions and tasks were designed to test if the structural unit of episode has psychological relevance, and what are common characteristics or ‘universal rules’ of structural organization and information packaging in producing narratives, given some general cognitive activities underlying discourse processing. Episode and episode boundary The present study argues that the basic structural unit of narrative discourse is the episode, which corresponds to the speaker’s mental representations of a narrative. Since the construction of episodes plays a crucial role in the present study, definitions are given below for the theoretical concepts of episode and episode boundary, which are drawn basically from Chafe (1994), van Dijk and Kintsch (1983), Pu (1995) and Tomlin (1987). An episode is defined cognitively as a memory unit in the flow of information processing. Linguistically, it is a semantic unit subsumed under a macroproposition, which functions to unify ideas of the unit. The macroproposition is generally a topical expression, featuring a global predicate (that denotes a global event or actions), a specific cast of participants, and/or time and place coordinates. Episodes in a discourse may be of varying length or scope. An episode is conceived of as a part of a whole discourse, having a beginning and an end. The beginning and end of an episode are defined in terms of propositions subsumed under the same macroproposition, while the propositions preceding the first and following the last proposition of an episode should be subsumed under different macropropositions. The transition between macro-propositions represents episode boundaries. They are normally marked by expressions denoting changes in time, place, scenery, participants, perspective, possible world, etc. Cognitively, boundaries may also be manifestations of attention shifts. Studies have shown the existence of episodes as chunks in narrative memory and episode-formation appears to be a virtually automatic process in story processing (Black & Bower 1979; Guindon & Kintsch 1982). Other research into story comprehension (Haberlandt, Berian, & Sandson 1980; Gernsbacher 1990) suggests that cognitive processes inside an episode are different from those at or around the episode. Comprehenders map the current information onto a developing structure within an episode when incoming information coheres with the previously presented information, while they shift from actively building one structure to start another between episodes when incoming information is less co- JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.4 (214-272) Ming-Ming Pu herent. The process of shifting costs more mental effort than mapping and thus comprehenders have more difficulty accessing information that occurs after an episode boundary than before a boundary (Anderson, Garrod, & Sanford 1983). It seems that comprehenders are quite sensitive to the cues that prompt them to carry out either a mapping or a shifting process. On the other hand, in order to convey their intended message successfully, speakers and writers must give their addressees signals or cues to help them build up a discourse representation congruent with his/her own. The present study aims to investigate and demonstrate how speakers and writers organize narrative discourse into episodes during narrative production and how they orally convey the structural network to their addressees. Stimulus material The stimulus material came from a children’s picture storybook (Krahn 1981), which has no written text and depicts several adventures of a little boy on a certain day. The book consists of 8 episodes, each of which has 8 pictures and is headed by a subtitle denoting a particular adventure with a picture clock showing the time of the day. In each of the episodes, the main character, a little boy named Alex Pumpernickel, is accompanied by a different secondary character of either the same- or different-gender. Three episodes were chosen for our study. A total of 24 pictures (Krahn, F. (1981). Here comes Alex Pumpernickel! Boston: Little, Brown & Co), with the subtitle and picture clock removed, were made into an adapted picture book of 12 pages (2 pictures per page). The purpose of the experiment was to establish whether the subjects would perceive, organize, produce and retrieve the non-verbal information as episodes, as would be predicted by the episode theory (Schank & Ableson 1977; van Dijk & Kintsch 1978). Visual rather than the verbal material was chosen because (1) the processing and organization of information is considered general, rather than languagespecific cognitive activities (Bagget 1979); (2) with the subtitle and picture clock removed from the stimulus material, the subjects’ recognition of episodes in this experiment would be independent of linguistic information, and we would thus avoid risking the problem of circularity in defining and identifying episodes in discourse; and (3) the picture book consists of separate but related episodes. If episodes have psychological content, subjects should be able to identify and store these episodes as memory representations, and recall them verbally as separate episodes. JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.5 (272-329) Discourse organization and coherence . Method and procedures There are two narrative tasks for the subjects: an oral on-line (i.e., impromptu) description of the picture sequence and a recall of the pictures afterwards. The instruction was presented to the subjects in written form, which did not mention or suggest that the pictures ‘tell a story’ or ‘stories’. In the on-line task, subjects were asked to describe each picture while paging through the picture sequence for the first time. It was expected, as explained earlier, that subjects would recognize visual cues at the beginning of an episode, and would employ larger coding material at such junctures, to lay a foundation for the new substructure and signal the shift to the listener. In the recall task following the on-line description, subjects were asked to retell the picture sequence from memory. The purpose of the dual-narrative task was to see how a speaker would construct a narrative without a specific discourse plan (i.e., without knowing what was happening next), as contrasted to a planned or structured oral narrative from memory. The recall task, on the other hand, was carried out in either oral or written forms: half of the subjects retold the story orally and the other half wrote the recall. Forty subjects participated voluntarily in the experiment. Twenty were native English speakers from Northern State University in the United States, and twenty were native Mandarin Chinese speakers from the Central China University of Finance and Economics in China. All subjects are undergraduates and about half in each group are women. . Results and discussion In general, the speakers and writers of both languages produced very similar narratives in terms of episode organization, event coding and information patterning. They recognized the three episodes in the picture sequence and used them in their story construction. They followed the main story-line and encoded the important events of each episode. They also processed story information such as given-new and background-foreground in a consistent way. The remaining sections of this paper will discuss the general characteristics of episode construction, information processing and event encoding. I will use oral and written narrative samples from both languages to illustrate these characteristics, with each example coded to capture: the relevant language (E-English or C-Chinese); task type (O-on-line or R-recall); recall mode (RS-spoken recall or RW-written recall); and the subject number (1 to 20). JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.6 (329-370) Ming-Ming Pu Episodic structure Many studies have demonstrated the existence of episodes as memory chunks in discourse processing. Speakers, who are constrained by working memory limitations, would try to organize the overall discourse contributions into smaller semantic units, each of which is dominated by a macroproposition. Comprehenders, on the other hand, would capture this episode structure in their mental representation by building separate substructures to represent each episode (Gernsbacher 1990) because whatever portion of the incoming information that is to survive in longer-term memory must be translated rapidly into some form of episodic mental representation (Givón 1995: 62). The results of the present study give support to the psychological relevance of episode structure in discourse production and comprehension: although there was no written/linguistic clue in the stimulus material suggesting there were three episodes in the picture sequence, subjects of both languages recognized them, often with overt remarks. In both on-line and recall conditions, subjects consistently organized the picture sequence into three semantic units in their narrative production (as was intended by the author in the original picture storybook), and frequently signaled and separated the units linguistically. More interestingly, in the recall task five subjects (three English and two Chinese) could only recall two of the episodes at first and then realized that one (always the middle) episode was missing from their recall. The way they finally recalled the second episode (‘Boy and Fly’) was informative. Each subject first recalled the macroproposition, and then the whole episode came flowing out. Some exact wordings used by the subjects are: “Well, I remembered it’s the boy chasing the fly”, “Okay, it’s about the kid swatting a fly,” or “Yes, it’s about the child and the fly.” The memory relapse in the recall task gives further evidence to the existence of episodes as chunks in memory and the monitoring role that macropropositions play in discourse processing: information is organized, stored, retrieved, and forgotten as episodes, and macropropositions function to unify ideas of the episode. In the on-line description task, speakers were very sensitive to the nonlinguistic cues of episode shifts, such as change of location, change of scenery, change of activities, and change of characters, which were used in building episode structure. Most speakers recognize boundaries between excerpts in the picture sequence and mark the beginning of a new episode in their oral narratives accordingly. A new episode normally starts with an adverbial phrase of time or location, as exemplified by the following. (1) Outside in the backyard, the boy is playing tennis with a girl. . . . (EO3) (2) And then, the boy is walking on the street. . . . (EO6) JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.7 (370-424) Discourse organization and coherence (3) zhe yi.ding shi ling.yi.ge gu.shi, yin.wei zhe nan.hai xian.zai zai this must be another story because this boy now at ke.ting li, . . . living-room in “This must be another story because the boy is now in a living room. . . . ” (CO2) (4) ran.hou, zhe nan.hai chu.qu he yi.ge nu.hai da wang.qiu, . . . then this boy go-out with a girl play tennis “After that, the boy goes out to play tennis with a girl. . . . ” (CO4) In most cases, the adverbial phrase is accompanied by a reinstatement of the major character with a full NP. The function of the full NP, however, is two fold. First, building a new mental structure for a new episode consumes more cognitive effort on the part of the speaker, for whom information of the previous episode becomes less accessible at this point (Chafe 1994; Gernsbacher 1990). The speakers would then use a full NP or a proper name to quickly reactivate reference because “[t]he less predictable the information is, or the more important, the more prominent or larger coding it will receive” (Givón 1993: 196, emphases in the original). Second, the use of a full NP at the beginning of a new episode serves as a signal to the listener, who needs to build a new mental structure for the incoming episode. In the written narrative the episode boundary was made even more explicit. In addition to adverbial phrases, eight out of ten English writers and all ten Chinese writers used blank lines, numerical devices, or paragraph structure with or without indentation to separate episodes. Of the two English subjects who recalled their story in one written piece without any visual demarcation, one managed to indicate a new episode by repeating and underlining an adverb at the beginning of the episode: “next, . . . ”. The general characteristics of the language user’s perception and formation of discourse units demonstrate the nature of discourse organization and the importance of episode structure in language production and comprehension. The story information was not only hierarchically organized and produced as a series of episodes, but also so stored and retrieved. Achieving and maintaining coherence This section examines how speakers construct macrostructures for episodes, and how they relate events within an episode both linearly and hierarchically to achieve local and global coherence. Our data show that during the on-line description task when the episodic structure was being built, subjects tried to seek coherence both externally in non-verbal materials and internally in mind. Specifically, they construed a story frame (with temporal and spatial reference and central character JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.8 (424-486) Ming-Ming Pu effect), and maintained referential and topical continuity to achieve coherence of the discourse. Story frame Although ‘story-telling’ was not mentioned in the instruction of the narrative tasks, almost all subjects were prepared to tell a story of some sort at the beginning. They tried to organize the not-yet-known information into a familiar and controllable structure or frame – a story, thus making their first attempt at obtaining discourse coherence. The story frame sets a macrostructure for the discourse, to which incoming information can be related and explained. The following examples are the typical start of the on-line description. (5) Once upon a time, there was a little boy (EO4) (6) The story starts with a boy . . . (EO2) (7) zai zhe.ge xiao gu.shi li, wo kan.jian yi.ge nan.hai, . . . at this little story in I see a boy “In this little story, I see a boy . . . ” (CO7) (8) xian.zai wo yao gei ni.men jiang yi.ge gu.shi now I want to you tell a story “Now I’m going to tell you a story . . . ” (CO3) Once the ‘story-frame’ was set but little other information was available, subjects tried to derive macropropositions as quickly as possible so as to relate subordinate actions and information to the macro-proposition (see also Guindon & Kintsch 1982; Kintsch 1995) in a story. One such macroproposition is the establishment of the central character. Subjects quickly identified the central character at the beginning of on-line task (as shown in (5)–(7) above), and then concentrated on his actions and purposes to achieve discourse coherence. Though required to describe each picture in the storybook, which contains a great deal of information, subjects did not describe indiscriminately everything in the picture sequence but were more concerned about the actions and goals of the central character, and the cause and outcome of the actions and events. They elaborated on the pictures that were regarded as important in carrying out the story line, and explained the events and actions that added to the understanding of the story, but only touched upon (some even omitted) the pictures that were not critically related to the theme of the story. Our data show that the overall mentions of the main character are more than twice as many as those of any secondary character. In the first episode (‘Boy and Tennis Ball’), for example, despite both characters appearing together in each of the eight pictures, two thirds of the subjects focused on the boy, describing his actions and adventure in detail yet mentioned the other character only occasionally. Examples are given in the following passages. JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.9 (486-559) Discourse organization and coherence (9) He’s climbing up the window, and he’s looking into the house for the ball. He goes into the house and tries to find the ball, . . . and all the while, the little girl is standing there, watching. . . . (EO1) (10) ta pa.shang chuang.zi, wang wu li kan, ta faxiang na qiu zai he climb-up window toward room in see he find that ball at yige guo li. yu.shi ta pa.jin wu li, xiang ba na.ge qiu cong guo a pot in so he get-in room in want om that ball from pot li lao chu.lai, . . . zui.hou ta ba qiu lao.le chu.lai, shang.mian in scoop out finally he om ball scoop out around-side zhan.le xu.duo tang.xi ran.hou ta he xiao nu.hai ba qiu na.dao stick much candy then he and little girl om ball take-to (om=object marker) hou.yuan li, . . . back-yard in “He climbs up the window and looks into the room, and he finds the ball in a pot. So he gets into the room and tries to scoop the ball out of the pot. . . . Finally he gets the ball out of the pot, which is wrapped in sticky candy. Then he and the little girl take the ball to the back-yard, . . . ” (CO2) The central character is the back-bone of the story, chaining actions and events throughout the main storyline. Focusing on the central character is a very important strategy in storytelling, which affords speakers to be selective in presenting the incoming information, enables them to stay on the main storyline, and hence allows them to obtain and maintain coherence. The ‘central character’ strategy plays an important role not only in achieving discourse coherence but also in facilitating comprehension. A story is considered coherent and easy to comprehend as long as the actions, intentions and purposes of the central character are stated and explained, while those of secondary characters can be marginalized. Indeed, as observed by Garrod and Sanford (1988: 174), “an unexplained action on the part of a main character results in a delay to processing the sentence in which that action occurs, while such an action on part of a secondary character results in no such delay.” Accompanying the central character in the early stage of the on-line task is the set-up of the temporal and spatial reference, which not only delineates the story frame but also functions to mark the shift or transition at the beginning of a new episode. For example, (11) Once upon a time, there was a little boy, . . . (EO4) (12) Late that day, the boy is out on the street walking, . . . (EO11) (13) tian wan.le, nanhai hui.dao wu.li, . . . it late boy return house “It’s late now, the boy goes back to the house, . . . ” (CO2) JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.10 (559-633) Ming-Ming Pu (14) yige qing.lang.de xiawu yige nanhai he yige nuhai zai . . . a fine afternoon a boy and a girl are “One fine afternoon, a boy and a girl are playing . . . ” (CO3) It is of interest here that the time of an episode or events only existed in the speaker’s mind since nothing in the visual stimuli themselves indicates time with the removal of the picture clock. The spatial reference, on the other hand, is established when speakers take cues from the pictures. As mentioned previously, these subjects were very sensitive to boundary information and employed them in encoding events. In the picture sequence the most readily available episode-shift information was a change of location, such as from a living room to a street, from the street to a backyard, etc. Subjects immediately recognize the shift, and mark it linguistically in their narratives to lay the foundation, so to speak, for the new episode. Some of the examples are: (15) Outside on the street, the boy . . . (EO7) (16) xian.zai zhe nan.hai chu.xian.zai da.jie shang. . . now this boy appear at street on “Now the boy appears on the street. . . . ” (CO5) Once temporal and/or spatial references are set globally, they are maintained locally throughout an episode to give the listener a coherent time frame and the spatial orientation of the episode. The following examples contain some of the typical episode-medial time and locative phrases, which help achieve and maintain local coherence of the episode. (17) Just as he puts the newspapers together, . . . (18) He climbs onto a chair next to the couch, . . . (19) deng lao tai.tai yi zhuan.guo jie jiao, . . . wait old lady just turn-over street corner “As soon as the old lady turns around the street corner . . . ” (20) chuangzi li shi yi.ge chu.fang, . . . window in is a kitchen “It’s a kitchen (inside the window), . . . ” (EO6) (EO15) (CO9) (CO17) The use of temporal and spatial reference is another attempt that speakers make to obtain and maintain coherence of the discourse. Although explicit cohesion markers, as mentioned above, occurred frequently in our narrative data, they are not necessary nor sufficient to make a discourse coherent. During the on-line description, subjects also maintained temporal and spatial coherence implicitly by connecting the order of events sequentially, and moreover, they sought and achieved discourse coherence by relating subordinating actions and events hierarchically to the higher level goal or dominant macroproposition of the episode, JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.11 (633-685) Discourse organization and coherence once the temporal and/or spatial framework was set. This was shown in speakers’ descriptions when they were puzzled by an action or a motive of the central character in an episode. Though unsure of the goal or purpose of a particular action or scene, subjects would try to tie it to the macroproposition of the episode. For example, in the second episode (‘boy and fly’), many speakers were not certain why the boy was messing with the newspapers in pictures 6 and 7. They paused, hesitated, and/or expressed uncertainty about the ongoing event, but managed nevertheless to come up with explanations that contribute to the theme of the episode, viz., the boy’s attempt to swat a fly. The following excerpts exemplify such an effort. (21) He’s going through the papers . . . I guess he’s looking for the fly. (EO8) (22) ran.hou, ta ba bao.zhi pao qi.lai, ren.de dao.chu dou.shi. ta zai then he om newspaper throw up throw everywhere he is ta yi.ding shi xiang rang na.ge cang.ying fei chu.lai he must is want let that fly fly out “Then, he throws the newspapers everywhere. He must be trying to get the fly to fly out.” Furthermore, when a surprising outcome or climax occurred late in an episode, subjects would try to make sense out of it and incorporate it into the developing episode, especially if it did not meet the speaker’s earlier expectations. For example, in the third episode (‘boy and lobster’), the main character’s true objective did not become evident until the 7th picture, in which the boy opens the bag he helps the old lady carry. Every subject described the event and many commented: (23) The boy didn’t really want to help the lady, he was just too curious. (EO9) (24) ta hen xiang zhi.dao bao li cang.zhe she.me dong.xi, suoyi ta cai he very want know bag in hide what thing so he just yao bang.mang offer help “He was curious about what’s in the bag. That was why he offered help.” (CO13) Our narrative data have given further support to the linear and hierarchical organization of discourse, demonstrating how subjects link micropropositions to one another at a local level, and at the same time relate them to the global macroproposition of an episode. In general, subjects used cues from the picture sequence to maintain local coherence, but more importantly, it is their extensive background knowledge, along with the picture information, that enabled them to achieve global coherence, i.e., to infer goals and plans, and use them to explain actions and events throughout discourse. These inferences hold over large distances in a JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.12 (685-770) Ming-Ming Pu network and are made regardless of local coherence being possible (Trabasso, Suh, & Payton 1995: 212). In the recall task, on the other hand, subjects used the same strategies in achieving temporal, spatial and thematic coherence, but were more organized and concise in their recall since they had already had the settings, plans, actions and goals of the episodes in mind and were freer in their choice of picture/event description. The following passages are exemplary of the orally recalled episode that follows the main story-line and describes only the major events. (25) The next episode has to do with the boy attempting to swat a fly. He is on a chair trying to swat a fly. He leaps off the chair to get the fly and swats the newspapers on his dad instead. His dad sits up, looks around, and loses the newspapers on the floor. The boy searches through the papers to find the fly swatter and once again goes after the fly. (ERS5) jia le. zai lu.shang kanjian yige (26) tian wan le, xiao nan.hai hui it dark little boy return home at street see an old lao taitai lin.zhe liang.ge hen chen.de daizi. xiao nan.hai hen xiang lady carry two very heavy bag little boy very want zhidao bao.li shi sheme jiu pao guo.qu yao bang lao tai.tai ti know bag in is what just run over want help old lady carry dai.zi. lao tai.tai hen gan.dong jiu ba yige dai.zi gei ta bei bag old lady very touched just om a bag give him carry dang lao tai.tai zhuan.shen jin.ru yi.ge xiao xiang.zi, xiao.hai when old lady turn enter a small alley little-kid gan.jin dun.xia.lai, ba dai.zi da.kai. mei xiang.dao dai.zi li hurriedly squat down om bag open not expect bag in pa.chu yi.zhi da long.xia, yao.le ta yi.kou. ta teng.de zhi.du, climb.out a big lobster bite him a.bite he hurt cry dan.ye zhi.hao wu.ke.nai.he.di geng.zai lao tai.tai houmian hui but have.to helpless follow old woman behind return jia.le home “It was late and the little boy went home. (He) saw an old lady on the street, carrying two heavy bags. The boy very much wanted to know what’s inside the bag, so (he) ran over to help the old lady. The old lady was touched and gave him a bag to carry. When the old lady turned into an alley, the kid hurriedly squatted down and opened the bag. Out climbed a big lobster unexpectedly and bit him. He was hurt and crying, and helplessly followed the old lady home.” (CRS2) JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.13 (770-810) Discourse organization and coherence Reference tracking Another implicit way of establishing and maintaining coherence is reference tracking, which subjects managed in consistent ways. It has long been noted that there is a correlation between the cognitive status of a referent and the linguistic form encoding the referent. Researchers have demonstrated that forms that signal the most restrictive cognitive status (in high focus) are always those with less or least phonetic content, namely unstressed pronouns, clitics, and zero pronominals (Chafe 1987; Givón 1989; Gundel, Hedberg, & Zacharski 1993; Pu 1995; Tomlin & Pu 1991). Indeed, our narrative data show that once the central character, ‘the little boy’, was established at the beginning of the storytelling, it was very frequently encoded by pronominals (e.g., lexical and zero pronouns) throughout the remainder of the narrative because it was the focus of attention of subjects in their description tasks. The supporting character (i.e., the old lady, the man, and the little girl, respectively), on the other hand, has to reside mostly outside of subjects’ focus of attention due to the limited capacity of focal attention (Just & Carpenter 1992; Gathercold & Baddeley 1993; Gundel 1998), and therefore frequently referred to by full NPs. Examples (25) and (26) above are taken from the spoken recall data, where both subjects systematically pronominalized the central character and nominalized the secondary character within the episode, even though the secondary character (i.e., ‘his dad’ in (25) and ‘the old lady’ in (26)) was just mentioned in the preceding sentence. Examples (27) and (28) below are taken from the on-line task and the written recall respectively, which reveal the same patterns of reference tracking in the narrative. (27) A little boy is walking on the street. He meets an old lady carrying some bags. He asks the lady what’s in the bags, and the lady gives him one of the bags. The lady walks off and he’s holding the bag. . . . (EO11) (28) ta zhai.zai yi.ge yi.zi shang da cang.ying. cang.ying fei wang he stand-on a chair on swat fly fly fly toward sha.fa shang de yi.dui bao.zhi shang. ta hui.pai da quo.qu, sofa on a-pile newspaper on he raise-swatter hit over que bu.liao jin.xin.le bao.zhi di.xia tang.zhe.de yi.ge nanren zhe but not-expect awake newspaper under lie a man this nanren zheng tang zai sha.fa hang shui.jiao, bei ta da.xin.le, hen man just lie on sofa on sleep by him hit-awake very sheng.qi. nan.ren yi.xia.zi zuo.qi.lai, . . . angry man suddenly sit.up “He was standing on a chair to swat the fly. The fly flew toward a pile of newspapers on the couch. He raised the swatter to hit it, only to wake a man who was lying under the newspapers. The man was sleeping on the couch and was (CRW2) very angry when woke up by him. The man sat up suddenly . . . ” JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.14 (810-916) Ming-Ming Pu Table 1. Reference-tracking results English NP PN Zero Total Chinese Central N % Secondary N % Central N % N 246 494 191 931 297 116 40 453 239 285 347 871 313 29 58 400 26.42 53.06 20.52 65.56 25.61 8.83 27.44 32.72 39.84 Secondary % 78.25 7.25 14.50 Total 1095 924 636 2655 Table 2. Boundary results for all tasks NP English Chinese Total 50 51 101 On-line task PN Zero 10 9 19 0 0 0 NP 29 28 57 Oral Recall PN Zero 1 2 3 0 0 0 Written Recall NP PN Zero Total 28 30 58 120 120 240 2 0 2 0 0 0 Table 1 indicates the results of anaphor use in tracking reference in our narrative data, in which the tokens of full NPs (=NP), lexical pronouns (=PN), and zero anaphors (=Zero) and their respective distribution rates are calculated with regard to the central and secondary character. Table 1 shows the distinct patterns of tracking characters in narrating the story: subjects focused their attention on the central character throughout the narrative (the anaphoric tokens for the central character are twice as many as those for the secondary characters), and consistently used less explicit coding forms to refer to it due to its restrictive or privileged cognitive status. On average, lexical and zero pronouns account for about 73% of all anaphors referring to the main character, whereas these reduced forms account for only about 28% of all references made to the supporting characters. Also of interest in the reference management is the ‘boundary effect’, which accounts for the relatively higher rate of full NPs referring to the central character (about 27% on average), as discussed briefly in the section of Episodic Structure. Although the central character of the story remains the same throughout the three episodes, subjects would nonetheless use a full NP (e.g., a definite or demonstrative NP, or a repeated proper name) to reinstate the referent at the beginning of a new episode, regardless of its referential distance (i.e., the number of clauses between the current and the last mention of the referent; see Givón 1987). The boundary results are presented in Table 2, where the alternative anaphoric forms used at the beginning of an episode for the first mention of the central character are listed for each of the narrative tasks. The boundary effect is found to be very strong in our narrative study. In the on-line task, the majority of subjects (13 in Chinese and 14 in English groups) JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.15 (916-966) Discourse organization and coherence used a full NP for the first mention of the central character in each episode. In the recall task, be it oral or written, when subjects had established the three episodes in their mental representations, the boundary effect is shown to be even stronger: the overwhelming majority (19 in Chinese and 18 in English) consistently used a full NP to reinstate the central character at the beginning of each of the three episodes. The boundary effect is again a manifestation of our cognitive constraints and activities underlying the pronominalization process. Within an episode when speakers’ attention sustains, a referent that has been focused on (e.g., the central character) can keep its cognitively privileged status of being most accessible and identifiable, and speakers would use a less explicit anaphor to code the referent. However, between episodes when speakers’ attention shifts, a referent that has been focused on would lose its privileged activation status due to the change in the memorial and attentional process and becomes less accessible, at which juncture speakers would use an explicit anaphor to reactivate the referent. Not only is speakers’ referential choice governed by their own cognitive activities, but it is also based partially on their assessment of the hearers’ cognitive status with respect to a particular referent in order to facilitate comprehension. Speakers would use pronominals for the central character within an episode to maintain referential coherence and to keep listeners focused on the same character. At the beginning of a new episode, however, they would facilitate listeners’ shifting process by using a self-defining NP for the quick and easy reactivation of the same referent because shifting is cognitively more costly than mapping, and thus comprehenders have more difficulty accessing information that occur after a unit boundary than within a boundary (Gernsbacher 1990). Topic continuity Closely related to the strategy of reference tracking is the establishment of topic continuity, another important means of maintaining local discourse coherence. Topic continuity is best embodied in a topic chain that consists of several clauses over a span of discourse, within which each clause is understood as being about the same topic (Li & Thompson 1979: 33). In a topic chain, the topic is set up in the first clause and typically left unspecified (i.e., with zero anaphora) in subsequent clauses because its referent is most accessible, identifiable and recoverable from discourse context. It has been argued that topic chains are largely responsible for the prevalence of zero anaphora in Chinese discourse. Indeed, our recall data show that topic chains are a common device used by Chinese speakers in their coding of events within an episode, where a topic persists over a span of discourse. For example, JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.16 (966-1017) Ming-Ming Pu (29) ta zhan.zai yi.zi.shang, ju.zhe cang.yin pai, kan.jian cang.ying luo he stand in chair-on raise swatter see fly fall zai bao.zhi shang, jiu hao.bu.yu.yu.di pai.le xia.qu, que bu.liao at newspaper on just not.hesitate swat down but not-expect pai.zai yi.ge ren shen.shang, . . . hit-at a man body “He stood on a chair, Ø poised his flyswatter, Ø saw the fly fall onto a pile of newspapers, and Ø swung the flyswatter down without hesitation, but Ø hit a man instead . . . ” (CRS15) (30) ta zou shang.qian, re.xin.di yao bang lao nai.nai na na.ge xiao he walk forward eagerly offer help old granny carry that small bao, jie.guo cheng nai.nai guai.wan shi tou.tou da.kai kou.dai, bag end-up as granny turn-corner time stealthily open bag jie.guo fa.xian shi yi.dai pang.xie end-up find is a-bag crab “He steps forward, Ø eagerly offers to help the old granny carry a small bag, as the old granny turns the corner, Ø secretly opens the bag, and Ø finds a bag full of crabs.” (CRS9) Passage (29) describes, in a topic chain, an action sequence of the boy swatting a fly. The topic chain is all about the topic, the boy. Once the topic is established in the first clause of the action sequence, it is encoded by a zero anaphor in the remainder of the sequence. In fact, topic chains can be formed in Chinese discourse regardless of whether there is intervening material between two clauses containing the topic, and regardless of whether there is another discourse entity that may cause referential ambiguity. Passage (30) above is another excerpt taken from the Chinese oral recall data. The topic is again ‘the boy’. Although there is an intervening clause in the middle of this event sequence, i.e., ‘as the old granny doesn’t pay attention,’ the topic chain resumes after the preceding clause mentioning a referent other than the topic. Moreover, there are two characters described in the event sequence, but the chain of zero anaphora refers unambiguously to the topic even though the last two zero anaphors could syntactically be coreferential with the secondary character, ‘the old lady’. It is not surprising that Chinese speakers used topic chains to keep the story flowing within an episode because Chinese is considered a discourse-oriented and topic-prominent language. English, in contrast, is regarded as a subject-oriented language, where an explicit subject is usually required. Nevertheless, the use of topic-chains is not uncommon in English recalls. When the topic persists over a span of discourse and can be easily identified, English speakers leave it unspecified, as do Chinese speakers. For example, JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.17 (1017-1080) Discourse organization and coherence (31) Then the curiosity of the boy got the better of him and he stopped. He opened up the bag, but Ø was bitten by this big lobster that jumped out of the bag. He was scared, Ø cried a little bit, Ø picked up the bag, and Ø followed his mother back into the house. (ERS12) The excerpt is taken from the English spoken recall data, which describes the climax and ending of the third episode (Boy and Lobster). Within the episode, the central character, ‘the boy’ is repeatedly referred to by either a lexical pronoun or a zero anaphor because the referent is continuous and is already activated in the preceding clause. Nevertheless, example (31) shows that even zero anaphora and lexical pronouns are not used indistinguishably in English discourse. In this passage, the climax of the episode is described in two clauses (i.e., ‘he opened up the bag, but Ø was bitten . . .’), the first of which creates some kind of suspense and the second reveals the unexpected outcome. If a lexical pronoun had been used to refer to ‘the boy’ in the second clause, the tight cause-result sequence would have been broken, and the continuity of the climax lost. Similarly, the next four clauses describing the ending of the episode are chained by zero anaphora, illustrating an action sequence of the topic, viz., the boy. Much like Chinese, such topic chains indicate maximum coherence within an episode, which are used to encode action or event sequences of the same topic, among other things. However, when such maximum coherence is disrupted in an episode, the topic chain would end. In the above example, there is a transition or minor thematic gap between the climax and the ending of the episode (i.e., ‘he was scared, Ø cried a little bit . . .’), where the speaker used a lexical pronoun to end the last topic chain and starts the next one. The alternative use of lexical versus zero pronouns are further exemplified in (32) below. (32) The boy looks fairly upset. He starts to try to straighten the newspapers, but he (uh), kind of gives up, Ø gathers them together, Ø throws them toward the man, and Ø continues to pursue the fly with the flyswatter. (ERS15) This passage depicts the boy’s persistent pursuit of the fly in the second episode (Boy and Fly), where the first clause describes how ‘the boy’ looks, and then a series of clauses are used to describe what he does (i.e., ‘he starts to try to straighten the newspapers , but he . . .’). Although the series of clauses are about the same topic, there is a minor thematic gap between the first and the rest of the clauses, namely, the boy’s attempt to straighten the newspapers is in conflict with his purpose to swat the fly. Hence the topic chain does not start at the first clause of the series, but after the occurrence of the minor discontinuity. Whereas zero anaphora is employed in the topic chain to describe the boy’s continued effort to pursue his goal, a lexical pronoun is used (‘but he (uh), kind of gives up . . .) at the juncture of the minor thematic gap. JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.18 (1080-1157) Ming-Ming Pu The correlation between minor thematic gap and the use of lexical pronoun within an episode is further evidenced in our on-line data. In the on-line description task, minor discontinuity exist not so much in the picture sequence of an episode as in subjects’ mental representations because the page-turning itself creates a gap between the description of the last pair of pictures and that of the next pair, at which point subjects did not know what to expect but had to get prepared to quickly comprehend and connect the incoming information with the previously presented information in order to tell a coherent story. Therefore after turning a page when the description continues, speakers would use a lexical pronoun to resume the central character even though the new pairs of pictures continue to depict the same action or event sequence of the same referent. For example, (33) The old lady walks off, and he’s holding the bag. And he looks into the bag, he looks interested. (laugh . . . ) A crab comes out and bites him on the hand. The (EO13) boy looks pretty upset, and he follows the old lady home. (34) ranhou ta ba bao.zhi ren xiang ta ba, ta kan.jian cang.ying then he om paper throw to his dad he see fly from cong bao.zhi li fei chu.lai, ta you qu zhui cang.ying le from paper in fly out he again go chase fly “He then threw the paper at his dad, he saw the fly fly out from inside the papers, and he ran after the fly again.” (CO11) (35) The boy goes through all the papers, looking for something, maybe the fly. The boy throws the newspaper onto the man on the couch. He then finds, . . . spots the fly, and he continues chasing the fly. (EO9) (36) ta gou bu.zhao cang.ying, suoyi ta cong yi.zi shang tiao xia.lai he reach not fly so he from chair on jump down da, dan ta que yi.pai.zi pai.zai bao.zhi shang swat but he instead a-swatter hit-at paper on “He can’t reach the fly, so he jumps off the chair, but he swats the newspapers instead.” (CO17) The passages (33)–(36) describe the same scenes from the second and third episodes as do passages (29)–(32). However, the latter frequently employs topic chains, while the same topic is realized in the former by a more explicit anaphora in each of the clauses. As I have explained, the page-turning imposes a minor gap in our sustained attention in the description task, and an attention gap in mind results in a thematic discontinuity in text. The operation of topic continuity in both English and Chinese narrative production reflects a general cognitive principle in language processing: “Expand only as much energy on a task as is required for its performance” (Givón 1983: 18). In other words, the least explicit anaphora assumes the most thematic or topic continuity, while more explicit anaphora bridges thematic gaps or signals thematic discontinuity of various degrees. JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.19 (1157-1216) Discourse organization and coherence . Conclusion The present study has demonstrated, with data taken from a narrative study, that stories produced in different forms and languages are strikingly similar with regard to their structural organization, coherence building, and event coding. Speakers and writers are largely responded to the episode boundary information, and generally organize their narratives into separate yet interrelated episodes. Within an episode when the incoming information is mapped onto the previously presented information, speakers sought and achieved local and global coherence through establishing story frame, focusing on the central character, systematically tracking references, and maintaining topic continuity. Between episodes when speakers (and listeners also) shift from actively building one structure to start another, they would try to mark the episode boundary. The boundary effect not only reflects speakers’ mental representations of episodes in discourse production, but also serves to signal to their addressee the advent of such a boundary in order to facilitate comprehension. In general, discourse organization and coherence establishment seems to be a systematic and even automatic process, which is governed by our underlying cognitive activities and driven by our subconscious attempt to enable our addressee to establish mental representations congruent with our own in discourse processing. References Anderson, A., S. Garrod, & A. Sanford (1983). The accessibility of pronominal antecedents as a function of episode shifts in narrative text. Quarterly journal of experimental psychology, 35A, 427–440. Baggett, P. (1979). Structural equivalent stories in movie and text and the effect of the medium on recall. Journal of verbal learning and verbal behavior, 18, 333–356. Black, J. B. & G. H. Bower (1979). Episodes as chunks in narrative memory. Journal of verbal learning and verbal behavior, 18, 109–118. Chafe, Wallace (1992). The flow of ideas in a sample of written language. In W. C. Mann & S. A. Thompson (Eds.), Discourse description: Diverse linguistic analysis of a fund-raising text (pp. 268–294). Amsterdam: John Benjamims. Chafe, Wallace (1994). Discourse, consciousness, and time: The flow and displacement of conscious experience in speaking and writing. Chicago: The University of Chicago Press. Fox, B. A. (1987). Anaphora in popular written English narratives. In R. S. Tomlin (Ed.), Coherence and grounding in discourse (pp. 121–167). Amsterdam: John Benjamins. Garrod, S. C. & A. J. Sanford (1988). Thematic subjecthood and cognitive constrains on discourse structure. Journal of pragmatics, 12, 57–72. Gathercold, S. E. & A. D. Baddeley (1993). Working memory and language. Hillsdale, NJ: Lawrence Erlbaum. Gernsbacher, M. A. (1990). Language comprehension as structure building. Hillsdale: Erlbaum. JB[v.20020404] Prn:16/03/2006; 16:24 F: HCP1515.tex / p.20 (1216-1337) Ming-Ming Pu Givón, T. (1983). Topic continuity and word order pragmatics in Ute. In T. Givón (Ed.), Topic continuity in discourse: Quantitative cross-language studies (pp. 343–363). Amsterdam: John Benjamins. Givón, T. (1987). Beyond foreground and background. In R. Tomlin (Ed.), Coherence and grounding in discourse (pp. 173–188). Amsterdam: John Benjamins. Givón, T. (1989). Mind, code and context: Essays in pragmatics. New Jersey: Erlbaum. Givón, T. (1993). English Grammar: A Function-based Introduction, Vol. I & II. Amsterdam: John Benjamins. Givón, T. (1995). Coherence in text vs. coherence in mind. In A. M. Gernsbacher & T. Givón (Eds.), Coherence in spontaneous text: Typological studies in language 31 (pp. 59–115). Amsterdam: John Benjamins. Guindon, R. & W. Kintsch (1982). Priming macrostructures. Technical report. Colorado: University of Colorado. Gundel, J. K. (1998). Centering Theory and the Givenness Hierarchy: Towards a Synthesis. In W. Walker, A. Joshi, & E. Prince (Eds.), Centering theory in discourse (pp. 183–198). Oxford: Clarendon Press. Gundel, J., N. Hedberg, & R. Zacharski (1993). Cognitive status and the form of referring expressions in discourse. Language, 69, 274–307. Haberlandt, K., C. Berian, & J. Sandson (1980). The episode schema in story processing. Journal of verbal learning and verbal behavior, 19, 635–651. Just, M. A. & P. A. Carpenter (1992). A capacity theory of comprehension: Individual differences in working memory. Psychological Review, 99(1), 122–149. Kintsch, W. (1995). How readers construct situation models for stories. In A. M. Gernsbacher & T. Givón (Eds.), Coherence in spontaneous text: Typological studies in language 31 (pp. 139–160). Amsterdam: John Benjamins. Krahn, F. (1981). Here comes Alex Pumpernickel! Boston: Little, Brown & Co. Lichtenberk, F. (1996). Patterns of anaphora in To’aba’ita narrative discourse. In B. Fox (Ed.), Studies in anaphora: Typological studies in language 33 (pp. 379–411). Amsterdam: John Benjamins. Li, Charles N. & S. A. Thompson (1979). Third person pronouns and zero-pronouns in Chinese discourse. In T. Givón (Ed.), Discourse and syntax (pp. 311–335). New York: Academic Press. Pu, Ming-Ming (1995). Anaphoric patterning in English and Mandarin narrative production. Discourse processes, 19(2), 279–300. Schank, Roger C. & R. P. Abelson (1977). Scripts, plans, goals and understanding. Hillsdale, NJ: Erlbaum. Tomlin, Russell S. (1987). Linguistic reflections on cognitive events. In R. S. Tomlin (Ed.), Coherence and grounding in discourse: Outcome of a symposium (pp. 455–479). Amsterdam: Benjamins. Tomlin, R. S. & M. M. Pu (1991). The management of reference in Mandarin discourse. Cognitive linguistics, 2(1), 65–93. Trabasso, Tom, Soyoung Suh, & Paula Payton (1995). Explanatory coherence in understanding and talking about events. In A. M. Gernsbacher & T. Givón (Eds.), Coherence in spontaneous text: Typological studies in language 31 (pp. 189–214). John Benjamins: Amsterdam. van Dijk, T. & W. Kintsch (1978). Cognitive psychology and discourse: retelling and summarizing stories. In W. U. Dressler (Ed.), Current trends in text linguistics (pp. 61–81). Berlin and New York: Mouton de Gruyter. van Dijk, T. & W. Kintsch (1983). Strategies in discourse comprehension. NY: Academic Press. JB[v.20020404] Prn:21/04/2006; 9:24 F: HCP15NI.tex / p.1 (48-198) Name index A Achard, Michel , , Allan, Scott , Amberber, Mengistu Ameka, Felix Anderson, A. , Aoki, Haruo , Ariel, Mira Arin, Dorothea Neal , , , Aristotle , Aslin, R. Atalay, Besir , Athanasiadou, A. B Backhouse, A. E. Baddeley, Alan , Baggett, P. Bakema, Peter Baker, C. Ball, T. M. , Banfield, Ann Bardon, Geoff Barnlund, Dean Barsalou, Larry W. Basso, Keith , Bates, Elizabeth A. , , , Bauer, Laurie Bell, Allan , –, Bensch, P. A. Benveniste, Emile Berian, C. Bernárdez, Enrique Bever, T. Black, J. B. Blumstein, Sheila E. Boomer, David S. Boroditsky, L. Bower, G. H. Bowerman, Melissa , , , –, Bownds, M. D. Brown, G. , , Brown, Roger , , Brugman, Claudia , , , Buck, Carl D. Bugenhagen, Robert D. Burgess, C. –, , C Carlson, R. Carpenter, K. , Carroll, David W. , , Carroll, Pat , , Casad, Eugene Chafe, Wallace , , –, , Chalkley, M. Chao, Y.-R. Chappell, Hilary Chater, N. Chomsky, Noam Church, Kenneth W. , , , , –, Cienki, Alan , Clark, Eve , , , , , , , , , Clark, Herbert H. , , , , , , , , , Cocude, M. Coleman, Linda Colston, H. Comrie, Bernard Contini-Morava, Ellen , , , Cooper, L. A. Coulson, Seana , , , , , , , Creider, Chet Croft, William , , , Cruse, D. A. , Cutler, Anne D Davey, A. S. , , Davidse, Kristin –, , Dechert, Herbert W. , DeLancey, Scott , Denis, M. Deutsch, W. Deverson, Tony Dirven, René , Dixon, Robert M. W. , Doi, Takeo Du Plessis, J. A. , Duranti, Alessandro , E Elman, Jeffrey, L. , , , , , , , , Emanatian, Michele Enfield, Nick J. , Erman, Britt Evans, Zoe Eysenck, M. W. F Fauconnier, Gilles , , , , Feld, Steven Fillmore, Charles J. , , Finch, S. Fodor, Jerry Forceville, Charles Fortune, G. Fox, Barbara A. , Friedrich, Paul G Ganong, William F. Garrod, S. C. , Gaskell, M. Geeraerts, Dirk , , , , JB[v.20020404] Prn:21/04/2006; 9:24 F: HCP15NI.tex / p.2 (198-338) Name index Gernsbacher, M. A. , , , , Gibbs, Raymond W. Jr. , , , , Givón, Talmy , , , , , , , , , Glenberg, A. M. , , Goddard, Cliff , , , , , , – Goldberg, Adele , Gordon, Elizabeth , , Gough, Dave , , , Grice, Paul Grondelaers, Stefan , Guindon, R. , Gundel, J. Guthrie, Malcolm – H Haaften, Ton van Haberlandt, K. Halliday, M. A. K. , Hannan, M. Harkins, Jean Hart, B. Hasada, Rie Haspelmath, Martin Hawkins, Bruce Hebb, Donald Hedberg, N. Herskovits, Anna Hinton, Geoffrey Hoenkamp, Edward , , –, Holland, Dorothy Holmes, Janet , –, Hopper, R. J. Hutchins, Edwin Hymes, Dell I Ibarretxe-Antuñano, Iraide , , , , , Ikegami, Yoshihiko Iwasaki, Shoichi , , J Jackendoff, Ray , , , –, Janda, Laura Johnson, Mark H. , , , , , , , , , , , , , , , , Johnson-Laird, P. N. , , , , , , , , , , , , , , , , , Junker, Marie-Odile K Karmiloff-Smith, Annette , , Kay, Paul , Kempen, Gerard , , –, Kempson, Ruth Kendon, Adam Kennedy, J. M. Kerzel, D Khoali, B. T. Kintsch, W. , , , Kitto, Catherine Klahr, D. , Klatt, Dennis H. , Kohonen, T. Kokuritsu Kokugo Kenkyûjo Kornacki, Pawel Kosslyn, S. M. , Kövecses, Zoltán , , , , , Krahn, F. Krauss, R. M. Kuczaj, S. Kuipers, Joel C. Kuno, Susumu , , , Kurath, Hans Kuroda, S.-Y. , Kurtböke, N. Petek , L Lachter, J. Lakoff, George , –, –, , , , , , , , , , , , , , , , , , , , , , , , , , Lambrecht, Knud Langacker, Ronald W. , , –, , , , –, , , , , , , , , –, , , , , , , , , , , , Leakey, Louis S. B. , Lebra, Takie Sugiyama Lee, P. U. , , Lehrer, Adrienne , Leinbach, J. , Lemmens, Maarten , , , , , , , , , Levelt, Willem J. M. –, , Levin, Beth , Li, Charles N. , , , , , , , , –, –, Li, Ping , , , , , , , , –, –, Liddle, Scott Lipka, Leonhard Longacre, R. Lucas, Margery M. Luchjenbroers, June , , , , , , , , , , , , Lund, K. –, , M Maclagan, Margaret A. , MacWhinney, Brian , , , , , , , , –, , , Mandler, Jean M. Mann, V. A. , Maratsos, M. Marchand, H. Marchman, Virginia A. Marslen-Wilson, William D. , , , Matlock, Teenie , , , , , , Matsumoto, Yo , McCawley, James D. McClelland, James L. , , , McCloud, S. McNeill, David , Metzler, J. Miikkulainen, R , Miller, G. A. Moiseeva, Nadezda Moliner, María Munn, Nancy D. Mühlhäusler, Peter Mylne, Tom , , , JB[v.20020404] Prn:21/04/2006; 9:24 F: HCP15NI.tex / p.3 (338-480) Name index N Newmeyer, F. J. Newport, E. Niemeier, Susanne Nishimura, Yoshiki Nolan, Francis J , Noordman, Leo Nooteboom, Sieb O Olbrechts-Tyteca, L. Onishi, Masayuki P Palmer, Gary , –, , –, , , , , , –, , , , Pardoen, Justine A. Parisi, D. , Parker, Simon Payton, Paula Pederson, E. , Peeters, Bert , Perelman, C. Pinker, Steven , , Plunkett, Kim , Prince, Alan Pu, Ming-Ming –, , , Pullum, Geoffrey K. Pustejovsky, James Pylyshyn, Zenon R Radden, G. Rader, Russell , , Raupauch, Marius , Redington, M. Reiser, B. J. Repp, Bruno H. , Rice, S. , Risley, T. Rumelhart, David , , , Ryder, Mary-Ellen S Saffran, J. Sanders, G. , Sanders, Ted , Sandson, J. Sanford, A. J. , Sansò, A. Saussure, Ferdinand de Schank, Roger C. Schilperoord, Joost , , , , , , , Schmid, H.-J. Schvaneveldt, Roger W. Scollon, Ron Seidenberg, Mark Shepard, R. N. Shirai, Y. Silverstein, Michael , Skinner, Debra Slobin, Dan I. , , Smith, Carlota S. Spitulnik, Debra A. , , , Spitzer, M. , , Spooren, Wilbert , Stanwood, Ryo E. Starks, Donna , Stevens, Kenneth N. Stubbs, Michael Suh, Soyoung Sweetser, Eve , , , , , , , , , Swinney, David A. T Tabakowska, E. Talmy, Leonard , , , , , , , , , Tanenhaus, Michael K. Thompson, Sandra A. , , , Tomasello, Michael Tomlin, Russell S. , , , , Trabasso, Tom Traugot, Elizabeth Closs Travis, Catherine , , , , , , Trilling, Lionel Turner, Mark , , , , , , , Turner, Robin , , , , , , , Tversky, B. , , , Tyler, Lorraine K. , U Uehara, Satoshi , , –, , , – Ungerer, F V van Dijk, Teun A. , , Vandeloise, Claude Vendler, Z Verhagen, Arie , , , , – W Wald, B. Wallace, S. , – Warren, Beatrice , , , , , Warren, Paul , , , , , Watson, Catherine I. , , Whorf, Benjamin Lee –, , , , , Wierzbicka, Anna , , –, , , –, , , Wilkins, David P. , Williams, R. Wong Scollon, Suzanne Woodman, Claudia , , , , , Y Ye, Zhengdao Yoon, Kyung-Joo Z Zacharski, R. JB[v.20020404] Prn:12/04/2006; 9:47 F: HCP15SI.tex / p.1 (48-178) Subject index A accessibility , accessible (information) , , , , , acquisition , , , , –, –, –, , , activation , , , , , , , , , , , actor , , –, , , , , , –, , addressee , , , , , , , affective, affected (case) , , , , , , , , –, , , , agent (case) , , , , –, , , , , , , , , , , , agreement , , , , , , ambiguity , , , , , , , , animacy , argument , , , , , , , , , , , , , , , , , , , , association , , , , attention –, , , , , , , , , , , , , , , , , , , –, Australian , , , , , , , , , , , Austronesian , , , , , autonomous B back-propagation , , background (v. foreground) , , , , , , , , , –, –, –, , Bantu , , , , –, , , , , , Basque , , , , , –, , , , – beats Blending Theory , , , (Conceptual Integration Theory) , , completion , composition , , elaboration , , blends , , , , , ‘bottom-up’ (processing) C case , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , –, –, , , , , , , , , , , , , , –, , , , –, , , , , , , , , , , –, , categorization , , , , , causality , causation , , , , causative , , , , external , , , , , , –, , , , , , , internal , , , , , , , , –, , –, , , –, , non-causative causatives , , Chewa , Chinese , , , , , , , , –, – cluster tree , cognition , , , , , , , , , , cognitive domain , , , , , –, , , , , , , , , , , , , –, –, , , , factors , , , , , , , , , , , , , , –, , , , , , , Grammar –, , , , , –, , , , , , –, , , , –, , , , , , , , , , , , , , , model , , , , , , , , , , , , , , , –, , –, , , , , , , , , –, , , , , , –, , Semantics –, , , , , , , , , , , , , , , , , , –, , , , , , , , , , , , , , , status , , , , , , , , , , , , , , , , , , , , –, , , , –, , , , – coherence , , , , , , –, , , cohesion JB[v.20020404] Prn:12/04/2006; 9:47 F: HCP15SI.tex / p.2 (178-312) Subject index cohesion markers ‘comfort zone’ , , –, completion , composition , , comprehension –, , , , , , , , , , , , , , , , , , concept , –, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , conceptual –, , , , , –, , , , , , –, –, , , , , –, , , , , –, , , , –, –, , , , –, , , , , –, , –, –, , , , –, –, , , , , , , , , , , , blending , , –, –, , dependency , , structure , , , , , , , , , , , , , , , , –, –, –, , , , , , , , , , , , , , –, , , , , , , , , , , , , , , , , , , , , , , , , , , , –, , , , , , –, –, , conceptual blends Conceptual Integration Theory (Blending Theory) , , conceptualizer , , , , , configuration , , , –, , , connectionism , connexity , , , –, , constraint , , , construal , , , , , , , –, , , , , , –, , , , , , , –, construction , , , , , , , , , , , , , , , –, –, , , , , , , –, , , , , , , , context , , , , , , , , , , , , , , , , , , , , , , , , –, , , , , , , , , , , conversation –, , , , , cooperative corpus , , , , , , , , , , , , , , , , , , , , , , , , , , cross-linguistic , , , , , , , , , , , , , , , , , , cryptotype , –, –, – cue , , , , –, , cultural linguistics –, , culture , , , , , , , , , , , , , , , , , , , , , culture-specific , , , , , , , , , , D decontextualized (image) default , , , , , , , , , , , deictic , , , –, , – deixis dependence , , , discourse function discourse production , , discourse structure , , , , discourse unit discursives – Dokean (framework) domain , , , , , –, , , , , , , , , , , , , –, –, , , , double subject durative , , – Dutch , , , –, , –, , Dyirbal , , –, , E effective constructions egocentric , elaboration , , emblem , embodiment , , emergent , , , , , , , , , – emergent structure , emotions , , , , , , , , , , English , –, , , , , , , , , –, , –, , , , , , –, , , , –, , , –, , , , , , , –, –, , , –, , , , , , , , , , –, , –, , , , , , , –, – episode , – episode boundary , , , , ergative , , , , –, – event , , , , , , , , , , , , , –, –, –, , , , , , , , –, –, , , , – experiencer , , –, , , , , –, , , experiential , , , , , , , , , , , , , , F F-space , , , – ‘inside’ , , , , , –, , , , , , , ‘outside’ , , , , , , –, –, , , , , , , , , JB[v.20020404] Prn:12/04/2006; 9:47 F: HCP15SI.tex / p.3 (312-446) Subject index feature overlap , , fictive (motion) , , , –, –, – figurative , –, , figure , –, , , , , , , , , , –, –, , , –, –, –, –, , , , –, , , focus (of attention) , , , –, , , , , , , , , , , , , , , , , , , , , , , , , , , force dynamics , , foreground (v. background) , , , Formal, Formalist , , , , , , , , , , , , , , , , frame , , , , , , , –, , , , , , , –, function words , , , –, –, –, – functional , , , , , , , –, , , , , , , , , functorization –, , –, , G gender , , generativist gesture –, , , , –, , complex gestures , , deictic gestures simple gestures , given (information) , , , , , , , , –, , , , –, , , , , , , , , , , , , , , , , , , , , , , , , –, , , , , , , , , , goal (case) , , , , , , , –, , –, , , , , , , ground , , , , , , , , , , , grounding , , , , , , , , , , , H hidden units homophony , , , , I iconic , , , , , , Idealized Cognitive Model (ICM) , , , ideology image schema (schemata) , Incremental Procedural Grammar indexicals , , Indo-European , , inference , innateness input , , , , , , , , , , , , , , –, , , –, data (information) –, , , , , –, , , , , , , , , –, , , , , , , , –, , , , , , , , , , , , , , , , , , , , , – spaces , , , , , , , , , –, – input spaces , instigator , , , integration , , , –, , , –, , , , interactive , , internal state –, , , – L Landmark (LM) , , , language acquisition , , , , –, , , , learning , , , –, –, –, lexical co-occurrence –, , lexicalization, lexicalisation , –, , , –, , , , lexically driven , , , –, , , , M Mandarin (Chinese) , , , mapping , , , , , , –, , , , , , , , , markers , , , , , , , , , , , , , case , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , –, –, , , , , , , , , , , , , , –, , , , –, , , , , , , , , , , –, , cohesion evidential , , , , , nominative , , , –, –, tense , , , , , , , , , , , , topic , , , , , , , , , , , , , , , , , , – medium , , , , memory , , , , –, , , , , , , – mental spaces , , , , , , , , – metalanguage , , , , , , , metaphor , , , , , , , , , , , –, , , , , , , , , JB[v.20020404] Prn:12/04/2006; 9:47 F: HCP15SI.tex / p.4 (446-579) Subject index metaphorical scope , , model , , , , , , , , , , , , , , , –, , –, , , , , , , , , –, , , , , , –, , computational , , , , , , –, , , , , , , module , mood (participial) , , – motion , , –, , , , mutual ground , , , , , , , , , , , information , , , –, , , , , , –, , , –, , , , –, , , , , , , , , , , , , , , , –, , , , , , , , , , , , , –, , , –, , , N narrative –, , , –, , , –, , , , , , network , , , , , , , –, , , , neural , , , , semantic –, , , , , , , , , , , , , , , , , , , –, –, –, , , , , –, , , , , –, , , , , , , , , –, , , , , , , , , , , , –, , , , , neutralization , , , complete , , , , , , , , , , partial , , , , , , , , new (information) , , , , , , , , , , , , , , , , , , , , , , , , –, –, , , , , , , , , , , , , –, , , , , New Zealand English , , , , O om , –, , , , , , , output , , –, layer , pattern , , , , , , , , , , , , , , , , , , –, P Pakeha , pantomime paradigm , , , , – parallel (processing) , , , , , , , , , path , , , , , , , patient , , , , , pattern completion pause patterns , , , , , , perception , , , , , , , , , , –, , , , , performance perspective , , , , , , , , , , , , , , , , , , , , , , , –, , , polysemy , , , , , , , , –, problem solving processing –, , , , , , , , , , , , , , , , , , , , –, , , , , , –, , , bottom-up language –, , , , –, , , , , , –, , , , , , , –, –, , , , , , , –, , –, , , –, –, , , , , –, , , , –, –, –, , , , , , , , , –, , , –, , , , , , , phonetic , , –, , , –, , , top-down production , , , , , , , , –, –, , –, –, , , , , –, , , , projection partial , , , , , , , , prominence , proposition , , , , , , , , prototype , , –, , , , , , , , , , , prototype effects , , R range (pseudo-goal) , , , , , , , , , , , , , , , , , , , , , , , , , , recall , , –, – recognition , –, , , –, , recovery , , , , reference , , , , , , , , , , , , , , , , , , , , , , , , , , – register relativism role , –, , , , , , , , , , , , –, , , , , , , , , , , , , , , –, , , , , , , , , , , , , , JB[v.20020404] Prn:12/04/2006; 9:47 F: HCP15SI.tex / p.5 (579-695) Subject index –, , –, , , , semantic –, , , , , , , , , , , , , , , , , , , –, –, –, , , , , –, , , , , –, , , , , , , , , –, , , , , , , , , , , , –, , , , , syntactic , , , , , –, , , –, , , rule , , , , , , , , , S salience , , , , , , , , scenario , , , , , , , , , , –, , , , , , , , , , , – schema , , , , , , , –, , , , , , , schematicity schematization , , scope , , , , , , , , , , , , , self organization network – map – semantic extension , , , , , , , , , , , , , , , , –, –, –, , field , , , , , , , , , , , , , , , –, , structure , , , , , , , , , , , , , , , , –, –, –, , , , , , , , , , , , , , –, , , , , , , , , , , , , , , , , , , , , , , , , , , , –, , , , , , –, –, , semanic space , semantics –, , , , , , , , , , , , , , , , , , –, , , , , , , , , , , , , , , cognitive –, , , –, , , , , , , , , , –, , , , , , , , , , , –, , –, , –, , , , , , , , , –, , , , –, , , , , , , , –, , , , –, , –, , formal , , , , , , , , , , , , , , Shona , , , , –, , sign language social , , , , , , , , , , , , , , , , , , , , , , , , , space , , , , , , –, , , , , , , –, , , –, , , –, , , , , , , , , , , , , blended , , , , , , mental –, , , , , , , , , , , , , , –, , , , , , , , , , , , , , –, , , , , physical , , , , , , , , , , , , , , –, , , –, , , , , , , –, – Spanish , , , , , , –, , , , , speaker –, , , , , , , , –, –, , , , –, , –, , , , , , , , , , , –, , –, , , , , , , , speech act , statistical learning , , , , structure building , , subjectivity , –, , , – subjectivity scale subordinate mood type , surrogate symbolic , , , , , , , , , , , , , , T Tagalog , , , , –, , , , temporal grounding tense , , , , , , , , , , , , conceptualization , , , , , , , –, , , –, , , continuous , , , , , , , , thematic coherence theme , , theory , , , , , , , , , , , , , , , –, , , , , , , –, , , , , , , , , , , thinking for speaking ‘top-down’ (processing) topic , , , , , , , , , , , , , , , , , , – topic continuity , , , training , , trajector , , , , , –, , , , –, trajectory (TR) , , , , transitive , , , , , , –, –, , JB[v.20020404] Prn:12/04/2006; 9:47 F: HCP15SI.tex / p.6 (695-750) Subject index Turkish , , , , , , , , – turn-taking U unit , , , , , , , , , , , , processing –, , , , , , , , , , , , , , , , , , , , –, , , , , , –, , , universal –, , , , , usage-based V variation , , , , , , , , , verb , , , , , , , , , , –, , –, , –, –, , , , , , , , , –, , –, , –, , , , –, , , , , ‘killing’ , , , , , –, , emotion , , –, , , , , , motion , , –, , , , processing –, , , , , , , , , , , , , , , , , , , , –, , , , , , –, , , viewpoint , , visual , , , , , –, , , , , , , voice , , , , , , , , , , , , , , , W weights , , –, , Whorf –, , , , , working memory , –, X Xhosa , , , , , , , In the series Human Cognitive Processing the following titles have been published thus far or are scheduled for publication: 17 LANGLOTZ, Andreas: Idiomatic Creativity. A cognitive-linguistic model of idiom-representation and idiomvariation in English. 2006. xii, 326 pp. 16 TSUR, Reuven: ‘Kubla Khan’ – Poetic Structure, Hypnotic Quality and Cognitive Style. A study in mental, vocal and critical performance. 2006. xii, 252 pp. 15 LUCHJENBROERS, June (ed.): Cognitive Linguistics Investigations. Across languages, fields and philosophical boundaries. 2006. xiii, 334 pp. 14 ITKONEN, Esa: Analogy as Structure and Process. Approaches in linguistics, cognitive psychology and philosophy of science. 2005. xiv, 249 pp. 13 PRANDI, Michele: The Building Blocks of Meaning. Ideas for a philosophical grammar. 2004. xviii, 521 pp. 12 EVANS, Vyvyan: The Structure of Time. Language, meaning and temporal cognition. 2004. x, 286 pp. 11 SHELLEY, Cameron: Multiple Analogies in Science and Philosophy. 2003. xvi, 168 pp. 10 SKOUSEN, Royal, Deryle LONSDALE and Dilworth B. PARKINSON (eds.): Analogical Modeling. An exemplar-based approach to language. 2002. x, 417 pp. 9 GRAUMANN, Carl Friedrich and Werner KALLMEYER (eds.): Perspective and Perspectivation in Discourse. 2002. vi, 401 pp. 8 SANDERS, Ted J.M., Joost SCHILPEROORD and Wilbert SPOOREN (eds.): Text Representation. Linguistic and psycholinguistic aspects. 2001. viii, 364 pp. 7 SCHLESINGER, Izchak M., Tamar KEREN-PORTNOY and Tamar PARUSH: The Structure of Arguments. 2001. xx, 264 pp. 6 FORTESCUE, Michael: Pattern and Process. A Whiteheadian perspective on linguistics. 2001. viii, 312 pp. 5 NUYTS, Jan: Epistemic Modality, Language, and Conceptualization. A cognitive-pragmatic perspective. 2001. xx, 429 pp. 4 PANTHER, Klaus-Uwe and Günter RADDEN (eds.): Metonymy in Language and Thought. 1999. vii, 410 pp. 3 FUCHS, Catherine and Stéphane ROBERT (eds.): Language Diversity and Cognitive Representations. 1999. x, 229 pp. 2 COOPER, David L.: Linguistic Attractors. The cognitive dynamics of language acquisition and change. 1999. xv, 375 pp. 1 YU, Ning: The Contemporary Theory of Metaphor. A perspective from Chinese. 1998. x, 278 pp.
© Copyright 2026 Paperzz