Universität Stuttgart Institut für Maschinelle Sprachverarbeitung Azenbergstraße 12 70174 Stuttgart Germany Segmental foreign accent Autor: Daniel Duran Matrikelnummer: E-Mail: [email protected] Prüfer: Apl. Prof. Dr. phil. Bernd Möbius Betreuer: Matthias Jilka, Bernd Möbius Diplomarbeit Nr. 63 Beginn der Arbeit: 01. 09. 2007 Ende der Arbeit: 01. 03. 2008 Hiermit erkläre ich, dass ich die vorliegende Arbeit selbständig verfasst habe und dabei keine andere als die angegebene Literatur verwendet habe. Alle Zitate und sinngemäßen Entlehnungen sind als solche unter genauer Angabe der Quelle gekennzeichnet. Daniel Duran Esslingen, den 1. März 2008 Abstract This thesis examines segmental foreign accent phenomena, i. e. individual sounds as spoken by non-native speakers of a language and their characteristic deviances from the language norm. The first chapter introduces basic terminology and identifies some difficulties in defining several concepts essential to foreign accent research and research on second language acquisition in general. Terms like accent, native and foreign language or bilingualism are introduced and the according definitions are discussed as proposed in the literature. Chapter 2 gives an overview on various variables in experimental studies on L2 speech production, and foreign accent and the factors which have been found to influence the degree of foreign accent. This chapter is primarily concerned with extralinguistic variables which correspond to characteristics of the examined speakers like gender, age or their previous language experiences. However, the influence of a speaker’s L1 on his or her L2 is also discussed in that chapter. An overview of the research literature on various segmental acoustic, i. e. phonetic and phonological manifestations of foreign accent is given in chapter 3. The most often analysed acoustic phenomena are identified and fields which received only marginal attention in foreign accent research are addressed. In chapter 4, theories and models which are used to explain the foreign accent phenomenon are reviewed. Some of the theories and models presented there are concerned with (second) language acquisition in general and not particularly with segmental phenomena of foreign accent. These are nevertheless relevant to research on the topic on which this thesis is focused. Besides general models of the human language capacity, models concerned with specific phenomena in the domain of segmentals are presented. Chapter 5 addresses methodological issues important to experimental studies on foreign accent. It is discussed according to what criteria the subjects for experiments on foreign accent should be selected. Problems with various task designs are discussed as well as the problems an experimenter faces when evaluating and interpreting the data. Various methods are pointed out how degree of foreign accent can be measured. Finally, chapter 6 presents an experimental study on the realisation of a phonological vowel opposition in German by non-native speakers in comparison to bilingual and native speakers of German. For this study, speech samples of twenty speakers were recorded and acoustically analysed, ten of which are non-native speakers who learned German not until school age. The study focuses on the acoustic features of vowel quality and vowel duration and compares the realisations of these between the respective speakers. Contents Abbreviations 7 1 Introduction 1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Accent . . . . . . . . . . . . . . . . . . . . . 1.1.2 Language acquisition vs. language learning 1.1.3 Native language – First language . . . . . . 1.1.4 Foreign language – Second language . . . . 1.1.5 Bilingualism . . . . . . . . . . . . . . . . . . 1.1.6 Interlanguage phonology . . . . . . . . . . . 1.1.7 Foreign accent . . . . . . . . . . . . . . . . 1.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 9 10 10 13 13 14 14 14 2 Factors affecting degree of foreign accent 2.1 Affective and psychological factors . . . . . . . . . . . . . 2.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Language learning aptitude . . . . . . . . . . . . . 2.2 Gender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Formal L2 instruction . . . . . . . . . . . . . . . . . . . . 2.4 L1 background and L1-L2 combination . . . . . . . . . . 2.4.1 Language distance . . . . . . . . . . . . . . . . . . 2.4.2 L1 proficiency and influence of L2 on L1 . . . . . . 2.5 Language use patterns . . . . . . . . . . . . . . . . . . . . 2.6 Exposure to L2 surrounding and amount of L2 experience 2.6.1 Length of residence . . . . . . . . . . . . . . . . . . 2.6.2 Age of arrival . . . . . . . . . . . . . . . . . . . . . 2.7 Age of L2 learning (AOL) . . . . . . . . . . . . . . . . . . 2.8 Speaker-independent factors . . . . . . . . . . . . . . . . . 2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 16 17 17 18 18 19 20 21 22 23 23 23 23 24 26 3 Manifestations of foreign accent 3.1 Segmentals I: Consonants . . . 3.1.1 VOT . . . . . . . . . . . 3.2 Segmentals II: Vowels . . . . . 3.3 Phonotactics . . . . . . . . . . 3.4 Suprasegmentals . . . . . . . . 3.5 Voice quality . . . . . . . . . . 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 28 28 29 29 30 31 31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CONTENTS 3 4 Theories on foreign accent 4.1 Universal grammar and foreign accent . . . . . . . . . . 4.2 The critical period hypothesis . . . . . . . . . . . . . . . 4.2.1 Nature not nurture . . . . . . . . . . . . . . . . . 4.2.2 Problems with the critical period hypothesis . . 4.2.3 A sensitive period . . . . . . . . . . . . . . . . . 4.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . 4.3 Contrastive analysis, phonetic transfer and interference . 4.4 Direct realism . . . . . . . . . . . . . . . . . . . . . . . . 4.5 The Speech Learning Model . . . . . . . . . . . . . . . . 4.5.1 Summary . . . . . . . . . . . . . . . . . . . . . . 4.6 The perceptual magnet effect . . . . . . . . . . . . . . . 4.6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 33 34 35 36 38 38 39 40 41 42 42 44 5 Methodological issues 5.1 Subject selection and control group . . . . . . . . . . . . 5.2 Obtaining data: The task . . . . . . . . . . . . . . . . . 5.3 FA rating by native speaker judges . . . . . . . . . . . . 5.3.1 Scaling foreign accent . . . . . . . . . . . . . . . 5.3.2 The judges . . . . . . . . . . . . . . . . . . . . . 5.3.3 Native speakers’ judgments and acoustic features 5.4 Foreign accent detection by acoustic measurements . . . 5.5 Criteria for native-likeness of speech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . of foreign accent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 45 46 47 48 48 49 50 50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 52 55 56 57 58 59 59 59 60 62 62 64 72 74 78 79 6 Experimental study 6.1 German vowels . . . . . . . . . . . . . . . . . . 6.1.1 Acoustic correlates of the German vowel 6.1.2 German vowels: Summary . . . . . . . . 6.2 The participants . . . . . . . . . . . . . . . . . 6.3 The speech material . . . . . . . . . . . . . . . 6.4 Procedure . . . . . . . . . . . . . . . . . . . . . 6.4.1 Part I: Interview . . . . . . . . . . . . . 6.4.2 Part II: Production experiment . . . . . 6.5 Acoustic analysis: method . . . . . . . . . . . . 6.6 Acoustic analysis: results . . . . . . . . . . . . . 6.6.1 Vowel quantity . . . . . . . . . . . . . . 6.6.2 Vowel quality . . . . . . . . . . . . . . . 6.6.3 Tenseness . . . . . . . . . . . . . . . . . 6.6.4 Effects of L1 . . . . . . . . . . . . . . . 6.6.5 Age effects . . . . . . . . . . . . . . . . 6.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . opposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Summary and conclusions 81 A Tables and figures 84 B Wordlists 117 Bibliography 122 List of Figures 4.1 4.2 Critical periods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . NLM Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 44 6.1 6.2 6.3 6.4 6.5 6.6 The German monophthongs . . . . . . . Screenshot of labels in WaveSurfer . . . Vowel duration (per group) . . . . . . . [i:]∼[I] and [y:]∼[Y] of speaker A03. . . . Disturbing signal . . . . . . . . . . . . . Voice quality parameters RCG and SKG 53 61 63 67 72 74 A.1 A.2 A.3 A.4 A.5 A.6 A.7 The F1 /F2 vowel spaces of speakers A01, A02 and A03. . . . . . . The F1 /F2 vowel spaces of speakers A04, A05 and A06. . . . . . . . . . The F1 /F2 vowel spaces of speakers A07, A08 and A09. . . . . . . . . . The F1 /F2 vowel spaces of speakers A10, B01 and B02. . . . . . . . . . The F1 /F2 vowel spaces of speakers B03, B04 and B05. . . . . . . . . . The F1 /F2 vowel spaces of speakers B06, B07 and B08. . . . . . . . . . The F1 /F2 vowel spaces of speakers C01 and C02 and the reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 . 95 . 96 . 97 . 98 . 99 . 100 B.1 Printed version of word list A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4 List of Tables 1 Plotting symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 6.2 6.3 6.4 Denominations for the German vowel classes . . The vowel contrast pairs. . . . . . . . . . . . . . Reference F1 /F2 values and standard deviations . P-Values of t-tests of long and short vowels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 54 65 73 A.1 Demographic speaker characteristics . . . . . . . . . . . . . A.4 Formant values of German monophthongs (from literature) A.5 Vowel duration ratios . . . . . . . . . . . . . . . . . . . . . . A.6 Formant values (A01) . . . . . . . . . . . . . . . . . . . . . A.7 Formant values (A02) . . . . . . . . . . . . . . . . . . . . . A.8 Formant values (A03) . . . . . . . . . . . . . . . . . . . . . A.9 Formant values (A04) . . . . . . . . . . . . . . . . . . . . . A.10 Formant values (A05) . . . . . . . . . . . . . . . . . . . . . A.11 Formant values (A06) . . . . . . . . . . . . . . . . . . . . . A.12 Formant values (A07) . . . . . . . . . . . . . . . . . . . . . A.13 Formant values (A08) . . . . . . . . . . . . . . . . . . . . . A.14 Formant values (A09) . . . . . . . . . . . . . . . . . . . . . A.15 Formant values (A10) . . . . . . . . . . . . . . . . . . . . . A.16 Formant values (B01) . . . . . . . . . . . . . . . . . . . . . A.17 Formant values (B02) . . . . . . . . . . . . . . . . . . . . . A.18 Formant values (B03) . . . . . . . . . . . . . . . . . . . . . A.19 Formant values (B04) . . . . . . . . . . . . . . . . . . . . . A.20 Formant values (B05) . . . . . . . . . . . . . . . . . . . . . A.21 Formant values (B06) . . . . . . . . . . . . . . . . . . . . . A.22 Formant values (B07) . . . . . . . . . . . . . . . . . . . . . A.23 Formant values (B08) . . . . . . . . . . . . . . . . . . . . . A.24 Formant values (C01) . . . . . . . . . . . . . . . . . . . . . A.25 Formant values (C02) . . . . . . . . . . . . . . . . . . . . . A.26 Within-speaker comparison of formant values. . . . . . . . . A.27 Formant differences (reference values) . . . . . . . . . . . . A.28 Formant differences (group B) . . . . . . . . . . . . . . . . . A.29 Voice quality parameters: A01 and A02 . . . . . . . . . . . A.30 Voice quality parameters: A03 and A04 . . . . . . . . . . . A.31 Voice quality parameters: A05 and A06 . . . . . . . . . . . A.32 Voice quality parameters: A07 and A08 . . . . . . . . . . . A.33 Voice quality parameters: A09 and A10 . . . . . . . . . . . A.34 Voice quality parameters: B01 and B02 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 85 86 87 87 87 88 88 88 89 89 89 90 90 90 91 91 91 92 92 92 93 93 101 102 103 104 105 106 107 108 109 5 . . . . . . . . . . . . . . . . . . . . 7 6 LIST OF TABLES A.35 Voice quality parameters: B03 A.36 Voice quality parameters: B05 A.37 Voice quality parameters: B07 A.38 Voice quality parameters: C01 A.39 Summary . . . . . . . . . . . and B04 . and B06 . and B08 . and C02 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 111 112 113 116 Abbreviations and notational conventions The following notational conventions and abbreviations are used in this thesis: For phonetic and phonological transcriptions the International Phonetic Alphabet (IPA) is used. For reasons of readability, SAMPA-like symbols are used in the vowel diagrams. The equivalences between IPA, SAMPA and the diagram plotting symbols are shown in table 1. IPA SAMPA plots ø: 2: 2 œ 9 9 a a A a: a: a E E E e: e: e E: E: 3 I I I i: i: i O O 0 o: o: o U U U u: u: u Y Y Y y: y: y Table 1: Plotting symbols Where language names appear in pairs, e. g. “German-Croatian”, the first refers to a speaker’s first language and the second to the same speaker’s second language. In reference to bilinguals the first language refers to a speaker’s dominant language and the second to his or her non-dominant language (in the case non-balanced bilinguals). The language codes listed below are used according to the ISO 639-1 standard1 . AOL AOA bg C CPH de DD F0 F1 F2 F3 FA it IPA hr 1 age of learning (a second language) age of arrival (in a second language’s speech community) language code: Bulgarian consonant critical period hypothesis (for language acquisition) language code: German Daniel Duran, the author fundamental frequency first formant second formant third formant foreign accent language code: Italian International Phonetic Alphabet language code: Croatian http://www.infoterm.info/standardization/iso_639_1_2002.php 7 8 LIST OF TABLES Abbreviations (continued) hu ka L1 L2 LOR NLM pl pl. ro ru SAMPA sd SLM tr uk UG V VOT /. . . / [. . . ] <. . . > language code: Hungarian language code: Georgian first language = native language second language length of residence (in a second language’s speech community) native language magnet language code: Polish plural language code: Romanian language code: Russian Speech Assessment Methods Phonetic Alphabet standard deviation speech learning model language code: Turkish language code: Ukrainian universal grammar vowel voice onset time phonological transcription phonetic transcription orthographic representation Chapter 1 Introduction It is not only since the world apparently grows smaller and smaller that individuals with different language backgrounds come into contact. Language contact happened at all times and at all levels from two single individuals to whole societies. The human language capacity enables people to recognise whether their interlocutors belong to the same language community or whether they are foreigners by the way they speak. The main focus of this thesis lies on this foreign accent as a specific outcome of a language contact situation at the individuals’ level: the segmental deviances in the speech of non-native speakers from the norm of a language. Since foreign accent is the focus of this thesis, the following sections in this chapter are concerned primarily with the domain of pronunciation, i. e. the production of speech sounds and utterances. Other language domains like morphology, syntax, semantics or pragmatics are not considered here, even though they might be (at least) as important as pronunciation to the topics following in this introductory chapter or later parts of this thesis. 1.1 Definitions In order to discuss the foreign accent phenomenon, some basic terminology has to be defined first. There are often terminological inconsistencies or ambiguities in the literature on second language acquisition or foreign accent research – even with respect to basic terminology. The following sections provide an overview of problems with defining some of the terms and their usage in the literature. As mentioned above, this overview is concerned only with the domain of pronunciation relevant to foreign accent research. This means, that topics like (second) language proficiency or attainment in language learning will be discussed primarily from a phonetic and phonological point of view, with pronunciation in mind. 1.1.1 Accent The first term which needs to be discussed here is accent. This is a highly ambiguous term. There are two meanings of accent which are of interest in this thesis: 9 10 CHAPTER 1. INTRODUCTION 1. The term accent is often used in the (general) meaning of a way of speaking. This can be either a characteristic regional pronunciation or the pronunciation of a specific group of people or non-native speakers – i. e. a foreign accent (which will be further discussed in this thesis). 2. On the other hand, accent is often used synonymously to stress or emphasis placed on a particular syllable or word. The terms word accent and pitch accent are often used in these latter cases. Up to this point the term has already been mentioned several times in this thesis. In all of these cases, “accent” was used in the first meaning mentioned above. In most cases there is no confusion about the interpretation of the term, as it can be inferred from the context. In texts on second language acquisition, sociolinguistics or dialectology for example, it is most likely, that “accent” is used in the first mentioned sense to refer to a specific way of speaking. Works on phonology or prosody use the term “accent” most likely in the second mentioned sense to refer to stress or emphasis. However, in works covering both areas, the term can be highly ambiguous. Jilka (2000) for example uses the term to refer to “foreign accent”, “pitch accent” and “word accent”. In this case, the meaning of the term cannot be understood without the context, as can be seen in sentences like the following one: “The total of 1300 pitch movements they produced is measured and compared instrumentally, further analysis being restricted to the tonal movements associated with accented words” (Jilka, 2000, p. 46). The terms “accent”, “accentedness” or “accented” will be used in this thesis only in the first mentioned sense to refer to a specific (marked) pronunciation – especially the pronunciation of non-native speakers (the following sections provide more in-depth discussions of the associated terms and concepts). The experiment presented in chapter 6 covers some aspects of German phonology. To avoid confusion in that context, the word stress will be used exclusively to refer to the prominence or emphasis of a certain syllable within a word, the so called word or lexical stress – even though “(word) accent” is often used in the respective literature1 . 1.1.2 Language acquisition vs. language learning Some researchers explicitly distinguish between language acquisition and language learning. The former is used to refer to the unconscious, effortless, spontaneous process which young children go through when they are exposed to one or more languages and begin to speak. The latter term refers to the process of consciously learning a second language later in life. The term acquisition thus might refer to a naturalistic setting while the term learning might refer to an instructional setting like the language class room. Language acquisition and language learning are not distinguished in this thesis and the terms will be used interchangeably according to the cited sources. 1.1.3 Native language – First language The question of what exactly constitutes a person’s native language is not as easy to answer as it might seem. Several different criteria have been proposed for the definition of what makes a 1 The considerations in this section hold also true for the German language and the respective usage of the term “Akzent” in the literature. This is especially important in chapter 6, which cites several sources on German phonology. 1.1. DEFINITIONS 11 specific language a person’s mother tongue or native language. Often, the term native language is not defined at all in the literature on second language acquisition or foreign accent. Further, the terms native language and first language (abbreviated L1) are often used interchangeably. There are also several other terms. Lenneberg (1967) for example uses the term primary language. This section will provide an overview of various possible linguistic definitions of what constitutes an individual’s native language. As a theoretical approach to a definition of the term native language one might consider a person’s proficiency, (grammatical) competence or performance in a language2 . The language one person “knows best” could then be called the native language of this person. Although there are numerous tests assessing language proficiency (e. g. the “Test of English as a Foreign Language – TOEFL” or the “Test of Spoken English – TSE”, or the “Deutsche Sprachprüfung für den Hochschulzugang – DSH” for German), there is, unfortunately, no generally accepted measure of proficiency. There is also no general agreement upon the terms competence and performance. This thesis will not go into the details of the discussion on these two concepts, despite its general relevance to the issue of second language acquisition. Frequency of language use seems to be another good candidate for the definition of a person’s native language. By this criterion, a person’s most frequently used language would be her or his native language. This could also be tested by additional or more specific usage criteria such as a person’s language of counting, of speaking to oneself, of swearing, dreaming and so on. In bilingualism research the term dominant language is used to refer to cases with more or less objective (testable) criteria like the ones mentioned so far (i. e. to a person’s most frequently used language or the language a person knows best). This term is used in contrast to a person’s nondominant second languages. Still, definitions like these are usually not used explicitly in second language acquisition literature. In most cases, the criterion used to assign the label native to an individual’s language is the age at which the respective language has been learned or acquired (compare section 1.1.2). The language (or the languages) learned from infancy is said to be the native language. In this case, the terms native language and first language are really synonymous. However, this criterion is not without problems either, even though it may appear to be so on first sight (compare for example the case of bilingual speakers in section 1.1.5). As mentioned above, it is often the case with studies found in the literature, that all these different criteria are met by one and the same language – or at least it seems to be so, as there are often no further considerations regarding the validity of the label native language. Thus, a person’s first learned language might also be his or her most frequently used language and as well his or her language with the highest competence or performance. The criteria taken into account in the various studies can often only be implicitly inferred from the descriptions of the examined subjects due to the lack of explicit information on this matter. As linguistically motivated as the above-mentioned criteria might be, they presuppose a precise definition of a language. As Li (2005) points out, “there is no simple answer to the question ‘what is a language?’ ” as there is no strictly linguistic definition of the term, or of the distinction between language and dialect. Beddor and Gottfried (1995) mention some problems that may arise with the designation of a person as a native speaker of a particular language. The language a person learns at home during early childhood might be different from the one used in school – and this 2 The three terms proficiency, competence and performance will be used in an informal sense in this thesis without further formal definition. In general, the terms proficiency and competence both refer to the linguistic knowledge about a specific language or the ability to produce and perceive it, while performance refers to the actual usage of a language. 12 CHAPTER 1. INTRODUCTION will be exactly the case in a lot of societies worldwide. The home language can also be different from the language(s) of the surrounding society a child grows up in. It is almost impossible to determine a person’s language experience precisely and exhaustively. Many languages in the world show considerable dialectal diversification. The term language will be used in this thesis to refer to any linguistic variety or code in general, i. e. to both a language and to what some might refer to as a dialect. Scovel (1969, p. 249) mentions an interesting operational definition of a dialect as being the second language an adult can, given enough exposure, learn to speak without a foreign accent. Depending on the listeners and the dialectal diversity of a language, dialectal differences might be perceived as foreign and vice versa (Major, 2001). There are cases of experiments where utterances of native speakers were (supposedly incorrectly) rated as having a foreign accent. One such case is described by Scovel, further examples are cited for example by Long (2005). Considering single instances (not the overall rating of a speaker), Flege et al. (1995) report in one experiment on a possibility of 2.9 % “misclassifications” of the native speakers’ samples. In some cases the authors attribute such confusion in judgment or other inconclusive results to different dialectal backgrounds of the participants: Mack (1989) for example examined (among others) speakers of “CanadianFrench” and “French from France”. The study found inconclusive results on VOT discrimination, and Mack theorized that a “careful study of cues to the /d-t/ contrast in various French dialects would help clarify the source of the bilinguals’ apparent perceptual confusion”. A study reported by Bongaerts (1999) examined L2 learners who reported to have been trained in the Received Pronunciation – the supraregional standard variety of British English. The study included a control group of native English speakers from the south of England and from the Midlands. The judges on the other hand where native English speakers from the north of England. As a result, the average ratings for the native speakers were rather low and half of the examined group of highly successful L2 learners received higher ratings than the native speakers (see section 5.5 for a discussion of such methodological issues). However, the general problem of distinguishing between language and dialect will not be further discussed in this work. Thus, examiners trying to assess the production or perception of a foreign or native language are somehow doomed to rely on the self-identification as a native speaker as the only practically applicable criterion in their selection of the subjects. Another possible identification criterion would be the judgment by other speakers of a person as a native speaker of their language. But this again comes with the above mentioned problems of not being objectively testable because of the various (more or less unknown) influencing factors on speech perception. However, from a statistical point of view, an identification of a given person as a native speaker by a group of independent judges seems to be of a higher validity in comparison to the judgment of only one single person (namely the speaker her or himself). An example of contradictory judgments of native-likeness is given by Mack (1989, p. 188). Two of her English-French bilingual subjects rated themselves as “slightly more proficient” in French. They were not judged as being native speakers by other French native speakers and received higher ratings in English than French – despite their self-declared higher proficiency in French. With respect to the problem of foreign accents the last given “definition” of native language could be reformulated as follows: a speaker’s native language is the language he or she produces without a (perceivable) foreign accent. This, however is a circular definition as the term foreign accent presupposes the concept of a native language (or a native language standard). To conclude this section, for the scope of this thesis the terms native language and first language, short L1, will be used synonymously. An explicit criterion for the designation of a speaker’s native language will not be decided upon. As this section shows, this is not an easy task. The above cited examples show that this is not just a theoretical or terminological problem. A lot of inconsistencies 1.1. DEFINITIONS 13 and disagreement on this issue can be found in the literature. As a consequence, I will adopt the terminology and the speaker classifications of the respective cited sources without detailed examination of their underlying views on this issue. However, where needed, problems obviously attributable to vague, inconsistent or varying definitions of native language or native speaker will be addressed. There will be also be a discussion of ramifications for methodology in foreign accent research (see chapter 5). 1.1.4 Foreign language – Second language Similar difficulties like those with the definition of native language arise with the definition of foreign language. As the term also presupposes the definition of language, the respective considerations mentioned above apply here as well. Without going into too much detail, a foreign language is a language an individual does “not know” (yet). The term second language, short L2, is used to designate a particular foreign language which a speaker does subsequently learn or acquire, i. e. at a later stage in life after the native language has already been established (at least to some degree). A distinction between second or third language will not be made in this thesis. It is however important to note, that such distinctions are sometimes made, while most authors use the term second language exclusively. Second language is thus used as a cover term in this thesis for any language or languages acquired after the first. It is also worth noting, that the terms foreign language and second language sometimes seem to be used synonymously. 1.1.5 Bilingualism To make all these considerations somewhat more complicated it has to be mentioned that many – if not most – people worldwide do not grow up with just one language and can thus be considered as bilinguals (Major, 2001; Li, 2005). The term bilingual is far from being used consistently in the literature. There is an important distinction between its usage within the literature on foreign accent and second language acquisition in general and its usage in the context of bilingualism research. In the former, the term is mainly used in its broadest sense to refer to people who have learned a second language or are in the process of doing so – regardless of their level of proficiency or the respective age at onset of learning. Such a usage can be found for example in works by Flege and Fletcher (1992); Flege et al. (1999); Guion et al. (2000) or Levi et al. (2007). In such cases the labels native language or L1 are easily assigned to one language and L2 to another. Within the field of bilingualism research, however, the term is most often used in a much narrower sense to refer to people who can use two languages in conversational interaction, often at such a proficiency level where code-switching, the systematic change from one language to another in the course of conversation, takes place3 . Subjects of research within this field are often adult speakers who have learned or acquired a second language long before the time of examination or children growing up multilingual (Li, 2005). In these cases the designation of a single native language can be difficult and so the term is often not used at all. To avoid terminological confusion4 the term bilingual will be used in this thesis exclusively to refer to so called early bilinguals, namely people who have acquired two (or more) languages from early childhood on – either simultaneously or subsequently. As it is an obvious prerequisite for foreign 3 4 Note that code-switching is a characteristic language mode of bilinguals and not a phenomenon caused by L2 learning deficiencies. In other words, code-switching is not an instance of a “syntactical foreign accent”. Li (2005, p. 6) gives in table 0.1 an example of 37 different terms which have been used in the description of bilingualism. 14 CHAPTER 1. INTRODUCTION accent research to examine subjects who actually speak more than one language, the label bilingual can be misleading and is not needed to refer to these examined speakers in general. 1.1.6 Interlanguage phonology The term interlanguage is used to denote a “separate linguistic system” which is responsible for a learner’s “attempted meaningful performance” in a target language norm (Selinker, 1972, p. 214). In other words, interlanguage is used to denote a linguistic system of an L2 learner which is (a) separate from that learner’s L1 system and (b) employed to express a communicative utterance in an L2. This interlanguage system is usually different from the system of an idealized L2 native speaker (i. e. the L2 norm). It is also important to mention, that this system is not static but constantly changing over time – as it is the case with all language knowledge. 1.1.7 Foreign accent It is not easy to give a precise definition of foreign accent. There is no generally accepted definition among researchers. The case is comparable to the difficulties of defining such diversely interpreted yet commonly used terms like native language or bilingualism. Flege (1987a) defines foreign accent as “the perceived effect of many discrete and general differences in sounds produced by native and non-native speakers”. Another approach defines foreign accent as “. . . all those features of interlanguage speech which differentiate learners according to their native language backgrounds” (Ioup, 1984, p. 2). This means foreign accent can also be seen as a manifestation of incomplete mastery of an L2 phonetic system. Long (1990, p. 255) defines incomplete mastery of an L2 as the “objectively identifiable differences between the underlying linguistic knowledge systems of SL [= L2] speakers and monolingual native speakers of a language”. The expressions overall foreign accent or degree of foreign accent will be used in this thesis to refer to foreign accent from the above cited Flegean point of view – i. e. foreign accent as the perception of a speaker as non-native. This is important to note, since foreign accent – and especially its degree – is usually measured by the judgments of listeners who are native speakers of the respective language. Thus, most of the studies cited in this thesis did not explicitly examine all the individual segmental or supra-segmental deviances in the speech of L2 learners. Jilka (2000, p. 9) emphasizes that “only those deviations that are perceived as such can be considered instances of foreign accent”. 1.2 Summary Not only the vague use of terminology, but more severely its practical consequences can lead to unreliable or incomparable results in foreign language studies. The term accent will be used exclusively in reference to the foreign accent phenomenon. It thus refers to the (perceived) differences on the phonetic and phonological level in the speech of nonnative speakers. With respect to word stress or emphasis the term accent will not be used, even though it is often used synonymously in the respective literature. The terms native language and first language (L1) will be used synonymously in this thesis. They will be adopted from the cited sources, but terminological problems or their practical consequence 1.2. SUMMARY 15 will be addressed were necessary. Caution is needed whenever the issue of native speakers and comparisons of different speakers is touched. The same holds true for issues related to the concept of native language or language comparison in general. The term second language (L2) is used as a cover term for any language (or languages) acquired after the first. The term bilingual or bilingual speaker will be used to refer only to those individuals who have either acquired two or more languages simultaneously in early childhood, or who have acquired a second “native” language at an early age. Diverging use found in the cited sources will not be adopted. A distinction between language learning and language acquisition will generally not be made in this thesis. It is, however, addressed in section 2.3, where the influence of formal instruction on degree of foreign accent is discussed. Chapter 2 Factors affecting degree of foreign accent Not every L2 learner reaches the same level of proficiency. Some attain native-like pronunciation; some retain a strong foreign accent. Such everyday observations of differences in degree of foreign accent lead to the question: what affects foreign accent? Examining the speech of non-native speakers, researchers aim to find out more about the phenomenon of foreign accent and possible factors affecting its degree. This chapter gives an overview on variables in experimental studies on L2 speech production and foreign accent. Various factors influencing degree of foreign accent have been identified in a wide range of studies. Most of these factors correspond to characteristics of the examined speakers like gender, age or their previous language experiences. From a phonetician’s point of view, who usually examines speech production, perception or its acoustic features, such variables could be called extralinguistic, since they correspond primarily to characteristics of the examined speakers and not to acoustic features of the L2 speech utterances. Acoustic characteristics of the speech signal which contribute to the perception of foreign accent are discussed in chapter 3. One of the most important of these external factors seems to be the age of an individual at the beginning of L2 learning or acquisition – abbreviated as age of learning (AOL). Other factors supposed to be important for the discussion of degree of foreign accent are the age at which an L2 learner moves to an L2 speaking community or the length of his or her residence within such a surrounding. These and other factors are discussed in the following sections, beginning with speakerdependent factors. At the end of this chapter some speaker-independent factors affecting degree of foreign accent are addressed. 2.1 Affective and psychological factors Among the factors affecting degree of foreign accent, some affective or psychological factors like motivation, language learning aptitude, native language loyalty or the learner’s IQ have been proposed by various researchers. 16 2.1. AFFECTIVE AND PSYCHOLOGICAL FACTORS 2.1.1 17 Motivation Motivation has been suggested by some authors as one possible factor influencing the degree of foreign accent. Some of the suggested motivational variables are for example “professional motivation”, “integrative motivation”, “instrumental motivation” or “concern for L2 pronunciation accuracy” (Flege, 1987a; Piske et al., 2001). Piske et al. reviewed several earlier studies that examined the influence of motivation on the degree of foreign accent. They conclude from the existing literature that “most studies [. . . ] have reported at least some influence of motivation on the outcome measures”, but they also emphasize that motivation has only little general effect and that it has not been quantified precisely. A factor like a person’s motivation (whether intrinsic or extrinsic) cannot be measured directly (or easily) with the usual methods within the scope of experimental phonetics. A deeper knowledge of the psychological research and theories of motivation is needed for a more comprehensive study of the influence of motivation on a speaker’s foreign accent. Although motivation is often mentioned as one (possible) factor affecting degree of foreign accent, motivation remains in most cases a vague concept that is not as extensively examined as other factors. For example Piske et al. state that professional motivation “may be a potent factor for groups of subjects who are required by their profession to speak an L2 without a foreign accent, but not so much for ordinary immigrants”. This assumption, however, is then not further examined (Piske et al., 2001, p. 211). Motivation is often measured by using a questionnaire. In such studies, the subjects are for example required to rate the importance of good or accent-free pronunciation of their L2 (e. g. Flege et al., 1995, 1999). In addition, the factors affecting motivation have to be considered as well. How (if at all) does motivation and its influence on L2 pronunciation change with the age of learning? Are there differences in motivation between early and late learners? Does a speaker’s motivation to speak without an accent decrease or increase with respect to the length of residence or his or hers L2 use patterns? How does motivation change with increased proficiency? Piske et al. left these questions open for future research. Whatever the influence of motivation on degree of foreign accent might be, it is assumed – as already mentioned – that it is only of relatively little importance. 2.1.2 Language learning aptitude Language learning aptitude is another variable that influences the degree of foreign accent. It is also a variable that is more psychological than linguistic in nature. A case study by Novoa et al. (1988) describes a male native English speaker with an above-average “talent” for foreign languages. He has acquired five languages with reported native-like performance, all of which after the age of 15. After learning French, German and Spanish in high school, he learned Moroccan Arabic “with unusual ease relative to his peers” and “picked up” Italian. Novoa et al. explicitly state that native speakers attested the speaker’s lack of a foreign accent in all these languages. One conclusion the authors draw is that “generally superior cognitive functioning is not necessary for exceptional second-language acquisition”. The examined speaker had an average IQ, average musical abilities and an average ability for manipulation of “abstract verbal concepts”. The causes for such L2 learning abilities cannot be determined from this single case study alone. The authors observed that the reported case is “consistent with some theories of second-language acquisition and contradicts others”. Cases of such highly successful L2 attainment are attributed to individual characteristics, which distinguish these learners from “normal” people. It is argued that such cases are “superexceptional” 18 CHAPTER 2. FACTORS AFFECTING DEGREE OF FOREIGN ACCENT and, by themselves, provide no counterevidence to theories which predict that a native-like attainment in a second language is only possible below a certain age (Bongaerts, 1999; the relevance of age of learning is discussed in section 2.7). Piske et al. (2001, p. 202) emphasise that studies relying on controlled conditions in their examinations of aptitude factors are not conclusive. Suggested factors affecting degree of foreign accent are for example musical ability or the ability to mimic unfamiliar speech sounds. The latter has been identified by various studies as a significant predictor of foreign accent. The emergence and nature of such abilities is still not fully understood, however. 2.2 Gender Gender has been found in some studies to affect degree of foreign accent. Flege et al. (1995) for example observed that below the age of learning of 12 (see section 2.7) the examined female speakers received higher ratings for their pronunciation (meaning their speech samples were perceived as less accented) in comparison to male speakers. Above the age of 16, however, the ratings for the male speakers were higher. Whether such results are generalizable is not clear, as earlier studies have provided divergent results. Piske et al. (2001) conclude from a review of the existing literature that the available results do not “lead to any strong conclusions”. In contrast to the above cited findings, some studies suggest that the effect of gender on degree of foreign accent “may vary as a function of AOL and amount of L2 experience”. In summary, there is conflicting evidence, and the influence of gender on L2 acquisition and foreign accent remains still to be determined. 2.3 Formal L2 instruction The introductory section 1.1.2 addressed the difference between language acquisition and language learning. Usage of these terms in the literature might sometimes be confusing. According to the definition given in section 1.1.2, research on second language acquisition actually studies language learning, namely the process of consciously learning a second language usually after early childhood. This concept of second language acquisition can be further divided into naturalistic L2 acquisition (“outside the classroom”) and foreign language teaching (“inside the classroom”) (see e. g. Wode, 1980). Although some authors explicitly differentiate these concepts, this distinction is not made in most of the studies on foreign accent. Piske et al. (2001, p. 200) summarize that the various studies which examined the influence of formal instruction on degree of foreign accent “have not produced encouraging results for language teachers”. According to their review, there is no clear evidence, that formal instruction does affect degree of foreign accent. In other words, degree of foreign accent does not depend on whether the L2 learner receives formal instruction inside a classroom or whether he or she learns the language without a language teacher. An explanation for this finding which they provide is the little attention that pronunciation receives in most foreign language classrooms. Contrary to these findings, they cite studies that revealed an effect of instructional variables under certain controlled conditions1 . Such findings lead to the conclusion that “formal instruction” is too general a term to be used without 1 E. g. “intensive training in the perception and production of English sounds” or special “ ‘prosody-centered’ phonetic training”. 2.4. L1 BACKGROUND AND L1-L2 COMBINATION 19 in-depth examination of the used methods and the precise circumstances of formal instructions provided to the examined learners. 2.4 L1 background and L1-L2 combination The term L1 background refers to a speaker’s native language(s), and in a more general sense, to the sum of his or her previous language experience. As foreign accent research necessarily involves two languages, the L1 background of a speaker will be discussed together with the L1-L2 combination as a factor on the degree of foreign accent. As opposed to the majority of factors discusses in this chapter (which are extralinguistic in nature), the current two are based on the linguistic structures of the involved languages. L1 background and L1-L2 combination are often discussed in the literature on a generally global level (without studying acoustic characteristics of non-native speech). This section gives a short overview of such global considerations on L1 background and L1-L2 combination. The most trivial precondition for the emergence of a foreign accent is that the native language of a person is different from a spoken second language, otherwise a foreign accent will not appear in that persons speech2 (compare section 1.1.3). Questions which need to be addressed with respect to a speaker’s L1 are (1) Does the L1 background affect a speaker’s L2 foreign accent – and how? (2) Is an L1-L2 combination of closely related, i. e. phonetically similar, languages more likely to result in a foreign accent or is the opposite the case, that the more dissimilar two languages are the more likely foreign accents become? These are rather non-trivial questions upon which there is still disagreement. It is a popular observation, that the origin of non-native speakers, i. e. their L1 background, can be recognized from a characteristic way of speaking. Wode (1980, p. 127) mentions that “there are specific error types that are simply characteristic of a given L2/L1 combination”. Similar views can be found in a wide range of literature on foreign language teaching about typical errors and difficulties for the learners of a particular language. One such “typical error” that has been frequently examined is the confusion English /ô/ and /l/ by native Japanese speakers, which represents a phonemic contrast in English but not in Japanese (e. g. Iverson et al., 2003; Yamada, 1995). That different native languages lead to language specific patterns of foreign accents, was experimentally demonstrated for example by Ioup (1984). The study she describes aimed to show that the native language of a speaker influences an L2 system mainly at a phonological level (and not at a syntactic one). One of her findings was that native speakers are able to group non-native speakers according to their respective native languages based on phonological cues, i. e. on their foreign accent. Thus, different native languages lead to recognizably different patterns of foreign accents. This effect that a person’s native language can have in shaping his or her foreign accent in an L2 is by some authors attributed to a phenomenon called transfer (Ioup 1984, see section 4.3 for further discussion). Descriptions of frequently observed mispronunciations of L2 learners come for example from foreign language teachers (see e. g. Ortmann, 1976). However, it is not possible to draw general conclusions from such effects of L1 background on L2 phonetics, i. e. its segmentals, syllable structure or prosody. 2 Though extremely rare, there are cases where a person (after serious brain damage) appears to be speaking his or her native language with a “foreign accent”. This frequently discussed, so called foreign accent syndrome won’t be considered here, as it is a pathological phenomenon which goes beyond the scope of this thesis and is generally not in the focus of foreign accent research. 20 CHAPTER 2. FACTORS AFFECTING DEGREE OF FOREIGN ACCENT Wode (1980) points out the great pronunciation variability found not only among L2 learners. The observed typical pronunciation errors have to be interpreted as “marking the range of the phonological variation of learners” (Wode, 1980, p. 127) rather than actual error predictions. So, the answer to the first question whether a given L1 background affects a speaker’s L2 foreign accent seems to be yes. However, whether this takes place in a precisely predictable way cannot be answered with certainty. It is not just a popular known cliché that the native language of a speaker can often be recognized from a characteristic foreign accent. The L1 background of a speaker affects L2 foreign accent in a way characteristic for all speakers of that L1. It has to be determined to what degree this general observation can be used to predict particular phonetic deviances that a learner of the L2 is likely to make. As Piske et al. (2001, p. 193) emphasize, the effect of L1 on degree of foreign accent remains uncertain. 2.4.1 Language distance Is there a correlation of phonetic or phonological similarity of two languages and degree of foreign accent? If there were predictable difficulties or limits on ultimate attainment for the learner of a language, this would provide a means of predicting a characteristic foreign accent. That difficulties in learning to pronounce an L2 are not independent from the native language can be concluded from the above cited finding that speakers of the same native language realize an L2 with similar foreign accents – apparently specific to their L1. Wode compared “typical errors” made by learners of English L2 with different L1 which showed that the mispronunciations made by the L2 learners depend on their L1 (Wode, 1980, p. 133). It is a wide-spread popular belief that the more similar two languages are the easier the foreign language is learned. And indeed, the view that the typological language distance affects (a) the rate of acquisition and (b) the ultimate attainment, is supported by second language research literature (Long, 2005). Brière (1966) showed in an experiment that target language sounds which have close equivalents in the L1 of the learner – either phonetically or on a phonemic level – are easier to learn than sounds that do not have such equivalents. The opposite observation is reported by Flege and Hillenbrand (1984) who conclude from a study with native English learners of French that French /y/, a sound that has no counterpart in English, was produced with relative great accuracy. On the other hand, the French /u/ sound was produced incorrectly by a group of “inexperienced” learners, and somewhat better, albeit incorrect as well by “experienced” learners. Similar results were obtained for French /t/. English learners failed to produce it with short-lag VOT values typical for monolingual native speakers of French. These findings contradict the former mentioned assumption that the more similar the L2 sounds are to L1 sounds, the easier they may be to learn. Flege et al. (1995) cite earlier findings where Chinese speakers of English, with an average age of arrival in the United States of 7.6 years, were rated3 significantly lower for their spoken English sentences compared to native English speakers, whereas native Spanish immigrants, with an average age of arrival of 6, received ratings not significantly different from native speakers (compare section 2.6.2). Some authors attribute such differences between groups of learners according to their L1 to the typological language distance between the two languages (e. g. Long, 2005). Mildner and Horga (1999) point out that the typological differences between the sound systems of L1 and L2 may tell something about possible areas of difficulty in L2 acquisition, “but not necessarily about the cues in L2 perception or production that non-native speakers may use differently than the native ones”. In a study on the relations between L2 proficiency and the acoustic features of 3 Compare section 5 for methodological issues. 2.4. L1 BACKGROUND AND L1-L2 COMBINATION 21 vowels, they examined two, with respect to their vowel spaces typologically distant languages. The reported study examined the production of English vowels by native Croatian speakers. English, i. e. Received Pronunciation which was the examined variety of English, has eleven monophthongal vowels while Croatian has only five. One of their conclusions is that native speakers of Croatian reorganize the English vowel space according to Croatian principles. Regardless of their proficiency in English, the Croatian speakers relied heavily on duration in distinguishing between the English vowels /i/, /u/ and /I/, /U/ (where Croatian has only two categories /i/ and /u/) and not only on spectral features like native English speakers. The level of proficiency in English was found to have “no significant effect on either the position of most vowels in the vowel space or the trade-off between duration and quality cues”. In reference to a possible hierarchy of learning difficulty, Brière (1966, p. 795) points out, that such hierarchies must be based on “exhaustive information at the phonetic level, rather than on descriptions solely in terms of distinctive features or allophonic memberships of the phoneme classes”. He mentions the problems of comparing “convergent” and “divergent categories” of L1 and L2 based on allophonic descriptions. At the phonetic level, sounds of the L2 “are never really equal” to L1 sounds. The above mentioned difference between English /t/ and French /t/ is one such example. The study of foreign accent (and L2 acquisition in general) in conjunction with contrastive analyses of particular L1-L2 combinations lead for example to the contrastive analysis hypothesis and theoretical explanations which attribute foreign accent phenomena to phonetic transfer and interference (see section 4.3 for further discussion). Flege uses his “speech learning model” to explain findings indicative of similar sounds in L1 and L2 as being more difficult for the learner than dissimilar sound (see section 4.5). Whether an appropriate measure of language distance can be formulated is not clear. Evidence from existing studies is not conclusive whether such constructed hierarchies of difficulty would reveal that more similarity increases or whether it decreases difficulty of learning and thus, contributes to degree of foreign accent. If language distance is a factor on degree of foreign accent, this influence might be formulated in terms of a function of the variables L1 and L2, permitting predictions for the shape or degree of foreign accent. Existing research on that issue suggests that a relationship between L1, L2 and degree of foreign accent exists. However, whether findings from individual studies are generalizable is debatable, since only a few language combinations have received indepth attention so far. The most studies examined English as the target L2. Other often examined target languages are French, German, Spanish, Hebrew or Dutch. The list of the examined subjects’ L1s includes, apart from the already mentioned languages, the following ones: Arabic, Chinese (Mandarin, Taiwanese), Italian, Japanese, Korean, Persian, Russian, Swedish, Thai or Turkish (see e. g. Long, 2005; Piske et al., 2001). Compared to the large number of existing natural languages and the range of possible L1-L2 combinations, this is only a small sample. The sample is even smaller, considering the fact that not even all of the possible combinations from the above mentioned list of languages have been thoroughly examined so far. 2.4.2 L1 proficiency and influence of L2 on L1 A phenomenon related to the above mentioned considerations is the influence of an L2 on a speaker’s L1 which might result in a partial or complete loss of L1 proficiency. This phenomenon is called individual first language attrition. Piske et al. (2001) assume that a high level of proficiency in L1 is more likely to be maintained if a speaker uses it frequently, even after living a long time in a predominantly L2 speaking surrounding. They also state, related to this issue, that the self-estimated L1 proficiency is significantly correlated 22 CHAPTER 2. FACTORS AFFECTING DEGREE OF FOREIGN ACCENT with degree of foreign accent, such that higher self-estimated L1 proficiency correlates with higher degrees of foreign accent. This correlation, however, is not independently affecting degree of foreign accent from age of learning (see below). On the other hand, Guion et al. (2000) found no differences in L1 proficiency, although their subjects (Quichua-Spanish and Korean-English speakers4 ) varied in amount of L1/L2 use. An L2, i. e. exposure to it in an L2 speaking surrounding or the use of it, can change the production of a speaker’s L1. Such findings are cited for example by Piske et al. (2001). Differences in a speaker’s L1 are by definition not the subject of interest in foreign accent research. However, the effect of L2 on a speaker’s L1 might have general implications that have to be taken into account in studies on foreign accent. This is related to the already mentioned problem of defining criteria for native-likeness. Methodological implications are discussed in section 5.5. The possible effects of L1 use will be discussed in the following section. 2.5 Language use patterns Besides general L1/L2 language use, several contextual settings of language use can be distinguished: language use at work, at home (i. e. with the family, the partner, spouse etc.), with friends or in other social situations and so on. Flege et al. (1995) report that language use factors account for 15 % of variance in the foreign accent ratings, and thus making it the second most important factor in their analysis (preceded only by age of learning, see below). The most important language use factor for male subjects was language use at work, followed by social use. For female subjects the most important factor was, according to this report, social use, followed by home use (Flege et al., 1995, p. 3131). Piske et al. (2001) found that continued frequent L1 use contributed to a significantly stronger foreign accent, a result also reported by Flege et al. (1997). The same effect was found for late learners as well as for early learners, i. e. bilinguals. Even temporal changes in language use patterns are reported to affect speech production, which leads to some implications for methodological issues (see chapter 5). They found that self-estimated L1 use is another independent predictor for foreign accent. Note, that the earlier mentioned self-estimated L1 proficiency was not found to be an independent factor in the same study. Guion et al. (2000) confirmed the conclusion that amount of L1 use affects L2 production. The examined Quichua-Spanish speakers who used their L1 more frequently had significantly stronger accents then the examinees with lower L1 use. Flege et al. (1997, p. 184) even theorized that the presence of another language may have been the most important difference between the examined monolingual native English speakers and the Italian-English speakers (and not their age of learning English). Other studies on the other hand found no significant effect of language use on degree of foreign accent (cited in Piske et al., 2001). It has to be mentioned, that the hypothesized effects of language use affect not only degree of foreign accent in an L2 (or general L2 proficiency), but also proficiency in L1. The phenomenon of first language attrition has already been briefly addressed in section 2.4. In fact, language use is said to 4 Guion et al. (2000) refer to the examined speakers as “bilinguals”, but according to the definition adopted for this thesis, this is not appropriate, as the speakers in the first mention group all “learned Quichua as a first language at home and later learned Spanish as an L2 when they began school or work”, thus, later than “early childhood”. The Korean-English speakers do not meet the criteria either (compare chapter 1.1). 2.6. EXPOSURE TO L2 SURROUNDING AND AMOUNT OF L2 EXPERIENCE 23 be one major cause for L1 attrition. However, the domain of phonetics and phonology has received only little in-depth attention in the research on first language attrition (Seliger and Vago, 1991). 2.6 Exposure to L2 surrounding and amount of L2 experience In considering the influence of the exposure to an L2 speaking surrounding on L2 learners’ pronunciation, two important factors can be observed: (1) the amount of exposure, and (2) the age at which he or she is first exposed to an L2 surrounding. The amount of exposure is often measured by the length of residence (LOR) of a learner in an L2 speaking surrounding. The LOR is also supposed to be a testable indicator of amount of L2 experience (Piske et al., 2001). The age of first exposure is usually measured by an immigrant’s age of arrival in an L2 speaking country. 2.6.1 Length of residence The length of residence (LOR) of a non-native speaker in an L2 speaking surrounding is reported by some authors to affect degree of foreign accent. It is the second most frequently studied variable on degree of foreign accent (after AOL, see below). A significant influence of LOR on foreign accent ratings by native listeners is reported for example by Flege et al. (1995). On the other hand, several studies provide contradicting findings (reviewed e. g. by Long, 1990, or Piske et al., 2001). A possible explanation for contradicting findings regarding the effect of LOR is provided by Flege (quoted in Piske et al., 2001). He hypothesizes that LOR affects L2 learners only in an “initial phase of rapid learning” (see 2.7 below). According to this view, the L2 learner proceeds faster through early stages of learning and is in this phase affected by the language surrounding. Studies which did not find a significant effect of LOR are hypothesized to have examined too narrow a range of LOR values. In summary, the existing literature indicates that the effect of LOR seems to decrease with the increase of a learner’s level of proficiency. For highly experienced learners, additional years of residence are unlikely to change degree of foreign accent significantly. However, this does not at all mean that there are no changes in L2 (or L1) proficiency over time (Piske et al., 2001). 2.6.2 Age of arrival The age of first exposure to an L2 is often equalled with the age of an individuals’ arrival in a predominantly L2 speaking country. A lot of studies examined the proficiency of immigrants in the dominant language of their new country. The age of first exposure to an L2 surrounding of immigrants is commonly called “age of arrival”, short AOA. Sometimes it is assumed (or simplified) that the AOA does also mark the onset of L2 acquisition (see next section), so these two variables are equalled and used interchangeably (e. g. Flege et al., 1995). 2.7 Age of L2 learning (AOL) The age of an individual at the beginning of second language learning, short age of learning (AOL), is by far the most frequently cited and examined factor on foreign accents, and according to some authors the most important one (Flege et al., 1995; Long, 1990; Piske et al., 2001). 24 CHAPTER 2. FACTORS AFFECTING DEGREE OF FOREIGN ACCENT A common view is that “earlier is better” and, it can be added, that “later is faster” (Long, 1990, 2005). In other words, AOL affects not only the ultimate attainment but also the rate of L2 learning. In early stages of learning, older learners proceed faster than younger ones. On the other hand, the earlier in life a child starts with L2 acquisition the higher the level of ultimate attainment will be, including the ability to achieve accent-free pronunciation. However, even an early start before the age of four is no guarantee for a native-like pronunciation of an acquired L2. Flege et al. (1995, p. 3128) examined native listeners’ judgments on the pronunciation of English sentences of 240 native Italians (with a control group of 24 native speakers). AOL5 was found to be the most important factor accounting for an average of 59 % of variance in the foreign accent ratings. With an AOL of less than 4 years the percentage of non-native speakers who received native-like ratings was 78 %. On average, the ratings of the speakers’ pronunciation were significantly lower only above an AOL of 7.4 years and they decreased with increasing AOL. From the speakers with AOL above the age of 16 years no one received native-like ratings. Contradicting evidence of higher pronunciation abilities of late learners compared to early learners is explained by the initial rate advantage of late learners over younger ones. It is argued, that studies suggesting better performance of late learners were actually examining learning rate, confusing it with ultimate attainment (Long, 1990, 2005). Long (1990, 2005) argues, only young learners can attain a native-like level of proficiency, but will not necessarily do so. After an AOL of six the achievement of accent-free pronunciation begins to become unlikely for most learners. With an AOL above twelve, native-like attainment in L2 pronunciation is said to be generally impossible. Long concludes these age limits from reviewing the findings of various studies. Other authors provide different AOL above which accent-free pronunciation, i. e. native-like ultimate attainment, is said to be unlikely or impossible. The general effect of AOL on degree of foreign accent (“earlier is better”) is nevertheless accepted by most researchers. Older learners on the other hand have an initial rate advantage over younger ones. They acquire early stages of an L2 faster than younger learners. According to Long (1990) this effect lasts only for a shorter period in the acquisition of phonology compared to the acquisition of other linguistic domains, and, to be measurable, a minimum amount of exposure to the L2 sounds is needed. This initial advantage does not guarantee successful acquisition, however. Late learners usually do not succeed in acquiring L2 phonology without a foreign accent. Neither does an early start guarantee native-like performance compared to monolingual native-speakers (Flege et al., 1997). AOL is a factor influencing L2 acquisition in general, but the effect is not the same in all linguistic domains. The most often examined domains in L2 acquisition are morphology and syntax. According to Long (1990) the upper limit of native-like attainment in the domains of morphology and syntax is around age 15 (i. e. later than the proposed AOL limit for native-like attainment in pronunciation). The effect of AOL on other domains, like semantics, pragmatics or lexis needs still to be determined. Although such findings are important for a theoretical explanation of the observed influence of AOL on foreign accent and L2 acquisition in general, they go beyond the scope of this thesis. An overview of proposed theories explaining the foreign accent phenomenon and its relation to AOL is given in chapter 4, especially in the section on the critical period hypothesis. 2.8 Speaker-independent factors Besides factors dependent on speaker characteristics, some speaker-independent factors have also been suggested to affect degree of foreign language. However, these factors deal with the perception 5 Flege et al. (1995) use AOA and AOL interchangeably (compare previous section). 2.8. SPEAKER-INDEPENDENT FACTORS 25 of the listener rather than the speech production of the speaker. As only the latter is within the focus of this thesis, such factors can’t be discussed here in detail. Only a short overview will be given. The previous sections focus on speaker-dependent factors, i. e. factors that depend on the language experience of the L2 learner or other individual characteristics. Such factors, like a speaker’s L1 background or age of learning, have been identified to affect a non-native speaker’s speech production and thus to contribute to that speaker’s L2 foreign accent. Besides such speaker-dependent factors, experimental studies have revealed that the perception of foreign accent is not exclusively based on the “many discrete and general differences in sounds produced by native and non-native speakers” (Flege, 1987a). The degree of foreign accent does also depend on the listener’s perception. Levi et al. (2007) for example examined the effects of listening context and lexical frequency on the perception of foreign-accented speech. Speech samples were presented to the listeners in two different contexts: one exclusively auditory and one combined context with additional orthographic display of the spoken samples. They found that high frequency words, i. e. words occurring more often in a language6 , are perceived constantly less accented than low frequency words. This effect was attenuated by additional orthographic presentation of the spoken words to the listeners. The “auditory+orthography context” had also the effect that native speakers were generally perceived as less accented and non-native speakers as more accented. Additional acoustic analysis of the speech samples confirmed the existence of differences between native and non-native speakers. These acoustic differences, however, were not correlated to lexical frequency and thus did not account for the observed perceptual effects. Other speaker-independent factors that were found to affect the perception of foreign accent are (1) the resolution of the used rating scales, (2) the elicitation techniques, (3) the proportion of native speakers among the examinees, (4) the range of L2 pronunciation proficiency included in the rating set, and, finally, (5) the linguistic experience of the listeners. Southwood and Flege (1999) suggest that rating scales with fewer intervals may produce ceiling effects. This means, that scales which are not sufficiently sensitive, i. e. scales which have too few points, cannot be used to reveal small differences between native and non-native speakers. Different elicitation techniques also affect the perception of foreign accent. Long (2005, p. 302) questions the validity of data obtained from limited, controlled samples of performance as indicators of the overall L2 abilities. With respect to the measurement of foreign accent on the other hand, it is argued, that samples above the level of isolated words would possibly be distorted with phenomena from other linguistic domains like prosody, morphology, syntax or lexis. The general rule is that the more natural and language-like a speech sample is, the less native-like it is likely to be perceived or rated by native language listeners. According to Flege and Fletcher (1992) the proportion of native speakers among the group of speakers under investigation affects the degree of perceived foreign accent. The more native speakers are included, the more accented the non-native speakers are perceived. A similar problem, which is stressed by Long (2005), are varying levels of proficiency of the speakers included in the date which is to be rated. He states how judges “may be fooled into accepting some of the near-native samples as native” because of the presence of obviously non-native samples. Flege and Fletcher conclude that ratings of foreign accent are not absolute, but influenced by the range of the talkers’ levels of proficiency included in the data which are to be rated by the listeners. It has been found that linguistically inexperienced, “naïve”, listeners tend to perceive a stronger degree of foreign accent than linguistically trained listeners like linguists or foreign language teachers 6 The frequencies of words are usually derived from language corpora. Levi et al. (2007) used the CELEX database. 26 CHAPTER 2. FACTORS AFFECTING DEGREE OF FOREIGN ACCENT (Piske et al., 2001). The above cited results suggest that factors influencing the perception (or the rating) of foreign accent exist, which are actually independent of a speaker’s speech production. Even though, such speaker-independent factors are not relevant for the acoustic analysis of foreign accented speech, they have implications for the theoretical description and the understanding of the foreign accent phenomenon. They provide also implications for methodological issues regarding experiment designs in foreign accent research (see section 5). 2.9 Summary If one thing can be concluded from the various findings in L2 research, it is that multiple speakerdependent factors affect degree of foreign accent. Gender may be a factor, but its exact influence on foreign accent is a matter of disagreement among L2 acquisition researchers. Similar conclusions can be drawn for factors like motivation, language learning aptitude or formal instruction. Further, it has been suggested that contrastive analyses of the L1 and L2 might reveal sources of predictable difficulties for the L2 learner which contribute to degree of foreign accent. There is no doubt that a speaker’s L1 affects foreign accent. The exact nature of this influence and the role of language distance or phonetic similarity between L1 and L2 are still a matter of disagreement. Early learners are more likely to speak an L2 accent-free, but do not necessarily do so. In general, overall degree of foreign accent increases with increased AOL. The proportion of L1/L2 language use does also affect degree of foreign accent. The more a speaker uses his or her L2, the less it is likely to be foreign-accented. Another important factor seems to be the age at first exposure to a predominantly L2-speaking surrounding as well as the length of residence of a learner within such a surrounding. In general, the lower the AOA and the longer the LOR, the lower the degree of foreign accent is expected to be. In addition it has been briefly mentioned that there are factors to degree of perceived foreign accent which are speaker-independent. Such factors are for example (in experimental settings): differences in the used rating scale, the elicitation techniques, the context and composition of data which are presented to the listeners, and the listeners’ linguistic experience. Such factors might not be relevant in acoustic examinations of foreign accent but they have to be considered by the experimenter in order to eliminate unwanted influences. In summary, foreign accent has to be seen as a relative phenomenon that changes over time and is dependent on various speaker-dependent variables and characteristics of the L1 and L2, as well as on speaker-independent factors, i. e. on the listeners’ perception and the surrounding circumstances. Chapter 3 Phonetic and phonological manifestations of foreign accent This chapter provides a short overview on acoustic, i. e. phonetic and phonological manifestations of foreign accent and the research on this matter. In section 1.1.7 it has been mentioned that judgments on foreign accent are usually based on the overall impression of a speaker’s pronunciation. Such impressionistic judgments do not explicitly refer to the various deviances in the speech of the nonnative speaker. Often, aspects like intelligibility or acceptability are involved in judgments about a speaker’s degree of foreign accent. Anderson-Hsieh et al. (1992) enumerate the main areas of pronunciation which need to be examined in foreign accent research as follows: segmentals, prosody (suprasegmentals), syllable structure, and voice quality. From these “areas of pronunciation” only the first is within the scope of this thesis, i. e. the segmentals. The errors1 that are observed within this domain are usually categorised as either substitutions or modifications of single sounds. There is, however, no strict distinction between these two. What counts as a substitution and what as a modification depends on the examined L1-L2 combination. If a non-native speaker realises an L2 sound in such a way, that it resembles another sound from his or her L1 or L2, than it can be called a substitution – for example the pronunciation of English [D] as [d]. A modification, then, is a deviance from the L2 norm, which has no (obvious) corresponding L1 sound – for example the realization of plosives with a voice onset time (VOT) – the time between the release of a stop consonant and the onset of voicing – which neither corresponds to the L1 of the speaker nor to the L2. Flege (1987b) observed that “the aim of most instrumental studies has not been to establish which dimension(s) contribute(s) most importantly to foreign accent, but to determine to what extent 2 L2 learners [. . . ] differ from native speakers”. In other words, the focus of studies where acoustic deviances are measured lies on the magnitude of the examined deviances, and not on the question, which acoustic dimensions exactly contribute (most) to the perception of a foreign accent. Brennan et al. (1975) report on a study which suggests that the degree of foreign accent is correlated to the amount of segmental mispronunciations, i. e. the amount of segmental deviances from the native speakers’ norm (see chapter 5 for a further discussion of this study by Brennan et al.). 1 2 The term “error” is widely used in the literature to refer to deviances from the norm of the examined L2. Usage of this term in this thesis does not imply reference to an idealised, prescriptive pronunciation, however. Emphasis in the original 27 28 CHAPTER 3. MANIFESTATIONS OF FOREIGN ACCENT Flege (1987b) emphasizes that acoustic measurements can in some cases reveal information about a non-native speaker’s language knowledge – about a categorical, i. e. phonological, contrast for example – which might not be perceived by the listeners. Learners might produce a systematic, measurable difference between sounds which belong to two different phonological L2 categories in a different way than native speakers do – either along a different phonetic dimension or at a different scale which both might be ignored or not be perceivable to native listeners. 3.1 Segmentals I: Consonants One of the most often examined consonant features in foreign accent research is VOT (e. g. Flege and Hillenbrand, 1984; Flege, 1987b; Mack, 1989). Other reported mispronunciations of L2 consonants include all kinds of substitutions, e. g. the substitution of English [D] with [d] by native Italian speakers (Flege et al., 1995, p. 3132), or the realisation of [T] as [t], [f] or [s] by native German speakers (Wode, 1980, p. 132). 3.1.1 VOT Several studies examined VOT values of plosives, e. g. the production of English [p, t, k] by native Spanish speakers. Flege (1987b) states that these English sounds were realised with VOTs greater than those of the corresponding Spanish sounds, but nevertheless shorter than the VOT values of native speakers. This confirms other observations, that the VOT values in the speech of L2 learners often take intermediate values between the L1 and the L2 norm. If the target language sounds have longer VOT values than the L1, sometimes overshooting can be observed (this means, the speakers produce the respective L2 with too large VOT values). The VOT measurements reported by Flege and Hillenbrand (1984) showed that even experienced French-English3 speakers produced VOT values which were higher than those typical for native French speakers (and thus more English like). The authors speculate that late learners “will never succeed in producing L2 stops with complete accuracy when stops in their native language differ substantially in VOT from those in L2” (p. 717). The French speakers who were classified as proficient speakers of English produced French [t] with higher VOT values than that of monolingual French speakers, revealing an influence of their L2 on their L1. Mack (1989) examined the speech of a group of ten English-French bilinguals (mean AOL: 4.5) who were all “judged native speakers of English” by native speakers (p. 188). In one experiment, the examinees read English CVC words. The acoustic analysis focused on the VOT values of English [d] and [t]. No significant differences between the English-French speakers and a monolingual English control group were found. A second experiment with the same examinees focused on the English vowels [i] and [I]. The measured features were vowel duration and the first three formant frequency values. The analysis revealed that there were almost no significant differences in pronunciation between the bilingual and the monolingual speakers. The bilinguals produced significantly more vowels with a decreased F2 value (of at least 50 Hz) from the midpoint of [i] to its offset. Mack concludes that the phonetic system of bilinguals “approximates, but does not match, that of monolinguals” (see following section). The more important finding for this present thesis is, that despite the fact, that all of the bilingual speakers in the study by Mack were judged as native speakers of English, there were measurable, 3 See notational conventions on page 7. 3.2. SEGMENTALS II: VOWELS 29 significant acoustic differences between the two groups4 . The speech of the bilinguals was rated on a 10 point scale with a mean rating of 9.3. However, the effect of amount or degree of the acoustic deviances on the ratings (i. e. the degree of foreign accent) was not examined. 3.2 Segmentals II: Vowels Studies on non-native vowel production usually examine vowel duration5 or formant frequency values (e. g. Flege and Hillenbrand, 1984; Flege, 1993; Levi et al., 2007; Mack, 1989; Mildner and Horga, 1999). A correlation between vowel quantity and vowel quality can be found in various languages and these two features are usually examined in combination. This correlation is also examined in the experiment presented in this thesis (see chapter 6). Flege and Hillenbrand (1984) analysed samples of French [ty] and [tu] syllables spoken by American English and French speakers. They measured VOT (see previous section) as well as formant frequencies for F1 , F2 and F3 . They found that the native American English speakers produced French [y] better than [u] (compare section 4.5). They also found that the native French speakers (with English L2) produced French [u] with higher F2 than monolingual French speakers do – resembling the English [u]. Flege (1993) examined the production and perception of the English word-final /t/–/d/ contrast by speakers from China and Taiwan and a native American English control group. The measured acoustic dimension was vowel duration. A general observation is, that all speakers produced longer vowels in the examined /b d/ carrier context than in the corresponding /b t/ context. However, all non-native speakers produced smaller differences in vowel duration than the native speakers. Although the study revealed a correlation between perception and production, the non-native speakers were in general more similar to native speakers in the perception of vowel duration differences than in the production of these differences. In their earlier mentioned study, Mildner and Horga (1999) examined the relations between proficiency in English of native speakers of Croatian and their acoustic vowel spaces of English vowels. Recorded speech samples of a group of 20 native-speakers of Croatian were rated for proficiency by “10 university professors of English” (with unspecified language backgrounds). The ratings were then compared to the results of acoustic analyses of vowel formant values and durations. Statistically significant differences between F1 or F2 values in the speech of the examinees and the English norm (Received Pronunciation) have been found in 8 out of 11 vowels. Vowel duration was significantly different from the English norm, and the non-native speakers used mainly duration as a distinction between English /i/, /u/ and /I/, /U/ – regardless of the respective speakers’ level of proficiency. 3.3 Phonotactics Mispronunciations or deviances from the language norm in the domain of phonotactics (i. e. the possible sound sequences) and syllable structure involve insertion, deletion or metathesis (reordering) of sounds. Flege et al. (1995) for example mention mispronunciations like the insertion of “schwa-like 4 5 There were also significant differences in perception – which is not discussed here. However, the results were similar for both perception and production, and the two groups were “nearly indistinguishable”. With respect to vowels, the terms duration, length and quantity will not be distinguished in this thesis. It should be noted however, that the terms duration and length are frequently used to refer to articulatory, auditory or acoustic aspects, while quantity is used to refer to the respective phonological feature of vowels (Ramers, 1988). 30 CHAPTER 3. MANIFESTATIONS OF FOREIGN ACCENT sounds” at the end of “red”, or omission of word-final consonants in “good” or “carrots” by native Italian speakers. A series of three experiments by Altenberg (2005) examined judgement, perception and production of English word-initial consonant clusters by native Spanish speakers. The non-native speakers “behaved, overall, like the native English speakers” in the judgment task, which suggests comparable linguistic knowledge in both groups (within this restricted domain of word-initial consonant clusters). As it is the case in the other studies cited here, Altenberg (2005) did not examine the contribution of the individual “types of modification” to the overall pronunciation rating. The mispronunciations were determined by phonetic transcription by two linguistically trained judges. 79, out of a total of 88 production errors were made in consonant clusters that were phonotactically permissible in English (the speaker’s L2) but not in Spanish (the speaker’s L1) – the majority of which involved word-initial epenthesis (insertion) of [O], [E], [e] or [P]. Epenthesis of a vowel in English word-initial consonant clusters by native Spanish speakers is a frequently reported phenomenon – e. g. the pronunciation of “school” as [Eskul]. An interesting finding of Altenberg’s production experiment is, that the production (but not the perception) of word-initial consonant clusters correlates with overall pronunciation proficiency, i. e. with the degree of foreign accent. Another reported error in the domain of syllable structure is devoicing of word-final consonants. This is a frequently observed phenomenon and is reported for non-native English speakers, for example native German or Italian speakers (Flege et al., 1995). 3.4 Suprasegmentals Piske et al. (2001) note that most studies on foreign accent have focused on segmental phenomena and that only a few have examined suprasegmentals (e. g. Jilka, 2000). In their above cited study, Anderson-Hsieh et al. (1992) examined the relation between the ratings which native speakers assigned to L2 speakers’ speech samples and the general deviance in segmentals, prosody and syllable structure. They investigated previously recorded SPEAK Test6 data of 60 speakers from varying language backgrounds of varying levels of proficiency. From the available test data, only a reading passage was used to ensure that only pronunciation skills were evaluated. The samples were rated for pronunciation and phonetically analysed. Only overall ratings for segmentals, prosody and syllable structure were determined. Single deviances from the native speaker norm like wrong VOT or formant values were not explicitly examined. Their conclusion from the statistical analysis can be summarised as follows: from the three examined domains of pronunciation, prosody is most strongly associated with the rating of pronunciation. These results are consistent with two of three earlier studies cited by Anderson-Hsieh et al. in the same report. However, only a few studies so far examined the contribution of prosody to degree of foreign accent. Jilka (2000) examined the contribution of intonation to the perception of foreign accent, analysing German speech of native American English speakers and American English speech of native German speakers and the perception of these production. He concludes that “intonation is by far the most important prosodic factor contributing to foreign accent in relation to other prosodic factors such 6 The Speaking Proficiency English Assessment Kit or short the SPEAK Test, as referred to by Anderson-Hsieh et al., is a test that uses forms from the Test of Spoken English (TSE) developed by the Educational Testing Service (http://www.ets.org). It is used to assess a speaker’s English speaking and comprehension skills like comprehensibility, pronunciation, grammar and fluency and it is not explicitly focused on the foreign accents of the examinees. 3.5. VOICE QUALITY 31 as rhythm or speaking rate”. In comparison to segmental foreign accent, intonational aspects were found to be clearly “of lesser importance” (p. 175). Piske et al. (2001, p. 212) on the other hand conclude – after providing a detailed review on existing literature on foreign accent research – that the examined evidence “does not allow one to quantify the relative contribution of segmental parameters, prosodic parameters and fluency to degree of foreign accent in an L2”. They emphasize the close relation between segmentals and suprasegmentals and how it is “difficult to draw a clear distinction between the two”. Despite the fact that most studies have examined segmentals, the exact contribution of individual phonetic deviances in the production of segmentals to (perceived) foreign accent seems to be unclear. Flege et al. (1995, p. 3132f) cite an experiment, which presented digitally processed recordings of speech samples to native English listeners for foreign accent ratings. The original recordings contained sentences spoken by native Italian and native English speakers. The processed sentences “preserved only amplitude and F0 variations”. The unprocessed recordings of the native speakers received – as expected – better ratings than the recordings of the non-native speakers. Interestingly, this was also the case for the processed recordings. Flege et al. conclude that “prosodic dimensions in the NI [native Italian] subjects’ production of English sentences were sufficient to cue foreign accent”. 3.5 Voice quality Anderson-Hsieh et al. (1992) notice that, in comparison to other domains of pronunciation, voice quality has not been well examined in second language or foreign accent research. Aspects of voice quality are (if at all) primarily addressed in the context of vowel distinctions (i. e. the articulatory settings). This issue is further discussed in chapter 6, which describes an experiment on German vowel production by non-native and bilingual speakers. Unfortunately, no literature on voice quality in foreign accent research could be reviewed for this thesis. Besides such general remarks as cited above, voice quality seems to have received only little attention in foreign accent research. 3.6 Summary Brennan et al. (1975, p. 35) suggest that a study of the correlations between degree of foreign accent and the relative frequency of various deviances in L2 speech “would indicate which features serve as the strongest cues of accentedness to listeners”. Findings like those cited here suggest that not all measurable acoustic deviances from the L2 norm (in the production of segmentals) are necessarily perceived as manifestations of a foreign accent. Which types of mispronunciations contribute to the overall degree of foreign accent cannot be concluded from the reviewed literature. Studies which focus on specific acoustic dimensions seem not to be as much concerned with overall degree of foreign accent as are the numerous studies on “external” factors cited in the previous chapter. Unfortunately, the literature does not provide any more than a vast collection of (too) specific case studies not enough to draw general conclusions as suggested by Brennan et al. above. Although the domains of segmentals and syllable structure received the most attention in foreign accent research, studies on prosodic deviances in non-native speech indicate that prosody is an important factor affecting the perception of foreign accent. According to some studies, it might 32 CHAPTER 3. MANIFESTATIONS OF FOREIGN ACCENT even contribute more to degree of foreign accent than the domain of segmentals does. Chapter 4 Theories on foreign accent In the previous chapters, various observations and findings of foreign accent research have been discussed. This chapter provides a short overview on various theoretical explanations of how and why foreign accents emerge, and models and hypotheses associated with the foreign accent phenomenon. As foreign accent can be seen as a specific phenomenon within the broader filed of second language acquisition research, there are several theoretical frameworks which do not explicitly focus on foreign accent but are nevertheless important for the theoretical approach toward this phenomenon. First, the most general theoretical frameworks (with respect to foreign accent) of the universal grammar and the critical period hypothesis for language acquisition are discussed. 4.1 Universal grammar and foreign accent The interlanguage system of an L2 learner is, according to Major (2001), composed of parts from the learner’s L1 system, parts from the target L2 and linguistic “universals”. Such universals of second language acquisition can be for example effects like overgeneralisation, simplification or overdifferentiation. In general, all those errors which an L2 learner makes that cannot be attributed to L1 transfer are called universals (see section 4.3 for a discussion of language transfer ). One theoretical framework which is concerned with (second) language acquisition claims that one part of these linguistic universals can be described by the concept of an universal grammar (UG). This framework addresses such phenomena in early language development like the apparent comprehension of words which the child cannot or does not yet imitate or the imitation of words which the child apparently does not understand. Children all acquire their first language at approximately the same age and go through the same stages of development, regardless of the varying circumstances they grow up in or the various languages they are exposed to (Lenneberg, 1967; Long, 1990; White, 1989). The competence which children acquire seems to go beyond the input they receive – which is, according to White, underdetermined, often degenerate, and which does not contain negative evidence (White, 1989, p. 4ff). The proposed solution to such problems is, that some fundamental language knowledge, i. e. some basic linguistic competence, has not to be acquired by the child but is innate. These innate fundamental 33 34 CHAPTER 4. THEORIES ON FOREIGN ACCENT linguistic structures supposedly underlying all natural languages are called universal grammar 1 . What’s interesting about UG in foreign accent research is (1) that part of UG which is concerned with the phonetic form, called universal phonetics (Chomsky, 1967), and (2) the predictions which UG makes for L2 acquisition and the explanations it offers for its problems. The general view in UG research today is, that UG is involved with L1 as well as with L2 acquisition but that it operates in different ways, i. e. UG (as an underlying so called learning device) is only partly accessible in L2 acquisition – either directly or via the L1. One explanation for the differences between child and adult language acquisition is that UG is utilised in competition with other general problem-solving abilities. These general problem-solving abilities are neither powerful enough nor restrictive enough for the acquisition of linguistic structures of a natural language, it is argued. Restrictiveness has been shown to be important in L1 acquisition. It limits the range of possible language structures, aiding in the acquisition of native language competence from underdetermined input. The reliance of adult learners on general problem-solving abilities and not on UG leads to “inefficient, slower and incomplete learning” (Long, 1990). Meisel points out that “. . . until the relationship between UG and other factors is spelled out more clearly, it is virtually meaningless to state that UG is a learning device”. He concludes, referring, in fact, more to syntax than to phonology, that UG is not directly accessible to the second language learner in the same way as it is accessible in L1 development (Meisel, 1991). Although UG might provide theoretical explanations of foreign accent, the majority of UG research seems to be concerned primarily with syntax and morphology – and not with L2 pronunciation and foreign accent. Long (1990) argues, that cognitive explanations for the differences between child and adult language acquisition, like the hypothesis of UG competing with general problem-solving abilities, fail to account for the different constraints on different linguistic domains (compare sections 2.7 and 4.2). 4.2 The critical period hypothesis Varying theories and explanations for the differences between child and adult language acquisition have been proposed under the notion of the critical period hypothesis for language acquisition (CPH). As formulated by Lenneberg (1967), the CPH states that there is a critical period for the successful acquisition of language. This period is limited by “cerebral immaturity” at its beginning and “termination of a state of organizational plasticity linked with lateralization of function” at its end (Lenneberg, 1967, p. 176). The CPH attributes changes in the ability of language acquisition to the biological development of the human being (namely his brain) and in that way it postulates maturational constraints on successful language acquisition. Thus, the CPH predicts a fundamental, biologically determined difference between first and second language acquisition. Lenneberg states that a language is acquired automatically from mere exposure during the critical period, while after the end of that period a language has to be learned consciously. The proposed end of the critical period is subject to much debate in language acquisition research and its literature. The end of the critical period is used to divide L2 learners into two distinct groups: early learners with AOL below the end of the critical period and late learners with AOL beyond that limit. With respect to foreign accent, it is this end of the critical period that has been in the focus of much 1 Note, that the term universal grammar is used to denote both, the proposed linguistic system, i. e. the “grammar”, as well as the corresponding theoretical framework. 4.2. THE CRITICAL PERIOD HYPOTHESIS 35 of the studies on ultimate attainment (Bongaerts, 2005). Applied to the area of L2 pronunciation, the prediction of the CPH is that the ability for complete acquisition of a phonological system is irreversibly lost after the critical period has passed, and so a foreign accent will inevitably be a feature of the speech of late learners. Lenneberg points out that “foreign accents cannot be overcome easily after puberty”. Long (1990) concludes that “the ability to attain native-like phonological abilities in an SL [= L2] begins to decline by age 6 in many individuals and to be beyond anyone beginning later than age 12, no matter how motivated they might be or how much opportunity they might have”. Birdsong and Molis (2001) summarise the various versions of the CPH in three basic criteria which experimentally obtained data has to meet in order for the CPH to be true: • Prior to the end of the critical period the level of attainment in L2 learning should be negatively correlated with the AOL. After the critical period has passed there should be no correlation between AOL and level of attainment, as this would suggest factors other than maturation. • There should be no late learners who attain a native or near-native level of performance in an L2. • Maturational constraints on ultimate attainment in L2 acquisition should be independent of L1 and L2. Bongaerts (2005) summarises the testable predictions of the CPH almost identically: • Related to the AOL, a discontinuity in the level of L2 proficiency should occur at the end of the critical period. • There should be no late learners who attain a native or near-native level of performance in an L2. In summary, these criteria state that the constraints on attainment in L2 acquisition and performance predicted by the CPH should be independent of variables not related to maturation. However, as Long (1990) emphasizes, the CPH “does not explain the phenomena to which it is applied, but is itself to be explained”. Several different explanations for the effects predicted by the CPH can be found in the literature – only a brief overview of which can be given here. Some authors suggest social, psychological or affective factors as an explanation for the observed differences related to the age of a learner. Others attribute differences between child and adult language acquisition to type or amount of input or various cognitive factors. 4.2.1 Nature not nurture A neurological explanation for the observed ability of children to completely acquire an L2 phonological system and the inability of adults to do so without a foreign accent is provided by Scovel (1969). His claims are based on the CPH as described by Lenneberg, but he modified it in a way such that its predictions are limited to L2 pronunciation. He refers to the so called Joseph Conrad phenomenon 2 as one of “many instances of adults learning the syntax of a second language completely and yet not being able to lose a foreign accent when speaking”. British writer Joseph Conrad 2 Although Scovel does not use this terminology with respect to that particular example, it is commonly referred to as the “Joseph Conrad phenomenon”, e.g. by Major (2001). 36 CHAPTER 4. THEORIES ON FOREIGN ACCENT is referred to as one example of a late learner of English as an L2. He acquired the language to a native-like level except for pronunciation, where he is said to have had a strong Polish accent. Scovel claims that it is not nurture that enables children to completely acquire an L2 system and which prevents adults from doing so. Referring to Lenneberg (1967), he points out that “the simultaneous occurrence of brain lateralization and the advent of foreign accents is too great a coincidence to be left neglected”. Thus, Scovel claims that cerebral lateralization and the emergence of foreign accents in L2 speech are correlated. However, in contrast to Lenneberg he concludes that this affects only the area of pronunciation and not other aspects of an L2, like syntax or lexis. Scovel attributes this difference between pronunciation and lexical or syntactic proficiency to the involvement of neurophysiological mechanisms in the production of sound patterns. The lexical and syntactic patterns, on the other hand, lack such “neurophysiological reality”, as Scovel assumes. 4.2.2 Problems with the critical period hypothesis Explanations for the decreasing attainment in L2 acquisition with increasing AOL, like the one stated above, “all have problems” (Long, 1990). One problem is the homogeneity observed in L1 acquisition. Children go through the same stages in L1 development at around the same ages, regardless of their motivation, attitude, social circumstances, cognitive abilities or the amount or quality of input they receive. Theories attributing differences between child and adult language acquisition have to explain, why such factors are irrelevant in child acquisition but not in adult acquisition. In addition, such explanations generally fail to account for different constraints on the various linguistic domains. The theories would have to explain, why L2 phonology is not affected in the same way by the proposed maturational constraints like other linguistic domains are – syntax and morphology for example (Long, 1990, compare section 2.7). Birdsong and Molis (2001) found, in replicating an earlier study which tested grammaticality judgments of learners on L2 morphology and syntax, that the performance of the examinees was related to their ages both before and after the assumed end of the critical period. In his survey of second language research, Bongaerts (2005) cites several studies that found no evidence for a discontinuity but a “quite linear” decline in L2 pronunciation proficiency related to AOL before as well as after the proposed limit(s) of the critical period. Figure 4.1 illustrates the proposed critical period for language acquisition, overlaid with examples of “quite linear” declines of native-likeness as a function of AOL. The horizontal axis represents age. The grey area marks the critical period according to Lenneberg (1967). The three vertical dashed lines mark the various limits of the “sensitive periods” according to Long (2005). The first line marks the end of the ability to attain a native-like accent “for many individuals” at the age of six years. The second line marks the end of the ability to attain accent-free pronunciation “for the remainder” at the age of 12. The third line marks the end of the sensitive period for morphology and syntax at the “mid-teens” (included for completeness). The overlaid coloured graphs show results from measurements of accentedness. The curve labelled “FMM” is adapted from Flege et al. (1995, p. 3128, figure 2) showing mean ratings of sentences spoken by native English (age 0) and native Italian speakers (ages above 0). The curves “B&M” and “J&N” are regression lines adapted from Birdsong and Molis (2001, p. 240, figure 3). The results (the number of “correct” sentences) are partitioned into two groups: the first representing “early arrivals” and the second “late arrivals” above the age of 16. “B&M” marks results obtained from Birdsong and Molis and “J&M” marks results from a study by Johnson and Newport which was cited and replicated by Birdsong and Molis. 4.2. THE CRITICAL PERIOD HYPOTHESIS 37 Phonology Begin of decline Morphology and Syntax, Long (2005) 100% B&M 50% J&N FMM Lenneberg (1967) 0 2 4 6 8 10 12 14 16 18 20 Figure 4.1: Critical periods: The critical period according to Lenneberg (1967) and Long (2005) with overlaid degree of foreign accent as a function of AOL according to Flege et al. (1995) (FMM) and Birdsong and Molis (2001) (B&M and J&N). According to Lenneberg, the ability to acquire a language (to a certain degree) after the end of the critical period does not contradict the CPH. He attributes this ability to the mechanisms established during L1 acquisition and the existence of similar (or “universal”) fundamental structures in all natural languages (see chapter 4.1). Possible counterevidence to the CPH for the acquisition of L2 pronunciation seems to come from late learners who achieve a high, native or near-native level of pronunciation proficiency in an L2 despite a late onset of learning. Long (1990) explicates that “a single learner who began learning after the period(s) have closed and yet whose underlying linguistic knowledge [. . . ] was shown to be indistinguishable from that of a monolingual native speaker” could serve as such counterevidence. The fact that there are not just a few “superexceptional” cases of (almost) native-like mastery of an L2 phonological system poses a difficult problem to the CPH. Birdsong and Molis (2001, p. 244) conclude from a review of earlier studies: “In most studies where nativelike attainment is found, subjects who perform at nativelike levels comprise about 5-20% of the sample”. This is a proportion that cannot be dismissed. They refer to studies on various aspects of L2 acquisition and not just those examining L2 pronunciation. However, they point out that native-like performance in L2 “phonetics and phonology [. . . ] has been demonstrated in several studies by Bongaerts and his colleagues”. Bongaerts (1999) reports on a series of three studies examining the L2 pronunciation of very advanced late learners. Each of the studies examined a carefully selected group of “highly successful” learners and compared them to a control group of native speakers and another group of L2 learners with the same L1 background but varying levels of proficiency. The studies show that a native-like performance in an L2 is not impossible to achieve for late learners. While the difference between the groups of native speakers and the highly successful late learners were still significant, some of the non-native speakers received ratings that matched consistently the criterion for native-likeness as used by Flege (1995) (see section 5.5). Findings like these can be interpreted as counterevidence to the CPH for pronunciation – or at least as counterevidence to its strongest version which predicts that there is an absolute age-limit for the acquisition of a native-like accent in an L2. 38 CHAPTER 4. THEORIES ON FOREIGN ACCENT Bongaerts also pointed out that native-like attainment in L2 pronunciation is only an exceptional phenomenon. The reported very successful late learners with native-like pronunciation represent obviously only a minority. For example the third study discussed by Bongaerts (1999, p. 143ff) found only three subjects with native-like performance, out of a group of nine pre-selected very successful late learners. From the comparison group of 18 learners at different levels of proficiency on the other hand, no one achieved native-like ratings. Bongaerts (2005) concludes his review of several studies on native-like attainment of late learners with two findings. First, attainment of native-like levels of proficiency of late learners, with AOL “sometimes well beyond” the end of the critical period, is possible. Second, such attainment of native-likeness is possible even for learners of typologically distant languages. Findings like this second one refer to the above quoted third criterion by Birdsong and Molis. It is argued that maturational (biological) constraints on ultimate attainment in L2 acquisition should be language-independent. Long (2005), supporting the CPH, criticises studies restricted to typologically related languages. Bongaerts (2005) points out that there are studies which found native-like speech by learners from (supposedly) typologically distant languages. However, Long (2005) argues that native-like attainment by late learners has typically been found in studies which examined limited samples only and not “more natural language use”. 4.2.3 A sensitive period Instead of a critical period, the modified concept of a sensitive period has been introduced. It is used to explain the gradual increase of foreign accented speech with respect to AOL that shows no sharp discontinuity as predicted by the CPH. However, the terms are often used interchangeably in the literature (Long, 1990; Piske et al., 2001). 4.2.4 Summary Thus, while results like those reported by Bongaerts (1999) seem to suggest that there is no critical period for the acquisition of native-like L2 pronunciation which ends at around the time of puberty, the differences between early and late learners remain obvious. Wode (1980) criticises theories and approaches to L2 acquisition which are restricted to specific structural domains as “fairly unenlightening for determining the nature of man’s language learning system” (in general). He points out questions such as whether (supposedly innate) mechanisms in L1 acquisition function in L2 acquisition as well or whether they can be manipulated by formal instruction, and he argues that these problems were not investigated apart from the framework of the CPH. Referring to this issue he states that “Lenneberg did not consider appropriate L2 data”. Similar considerations led Bongaerts (2005) to point out the “unfortunate consequences” of this concentration on the CPH in L2 research: the almost exclusive focus on AOL and the basic assumption that ultimate attainment is primarily a function of age. Referring to the various formulations of the CPH and the problems associated with it, Bongaerts quotes Singleton who concludes: “the CPH cannot plausibly be regarded as a scientific hypothesis [. . . ] it is like the mythical hydra, whose multiplicity of heads and capacity to produce new heads rendered it impossible to deal with it”. Long (2005), on the other hand, attributes much of the supposed counterevidence for the CPH to the overgeneralised usage of the term critical period hypothesis and to serious methodological 4.3. CONTRASTIVE ANALYSIS, PHONETIC TRANSFER AND INTERFERENCE 39 shortcomings, some of which will be discussed in chapter 5 (even though he is concerned with the CPH, some general conclusions for methodological issues in L2 research can be drawn from his remarks). 4.3 Contrastive analysis, phonetic transfer and interference Transfer is a major learning strategy, which is not restricted to language acquisition but can be found in a wide range of learning situations. The question, how a given L1 background affects a speaker’s production of a second language, as discussed in section 2.4, addresses the phenomenon of (phonetic) linguistic transfer. Phonetic or phonological transfer is the process of carrying over certain features or principles from the L1 system to another language L2. The study of L2 acquisition (and foreign accent), in conjunction with contrastive analyses of specific language combinations of L1 and L2 led to theoretical explanations which attribute foreign accent phenomena primarily to phonetic transfer and interference. The basic assumption (or claim) behind this approach is that all foreign accent phenomena as well as other L2 errors can be attributed to linguistic transfer – and that learning difficulties and errors can be predicted based on contrastive analyses of the respective languages. The descriptions of transfer and the theories on this issue are not restricted to foreign accent, but cover all kinds of linguistic domains, like syntax, morphology etc. Two kinds of transfer can be distinguished according to their outcome: negative and positive transfer. Positive transfer is the process of carrying over L1 features to another language which results in correct L2 expressions. Negative transfer takes place when carrying over L1 features to the L2 results in incorrect L2 expressions. The latter case is also called interference. However, the two terms transfer and interference are sometimes used interchangeably. In the domain of second or interlanguage phonology the term transfer is usually used to account for cases of negative transfer of L1 sounds or features into a target language L2, or the transfer of phonological rules from L1 to L2. According to such considerations, it should be theoretically possible to formulate a hierarchy of difficulty for any given pair of L1 and L2. Such a hierarchy could be constructed by a contrastive analysis, systematically comparing the sound systems of two languages. As an example, from a structural analysis, native German speakers might be expected to mispronounce word-final voiced consonants in English, because their native language has no voicing contrast in that position. This approach of contrastive analysis and the attribution of all L2 errors to linguistic transfer is (or maybe was) one of the central concepts in research and theories on second language acquisition and bilingualism. There are problems with this approach, however. One question for example is, why transfer affects L2 phonetics and phonology more than other domains like syntax. Another problem with this approach towards foreign accent is that not all errors can be attributed to transfer effects (Major, 2001). Ioup (1984) for example concludes that “transfer is the 3 major influence on interlanguage phonology”. She adds, the question is not whether phonetic transfer occurs or not, but how it affects the process of second language acquisition and why it “is so much more a predominant force in shaping the interlanguage phonology than in shaping the interlanguage syntax” (p. 14). Mack (1989) reports that early bilinguals can show monolingual-like performance in their dominant 3 Emphasis in the original. 40 CHAPTER 4. THEORIES ON FOREIGN ACCENT language without transfer effects from their weaker language. This can be interpreted as an indication that transfer is not an inevitable consequence of bilingualism. It is worth noting, that the dominant language is not always the first language of a speaker. This was for example the case with the examined English-French bilinguals in the study by Mack. Although the subjects showed some differences in comparison to monolinguals, they were all rated as sounding like native speakers. Today, the idea of linguistic transfer being the one major source for all problems in L2 acquisition seems to be discarded. Transfer is seen as one important factor in L2 phonology, besides others. As Major (2001, p. 35) put it, “even though universals are important, transfer exerts a very strong influence in SLA [second language acquisition] and perhaps is a permanent component of IL [interlanguage]”. He points out that in order for transfer to take place, there has to be “a corresponding existing structure”. Transfer is more likely with similar structures than with dissimilar (compare sections 4.5 and 4.6). 4.4 Direct realism Best (1995, p. 173) presents a working model of a direct realist view on cross-language speech perception. She points out that the central premise of direct realism is “that in all cases of perception, the perceiver directly apprehends the perceptual object and does not 4 merely apprehend a representative or ‘deputy’ from which the object must be inferred”. This means, that listeners perceive the relevant information directly from speech input without the involvement of innate linguistic knowledge or acquired abstract mental representations. Direct realism does not assume two separate informational domains for phonology and phonetics. Phonetics and phonology are both assumed to be based on articulatory gestures but “tap different levels of invariant structure” (p. 182). One central point of this assumption is, that “phonetic implementations” are language-specific rather than universal in nature. This accounts for phonologically hard to explain observations of language-specific phonetic characteristics of individual sounds (e. g. differing VOT values for English and French [t]). The prediction of this direct realist model of cross-language speech perception is, that listeners are attuned to language-specific information. As they become increasingly efficient in detecting crucial acoustic cues they pick up only this reduced, more compact information from the input. In that way the listeners become perceptually attuned to their L1. Here again, the concept of similarity plays a central role. Generally speaking, listeners are expected to be able to detect gestural similarities in non-native sounds to native ones. If the similarity between an L2 and an L1 sound is great, the L2 is expected to be assimilated into the respective L1 category. Although this model of speech perception is primarily based on articulatory gestures, the effects on speech productions, especially in a second language, are note discussed by Best (1995). This direct realist approach offers interesting explanations relevant to interlanguage speech perception. However, the implications for second language speech need to be examined. Unfortunately, no sources regarding this issue were reviewed for this thesis. 4 Emphasis in the original. 4.5. THE SPEECH LEARNING MODEL 4.5 41 The Speech Learning Model The Speech Learning Model (SLM) as defined by Flege posits that the phonetic vowel and consonant system of a speaker’s L1 influences the L2 system and vice versa (Flege, 1995). This interaction imposes constraints on the accuracy in both languages. The SLM links speech production to speech perception and incorporates the concept of similarity and dissimilarity of speech sounds. Its primary focus is ultimate attainment, and not the beginning L2 learner. Deviances in non-native speakers’ L2 pronunciation from the native speaker norm are attributed to an age-related decline of the learners’ ability to recognise certain audible acoustic differences between L1 and L2 sounds as phonetically relevant (Flege et al., 1995). As a consequence, new phonetic categories for the respective L2 sounds are not established when the perceived dissimilarity is too small. Late learners do not establish new phonetic categories because of this equivalence classification. In other words, they do not establish new categories for L2 sound, which they perceive as instances of similar L1 sounds. The underlying assumption behind such considerations is that L1 and L2 sounds are stored mentally within one common phonological space. One of the SLM’s hypotheses is that L1 and L2 sounds “are related to one another at a position-sensitive allophonic level, rather than at a more abstract phonemic level” (Flege, 1995, p. 239). This predicts also, that L1 sounds may be “deflected away” from neighbouring L2 sounds, which represent new categories, in order to maintain sufficient phonetic contrast in the common space. The ability to discriminate L2 sounds from similar L1 sounds decreases with increasing age. This is caused by reduced attention that is paid to subtle phonetic cues once the phonetic categories are well established. The perception of L1 sounds and L2 sounds as instances of the same phonetic category takes place even in such cases where acoustic differences between those sounds are auditorily detectable. It is a “fundamental aspect of human speech perception” to identify acoustically different sounds as members of the same linguistic category (Flege and Hillenbrand, 1984). Those L2 sounds, which are not recognised as belonging to a different category than the closest L1 sounds are predicted to be pronounced incorrectly. In addition, the production of these L1 sounds to which such L2 sounds are linked perceptually is predicted to gradually resemble that of the L2 sound. This is attributed to the need of maintaining sufficient phonetic contrast within the supposed common phonological space. The SLM predicts that for some L2 learners the perception of an L2 sound may be more accurate than its production (Flege, 1999, p. 109). Flege (1993) cites cases where this effect was observed and states that generally “productive abilities lag behind the development of perceptual abilities”. Another prediction of the SLM is that even when a new category for an L2 sound is established, this sound “might not be produced exactly as it is produced by native speakers” due to the presence of L1 sounds in the supposed common phonological space (Flege, 1995, p. 243). In their earlier cited study, Flege and Hillenbrand found that American English speakers were able to pronounce French [y] better than [u]. Findings like this one support the hypothesis that “new” L2 sounds are produced more accurately than “similar” ones. Another example supporting the predictions of the SLM is the earlier mentioned observation of VOT values in L1 sounds (e. g. in chapter 3), which lie between the values typical for of the respective speakers’ L2 and that of the L1 sounds typical for monolingual speakers. This can be interpreted as a result of mutual influence of L1 and L2 sounds – in cases, where those two sounds are identified as instances of one and the same phonetic category (as in the cited experiments on the VOT distinctions between English and French [t]). Additional support for such predictions can be seen in observations reported by Flege et al. (1995), 42 CHAPTER 4. THEORIES ON FOREIGN ACCENT for example: the self-reported pronunciation proficiency of the examinees in their L2 was inversely related to their self-reported proficiency in their L1. Supporting evidence for the existence of motoric output constraints on L2 speech, as suggested by Flege, is reported by Altenberg (2005). She observed in the earlier cited series of three experiments examining judgement, perception and production of English word-initial consonant clusters by native Spanish speakers, that L1 transfer affects L2 production but not necessarily perception. The results showed no correlation between the examinees’ scores on production and perception. The examinees performed native-like in the judgment task. Altenberg states that these correct judgments of permissible English consonant clusters “must be based on input”. She thus concludes in contrast to the SLM that “it seems unlikely that difficulties with the production task are due to problems in perception” (p. 75). She admits, however, that the observed differences may have resulted from different task effects in the respective experiments. It has to be added, that the study by Altenberg examined the phonotactic system while Flege explicitly states that the SLM is concerned with vowels and consonants, as mentioned above. 4.5.1 Summary The SLM predicts that “new” L2 sounds, i. e. those L2 sounds which are perceived as dissimilar with respect to L1 sounds, will be mentally represented by new phonetic categories and that they will be produced more accurately. Those sounds which are perceived as similar to L1 sound will be produced less accurately. As it is assumed that L1 and L2 sounds are represented in a common phonological space, those L1 and L2 sounds which are linked to the same category will eventually be produced alike. Thus, a major prediction of the SLM is, that a speaker’s L1 influences his or her L2 and vice versa. This mutual influence has been well observed and described by numerous researchers. One implication of this fact for experimenters is, that speech perception, and especially speech production of monolingual individuals cannot be compared to those of people who have acquired another language (see chapter 5). Flege points out that the SLM is a “working model” which can serve “as a useful heuristic for planning research” on L2 pronunciation and foreign accent (Flege, 1995, p. 238). 4.6 The perceptual magnet effect Sections 2.4 and 2.4.1 address the questions whether a learners’ language experience and whether language distance affects degree of foreign accent. This section describes the so called perceptual magnet effect and the corresponding model explaining it – the native language magnet model (NLM). Kuhl and Iverson (1995) argue that “language experience alters the mechanisms underlying speech perception, and thus, the mind of the listener”. The NLM accounts for early language development up to the age of around one year. It states that language exposure in that period plays a critical role in the development of language-specific speech perception. The theory predicts that the “distances” between sounds and a phonetic “prototype” are perceptually decreased within the surrounding of that prototype. This term is used to refer to an ideal exemplar or an abstract mental representation of a sound which is judged by listeners to be the best exemplar 4.6. THE PERCEPTUAL MAGNET EFFECT 43 representing a specific phonetic category. In other words, the NLM claims that such prototypes serve as “perceptual magnets” for other sounds around them. The ability to partition the continuous sound signals into categories is claimed to be innate. This categorical perception is part of general auditory processing mechanisms. It was demonstrated even for monkeys. However, they do not show a perceptual magnet effect. The perceptual magnet effect seems to be specific to humans. It is an effect of language experience on phonetic perception which is measurable as early as at the age of six months. However, it has been demonstrated for both six-month-old infants and adults as well (Kuhl and Iverson, 1995). In six-month-olds, this effect in perception is affected by exposure to their L1 and results from the infant’s analysis of the received language input. According to the NLM theory, speech representations are initially auditory in nature. Speech perception then changes from the initial “language-general” mode to a “languagespecific” one with increasing age (Iverson et al., 2003). Infants or young children are able to perceive phonetic contrasts in foreign language sounds, which adults from the same language background can or cannot discriminate. This ability to partition spoken language into categories is assumed to be innate. At around six months of age they begin to reanalyse the acoustic space according to their respective ambient language input. Sounds from an L2 which are similar to an already existing L1 prototype are difficult to perceive as being different from the respective L1 sound. The general principle of the change from “language-general” to “language-specific” speech perception is illustrated on the basis of a two-formant vowel space in figure 4.2. The initial auditory boundaries within this abstract vowel space are changed according to the infants’ exposure to a specific language. By the age of six months, the perceptual magnet effect emerges and unneeded phonetic boundaries disappear. Iverson et al. (2003) used 18 synthesized English [ra] and [la] tokens which varied in their F2 and F3 frequencies. The study tested the perception of these sounds by native English, Japanese and German adults with “identification and goodness”, “similarity scaling” and “discrimination” tasks. One of their findings is that American English listeners were more sensitive near the categorical, i. e. the phoneme boundary between [r] and [l] than within each category. The German listeners showed similar perceptual patterns as the American listeners. The Japanese listeners on the other hand had no such higher sensitivity at the categorical boundary but showed higher sensitivity within the English [r] category. Iverson et al. conclude that the Japanese listeners assimilated the sounds into their [r] category. Their perceptual spaces seem to be “mis-tuned” for the perception of the contrast between English [r] and [l]. Acoustic variations which are critical for the discrimination of sound categories of American English listeners are irrelevant to Japanese listeners, or no more salient than other acoustic cues. This distortion of the perceptual space does not represent a total lack of perceptual sensitivity. Decreased perceptual sensitivity around L1 prototypes can lead to reduced sensitivity to critical acoustic cues in the acquisition of non-native sounds. Increased sensitivity according to a learner’s L1 on the other hand can lead to increased attention to irrelevant acoustic cues in L2 sounds. This subsequently changes not only perception but also speech production. The NLM holds that the speech representations, which are developed in the first year of life under the influence of the surrounding language of an infant, play “a crucial role in guiding their initial attempts at speech production” (Kuhl and Iverson, 1995, p. 139). The same arguments about the influence of an infant’s exposure to a specific language are applied to adults. Difficulties in the acquisition of an L2 phonology are attributed (in part) to the perceptual magnet effect. 44 CHAPTER 4. THEORIES ON FOREIGN ACCENT Infants’ Natural Auditory Boundaries F2 (Hz) A F1 (Hz) B Swedish English Japanese C Figure 4.2: NLM Theory: (A) Infants partition the acoustic space in a language-general way. (B) By the age of 6 months infants exhibit language-specific magnet effects induced by ambient language input. (C) Unneeded phonetic boundaries disappear and magnet effects alter the perceived distance between stimuli. (Adapted from Kuhl and Iverson, 1995, fig. 10) 4.6.1 Summary The NLM addresses the problem of decreasing phonetic abilities in second language acquisition with increasing age, which is of great relevance to the foreign accent phenomenon. Exposure to a language in early life changes speech perception in a way such, that the perceived contrasts between acoustic cues are altered according to one’s L1 sound system. The perceived distances between “good” exemplars of a sound are shrunk, and near phonetic boundaries the distances are stretched. This has consequences for second language acquisition. A learner can have difficulties in perceiving differences in distinct L2 sounds in the vicinity of an L1 prototype. On the other hand, the learner might rely on acoustic cues which are irrelevant or only secondary in a L2. The NLM describes how language experience affects and shapes speech perception and it is argued that this language specific mode of perception has consequences for speech production as well. The concept of similarity by means of phonetic prototypes, is employed in the explanation of the perceptual magnet effect. With respect to second language acquisition, it is predicted, that L2 sounds which are more similar to L1 sounds will be perceived as less distinct than those L2 sounds which are farther away from L1 phonetic prototypes. Thus, phonetic language distance affects second language acquisition. Chapter 5 Methodological issues So far a wide range of experimental studies on foreign accent have been discussed. Methodological issues mentioned in previous chapter are summarised and completed in this chapter. Research on foreign accent faces the same problems as every empirical or experimental study in linguistics. This chapter addresses primarily those issues that are special to subject selection in studies on foreign accent. 5.1 Subject selection and control group Some experimental variables can change over time in longitudinal examinations – like for example LOR, the amount of exposure to an L2 or the L1 vs. L2 usage patterns of a speaker. Most speakerdependent factors, however, cannot be changed or manipulated directly by an experimenter without replacing the examined speakers (Levi et al., 2007). Speakers with varying levels of proficiency can influence the ratings given by listeners (Long, 2005 – this issue is already addressed in section 2.8). The more speakers are included in the group of examined subjects with lower levels of proficiency, the more speakers with near-native proficiency will be rated as native. Flege et al. (1995) point out the importance of examining non-native subjects, who have reached their individual level of ultimate attainment in an L2. In any case, the experimenter should be aware of the differences between measuring learning rate as opposed to measuring ultimate attainment (as discussed in section 2.7). Some authors require that speakers of a single native language should be examined (Flege et al., 1995). This obviously depends on the examined variable(s), and with respect to the problems of distinguishing between language and dialect it is a too general statement. The more precise requirement is that speakers from different dialectal backgrounds should not be compared to a single norm or variety. Dialectal differences or any other varieties of a language should never be underestimated in order to minimize misclassifications of native speakers. In the previous chapters examples of studies have been cited where dialectal differences lead to “confusing” results (e. g. Bongaerts, 1999; Mack, 1989, compare section 1.1.3). As a consequence, the experimenter has to examine possible regional or social variations of the object of investigation in the target language. At least, the possibility of differing “standards” or linguistic systems within the subjects should be considered. 45 46 CHAPTER 5. METHODOLOGICAL ISSUES Piske et al. (2001) specify two problems associated with the omission of a control group of native speakers: the performance of native speakers under the same conditions of the experiment remains uncertain, as well as the ability of the recruited judges (if included in the study) to distinguish between native and non-native speakers. Flege (1987b) stipulates the following requirements on subject selection: • Subject groups should be as homogeneous as possible (especially with respect to their language backgrounds). • Subjects with hearing problems should be excluded. • Subjects who do not speak their native language “normally” should be excluded. • Both male and female subjects should be included. • At minimum 6-12 subjects should be examined. Flege (1987b, p. 289) proposes an “repeated measures design” which reduces the number of subjects which needs to be recruited. Speech productions in L1 and L2 of each subjects can be directly compared, so each subject might “serve as his/her own control”. This approach is supposed to minimize the influence of subject selection biases. 5.2 Obtaining data: The task The subjects should be given enough time for acclimatisation to the situation before speech samples are recorded – especially, if the recordings take place in an anechoic chamber. It is noted in section 2.5 that even temporal changes in language use patterns might affect speech production. Piske et al. (2001) stress the possibility that individuals’ language behaviour in an experiment could be influenced not only by “the conditions under which they had been exposed to their L1 or L2 in the months preceding the experiment but also by the conditions under which they had been exposed to the L1 and the L2 in the hours or even minutes preceding the experiment”. For the recording procedure, Piske et al. (2001) suggest a delayed repetition technique as a way to obtain “more reliable measures of degree of L2 foreign accent” – a technique also used by Flege et al. (1995). In such a procedure a list of sentences is presented to the subject in both written form and aurally via a recording. The sentences of interest are preceded and followed by a context sentence, creating a mini-dialogue like in the following example: Voice 1: What did Paul eat? Voice 2: Paul ate carrots and peas. Voice 1: What did Paul eat? Subject: [repeats Voice 2] (example taken from Piske et al., 2001, p. 205) The delay between the sentence (or word) which is to be spoken by the subjects, and its repetition is assumed to prevent direct imitation from sensory memory. Verbal presentation of the material should prevent influence from differences in reading abilities. On the other hand, presentation of the written form might prevent interference from perceptual difficulties. Learners might be well aware of features of an L2 phonetic system and nevertheless be 5.3. FA RATING BY NATIVE SPEAKER JUDGES 47 unable to perceive (or produce) it correctly (Altenberg, 2005; Flege, 1993). In order to avoid such problems, the studies reported by Flege et al. (1995) and Piske et al. (2001) presented the test material both aurally and in written form to the subjects. Long’s proposal of “natural language use” or spontaneous speech as the ideal indicator for an individual’s L2 abilities seems to be not widely followed in studies on foreign accent. The justification for use of limited, language-like samples (in phonology studies, at least) is that judgments of pronunciation ability based on anything above isolated words, and especially on natural speech, are vulnerable to bias from cues from other linguistic domains than the one supposedly in focus [. . . ] (Long, 2005, p. 302) He questions the validity of results obtained from studies examining “controlled, elicited, often rehearsed” speech samples as a measure for a speakers general L2 abilities. This constitutes a difficult methodological problem in data collection on foreign accents. An opposing view is put forward by Major (2001). He argues that isolated words are not as unnatural as it is often claimed. There are more than just a few every-day examples of single word utterances or word lists. Two slightly modified versions of the above stated mini-dialogue can illustrate this: Example 1: word-list utterances A: What did Paul eat? B: Carrots, peas, beans and broccoli. A: What? B: Carrots, peas, beans and broccoli. [repeating more carefully] Example 2: single-word utterances A: Did Paul eat carrots or peas? B: Carrots. A: What? B: Carrots. [repeating more carefully] These two examples illustrate how “isolated words” or utterances consisting of simple word lists can be encountered in every-day situations and thus can safely be regarded as well as “natural language use”. Some examiners use picture naming tasks, where the subjects are asked to name a given picture without written or aural presentation of the respective target word (e. g. Altenberg, 2005). Such an approach can be fruitful when phenomena like spelling pronunciations or hyperarticulation have to be avoided and spontaneous speech is in the focus of interest. However, a drawback with such an approach is the possible interference of problems in other linguistic domains, like lacking syntactic or lexical knowledge. Another disadvantage of spontaneous speech is rooted in general communication strategies like the avoidance of words with difficult sounds for example. 5.3 FA rating by native speaker judges Most studies reviewed for this thesis use judgments of listeners who are native speakers of the examined target L2 to assess degree of foreign accent. This approach is justified on the basis of the 48 CHAPTER 5. METHODOLOGICAL ISSUES definition of foreign accent as a phenomenon of L2 speech which is perceived by native speakerslisteners (compare section 1.1.7). 5.3.1 Scaling foreign accent Usually listeners’ judgments are used to indicate degree of foreign accent in speech samples on a rating scale. Despite numerous studies on foreign accent, there still is no standard scale for measuring degree of foreign accent (Piske et al., 2001; Southwood and Flege, 1999). The used scale should be sufficiently sensitive to reveal even small differences between individual speakers and between the level of native speakers. The end points of the rating scales are usually labelled with “no accent”, “native-like pronunciation” or “native speaker” at one end and “definite” or “heavy foreign accent” at the other (Piske et al., 2001). Points between these two extremes are used to mark varying degrees of foreign accent. It is not known how many distinctive degrees of foreign accent listeners are actually able to distinguish. Flege and Fletcher (1992) and Flege et al. (1995) for example used a “continuous scale” (which in fact was a 256-point scale); others used a nine-point scale (e. g. Guion et al., 2000; Flege et al., 1999), a seven-point scale (e. g. Levi et al., 2007), a five-point scale (e. g. Altenberg, 2005) or a four-point scale (e. g. Asher and García, 1969; Flege et al., 1997). Piske et al. (2001) point out the lack of a standardised scale for measuring foreign accent and raise the question of whether the various utilised rating scales “ensure equally valid and reliable measures of degree of L2 foreign accent”. Usually equal-appearing interval scales are used. Southwood and Flege (1999) compared direct magnitude estimation and interval scaling methods for measuring degree of foreign accent. They found that both methods can “provide valid indices of accentedness” and that degree of foreign accent represents a “metathetic continuum” – i. e. a continuum which can be partitioned into equal intervals. This suggests that equal-interval rating scales are appropriate in foreign accent studies. Another study which compared two different methods of scaling foreign accent is described by Brennan et al. (1975): (a) magnitude estimation and (b) sensory modality matching. The first method required the judges to rate the presented speech samples with a number “that seemed appropriate for the amount of accentedness”. The second method employed a “Lafayette hand dynamometer” to measure the force of hand grip. The judges had to squeeze this hand dynamometer “with a force matching the accentedness of each speaker”. As a reference, the speech samples were auditorily analysed by two judges who assessed “the frequency of occurrence of specific accented pronunciations” (of segments). The judges both had no formal training in phonetic transcription. The results revealed a strong agreement among the judges about the degree of foreign accent. The results revealed that both methods can be used to scale degree of foreign accent, as the ratings obtained from both methods were correlated with one another. Additionally, the ratings were found to be highly correlated with the amount of accented pronunciations, i. e. the number of segmental mispronunciations. The rating of degree of foreign accent is a non trivial problem. Southwood and Flege (1999) speculate that “response biases” in ratings of foreign accent are likely to occur “because perceptual dimension, such as foreign accent, have no known physical units” (p. 344). 5.3.2 The judges The recruited judges are usually native speakers of the examined L2. Some studies relied on linguistically inexperienced, “naïve”, judges (e. g. Flege et al., 1995), others on linguistically trained ones, such as linguists or foreign language teachers (e. g. Altenberg, 2005; Mack, 1989). Diverging 5.3. FA RATING BY NATIVE SPEAKER JUDGES 49 results can be found in the literature regarding the effects of the judges’ linguistic experience. As a consequence, Piske et al. (2001) suggest recruiting a representative group and not only linguistically naïve judges or only experts. Although linguistically naïve judges might not be able to identify which mispronunciations contribute (to what degree) to their perception of foreign accent, they are, according to Brennan et al. (1975), nevertheless able to accurately scale the overall degree of foreign accent. They have demonstrated that linguistically naïve judges are able to “give reliable judgments of the accentedness of speech samples, and that they are in agreement as to what constitutes various levels of accentedness”. Another issue regarding the language experiences of judges is addressed by Long (1990) or Southwood and Flege (1999). Individuals from linguistically heterogeneous areas like cosmopolitan cities may have a higher tolerance for language variations and deviances from their L1 norm. In general, individuals who are familiar with foreign accent may give less rigorous ratings than individuals who have less or no experience with non-native speakers of their native language. More research is needed to determine the effect of familiarity with foreign accent on listeners’ perceptions and ratings. Long (1990, 2005) also points out the importance of the instructions given to the judges. Misleading instructions to the judges must be avoided. If the judges are to rate a set of speech samples of foreign speakers which includes control samples of native speakers they have to be told as to expect an unspecified number of samples form native and non-native speakers. It must also be clear, that the samples will include (mainly) recordings of non-native speakers. Although these statements might seem trivial, Long cites examples where misleading instructions might have lead the judges to false assumptions about what they would have to rate, which resulted in inappropriate ratings. In one case, he assumes the judges might have expected recordings of native speakers with more or less accurate pronunciations. Such misleading instructions then lead the judges to rate more of the samples as native-like than they might have done otherwise. The instructions have to be unambiguous without influencing the judges by any misleading assumptions. As mentioned earlier, the composition of the data which the judges have to rate poses another problem. The listeners might be influenced by obviously non-native samples in a way that they give higher ratings to speech samples from speakers with higher, near-native proficiency. The experimenter should be aware of the existence of such speaker-independent factors affecting the perceived degree of foreign accent (as discussed in section 2.8). An additional unanswered question is the total number of judges that is needed for reliable ratings of degree of foreign accent. Piske et al. (2001) hypothesise that a larger number of judges might be needed for reliable detection of smaller differences between speakers. 5.3.3 Native speakers’ judgments and acoustic features of foreign accent Which phonetic cues contribute to the perception of a foreign accent? This question was discussed in chapter 3. It is still largely unknown which acoustic cues contribute to what degree the perception of a foreign accent. Southwood and Flege (1999) emphasise the importance of examining which acoustic variables affect listeners’ judgments of degree of foreign accent. They suggest that “identification of the potential acoustic cues used by listeners may help provide a physical referent to assist in interpreting judgments of degree of perceived foreign accent” (p. 347). 50 5.4 CHAPTER 5. METHODOLOGICAL ISSUES Foreign accent detection by acoustic measurements Typical segmental acoustic phenomena that are examined in foreign accent experiments are voice onset time values (VOT) of stop consonants and formant values of vowels. Since the precise relation between specific acoustic cues and the perception of foreign accent is still more or less unknown, acoustic measurements on their own cannot provide a measure for degree of foreign accent. However, several studies indicate, that there is for example a correlation between the number of segmental errors and the degree of perceived foreign accent. Acoustic measurements can provide insight into the underlying linguistic knowledge of a learner about the examined L2. As discussed in chapter 3, speakers might for example produce acoustic contrasts in an L2 which are according to the “norm” irrelevant or even wrong. The mere fact, that they produce a contrast however, indicates an underlying awareness (or assumption) of an L2 phonological contrast. In combination with listener judgments of degree of foreign accent, acoustic measurements can help identifying the acoustic cues responsible for the perception of foreign accents. This of course, requires studies where both acoustic measurements and listener judgments are carried out. In the majority of studies reviewed for this thesis, this was not the case. Either acoustic measurements or listener judgments are usually used to determine foreign accents – but rarely both. Some studies rely on the number of segmental errors without measuring specific acoustic parameters. To conclude this section, measurements of specific acoustic parameters should be carried out in addition to the evaluation of speech samples by listener judgments. Only the combination of both the instrumental and the impressionistic approach toward foreign accent can provide a complete insight into the phenomenon. 5.5 Criteria for native-likeness of speech Studies relying on listener judgments as well as instrumental studies both are based on the idea of an existing phonetic norm. It is important to document the norms, or more general, the systems of both the L1 and the L2 and to compare them. Usually this is accomplished by comparing the speech of native speakers of both languages in question (Flege, 1987b). Sometimes the examiners refer to previously published descriptions or documented “standards” of the respective languages. Establishing the norms of languages which are not as deeply examined as English might sometimes be difficult. One possibility is to examine not only the subjects’ L2 speech but also their L1. Comparisons of sounds from two (or more) languages should be made within the same, or similar phonetic contexts. However, examining the L1 speech of the same speakers as the ones whose L2 speech is examined has a disadvantage which needs to be taken into account. As mentioned earlier, a speaker’s L2 influences his or her L1. Thus, the L1 norm of such speakers may not be comparable to the norm of monolingual speakers of that language. Flege et al. (1995, p. 3129) proposed a statistical criterion of native-likeness which is also used in other studies (e. g. Guion et al., 2000; Piske et al., 2001). The criterion is defined as a mean rating (of speech samples judged by native language listeners) that falls within 2.0 standard deviations of the mean rating assigned to a control group of native speakers. However, Flege et al. emphasize 5.5. CRITERIA FOR NATIVE-LIKENESS OF SPEECH 51 that such a criterion does “not provide direct evidence for foreign accent detection”, the most direct criterion being “a paired comparison task”. After discussing several studies which found native-like performance of L2 learners, Birdsong and Molis (2001, p. 245) explicitly “point out that many factors could invalidate demonstrations of nativelike attainment by artificially elevating subjects’ performance”. Among the possible causes for such artificially elevated performance they mention tasks that do not cover the full grammar of the L2. Tasks could be challenging only to inexperienced learners and result in no difference above a certain level of proficiency (but still below that of native speakers). As another cause for elevated performance they identify careful screening of subjects. A fundamental problem for the application of native-likeness criteria is the question whether performance of monolingual native speakers is appropriate data in bilingualism research at all. In section 1.1.3 the problems associated with the definition of what constitutes a native language were briefly discussed. Whatever definition is applied (provided, it is appropriate), the L2 learner can possibly never become a native speaker of L2 in that sense. Considering the differences in the L1 between monolingual and bilingual speakers it seems inappropriate to insist on the idealised monolingual native speaker as the only reference for L2 learners. As Bongaerts (2005) points out, no researcher would attribute L2 induced deviances in L1 usage to limited language learning abilities. According to Long (2005, p. 305f) the tested areas of language should be those where “clear native norms can be reliably documented”. This is important with respect to possible variations within the control group of native speakers and also within the samples of individual native speakers. Long questions the validity of measures in such cases, where even native speakers disagree. He cites studies where native speakers received ratings below the proposed level of native-likeness or where non-native speakers received better ratings than native speakers. Thus, the same problem as already discussed above with respect to subject selection or native language judges arises again: Dialectal (or other) within-language variations have to be considered when criteria of native-likeness are to be applied. As a consequence, it might not be sufficient to refer to an (idealised) standard which in practice is not spoken by the examined speakers or the recruited listeners. Chapter 6 Experimental study This chapter provides an overview of an experimental study on the production of German monophthongal vowels and the realisation of a phonological contrast by native and non-native speakers. The first section describes the linguistic background relevant to the present study: A brief introduction to the German vowel system is followed by a more detailed description of the German phonological vowel opposition. 6.1 Linguistic background: German vowels Following the conclusions from section 5.5, the first step in the experimental study described in this chapter is an examination of the phonetic and phonological system of the target L2, in this case: German. It is concluded in chapter 5 that dialectal or regional differences should not be underestimated. This means that the relevance of regional or dialectal diversification of the target language has to be considered as well as the regional background of the native speakers from the control group. In order to achieve this, not only the norm of standard German has to be analysed but also the possible regional variations1 . Additionally, a contrastive analysis of the phonetic systems of the target L2 and the L1 of the respective subjects is needed. Since the beginning of the standardisation of German orthography there have been attempts to standardise German pronunciation along with orthography. Although there are prescriptive norms for standard German pronunciation, they are usually not realised thoroughly by the speakers and not even generally accepted. The vowel system presented in this section follows descriptive presentations and acoustic studies from Becker (1998); Claßen et al. (1998); Kohler (1977); Ramers (1988); Sendlmeier and Seebode (2007) and Wängler (1968). The German vowel space comprises 15 monophthongal vowels with an additional “reduction vowel” (schwa). There are also three diphthongal phonemes, and an additional vowel [5] along with several [5]-diphthongs which correspond to various /r/-vocalisations. Figure 6.1 shows a phonological representation of the German monophthongal vowel phonemes. The set of monophthongs can be divided into two subsets (a) {/I/, /Y/, /E/, /œ/, /a/, /O/, /U/} and (b) {/i:/, /y:/, /e:/, /E:/, /ø:/, /a:/, /o:/, /u:/}. Each vowel from one subset is contrasted with 1 This does not imply that various “dialects” of the target language are examined in the present study, since there can naturally be regional variations within the “standard language”. 52 6.1. GERMAN VOWELS 53 i: • • u: •U • y: e: • I • • Y • ø: E• E: • • o: •O •œ a, a: • Figure 6.1: The German monophthongs (according to Kohler, 1977) one vowel from the other subset. This binary relation constitutes the so-called vowel opposition, which is a central feature of German vowel phonology (Becker, 1998). The two vowels within such a relation will be referred to as a contrast pair in the following text. According to Becker these two sets or classes of vowels are defined by the following phonological (phonotactic) constraints: Vowels from set (a) cannot appear in open tone syllables, and vowels from set (b) cannot appear before ambisyllabic consonants. The denomination of these two sets of vowels is not uniformly and often depends on the individual author’s underlying theory about the primary acoustic correlate of the opposition. Usual denominations which can be found in the literature are shown in table 6.1 (not all of which can be discussed in this thesis). Becker (1998) points out: “Die Bezeichnung Kurzvokale bzw. Langvokale ist dabei am neutralsten, weil sich diese Bezeichnungen auf den unumstrittenen phonetischen Dauerunterschied beziehen können, ohne daß dabei präjudiziert wird, daß es sich um einen phonologischen Quantitätsunterschied handelt. . . ”(Becker, 1998, p. 31) The denomination short or long vowels is the most neutral, as these denominations can refer to the undisputed phonetic duration difference, without prejudging any phonological quantity difference. . . (translation DD) The terminology of short vs. long vowels is also used by Kohler (1977) or Ramers (1988). I too will use the term short vowels to refer to the above stated set (a) and the term long vowels to refer to set (b) – without implying any phonological function of vowel duration. (a) I, Y, E, œ, a, O, U short vowels open vowels lax vowels centralised vowels abruptly cut vowels/syllables ∼ ∼ ∼ ∼ ∼ (b) i:, y:, e:, E:, ø:, a:, o:, u: long vowels close vowels tense vowels decentralised vowels smoothly cut vowels/syllables Table 6.1: Denominations for the German vowel classes The focus of the experiment presented in this chapter lies only on this vowel opposition between short and long vowels and thus on those monophthongs which are part of a phonological contrastpair. This excludes schwa, which is a result of vowel reduction in unstressed syllables and does not 54 CHAPTER 6. EXPERIMENTAL STUDY belong to a contrast-pair. The above mentioned vowel [5] does also not belong to a contrast-pair and is not further examined. The seven commonly described contrast-pairs are exemplified by minimal pairs2 shown in table 6.2. The list of contrast-pairs as presented in table 6.2 suggests, that the only monophthongal vowel phoneme which does not belong to a contrast-pair is /E:/. The positions of /e:/, /E/ and /E:/ within the phonological system, as in stehlen∼stellen∼stählen 3 , is a matter of disagreement among Germanists. /I/ /Y/ /E/ /œ/ /a/ /O/ /U/ ∼ ∼ ∼ ∼ ∼ ∼ ∼ /i:/ /y:/ /e:/ /ø:/ /a:/ /o:/ /u:/ e. g.Mitte e. g.Hütte e. g.Bett e. g.Hölle e. g.Stadt e. g.Pollen e. g.Busse ∼ ∼ ∼ ∼ ∼ ∼ ∼ Miete Hüte Beet Höhle Staat Polen Buße Table 6.2: The vowel contrast pairs. According to some authors – the realisation of /E:/ is a matter of (regional) variation. Wängler (1968) notes that usually /e:/ is used instead of /E:/4 . Kohler (1977, p. 175) states that the distinction between /e:/ and /E:/ is just a spelling pronunciation (“Schriftaussprache”) which reflects the difference between orthographic <e> and <ä>, and that these two vowels are often merged to /e:/, especially in the speech of northern Germany. The above stated contrast /e:/∼/E/ thus in some cases might as well be replaced or supplemented by a /E:/∼/E/ pair, depending on the norm and realisations of the examined native speakers. The common explanation is, that if /E:/ is used at all by a speaker, it is usually the realisation of orthographic <ä>. Becker (1998) argues that this view is too simple. He points out that the realisation of orthographic <ä> as /E:/ is not just a spelling pronunciation, but that it is the result of a process of mutual adjustment of both pronunciation and orthography along with the realisation of orthographic <e> as /e:/. He claims that this process is (almost) completed and that today /e:/ corresponds to <e> and /E:/ corresponds to <ä>. “Der Ausgleichsprozeß, der sich am Anfang dieses Jahrhunderts abzeichnete, ist jetzt in der überregionalen Standardaussprache vollständig durchgeführt. Lediglich in einem Gebiet im Südwesten (Stuttgart, Ulm, Tuttlingen) werden die historischen Distinktionen auch von den gebildetsten Sprechern gemacht, sie sind jedoch inzwischen sehr auffällige Kennzeichen einer Regionalsprache.” (Becker, 1998, p. 18) The process of adjustment of the differences, which emerged at the beginning of this century, is now completely accomplished in the supraregional standard pronunciation. The historic distinctions are realised even by the most educated speakers, only in an area in the southwest (Stuttgart, Ulm, Tuttlingen). They are by now, however, very noticeable features of a regional pronunciation. (translation DD) Note that this study was carried out in Stuttgart, and that 18 of the examined speakers were living in Stuttgart at the time of recording or are natives of the region (compare section 6.2). Such 2 3 4 ["mIt@]–["mi:t@] (“middle” vs. “rent”), ["hYt@]–["hy:t@] (“hut” vs. “hat (pl.)”), [bEt]–[be:t] (“bed” (furniture) vs. “bed” (in the garden)), ["hœl@]–["hø:l@] (“hell” vs. “cave”), [Stat]–[Sta:t] (“town” vs. “state”), ["pOl@n]–["po:l@n] (“pollen” vs. “Poland”), ["bUs@]–["bu:s@] (“bus (pl.)” vs. “repentance”), ["Ste:l@n]–["StEl@n]–["StE:l@n] (“to steal” – “to put” –“to steel”) “Anstelle des /E:/ hört man heute im Deutschen überwiegend /e:/.” (Wängler, 1968, p. 36) 6.1. GERMAN VOWELS 55 marginal remarks about regional (or other) variations in descriptions of the linguistic system of the examined target language may prove crucial in an analysis of pronunciation or perception as it is pointed out in chapter 5. 6.1.1 Acoustic correlates of the German vowel opposition The various denominations of the two distinct vowel classes reflect various underlying theories about the (primary) acoustic correlate of the observed phonological vowel opposition. Similarly, the graphic notation of the contrast-pairs as shown above can be seen as a visualisation of the underlying acoustic correlates of this contrast relation. The symbols /i:/ and /I/ for example imply an opposition in quantity (indicated by the length mark “:”) as well as an opposition in quality (indicated by the different vowel symbols “i” and “I”). This study does not provide an exhaustive examination of all the possible acoustic correlates of the vowel opposition, but concentrates on those most often described and referred to in the literature. Vowel quantity The denomination of the two vowel classes as long vs. short vowels can be interpreted as a reference to a phonological function associated with vowel quantity. There is no agreement upon the question whether vowel quantity constitutes the primary distinctive feature of the vowel opposition. However, the fact is undisputed, that there is a contrast in vowel duration in stressed syllables. The acoustic correlate of phonological vowel quantity is time – an acoustic parameter, which is rather easy to measure. Ramers (1988, p. 73) cites several sources that give ratios of 1:2 or 2:5 for the duration of short vs. long vowels5 . Measured in milliseconds, long vowels have a duration well above 100 ms, and short vowels well below 100 ms. Ramers for example found mean ratios of 1:2.1, 1:2.0, 1:2.58 and 1:1.65 for four male speakers and Claßen et al. (1998, p. 226) observed a mean duration of approximately 140 ms for long vowels and a mean duration of approximately 80 ms for short vowels in stressed positions. According to Kohler (1977) the only contrast in the pair /a:/∼/a/ is vowel quantity, and Wängler (1968) states that the only difference between /E:/ and /E/ is vowel quantity. Ramers concludes, that the obvious contrast of vowel duration under primary stress means that quantity cannot be excluded as a possible correlate of the phonological vowel opposition. Despite the disagreements among phonologists a contrast in vowel quantity between the two classes of short vs. long vowels can be expected to be a feature of a native German pronunciation. Vowel quality The term vowel quality is sometimes used to refer to all features of vowels which are not attributable to quantity. The term is used in this thesis in its narrower sense, referring only to supralaryngeal articulatory settings and the corresponding acoustic characteristics. In the acoustic analysis of this experimental study, vowel quality will refer primarily to the vowels’ formant structure. Vowel 5 The possibility of three distinct levels of vowel quantity and the existence of overlong vowels in the German language will not be considered in this thesis, as it is treated only marginally in the literature and seems not to be generally accepted in descriptions of German phonology (see e. g. Ramers, 1988, p. 76). 56 CHAPTER 6. EXPERIMENTAL STUDY tenseness, which is sometimes subsumed under the term vowel quality, will be treated separately (see below). The values for the first two formants F1 and F2 taken from various sources (as cited above) are shown in table A.4 on page 85. Values for the third formant F3 were available only from Ramers (1988), who lists formant values for four male speakers. In general, short vowels have a higher F1 than long vowels. The value of the second formant shows a split relation: long front vowels have higher F2 values, and long back vowels have lower F2 values than their short counterparts. These differences were found to be systematic for stressed and unstressed vowels; however, only for stressed vowels these differences were significant – the only exception being in the opposition of the two a-vowels (see e. g. Claßen et al., 1998, p. 224, or Ramers, 1988). Tenseness The phonological vowel opposition is often associated with the distinction between tense and lax vowels. However, the nature of the acoustic correlate of tenseness is still a matter of debate. The proposed acoustic correlates of vowel tenseness which will be of interest here are those associated with voice quality in it’s narrower sense – namely those acoustic characteristics caused by different laryngeal settings. Some researchers see the articulatory correlate of vowel tenseness in higher tension of the tongue, the walls of the vocal tract or the vocal folds (Ramers, 1988, p. 123f). According to Claßen et al. (1998, p. 223), the most prominent acoustic correlates of vowel tenseness are the spectral tilt parameters skewness and rate of closure. Skewness (SK) refers to the slope of the glottal closure and indicates how abrupt the glottis closes. The rate of closure (RC) refers to the speed at which the air stream is cut off. Tense (long) vowels show somewhat higher SK values and considerably higher RC values than lax (short) vowels. The speed of glottal closure is the major voice quality correlate of vowel tenseness which correlates most with the acoustic parameter RC. In addition to the parameters SK and RC, Claßen et al. observed that tense vowels show higher OQ values and lax vowels higher values of GO, although these differences were not significant. The open quotient (OQ) represents the time during which the glottis is open in relation to the entire duration of a glottal period. Glottal opening (GO) refers to the degree of the glottal opening during the entire glottal period. With respect to word stress, tenseness is a more stable feature of the vowel opposition than vowel length. This means, that this feature is not neutralised in unstressed positions. The only exceptions are (again) /a:/ and /a/, which show no significant differences in unstressed positions, neither in quality (including tenseness) nor in quantity. Claßen et al., p. 226 summarise their findings as follows: tense vowels are long (in duration) only if they are stressed, and stressed vowels are only long (in duration), if they are tense. Another exception is the vowel /E:/, which on the one hand is attributed with the feature lax, but on the other hand is nevertheless categorised as a long vowel. 6.1.2 German vowels: Summary Since the aim of the present study is not the examination of the nature of the German phonological vowel system, but its realisation by non-native speakers (in comparison to native speakers), most of the above stated questions can be left open. It is however important to point out, that the acoustic 6.2. THE PARTICIPANTS 57 features proposed as primary correlates of the vowel opposition have to be taken into account in the phonetic analysis of the speech samples taken in this examination. To summarise these considerations – (possible) acoustic correlates of the German phonological which seem most promising in an examination of the vowel opposition are: vowel quantity (duration), vowel quality (formant structure) and tenseness (spectral tilt). These are the acoustic features which will be examined in the experimental study described in this chapter. 6.2 The participants The speakers for the production experiment were recruited primarily through a circular email at the Institute for Natural Language Processing and some by personal contact. There was no pre-selection of the participants according to any linguistic or non-linguistic criteria – every speaker who wanted to participate was included in the study. The speakers were not paid for their participation; the only possible reward being the results of the later analysis of their own speech. The speakers were not informed about the precise nature of the study prior to the recordings. However, the general focus on foreign accent was known to the majority of the speakers and some of them were also aware of the experiment’s focus on the German vowel opposition. No further attempts were undertaken to mislead the participants into believing that something else than their pronunciation was examined. 18 speakers were recorded at the Institute for Natural Language Processing at the University of Stuttgart, Germany, and in addition two speakers were recorded with help from Mateusz Wiącek in Kielce, Poland. All except two (speakers A02 and B07) were living in the region of Stuttgart at the time of recording or had lived there in the past. The speakers were grouped afterwards into three groups according to their ages of learning German (AOL): group A comprising the speakers with AOL beyond early childhood (ranging from 7 to a maximum of 22 years with a mean value of 16.8 and a standard deviation of 4.73), group B comprising bilingual speakers6 with AOL not above early childhood (AOL from 0 to 3, with a mean of 1.13 and standard deviation 1.55, and group C comprising the non-bilingual speakers of German (AOL 0). Informally speaking, group A represents the non-native speakers, group C the native German speakers7 , and group B represents bilingual speakers, who are in general not easy to classify. There are some problems with this categorisation. First, some of the speakers in group B would not consider themselves as bilinguals. For example, speaker B07 was put in group B, because of an Italian speaking nursemaid. C02 was put in group C despite “German and Swabian” speaking parents. An overview of the demographic characteristics of the speakers is shown in table A.1 on page 84. The column “List” shows which of the four word lists was given to the speaker. The column “Age” shows the age of the speaker at the time of recording. Speakers who reported to have learned German from their parents are given an AOL of 0. Speakers who were born in a German speaking surrounding are given an AOA of 0, respectively. LOR is shown in months, as one speaker reported only one month of residence in a German speaking surrounding. Speakers who did not report any longer residence 6 7 Compare the definition of bilingualism in section 1.1.5 Compare the introductory sections, esp. the problems with the definition of the concept of a speaker’s native language in section 1.1.3. The term native language is not only avoided in the description of the three groups of speakers as given above, but it was also avoided throughout the entire experiment. The word Muttersprache (native language) was neither used in the circular email nor during the recording procedure (see section 6.4) and the preceding informal conversations with the participants. 58 CHAPTER 6. EXPERIMENTAL STUDY outside a German speaking surrounding are given an LOR according to the formula: Age × 12. The first languages are shown in column L1. Speakers who reported to have acquired two languages simultaneously are given a compound L1 value (the language codes are listed on page 7). The lower part of table A.1 shows a summary of the respective variables. 6.3 The speech material The carrier words for this experiment containing the examined long and short vowels were selected according to the following restrictions: • no “doubtful cases” • no vowels in a V+/r/ context • at least one minimal pair for each vowel contrast • only vowels in stressed syllables • orthographically marked and unmarked vowels Tröster-Mutz (2004) cites several “doubtful cases” (“Zweifelsfälle”) of words with varying designations of vowel quantity or quality in different pronunciation dictionaries or phonetic descriptions of German vowels. Some of these words are for example: Distel, Gas, Geruch, Krebs, Magd, Obst, Ost or schon 8 (the vowels in question are underlined). As stressed in section 5.5 such cases, without “clear native norms” should be excluded from examinations comparing native and non-native speakers. As sequences of vowel + /r/ are usually realised as [5]-diphthongs, such words were excluded as well as the other German diphthongs as described above. At least one minimal pair was included in the list of carrier words for each contrast pair. This provides (at least one) instance of a vowel opposition within an otherwise identical context. To obtain vowels in comparable contexts the additional carrier words were selected such that the vowels appear in stressed, non-final syllables. In addition to a controlled phonetic context, the carrier words were selected to cover not only “orthographically marked” but also “unmarked” instances of each vowel9 . Short vowels may be marked orthographically for example by doubling of the following consonant letter as in Spott 10 . Long vowels can be marked orthographically for example by a following <h> letter or by doubling the corresponding vowel letter as in Stahl or Staat 11 . Orthographically unmarked, i. e. not marked explicitly as being short or long, are vowels which are represented by single vowel letters, as for example in Koch or hoch 12 . The purpose of this consideration of orthography was to minimise a possible influence of the written form, i. e. to avoid spelling pronunciations. This was especially important considering the way the word lists were presented to the speakers (see following section). 8 9 10 11 12 ["d{I/i:}st@l] (“thistle”), ["g{a/a:}s] (“gas”), ["g@K{U/u:}x] (“smell”), ["kK{E/e:}ps] (“crab, cancer”), ["m{a/a:}kt] (“maidservant”), ["{O/o:}pst] (“fruit”), ["{O/o:}st] (“east”), ["S{O/o:}n] (“already”). Note that this terminology is used informally, as the German orthography is of course far more complex than the simple examples given in this section. [SpOt] (“mockery”). [Sta:l], [Sta:t] (“steel”, “state”) [kOx], [ho:x] (“cook”, “high”). 6.4. PROCEDURE 59 Four randomised versions of the compiled word list were created and manually edited to ensure the presence of at least two other vowels between any repetitions of instances of the same vowel group. The four final word lists are shown in appendix B. They comprise each the same 88 carrier words with each vowel included at least five times. 6.4 Procedure The recordings at the Institute for Natural Language Processing were carried out by myself, while a friend assisted as experimenter for two speakers who were recorded in Kielce, Poland. The recording sessions were all carried out entirely in German. The experimenter and the two speakers in Poland were explicitly instructed to speak exclusively German during the experiment. In addition, the experiments in Poland were preceded by about 10 to 15 minutes of informal German conversation. These measures were taken, to ensure comparable conditions for all participants and to allow the speakers to “switch to a German mode” to avoid short term influences by the speakers’ L1 (compare the considerations by Piske et al., 2001, discussed in section 5.2). 6.4.1 Part I: Interview At the beginning of each recording session a short interview was conducted with the participant already sitting inside the anechoic recording chamber. This served two purposes: first, the collection of demographic data and second, the acclimatisation of the speakers to the unusual situation within such a chamber. The interview was carried out verbally. The experimenter asked the questions via headphones from outside the anechoic chamber and wrote down the given answers. The speakers were asked about their age, place of birth and the language or languages they have learned from their parents and the respective ages of learning (if the languages were acquired subsequently). They were also asked about their parents’ first languages. These first questions refer to the speakers’ L1. Note, that the word Muttersprache (“native language”) was not used during the experiment. A second block of questions aimed at the additional language experiences, especially with respect to the non-native speakers and their German language experience. The speakers were asked about any additionally learned languages and the respective ages of learning. They were also asked about their current place of living and previous places of residence within a German speaking surrounding and the respective lengths of residence as well as the age of first arrival in such a surrounding. The corresponding experimental variables to these questions are AOL, LOR and AOA. 6.4.2 Part II: Production experiment In the main part of the experiment the speech samples were recorded using a dialogue technique as suggested in section 5.2. The carrier words were embedded within a short dialogue as in the following example: Examiner: Hütte bitte als nächstes! (“Hütte – hut – next please!”) Subject: Er hat Hütte gesagt. (“He said Hütte.”) 60 CHAPTER 6. EXPERIMENTAL STUDY The exact instructional sentence spoken by the experimenter is not important and was not always the same as stated above. However, the target word was in all cases followed by other words spoken by the examiner (e. g. Hütte ist das nächste Wort – “Hütte is the next word” – or something alike). The sentences spoken by the subjects on the other hand were always the same and varied only in one word. The word list was printed out on a single sheet of paper (DIN A4 size), containing the carrier sentences which the speakers had do produce. A sample of such a printed version is shown in figure B.1 on page 118. This technique was used to avoid direct imitation of the verbally presented target word. The additional presentation of the words in written form was chosen to avoid mistakes caused by difficulties in perception. It is stated in section 5.2 that a learner might be well aware of a phonological distinction in an L2 and able to produce it, but nevertheless unable to perceive it consistently. Since the focus of the present study is on production and not perception, it seemed reasonable to avoid perception mistakes by additional orthographic display of the target words. On the other hand, the verbal presentation by the examiner within an interactive dialogue was employed to avoid unnatural spelling pronunciations or hyperarticulation. A prerecorded version of the instructional sentences was presented to the two speakers in Poland to assure the same recording conditions for all speakers, i. e. the same spoken examples. The only difference worth mentioning might be that the pauses between each sentence were somewhat larger for the speakers in Poland because of the fixed nature of the recorded “dialogue”. The dialogue was repeated in six instances due to non-linguistically caused mispronunciations (“slips of the tongue”), hesitations or the like. Nevertheless, one utterance had to be excluded afterwards from further analysis because the speaker did not say “zerstößt” as required, but “zerstört”. In another case eight target words could not be analysed and had to be removed from the recordings due to technical error. Only the speech of the subjects was recorded. The subjects were all recorded continuously direct to hard disc on a Linux machine in wave format at a sampling rate of 48,000 Hz with a resolution of 16 bits (mono). The recording procedure resulted in 20 audio files – one for each speaker. The longest recording had a duration of 10:05 min (speaker A02), and the shortest 4:03 min (speaker B07). 6.5 Acoustic analysis: method The recordings were analysed in several steps to measure the proposed acoustic correlates of the phonological vowel opposition. The first step in the acoustic analysis of the speech recordings was labelling of the audio files. The vowels were labelled manually using the WaveSurfer software version 1.8.5 (see figure 6.2). Two label files were produced for each recording. One file contained labels for the beginning and end points of the vowels (with the labels placed at zero crossings at the beginning and the end of a period), and the other file contained four labels near the temporal midpoint of each vowel on four subsequent periods (only two instances had less than four complete periods – these were labelled accordingly at three points). The first label file was used for formant and duration measurements and the second for the voice quality parameter measurements. Vowel duration and the values of the first three formants F1 , F2 and F3 were measured automatically in Hertz at the temporal midpoint according to the labels at the beginning and at the end of each 6.5. ACOUSTIC ANALYSIS: METHOD 61 Figure 6.2: Screenshot of labels in WaveSurfer. The picture shows the [te:] part of Steg [Ste:k], in the recording of speaker A03. The topmost pane shows the waveform, below that are the spectrogram, the time axis and the two panes used for the labels – the pane with the labels marking beginning and end of a vowel (the highlighted area), and at the bottom the pane marking four points for voice quality analysis. vowel using Praat version 4.6.3413 . Voice quality parameters were measured with the harmonics-tools (HI) developed by Wolfgang Wokurek at the Institute for Natural Language Processing at the University of Stuttgart. Despite its broader functionality, HI was employed only in the measurements of voice quality parameters. The two voice quality parameters of special interest in this experiment are skewness and rate of closure. Skewness can be measured acoustically by the difference between the amplitude of the first harmonic H1 and the amplitude of the second formant A2 . The skewness gradient (SKG) is defined internally in HI as follows: SKG = H̃1 − Ã2 1−10 + (Bark(F2p ) − Bark(F0 )) (6.1) An acoustic measure for the rate of closure is the difference of the amplitude of the first harmonic and the amplitude of the third formant (A3). The rate of closure gradient (RCG) is defined in HI as: RCG = H̃1 − Ã3 1−10 + (Bark(F3p ) − Bark(F0 )) (6.2) Besides these two parameters, which are expected to show a correlation with vowel class, some additional parameters were measured with HI: the open quotient, glottal opening, incompleteness of closure (IC) and T4G. The latter two are not mentioned in section 6.1.1 where the acoustic correlates of the vowel opposition are discussed. They were nevertheless included in the analysis of voice quality since they both refer to phonatory settings and might provide interesting results. The incompleteness of closure is defined as the bandwidth of the first formant B1 divided by F1 . T4G is another spectral tilt parameter and is defined as the difference between the amplitude of the first harmonic H1 and the amplitude of the fourth formant A4 . The four corresponding formulae as defined in HI are as follows: 13 http://www.praat.org 62 CHAPTER 6. EXPERIMENTAL STUDY OQG = GOG = IC = T 4G = H̃1 − H̃1 1−10 + Bark(2F0 ) − Bark(F0 ) H̃1 − Ã1 1−10 + Bark(F1 ) − Bark(F0 ) B1 F1 H̃1 − Ã4 1−10 + Bark(F4 ) − Bark(F0 ) (6.3) (6.4) (6.5) (6.6) As can be seen from equations 6.1 to 6.6, the frequency values of the formants given in Hertz are converted to the Bark scale by HI. The measurements were carried out according to the second label file with four labels for each vowel near its temporal midpoint. This resulted in a maximum of 352 values for each speaker (four measurements for each of the 88 spoken vowels). Statistical analysis of the measured data was carried out with R version 2.3.014 . The output produced by Praat and HI was loaded into R. Although the amount of collected data is considerable for a thesis like this one, the number of instances of each individual vowel is rather low. Measurement errors or simple slips of the tongue 15 in just one instance can lead to considerable differences. As a consequence, in cases with a restricted number of samples, as in this experiment, the automatically measured values should be inspected, and where necessary corrected manually. Unfortunately, this could not be done within the scope of this thesis. The statistical analysis of this experiment must therefore be treated with caution. The limited amount of data does not allow generalisations anyhow. 6.6 Acoustic analysis: results For the reason that the examined speakers both are small in numbers and rather heterogeneous with respect to the various demographic variables (as shown in table A.1), a strictly statistical analysis of the results cannot be performed reasonably. 6.6.1 Vowel quantity The realisation of a contrast between short-class and long-class vowels was measured by the ratios of long-class vowel divided by short-class vowel. The ratios as cited in section 6.1.1 could not be reproduced by the measurements of the two supposedly native speakers in group C. Speaker C01 has means of 95.24 ms and 58.83 ms for long-class and short-class vowels respectively (standard deviations: 16.54 ms and 11.49 ms), and speaker C02 has means of 99.46 ms and 53.73 ms respectively (standard deviations: 14.84 ms and 10.84 ms). This equals to a mean ratio of 1:1.71 for short-class versus long-class vowels. Figure 6.3 shows a box–whisker plot the mean vowel durations 14 15 http://www.r-project.org/ As mentioned earlier, though great care has been taken, to reduce such mispronunciations, some might have remained undetected. 6.6. ACOUSTIC ANALYSIS: RESULTS 63 for each group in milliseconds. As can be seen from this figure, the speakers in group A tend to realise long-class vowels with considerably longer durations (mean: 110.93 ms, sd: 15.27 ms) than do the speakers in group B (mean: 100.87 ms, sd: 6.06 ms) or group C. Figure 6.3 depicts also the higher variability within group A as compared to the other two. The speaker who realises on average the greatest difference between long-class and short-class vowels is A09. The mean duration for long-class vowels is 122.69 ms (sd: 18.8 ms), and for short-class vowels 60.01 ms (sd: 15.4 ms). The two speakers with the smallest difference are A10 and A05. The mean durations for speaker A10 are for long-class vowels 84.82 ms (sd: 14.63 ms), and for shortclass vowels 68.21 ms (sd: 15.49 ms), and for speaker A05 114.95 ms (sd: 19 ms), and 90.52 ms (sd: 19.32 ms) respectively. 100 120 140 Vowel duration (in ms) 80 ● 60 ● gr. A (l) gr. B (l) gr. C (l) gr. A (s) gr. B (s) gr. C (s) Figure 6.3: Vowel duration (per group) The ratios for each vowel pair, each speaker individually and each group are shown in table A.5 on page 86. Note that the table includes ratios for the pair [E:]∼[E] but not [e:]∼[E:]. With respect to the e-vowels [e:], [E] and [E:], the results show, that all speakers realise [e:] longer than [E], and [E:] longer than [E]. The speaker with the lowest [e:]/[E] ratio is A05 with a value of 1.20. He realises [e:] with a mean duration of 112.51 ms (sd: 38.02 ms) and [E] with a mean duration of 93.87 ms (sd: 18.56 ms). The mean ratio of [E:]/[e:] for all speakers is 1.03. The speaker with the lowest ratio is A02 with a value of 0.67 (mean duration of [E:]: 97.80 ms, sd: 20.86 ms; mean duration of [e:]: 145.35 ms, sd: 19.51 ms), and the speaker with the highest ratio 1.29 is B05 (mean [E:]: 114.43 ms, sd: 17.73 ms; mean [e:]: 88.48 ms, sd: 9.99 ms). Another interesting case is the pair [a:]∼[a], which according to the literature should be distinguishable primarily in duration. The results show, that all speakers realise [a:] longer than [a]. The speaker with the lowest ratio is speaker A05, with a ratio of 1.30 (mean duration of [a:]: 139.08 ms, sd: 18.44 ms; and for [a]: 106.81 ms, sd: 31.86 ms). The mean ratio for group A is 1.6 (sd: 0.25) – exactly the same mean values as can be observed for group C. In table A.39 on page 116 the results of the vowel duration measurements are summarised together with the results for vowel quality and voice quality measurements (see below). The symbols in the 64 CHAPTER 6. EXPERIMENTAL STUDY rows labelled “quan. opp” mark the significance level of t-tests performed on the durations of the respective vowels of each vowel pair. A • marks a p-value at the 0.001 level. This means, that the difference between the durations of the two vowels within a pair is highly significant. A ◦ marks a p-value at the 0.01 level and a ∗ a value at the 0.05 level. P-values above that level are indicated by a dash. This marks cases, where a speaker does not realise the two vowels with significantly differing durations. 6.6.2 Vowel quality A summary of the results of the formant measurements for each speaker is shown in tables A.6 to A.25 on pages 87-93. Because the numerical data is not so easy to interpret – although presented in these readable tables – further evaluation of the data was necessary. Some formant values for [u:] and [U], e. g. for speakers A01 or B04, include obvious errors of measurement (see figures A.1(a) and A.5(c) on pages 94 and 98). A visual inspection of the spectrogram of some of these instances in the recordings of speaker A01 confirms this assumption. As a manual inspection and correction of all measurements could not be done, the respective [u:] and [U] values of speakers A01 and B04 were not generally excluded and removed from further analysis. The data was not manually corrected in some instances just in order to receive more convenient results. The tables on pages 87-93 thus show a summary of all the measured values – including some of these extreme outliers. The F1 /F2 vowel diagrams for each speaker are shown in figures A.1(a) to A.7(c) on pages 94 to A.7(c). Note that the axes in these diagrams are reversed and rotated to resemble the traditional vowel diagram. The equivalent IPA signs for the symbols used in the diagrams are shown in table 1 on page 7. Two questions are of interest in the present study: first, the realisation of the vowel opposition and second, the native-likeness of this realisation. The second point includes the native-likeness of the individual vowels as well. In order to make the formant values comparable across speakers, the values measured in Hertz were normalised using Lobanov’s z-score transformation shown in equation 6.7 (Lobanov, 1971; Adank et al., 2004). FtiN = Fti − Mti , δti (6.7) where Mti is the average value of formant i across all vowels and δti is the standard deviation for each speaker t. This normalisation was chosen because it supposedly provides the best means to remove anatomical influences from the acoustic data while at the same time preserving linguistic and social information. To determine the native speaker norm, reference points within the vowel space were defined for each of the examined vowels by computing the mean values for the two speakers from group C, the “monolingual native speakers”, and the respective values taken from literature16 . This was done 16 It has to be pointed out that the term monolingual is used here informally to refer to those speakers who are not bilinguals in the sense as defined in section 1.1.5. Strictly speaking, speakers C01 and C02 cannot be called monolinguals. Both speakers reported having learned English and French. In addition, speaker C01 reported a 9 month stay in Great Britain and speaker C02 even stated to have learned Swabian from his parents. In addition, the language backgrounds of the speakers who served as sources for the values taken from the literature are unknown. 6.6. ACOUSTIC ANALYSIS: RESULTS 65 because of the relatively small number of speakers in group C – a group which normally serves as the control group in a larger study under more professional conditions. Missing values for a specific vowel and/or formant in the data from the literature were ignored in computing the normalised values for the reference points. The resulting F1 /F2 vowel space for these reference points is shown in figure A.7(e) (in Hertz) and figure A.7(f) (normalised) on page 100. First to test the validity a comparison of the F1 and F2 values between long and short vowels will be made for the reference points with respect to the discussion in section 6.1.1. Table 6.2(a) shows a summary of the mean F1 and F2 values of the reference points (group C and the values from literature). Short vowels thus should have a higher F1 than their long counterparts. The only vowel pair where this is not the case is [a:]∼[a] – but this was expected, as it was explicitly pointed out in the literature as an exception. With respect to F2 , the long back vowels all have lower values. The long front vowels should have higher F2 values than their short counterparts according to the literature. The only vowel pair where this is not the case is [ø:]∼[œ]. This is due to the values of speaker C01, since the values for speaker C02 and the values taken from literature agree with the rule. The mean values from literature are 1485 Hz for [ø:] and 1476 Hz for [œ]. The exception in the values of speaker C01 can most likely be attributed to a measurement error, as the vowel diagram suggests (see figure A.7(a) on page 100). Thus, the constructed reference points generally agree with the literature. (a) group C + literature V F1 mean sd ø: œ a a: E e: E: I i: O o: U u: Y y: 405 518 712 752 557 396 549 395 279 583 403 394 305 385 289 57 68 88 91 103 60 112 73 23 74 41 53 49 61 19 F2 mean 1526 1548 1398 1303 1833 2224 1932 1962 2370 1104 727 1038 786 1557 1655 (b) group B sd V F1 mean 175 130 161 168 255 299 309 170 295 169 130 189 222 139 152 ø: œ a a: E e: E: I i: O o: U u: Y y: 394 502 668 732 536 381 476 397 300 561 396 399 339 379 294 sd F2 mean sd 60 82 95 167 77 47 116 58 63 68 49 56 73 56 46 1700 1679 1496 1303 1853 2270 2038 1993 2333 1068 735 1112 845 1793 1912 170 238 269 245 223 225 289 237 279 196 193 367 420 185 217 Table 6.3: Reference F1 /F2 values and standard deviations The mean values for the speakers from group B served as a secondary set of reference points for the speech of group A. A comparison of the F1 and F2 values according to the above stated rules reveals that the mean values of group B show no unexpected exceptions (see table 6.2(b)). Realisation of the vowel opposition The differences in F1 and F2 between short and long vowels are discussed in section 6.1.1. The realisation of a contrast in quality between two vowels of a given contrast pair will be assumed to 66 CHAPTER 6. EXPERIMENTAL STUDY be measurable by a pair wise comparison of the two vowels’ formant values. If there is at least one formant (F1 , F2 or F3 ), which is “sufficiently” distinguishable from the corresponding formant value of the other vowel, it will be assumed, that this is an acoustic indication of an intended realisation of a contrast between these two. The problem is finding a measure for this distinction. Although the formant frequencies alone might not in all cases be sufficient to describe vowel quality, they will be used here as the only examined acoustic correlate (voice quality parameters possibly correlating with the vowel opposition are examined separately – see following section). As a working hypothesis, it will be assumed, that such a comparison can be computed using t-tests to compare the vowel formants pair wise. Once again, it has to be mentioned, that the following analyses will be carried out and interpreted despite the very limited amount of data. The following example should illustrate this proposed ad hoc method: T-tests performed on the three formants (see table on the upper left in figure 6.4) of [i:] and [I] as realised by speaker A03 yields the following results: for F1 , t = −4.30, df = 9.85 and p = 0.0016, for F2 , t = 5.31, df = 6.61 and p = 0.0013, and finally for F3 , t = 9.00, df = 10 and p = 4.15 ∗ 10−6 . The p-values for F1 and F2 below the 0.01 level, and the p-value for F3 below the 0.001 level indicate that the two vowel might have formants with pair wise different mean values. The vowels [y:] and [Y] from the same speaker on the other hand have formant values which are much closer. The t-test yields t = −3.18, df = 8.35 and p = 0.01 for F1 , t = 0.17, df = 6.17 and p = 0.87 for F2 , and t = −2.04, df = 5.53 and p = 0.09 for F3 . The relatively high p-values all above the 0.01 level suggest, that the mean values are probably not different. This numerical analysis suggests, that the speaker probably distinguishes [i:] from [I] by vowel quality. The evidence for a similar distinction of [y:] from [Y] by the same speakers is less conclusive. Provided the small p-value in F1 is not just a consequence of the very limited amount of data, it is possibly an indication, that the speaker does not distinguish the two vowels by quality (as measured by formant frequencies). A comparison with the vowel diagram shown on the right in figure 6.4 confirms these conclusions. Table A.26 on page 101 shows the results of such a pair wise application of t-tests of all vowel pairs for all examined speakers. The symbol • indicates, that the p-value is below the 0.001 level. ◦ indicates a p-value below the 0.01 level, and the symbol ∗ a p-value below the 0.05 level. A dash in the table indicates a p-value above the 0.1 level. This construction can be interpreted as a kind of “proximity score” for the formant values of each vowel pair. If interpreted that way, a designation of “–” for all three formants represents the maximum score, meaning, that the vowels have the most similar formant values (according to this method). A designation of • for all three formants on the other hand represents the minimum score, meaning, that the formant values are the most dissimilar. The corresponding entries for the above stated example for speaker A03 are shown in the table on the lower left in figure 6.4. It seems safe to assume, that two vowels are distinct if their mean values differ in such a way as described above. This does, however, not imply that vowels which are not considered as distinct according to this method cannot be classified as two (phonetically) distinct vowels by other criteria. Thus, this method is used only as an indicator of dissimilarity of pair wise compared formant values of two vowels. Interestingly the results obtained by such an evaluation agree with the interpretations of the visual appearance of the vowel diagrams. Note that the characteristic F1 and F2 patterns associated with the vowel opposition are not of interest here but are discussed in the following section on the native-likeness of the realisation of the vowel opposition. The i-vowels [i:] and [I] are realised with formant values, which show a distinction for at least one formant with a p-value below the .001 level, by speakers A01, A02, A03, A04, A08 and A09 6.6. ACOUSTIC ANALYSIS: RESULTS F2 mean sd 2165 52 1860 130 1641 63 1637 20 F3 mean sd 3277 137 2561 139 2251 56 2368 117 −1.5 sd 22 25 16 17 i yy i −1.0 [i:] [I] [y:] [Y] F1 mean 288 347 287 319 67 yy ii i y yY y Y i I i [y:]∼[Y] .012 ∗ – .092 Y F1n [i:]∼[I] .002 ◦ .001 ◦ <.001 • Y Y −0.5 Fi F1 F2 F3 I I I II I Y 0.0 ID A03 Speaker A03 2.0 1.5 1.0 0.5 0.0 −0.5 F2n Figure 6.4: [i:]∼[I] and [y:]∼[Y] of speaker A03. from group A. From the remaining speakers, all but one (speaker B08) realises that distinction as well with p-values below the 0.001 level or only slightly above it. Speakers A05, A07 and A10 realise [i:] and [I] with very close formant values in all three formants. The ü-vowels [y:] and [Y] are realised with considerably high p-values by all the speakers in group A. Only speaker B01 realises the two vowels with all three formants’ p-values well beyond 0.001. All the other speakers in group B and speaker C01 realise a distinction in vowel quality. Speaker C02 displays some proximity in the formant frequencies, however, a comparison of the statistical analysis with the graphic display in figure A.7(c) on page 100 indicates, that the vowels are despite their proximity very likely distinguished by quality by that speaker. The e-vowels [e:], [E] and [E:] were compared in the three combinations [e:]∼[E], [E:]∼[E] and [E:]∼[e:]. The first mentioned corresponds to the traditional description of the German vowel system as described in section 6.1. The second pair – [E:]∼[E] – should display no difference in vowel quality, as these two vowels are described as differing only in duration (if [E:] is realised at all by a speaker). The last pair – [E:]∼[e:] – should show no difference in vowel quality for those speakers, who identify orthographic <ä> (as a long vowel) with [e:]. This means, that only the distinction [e:]∼[E] can be tested unambiguously, as the other two pairs are expected to display variation even for native speakers. The pair [e:]∼[E] is realised with clearly differing formants by speakers A01, A02, A03, A06, A08 and A09. The other speakers in group A show closer formant values, i. e. higher p-values for the t-tests. However, no one realises these two vowels with almost identical mean values for all three formants. All the speakers of groups B and C realise these two vowels with significantly differing formants. Notably speakers B03 and B05 realised textipa[e:] and [E] with highly significant differences in all three formants. Except for speaker A07 (and possibly A10), a tendency to distinguish [e:]∼[E] can be seen on all vowel diagrams. [E:] and [E] are realised differently by speakers A01, A02, A03, A08 and A09, as well as by B01, 68 CHAPTER 6. EXPERIMENTAL STUDY B03, B04, B05 and speaker C02. In contrast, [E:] is realised differently from [e:] by speakers A01, A02, B02 and B07. This comparison shows, that speaker A01 realises the vowel pairs [e:]∼[E], [E:]∼[E] and [E:]∼[e:] with formant values which show a distinction for at least one formant at a highly significant level. Thus, the three respective vowels are mutually distinguished from one another and take three different places within the vowel space of that speaker. The same effect can be observed with speaker A02, and to a lesser degree with speakers A03 and C02. The highest overlap in all three pairs can be observed with speaker A07, who seems to realise all three German e-vowels alike. The ö-vowels [ø:] and [œ] are distinguished only by speakers A03 and A09 from group A. They are not distinguished considerably by all other speakers in group A as well as by speakers B01, B06 and B07 (although, these latter two could be measurement errors or slips of the tongue). Despite the statistical similarity of the formant values, the visual display in the vowel diagrams show a possible tendency for a realisation as two different vowels by speaker A01. The a-vowels [a:] and [a] are realised with two significantly different vowel qualities only by speaker B01, and to a lesser degree by speakers A03, A07 and A09. All the other speakers do not realise these two as different vowels. A mutual overlap in all three formants can be observed in the realisations of speakers A01, A02, A05, A06, A10 and C01. The o-vowels [o:] and [O] are distinguished considerably by three of the ten speakers in group A: A04, A06 and A09 – and to a lesser degree as well by speakers A02, A03, A07 and A08. All speakers in groups B and C produce these two vowels with significantly different qualities. Although for speaker A02 the minimum F1 of [O] (492 Hz) is lower than the maximum F1 of [o:] (514 Hz), the t-test yielded a p-value of 0.003, although the formants show a high dispersion of the individual instances of these vowels (compare figure A.1(c) on page 94). The u-vowels [u:] and [U] could not be analysed for all speakers due to some instances of extreme outliers in linguistically unlikely positions within the F1 /F2 vowel space. The u-vowels could therefore not be analysed reliably for the speakers A01, A02, A05, B02, B04, B05, B07 and B08. The data for speaker A03 includes one such outlier which can be interpreted as a measurement error. The remaining instances show a distribution which shows a distinction between the two vowels. The same is true for speaker B02 – provided, the two instances of [u:] with the highest F2 are interpreted as measurement errors (see figure A.4(e) on page 97). Of the remaining speakers in group A no one realised the two u-vowels as significantly distinct vowels, and in group B only speaker B06 distinguishes [u:] and [U]. Despite a considerable computed overlap for the two vowels by speaker C02, the distributional pattern indicates the existence of a distinction. Disregarding the one extreme [u:] outlier at 1516 Hz, the remaining values show a distribution as it can be expected for a native German speaker. Native-likeness of the vowel opposition Native-likeness was computed in two different ways. The patterns described in section 6.1.1 are examined first. The general rule, that short vowels have a higher F1 than their long counterparts can be observed in almost all vowels of all examined speakers. 6.6. ACOUSTIC ANALYSIS: RESULTS 69 This means, that there is either too little data, or that a comparison like this is not an adequate measure of native-likeness of the realisation of the vowel opposition. Assuming the latter, the next step in this analysis was a comparison of the individual vowels with the target values. A method corresponding to the ad hoc measure using pair wise t-tests as described above was employed. This time, each of the vowels of the individual speakers was compared individually to its corresponding reference point. For this between-speakers comparison the normalised formant values were used and not the values as measured in Hertz. The native reference points are also shown in the example of speaker A03 in figure 6.4. The mean native speaker values of [i:] and [I] are shown in the background of the vowel diagram. The respective vowel symbol marks the mean value and the surrounding boxes show one (solid lines) and two (dotted lines) standard deviations of the normalised reference value. The complete diagrams with the normalised vowels for each speaker are shown in figures A.1(b) to A.7(d) on pages 94 to 100. The reference points for each vowel are shown in the background in each graph (for the sake of readability without standard deviations). This corresponds to the traditional approach of comparing the speech of non-native speakers to the speech of “monolingual” native speakers. In addition, the vowels of each speaker in group A were also compared to the mean values of the corresponding vowels from the speakers in group B (see below). The interpretation of the computed proximities is reversed in comparison to the above discussed evaluation: the higher the score is, i. e. the closer two vowels are, the more native-like they are. As it was enough if at least one formant was sufficiently distinct from its counterpart in the above mentioned considerations, only one such significantly distinct formant value can now be interpreted as an indication of considerable deviation from the native speaker norm. Table A.27 sums up the results of this numerical comparison (see page 102). The symbol • indicates that the mean of one formant of a given vowel falls within one standard deviation of the native speakers’ mean value. ◦ marks instances where the mean of a speaker’s vowel production falls within two standard deviations and the dash “–” marks instances where the mean of a vowel is further away from the native mean value than two standard deviations. The symbol • indicates that the t-test yielded in a p-value smaller than 0.001. ◦ marks instances where the p-value is below the 0.01 level, and an asterisk ∗indicates a p-value smaller than0.05. Values above the level of 0.1 are not shown and the respective cells in the table marked with a dash –. This comparison of each individual vowel together with the characteristic native speaker patterns of F1 and F2 in correlation with vowel class now gives a more distinguished picture of the nativelikeness of the realisation of the vowel opposition. The i-vowels are realised native-like by most of the examined speakers. All speakers realise [I] with a higher F1 than [i:] – although the difference is quite small for speakers A05 (339 Hz vs. 354 Hz) or A08 (367 Hz vs. 373 Hz). The difference for speaker A07 is irrelevant (382.11 Hz vs. 382.43 Hz). The differences for the remaining speakers in group A range from 22 Hz (speaker A10) to 105 (speaker A01). The speakers in groups B and C realise [I] and [i:] with differences in F1 in the range from 68 Hz (speaker B08) to 194 Hz (speaker C01). All speakers without exception realise [i:] with a higher F2 than [I] (with differences in the range from 63 Hz (A05) to 736 Hz (C01). With respect to the individual vowels, [I] is realised with mean formant values (F1 , F2 and F3 ) which show no deviance from the native speaker’ norm by speakers A03, A06, B02, B03, B04, C01 70 CHAPTER 6. EXPERIMENTAL STUDY and C02. A greater deviation in only one of the three compared formants is realised by four speakers in group A and foive in group B. There are only two speakers (A09 and A10), who realise [I] with a highly significant deviance in one formant. Such an indicator of non-native-likeness for [i:] is found in four speakers: A01, A05, A08, A09 and B05. An overlap in all three formants is observed only for speakers A02, A04, B03, B07 and B08. The ü-vowels are realised with higher F1 for [Y] by all speakers, except A10, and higher F2 for [y:] by all but three speakers. A01 realises [y:] and [Y] with mean F2 values of 1710 Hz and 1799 Hz respectively. B08 realises these two vowels with mean F2 values of 2003 Hz and 2043 Hz respectively. Speaker A10 realises the ü-vowels non-native-like in both F1 and F2 . A strong indication of non-native-likeness for the individual vowels is found in the productions of [y:] by speakers A04 and A08, and in the productions of [Y] by speaker A10. Native-like values for all three formants are realised by speakers A09, B01, B04 and C02 for [y:] and by speakers A04, A07, A09, C01 and C02 for [Y]. The e-vowels [e:] and [E] are realised by all speakers with a higher F1 for [E] and a higher F2 for [e:]. The F2 difference is quite small for A10, but this might be due to measurement errors, as two unusual outliers in the vowel diagram imply (figure A.4(a) on page 97). Examined individually, [e:] is realised with native-like values for all three formants by speakers A03, A09, B02, B03, B06 and C02. Accordingly, [E] is realised native-like by speakers A01, A05, A06, A08, A09, A10, B03, B04, B05, B06, B07, B08, C01 and C02, and [E:] is realised native-like by speakers A06, A10, B06, B06, B07, B08, C01 and C02. Strikingly, out of the three e-vowels, [E] is realised most native-like with only little deviations from the formants’ reference mean values even by the non-native speakers. The ö-vowels have a native-like relation of F1 in the samples of all speakers except A05 who realises [ø:]∼[œ] with mean F1 values of 369 Hz and 365 Hz respectively. The difference is also quite small for speakers A02 (414 Hz vs. 431 Hz) and A10 (550 Hz vs. 556 Hz) . The differences in F2 are more noticeable. Four speakers in group A, and three speakers in group B have a lower F2 for [ø:]. Definitely native-like are seven instances of [ø:] (speakers A03, A04, A09, B01, B04, B08 and C02) and five instances of [œ] (speakers B01, B06, B08, C01 and C02). Strong evidence for non-native-likeness is found in five instances of [œ] (speakers A02, A04, A05, A06 and A07). Notice, that the values of the reference point for [ø:] might be distorted due to unusual values (or measurement errors) in the recordings of C01, which were nevertheless included in the computation of the reference values. The a-vowels [a:] and [a] are realised native-like in all three formants by six speakers in the case of [a:], and by nine speakers in the case of [a]. Only speakers A01, A07, A09, B01 and B05 realise [a:] with a highly significant deviation from the reference values and for [a] only speaker A01 had highly significant deviation in the formants’ mean values. The difference F1 ([a :]) − F1 ([a]) has a mean value of 17.69 Hz for group A, with a minimum of -41 Hz, a maximum of 149 Hz and a standard deviation of 52.48 Hz. This means, that six out of ten speakers in group A realise [a] with a lower F1 . For group B the mean value is 50.54 Hz (min. -46 Hz, max. 270 Hz and standard deviation 91.44 Hz). Seven out of eight speakers realise [a] with a lower F1 . The speakers C01 and C02 both realise [a] with a lower F1 (with differences of 39 Hz and 43 Hz respectively). Thus, a higher F1 for [a:] can be considered a native-like realisation. Hence, the 6.6. ACOUSTIC ANALYSIS: RESULTS 71 difference between [a:] and [a] is not realised native-like only by speakers five speakers: A01, A02, A09, A10 and B06. In addition, the difference seems quite irrelevant for speakers A06 (6 Hz) and A08 (2 Hz). The difference F2 ([a :]) − F2 ([a]) has a mean value of -42.39 Hz for group A (min. -215 Hz, max. 138 Hz, s.d. 101.95 Hz) and a mean value of -193.3 Hz for group B (min. -369 Hz, max. -97 Hz, s.d. 92.37 Hz). Speakers C01 and C02 realise [a:] with a lower F2 (with differences of -108 Hz and -116 Hz). Only four speakers realise [a:] with a higher F2 : A03, A05, A07 and A10. The o-vowels [o:] and [O] are realised with higher F1 and F2 for [O] by all speakers. Looking at the native-likeness of the individual formant values reveals however, that four speakers realise [O] (A01, A07, B01 and B04) and at least two speakers realise [o:] (A01 and A02) definitely not according to the acoustic norm. The u-vowels [u:] and [U] are realised with higher F1 for [U] by all speakers (excluding A01 and B04), and only one speaker (A07) realises [U] with a lower F2 . There are however, some instances with quite small differences between the two vowels. Speakers A05, A07, A08 and A10 realise the difference F1 ([u :]) − F1 ([U ]) with values of -10 Hz, -12 Hz, -7 Hz and -12 Hz respectively. The difference F2 ([u :]) − F2 ([U ]) is realised with a difference of only -11 and -13 by speaker A05 and A10. The comparison of the individual vowels with their reference points is less conclusive. All speakers (excluding A01 and B04) realise [u:] and [U] with formant values rather close to the native speaker means. Only speaker A10 has a highly significant deviance in F3 of [u:]. Group B as control Applying the suggestions discussed in section 5.5, that second language learners cannot be compared to monolingual speakers of the respective target language, group B served as the control group in a second comparison of the individual vowel values. A summary is shown in table A.28 on page 103 where an improved value (in comparison to the values in table A.27) is marked in blue, and deteriorations in red. The differences are only marked, if the new p-value lies at another significance level (according to the marks as indicated above, with the levels 0.1, 0.05, 0.01 and 0.001) In general, there were only slightly more p-values which improved, i. e. were higher, in comparison to the tests with the reference values. Overall 453 p-values were higher, and 447 lower. However, regarding only those changes, which resulted in a higher or a lower level of significance as indicated in table A.28, the majority of differences is marked red in all three groups, and thus implicates in general more differences between the individual values and the reference (in this case group B). These results have to be treated with caution. Due to the different sizes of the three groups and the overall small sample sizes, no definite conclusions or generalisations can be drawn. In group B are 8 speakers as opposed to two speakers in group C plus the values taken from literature (which were rather incomplete, especially with respect to F3 ). The validity of such a comparison of non-native speakers with bilingual speakers of the L2 cannot be determined anyways by an acoustic analysis. Since the whole idea of using bilinguals as a control group is based on the observation, that a speaker’s L1 is affected by any of his or hers L2, a comparison between bilingual and monolingual native speakers will obviously result in differring values. Thus, the native-likeness of group B cannot be determined by any acoustic measure, but has to be tested with a perception experiment with native speakers-listeners. 72 6.6.3 CHAPTER 6. EXPERIMENTAL STUDY Tenseness Unfortunately the voice quality measurements yielded no results which would indicate a relation between vowel class and one of the examined voice quality parameters. The results from Claßen et al. (1998) for German speakers could not be reproduced in general for both speaker C01 and C02. Table 6.4 shows the p-values of t-tests performed for each parameter on an overall comparison of long and short vowels for each speaker and the respective groups. The p-values suggest that there are no highly significant differences for the parameters with respect to vowel class in general. Figure 6.5: Screenshot of record with disturbing signal. The section shown in this figure is the [a:k^t] part of gesagt ([g@za:kt] – “said”) with adjacent “silence” after the cursor position. As the spectrogram shows, the distortion ranges from 0 up to about 500 Hz. This part of the signal covers (at least) the first harmonic, which is used for calculating voice quality parameters. Although every vowel was labelled four times for the voice quality measurements, there were a lot of obvious measurement errors, which led to a further reduction of the already small amount of available data – this was especially important for group C with only two speakers as opposed to ten in group A. A lack of data is marked in tables A.28(a) to A.37(b) (on pages 104 to 113) by “ na”. Of the remaining data, a part of the measurements might be distorted as well due to a disturbing signal in some of the recordings, which remained undetected during recording procedure. A repetition of the respective recordings was not possible, so the analysis was carried out with reservations. Unfortunately, there are several such cases where the respective analysis could not be performed. The results are nevertheless presented in this section, and included in the summary as well (table A.39). The two speakers C01 and C02 as a group show no general effect of vowel class on any of the six measured voice quality parameters. The voice quality measurements for the two speakers are summarised in tables A.37(a) and A.37(b) on page 113. For each vowel pair and each parameter, the corresponding mean values with standard deviations (in parentheses) are given. Additionally, the p-values of a t-test with the two vowels for the respective parameters are given (if the p-value is above 0.1 only dash is drawn – see tables A.28(a) to A.37(b) for all speakers). The parameter SK was found to have significantly higher values for [e:] than for [E] in stressed syllables (Claßen et al., 1998, table 4 (a)). The same effect was observed for the i- and u-vowels. With respect to RC, the same effect was observed for e-, o- and u-vowels. Figure 6.6 shows box-whisker plots for the two parameters RCG and SKG. The ticks on the left and on the right of the plots indicate the distributions of long and short vowels respectively. However, if compared for each speaker and vowel-pair separately, some parameters show a significant differences for the long-class and short-class vowels. 6.6. ACOUSTIC ANALYSIS: RESULTS ID A01 A02 A03 A04 A05 A06 A07 A08 A09 A10 gr. A B01 B02 B03 B04 B05 B06 B07 B08 gr. B C01 C02 gr. C OQG 0.7928 0.6104 0.6229 0.6731 0.7096 0.281 0.6849 0.5561 0.4834 0.2429 0.0899 0.3955 0.4539 0.9994 0.8692 0.8853 0.9977 0.5195 0.593 0.3828 0.9798 0.7692 0.9717 GOG 0.7183 0.9072 0.0089 0.6145 0.4819 0.7725 0.9785 0.6058 0.9813 0.5705 0.3822 0.4605 0.5701 0.2667 0.0915 0.4719 0.0768 0.0308 0.6025 0.0471 0.3281 0.1955 0.8053 73 SKG 0.1494 0.522 0.0439 0.7052 0.6357 0.5747 0.902 0.9373 0.2585 0.3126 0.7399 0.933 0.6227 0.5582 0.9688 0.2479 0.3151 0.1389 0.6462 0.1761 0.2784 0.1428 0.1568 RCG 0.9579 0.1638 0.4381 0.7862 0.3068 0.1988 0.4035 0.8972 0.1049 0.3653 0.3798 0.2051 0.3409 0.4294 0.4823 0.7766 0.6371 0.1882 0.8672 0.3281 0.402 0.0588 0.1476 T4G 0.962 0.4428 0.3009 0.8103 0.2419 0.8458 0.692 0.849 0.1071 0.8329 0.9092 0.1915 0.5246 0.4613 0.9424 0.091 0.6248 0.253 0.4208 0.0884 0.2732 0.2264 0.1626 IC 0.5732 0.9517 0.0147 0.6715 0.7212 0.3133 0.9821 0.9209 0.7605 0.6659 0.1801 0.2138 0.2632 0.285 0.286 0.769 0.7918 0.4739 0.2707 0.1724 0.0052 0.5532 0.0068 Table 6.4: P-Values of t-tests of long and short vowels As a method to detect a contrast in the various voice quality parameters with respect to vowel class, pair wise t-tests were performed on the measured values for each speaker. This was also employed by Claßen et al.. For the pair [e:]∼[E], a highly significant difference in SKG can be observed with speakers A01, A09, B02, B06 and C01 (determined by t-tests with a p-value at the 0.001 level). Less significant values are measured for B01 (p < 0.01), and A04, A05, A06, A08 and A10 (p < 0.05). There is also a difference between [e:] and [E:] in the samples of speakers B06 (p < 0.001), A08, B01 and C02 (p < 0.01), and A05, A09 and C01 (p < 0.05). Interestingly, a highly significant difference in SKG between [E:] and [E] is realised speakers A03, A09, B02 and C02, at the 0.01 as well by speaker A01, and at the 0.05 level additionally by speakers A06, A10 and B07. For the pair [i:]∼[I], a highly significant difference in SKG is realised only by speaker A08. Additionally, the t-test for speakers B03 and C02 yielded a p-value at the 0.05 level. And finally, the pair [u:]∼[U] is not realised with a highly significant difference in SKG by any of the examined speakers. The only significant difference at the 0.05 level can be observed in the samples of speakers A08, A09 and B01. The pair [e:]∼[E] is realised with highly significant differences in RCG by speakers A09, B01, B07 and C01, at the 0.01 level by speakers A06, B04 and B08 and at the 0.05 level by speakers A02, A07, A08, B02 and B05. The pair [e:] and [E:] is significantly distinguished by speakers C02 (p < 0.001), A05, A08, C01 (p < 0.01), and B01 and B07 (p < 0.05). [E:] and [E] have distinct mean values in the samples of A09 and C02 (p < 0.001), and A05, B02, B07 and B08 (p < 0.01). A significant difference in RCG for the pair [o:]∼[O] can be seen in the results of speakers A03, A08, B01, B05, 74 CHAPTER 6. EXPERIMENTAL STUDY RCG ~ Speaker group : Vowel class SKG ~ Speaker group : Vowel class −1 0 0 1 1 2 ● −1 ● ● ● −2 ● ● −3 −2 ● ● ● −4 −3 ● ● gr. A (l) gr. A (s) gr. B (l) gr. B (s) (a) gr. C (l) gr. C (s) ● gr. A (l) gr. A (s) gr. B (l) gr. B (s) gr. C (l) gr. C (s) (b) Figure 6.6: Voice quality parameters RCG and SKG by speaker group and vowel class B06, B07 (p < 0.001), A06, B08, C02, A01 (p < 0.01), and B03, B04 and C01 (p < 0.05). Looking at the remaining vowel pairs it can be observed that the parameters SKG and RCG do not show consistently significant p-values. There are for example for speaker C01 no significant differences between RCG values for the pairs [ø:]∼[œ], [a:]∼[a] and [i:]∼[I], and for speaker C02 there are no significant differences for [ø:]∼[œ], [a:]∼[a] and [y:]∼[Y]. Taking into account these observations for SKG and RCG, as wells as the remaining voice quality parameters, it seems obvious that with the present data no reasonable statistical analysis can be performed. The distributional patterns of individual significant findings over the various speakers, vowel pairs and parameters do not show obvious regularities. Especially for the groups B and C, there are no general relations between vowel class (i. e. tenseness) and the respective parameters. In order to compare non-native speakers to the native-speaker norm, this norm has first to be determined precisely. A tendency for a higher mean number of highly significant values for speakers in groups B and C can be observed. The mean number of p-values at the 0.001 level over all vowel pairs and parameters for the speakers in group A is 10.5. The means for group B and C are 15.63 and 18 respectively. These values are, however, not significant and due to the unbalanced size of the groups (10 versus 8 versus 2 speakers) not conclusive. 6.6.4 Effects of L1 The factors affecting degree of foreign accent, discussed in chapter 2 cannot be examined in this study, because of the missing “score” for each speaker. As discussed in chapter 5, degree of foreign accent can be determined only by the perception of native speakers and their judgments about the native-likeness of the non-native speakers’ pronunciation. The total number of speakers is already low, so the number of each individual L1 represented in this study is definitely too low for a statistical analysis to be reasonable. Nevertheless, a brief theoretical overview of the speakers’ L1 is given in this section, and speculations about possible effects of the speakers’ L1 on their German pronunciation are made without an extensive statistical analysis. 6.6. ACOUSTIC ANALYSIS: RESULTS 75 An in-depth analysis of the phonetic and phonological systems of the respective languages’ is required – including regional or other variation within the language – in any examination of possible effects of the speakers’ L1 on their L2. This could be done in order to determine sources of interferences for example. In the following sections, some languages are discussed based on phonological descriptions or brief remarks found in the literature, without the required phonetic analyses. Such in-depth analyses – though necessary for any thorough examination – would go far beyond the scope of this thesis. An analysis of a survey among teachers of German as a foreign language reported by Ortmann (1976) is used as one source to illustrate which vowels might are considered to be “difficult” and are reported to be mispronounced by non-native speakers of German (at least at the beginning of the learning process). Polish Polish was specified by three speakers as their single L1 (A01, A02 and A05)17 . The Polish vowel system comprises six monophthongal vowel phonemes. A phonetic comparison of Polish18 and German vowel formants (the reference values prepared for this thesis) shows that five of the six Polish vowels have close F1 and F2 values to German vowels. Polish [i] is close to German [i:], Polish [a] lies between German [a] and [a:], Polish [e] between German [E:] and [E], and Polish [o] is slightly higher (i. e. has a slightly lower F1 ) than German [O]. Polish [u] is somewhat lower than German [u:]. According to the predictions of the speech learning model or the native language magnet model, these vowels might be identified with the corresponding German vowels. The “difficult” German vowels for native Polish speakers are according to Ortmann all long-class vowels. From the short-class vowels, only [Y] and [œ] are reported by the majority of the informants to be difficult for native Polish speakers. Speakers A01 and A05 show in comparison to the normalised reference points a more retracted position of [i:], which might be affected by the corresponding position of Polish [i]. The realisation of German [o:] and [O] appears to be affected by interference from Polish [o] in the speech of all three Polish subjects. All three realise neither a significant contrast in the two vowels’ formants nor a native-like vowel quality. Only speaker A02 shows a distinction between the two German o-vowels by vowel quantity (and probably in F1 ). The “central” vowel pairs corresponding to <ü> and <ö> are not distinguished significantly by any of the Polish subjects. However, speakers A01 and A02 realise [ø:] and [œ] with more native-like formant values than A05. Figure A.2(c) (page 95) shows an empty central area within the vowel space that resembles the Polish distribution of vowels. Polish has no phonological vowel quantity. Speakers A01 and A02 both realise significant differences in vowel duration for most vowels. Speaker A05 has a significant contrast only in the pair [E:] ∼[E] (the only pair that speaker distinguishes in quantity and quality), and to a lesser degree as well in [u:]∼[U] and [y:]∼[y]. 17 18 The subjects were not asked explicitly to specify their L1, but were asked about the languages they have learned from their parents and other language experience together with the respective ages of learning (compare section 6.4.1). Polish vowel formants are taken from Majewski and Hollien (1967). 76 CHAPTER 6. EXPERIMENTAL STUDY Croatian Three speakers specified Croatian as their L1 (A04, B03 and B04), and two speakers specified it as one of their two L1 besides German (B01 and B06). There are only five monophthongal vowels – [i] [e], [a], [o], and [u] – in the Croatian standard language, and no vowel opposition like in German. Vowel quantity is morphologically-conditioned and vowels with differing quantities are not considered different phonemes (like in German). The short vowels are also described as having the same quality (in terms of their formants) as their long equivalents. A phonetic comparison of the vowels’ formant values shows, that the Croatian vowels are all close to German vowels, except for Croatian [o], which lies between German [O] and [o:], though somewhat closer to the latter19 . Croatian [e] is much closer to German [E] and [E:] than to German [e:]. The central area of the F1 /F2 vowel space is not occupied by any of the Croatian vowels. This implies, that – according to SLM or NLM – the German vowels in this central area of the vowel space might be pronounced more accurately, since there are no Croatian vowels in the vicinity that could be identified with the German ones. There are no rounded vowels in Croatian. According to the German language teachers cited by Ortmann, the “difficult” German vowels for native Croatian speakers are [y:], [Y], [e:], [E:], [ø:] and [œ]. This is an interesting overlap with those vowels which are predicted to be pronounced more accurately due to their dissimilarity (namely, the ü- and ö-vowels). This seeming contradiction can be explained by considering the different sources of these assertions. The SLM for example is focused on ultimate attainment of second language learners, while language teachers obviously observe learners at earlier stages of learning. Another difference is of course, that theoretical models like the SLM or the NLM are concerned with perception and/or production of speech sounds, while the focus of a teacher might be something completely different (like intelligibility for example). Speaker A04 for example realises [ø:] and [œ] with a significant difference in duration but not in vowel quality. With respect to [y:] and [Y] there is – besides a less significant difference in F3 – not even a distinction in duration. The other Croatian speakers (all categorised as bilinguals), mostly realise a distinction for both pairs in quality and in quality. German [o:] and [O] are both realised almost native-like by speaker A04, but are not distinguished in duration. The remaining Croatian speakers realise a significant distinction in duration. The vowels [e:] and [E] are distinguished as well by speaker A04 and the other Croatian speakers. All bilingual Croatian speakers have mean ratios of long versus short vowels minimally above the mean ratio of 1.71 of group C, but speaker A04 in contrast has a mean ratio of only 1.34. Although not a generalisable conclusion with only one “monolingual” Croatian speaker, it can be said, that there are no obvious effects of interference from Croatian on the German vowels in the examined speech material. Bulgarian Only a few remarks about Bulgarian, as well as the remaining languages, will be discussed here. No detailed acoustic, phonetic or phonological literature on these various languages could be reviewed for this thesis. Two speakers (A06 and A08) specified Bulgarian as their single L1. Bulgarian has two front vowels, one high [i] and one open mid [E], one central open vowel [a], and 19 Formant values for Croatian are taken from Bakran (1996). 6.6. ACOUSTIC ANALYSIS: RESULTS 77 two back vowels [u] and [O] (IPA, 1999, p. 55). The latter two can be neutralised to [o] in unstressed positions. According to the predictions of the SLM or the NLM, these are the vowels which might be identified with their German counterparts. This would be an expected error, if the Bulgarian vowels are acoustically (perceptually) close to the German vowels. According to Ortmann20 , all German vowels are “difficult” for native Bulgarian speakers. On first sight, it seems likely that some speakers might have difficulties in producing a stressed [o:]. This could explain the observed formant values for speaker A06. The speaker does not clearly distinguish [o:] (mean F1 : 441 Hz, sd: 21; F2 : 739 Hz, sd: 86) neither from [U] (mean F1 : 419 Hz, sd: 41; F2 : 1043 Hz, sd: 446) nor from [u:] (mean F1 : 369 Hz, sd: 41; F2 : 892 Hz, sd: 155). The three vowels are realised alike, but notably, [o:] and [O] are clearly distinguished both in quality and quantity. In comparison, the other Bulgarian Speaker (A08) mutually distinguishes the two o-vowels. Both pairs of rounded front vowels are not distinguished by speakers A06 and A08, neither in quality nor in quantity. While speaker A06 seems to distinguish at least the ü-vowels from the ö-vowels, speaker A08 realises the four vowels with very close F1 and F2 values (compare figures A.3(d) and A.3(d) on pages 96 and 96). Bulgarian has also two additional central vowels [5] and [7]. The latter has no direct counterpart in German, but might interfere with German [@], and [5] might be identified with German [5]. Both these vowels are unlikely to interfere with the examined German vowels, as both schwa and [5] are not included in the examined data. Hungarian Two speakers (A09 and B02) specified Hungarian as their L1. Hungarian is described as having seven basic vowel qualities, which appear as both short and long vowels. Except for the pairs [a:]∼[A] and [e:]∼[E], the short vowels are only a little lower and more centralised than their long counterparts (IPA, 1999, p. 104). There are additional vowels, but these are described as being of only marginal/regional relevance. The “difficult” German vowels for native Hungarian speakers are according to Ortmann [E:], and [I] or [Y] which supposedly have no Hungarian equivalents. Due to the large number of vowels and a similar opposition of long and short vowels, interference effects can be expected for all examined German vowels. Provided the primary distinction in the Hungarian vowels is quantity (a fact that was not further examined), speakers might rely more on this feature and pay less attention to vowel quality with the German vowel pairs. This describes exactly the results for speaker A09. On one hand, all vowel pairs have highly significant differences in duration for the two respective vowels, the formants on the other hand, show no significant distinction for the pair [u:]∼[U], and only to a small degree for the pair [y:]∼[Y]. The two i-vowels are much closer than their two native reference values. Turkish Two speakers (A03 and B08) specified Turkish as their single L1. Turkish has eight vowels [i], [y], [e], [œ], [a], [o], [u], and [W] (IPA, 1999, p. 154). As far as phonology is concerned, only the last one has no counterpart in German. The other vowels might be identified 20 The data for Bulgarian is based on the answers of only one informant, however. 78 CHAPTER 6. EXPERIMENTAL STUDY with the respective German counterparts and therefore the German vowels might be affected by the speakers’ L1 system. Only [i:], [e:], [a:] and [u:] are reported as long vowels. However, according to the remarks given by Ortmann, there are no long vowels in Turkish – except for loanwords. The precise phonetic nature and phonological function of vowel quantity in Turkish should be considered in an experiment with Turkish speakers. The “difficult” German vowels for native Turkish speakers are according to Ortmann [e:], [E:]. These two are nevertheless realised native-like by speakers A03 and B08. Although the statistical comparison resulted in non-native-like formant values for [E:] of speaker A03, a closer look at the data reveals, that the speaker realises [E:] very close to [e:] (see e. g. table A.8 on page 87 or figure A.1(e) on page 94). According to the considerations in section 6.1 this can be seen as native-like pronunciation (since the respective values for [e:] are native-like in all three formants). The opposition in vowel quantity is realised consistently by both speakers A03 and B08. With respect to vowel quality, speaker B08 realises a distinction in quality in all vowel pairs but [a:]∼[a] and [E:]∼[E] – both cases of lacking distinctions in vowel quality can be considered native-like. Speaker A03 on the other hand shows only for the pairs [u:]∼[U] and [y:]∼[Y] no clear distinction. The former case of the u-vowels might be attributed to measurement errors. As mentioned earlier, there were generally problems with the measurements of these vowels. A look at the vowel diagram indicates, that the speaker might nevertheless realise a distinction between [u:] and [U] in the domain of vowel formants F1 and F2 . The very close mean values for [y:] and [Y] can probably be explained by an interference effect of that speakers’ L1. The speaker relies on vowel duration to distinguish the two vowels, but might identify them in quality with the Turkish equivalent. 6.6.5 Age effects The effects of age of learning (AOL) , length of residence in a German speaking environment (LOR) or the respective age of first exposure to such a surrounding (AOA – age of arrival) can only be hypothesised about. The speaker were grouped according to their respective AOL: groups B and C have a maximum age of learning of 3, and group A has a minimum AOL of 7. Looking at the mean values of the three speaker groups can therefore reveal some tendencies possibly related to this variable. As an example, both the absolute mean values of vowel duration in milliseconds and the long/short vowel ratio are similar for groups B and C and stand in opposition to the values of group A. A more detailed effect of AOL cannot be seen on the individual results of the measurements of the realised oppositions in vowel quantity. There are in total more speakers in group A who do not consistently realise a significant distinction in vowel quantity. For example A03 and A09, the two speakers with the most extreme AOL values (22 versus 7) have almost identical results for the statistical examination of vowel duration. In comparison, speakers A05 and A07 have the least native-like patterns of vowel duration – the former with an AOL of 16 and the latter with and AOL of 22. The mean formant values for group B are similar to those of group C, but the values for group A show a greater deviation of the mean values. The realisation of the vowel opposition by quality shows a tendency for better results with lower AOL – but there are again speakers who contradict this observation. Speaker A03 (with an AOL of 22) realises a consistent distinction in vowel quality between long- and short-class vowels. Speakers A05 and A08 (both with an AOL of 16) produce much more vowel pairs with equal formant values. Such observations seem to confirm the view that there is such a thing as a critical period for language (or just pronunciation) learning, and that therefore there should be no correlation between AOL and 6.7. DISCUSSION 79 foreign accent. However, the observation of “better” results up to an AOL of three and no obvious further relation between AOL and the results above that age is neither based on a sufficiently large group and a large amount of data, nor is it the result of native-speaker judgments about the degree of foreign accent. Therefore it is on one hand an interesting observation, that a speaker with one of the highest AOL can have such “good” results in comparison to speakers with AOL of 16, but on the other hand these results are not generalisable and need further examination. 6.7 Discussion This chapter provided a presentation of an experimental study on the realisation of the German vowel opposition by non-native speakers. First, the German vowel system and especially, the German vowel opposition between the long- and short-class vowels was discussed. Various differing descriptions of the vowel system can be found in the literature. There is substantial disagreement among linguists about the status or existence of a long [E:] in the German language. Another relevant aspect of German phonology and phonetics is the question about the acoustic correlates to the vowel opposition. Diverging terminologies denoting the two vowel classes as either long and short, tense and lax, or centralised and decentralised and so on all relate do different theories about the (primary) acoustic correlate to the vowel opposition. The German vowel opposition is generally described as having three phonetic/acoustic correlates: duration, vowel quality as represented by the formant structure and vowel tenseness. The latter has its acoustic correlate in voice quality parameters and was examined in addition to the traditional approach of determining vowel quality primarily by formant measurements. Therefore, the two most prominent acoustic parameters (vowel duration and vowel quality in its narrower sense, referring to the vowels’ first three formant values), were chosen for analysis, and in addition, the acoustic correlate to the vowel opposition – which is usually not examined in foreign accent research – was added: voice quality (measured by acoustic spectral tilt parameters). The reasons for examining voice quality parameters were twofold: first, voice quality seems to have received only marginal attention in foreign accent research. Besides some remarks in the literature, no closer examination of voice quality features in non-native speech could be reviewed for this thesis. The second reason was that vowel “tenseness” – one of the most often referred to correlates of the German vowel oppositions – is seen to have its primary acoustic correlate in voice quality parameters. If a phonological distinction in a language has its correlate in certain voice quality parameters, it is obviously of interest to foreign accent research to examine the realisation of these parameters by non-native speakers. Besides the review of the linguistic background of the examined phenomenon, the experiment comprised the recruitment of subjects, the recoding of speech material, acoustic measurements and a statistical analysis of the results – as far as possible with the limited amount of data. Twenty speakers were recruited for this experiment. Half of them are non-native speakers with ages of learning German above early childhood. The other half comprised eight speakers who were categorised as bilinguals – due to an early acquisition of German and another language – and two speakers who were categorised as non-bilinguals, i. e. native speakers (in the traditional sense). Unfortunately the results of the voice quality measurements cannot be reliably interpreted due to technical problems with the recordings. The experiment should be repeated to determine if and how the realisation of the vowel opposition by non-native speakers deviates from the native speakers’ norm in the domain of voice quality. 80 CHAPTER 6. EXPERIMENTAL STUDY The results for the vowel duration and vowel quality measurements revealed that the speakers use vowel duration and vowel quality in various ways to realise a phonetic contrast between the two German vowel classes. Some speakers, e. g. A05, A07 or A10, seem to make no distinction whatsoever between the two vowel classes (with respect to pronunciation at least, since perception was not examined). Other speakers show relatively good results, like for example speakers A02 or A03 who despite their late onset of learning German not only distinguish most vowel pairs in both quality and quantity but also realise the individual vowels mainly native-like. This examination revealed further that bilingual speakers can be compared to monolingually raised speakers – the traditionally referred to native speakers. The results for group B did not show any major deviances from the two speakers in group C or the “standard” values taken from literature. In order to reveal systematic phonetic differences between speakers with one native language or L1 and those with more than one L1, more detailed examinations are needed. In addition, an evaluation of the pronunciation of such speakers by judgments of native speakers is needed. Finally, the results were also discussed with regard to theoretical issues and commonly studied experimental variables in foreign accent research. For this reason, possible effects of the speakers’ L1 or their ages of learning German were briefly discussed, although a formal (statistical) analysis was not performed. A summary of the results is given in table A.39 by means of a symbolic representation of the data for easier readability. Whether these results are generalisable or not is questionable with respect to the small group of examined speakers and the limited amount of data. But then, however, the generalisability of every experimental and empirical study is questionable. In any case, a statistical evaluation is not more than a summary of the analysed data. As Southwood and Flege (1999) explicitly pointed out in their paper: “From this study, generalizations about the nature of the accentedness continua of other non-native speakers of English or speakers with stronger foreign accents are inappropriate”. How far these measured deviances affect the degree of perceived foreign accent is an open question which needs to be further examined. It should be further examined how foreign a non-native like realisation of the vowel opposition is perceived and what acoustic cues affect this perception. This however, is beyond the scope of this thesis. Chapter 7 Summary and conclusions This thesis examined foreign accent with a special focus to segmental phenomena. An overview of current and earlier research on this broad topic was presented and an experimental study was performed to examine foreign accent from a phonetic perspective. First, it was pointed out, that the term accent is commonly used in various senses – two of which are relevant to the topics discussed in this thesis. In this thesis accent refers to a characteristic way of speaking and not to emphasis placed on a certain word or syllable (in which case the term stress was used). Then, concepts like native and foreign language and bilingualism were discussed. It was shown, that these concepts and the associated terms cannot be defined precisely without great difficulty. The fact, that most of the human population does not speak just one language collides severely with the traditional concept of the idealised monolingual native speaker. This is of great importance in research on foreign accent, where the speech of a native speaker is usually regarded as the ultimate goal of language learning and non-native speakers are usually judged against this ideal norm. In the second chapter, titled “factors affecting degree of foreign accent”, experimental variables like the age of learning a language, the speakers gender, their linguistic background and experiences were discussed. Affective and psychological factors like the speakers’ motivation or language learning aptitude are often discussed in the literature. There is no general agreement among researchers whether such factors affect foreign accent or not. The same disagreement among researchers was found in the literature regarding the speakers’ gender. Possibly a surprising finding is that formal language instruction seems to have no effect on the degree of foreign accent. The fact that a speakers’ L1 affects his or hers L2 is undisputed. The precise nature and degree of this influence is however still largely unknown and a matter of disagreement among researchers. One important point is that most conclusions about general L1 effects on a speakers’ L2 are based on specific language pairs. Despite the large number of natural languages world wide, researchers so far examined basically only a small number of language pairs. The amount of exposure to the L2 and the age of learning were identified as strong factors on the degree of foreign accent. The length of residence within an L2 speaking environment affects a speakers’ foreign accent such that pronunciation in general becomes less accented with time. More important are however, the age of arrival within such an environment or the age of learning an L2. The every-day observation that young children of immigrants usually learn a language faster 81 82 CHAPTER 7. SUMMARY AND CONCLUSIONS and, what’s more important, “better” then their parents has been confirmed by researchers. The general rule is that the earlier an individual starts to learn a language, the higher his or her level of proficiency can ultimately be. A great dispute can be found in literature about the reasons for this effect of age on language acquisition (these were discussed in chapter four). The review of factors affecting degree of foreign accent was completed with a brief discussion of speaker-independent factors on the perceived degree of foreign accent. Although factors completely independent of the respective speakers are not in the focus of researchers examining the speech of non-native speakers, the existence of such factors has important consequences an experimenter has to be aware of when designing and interpreting experiments on foreign accent. This also points out the fact, that foreign accent is a phenomenon of both speech production and speech perception. Chapter three focused on segmental characteristics and acoustic manifestations of foreign accent. The two most often examined acoustic parameters are VOT of stop consonants or the formants of vowels. Most studies examining segmental manifestations of foreign accent concentrated on either both or one of these features. Other segmental features are treated only marginally. Often, segmental deviances in foreign-accented speech are not acoustically measured, but determined auditorily by phoneticians or linguistically naïve listeners. The literature linking certain acoustic deviances and the perceived degree of foreign accent is rather scarce. In chapter four, theories and models which are used to explain the foreign accent phenomenon were reviewed. The concept of an innate universal grammar and the critical period hypothesis were briefly discussed. Both theories are not concerned primarily with foreign accent, but have implications for foreign accent research and are often referred to in the literature. The critical period hypothesis of language acquisition is of particular interest as it identifies age as the most important factor in language acquisition. A great part of foreign accent research is concerned with age as a factor affecting degree of foreign accent. With respect to the various versions of the critical period hypothesis researchers often compare speakers which can be grouped into speakers who learned the examined L2 prior or after the supposed end of this critical period. The concept of interference and linguistic transfer was discussed, as it addresses the relation between a speaker’s L1 and L2. For a long time in linguistics it was assumed that transfer could explain all effects observable in foreign-accented speech (or the language of non-natives in general). Today this view is generally regarded as too simplistic, but the effects of linguistic transfer are still seen as an adequate model explaining certain foreign accent phenomena. Other approaches often referred to in the literature – direct realism, the speech learning model or the native language magnet model – were reviewed as well. These models all incorporate the concept of perceptual similarity of speech sounds. The basic idea, central to all these models is that a speaker gets attuned to his or her L1 and with time concentrates only on the most salient acoustic cues, characteristic to the respective L1. The perception of non-native speech sounds is affected severely by this Chapter five discussed methodological issues. The problems with selecting an appropriate group of speakers or listeners for an experimental study are addressed as well as possible problems caused by various elicitation techniques. Chapter six described an experimental study examining the realisation of the phonological opposition between German long- and short-class vowels by non-native speakers in comparison to native speakers. One important finding was, that speakers who learned German in early childhood (classified as bilinguals) could not be distinguished substantially from speakers who learned only German until school-age. Another interesting finding was that speakers with a relatively high age of learning are able to acquire the vowel opposition and produce it consistently. 83 Foreign accent received considerable attention in the last years. This thesis has summarised the findings from various studies and pointed out that there are still several open questions which need further examination. Appendix A Tables and figures ID Group List A01 A02 A03 A04 A05 A06 A07 A08 A09 A10 B01 B02 B03 B04 B05 B06 B07 B08 C01 C02 A A A A A A A A A A B B B B B B B B C C B A A C B A C D A A B B C B C D C B D A A: 10 B: 8 C: 2 A: 6 B: 6 C: 5 D: 3 Age 29 29 32 24 24 26 28 26 52 28 24 29 30 27 22 28 23 28 27 29 min: 22.00 max: 52.00 mean: 28.25 Sex AOL AOA m m m f m f f f f f f f m m f m m f f m 15 12 22 19 16 18 22 16 7 21 0 0 3 3 0 0 0 3 0 0 22 18 22 18 19 18 24 19 0 21 18 0 0 0 10 0 0 0 0 0 f: 11 m: 9 min: 0.00 max: 22.00 mean: 8.85 min: 0.00 max: 24.00 mean: 10.45 Table A.1: Demographic speaker characteristics 84 LOR 89 1 120 72 54 96 48 84 624 84 72 348 360 324 144 336 276 336 315 348 min: 1.00 max: 624.00 mean: 206.60 L1 pl pl tr hr pl bg ka bg hu ru+uk de+hr de+hu hr hr de+ro de+hr de+it tr de de hr: 3 pl: 3 bg: 2 de: 2 de+hr: 2 tr: 2 other: 6 680 737 896 a 750 a 850 a 762 a 622 a 749 831 676 F1 F2 2450 2171 2533 2400 1930 1986 2148 2181 2382 2199 F2 1150 1275 1517 a 1150 a 1221 a 1172 a 1139 a 1176 1302 1577 a: i: 680 694 836 800 768 706 593 618 795 676 F1 375 369 433 325 358 319 352 270 388 345 F1 a I 1280 1372 1586 1400 1192 1237 1048 1224 1425 1572 F2 2200 1902 2095 2200 1940 1888 1868 1914 2028 2065 F2 517 572 500 391 358 316 343 - F1 260 302 320 275 306 300 286 261 - F1 @ F2 1447 1763 1200 1409 1550 1530 1400 - F2 1550 1722 1810 2000 1689 1452 1615 1491 - y: 540 482 534 514 - F1 450 373 426 325 339 358 387 306 - F1 F2 1009 1022 1081 1126 - F2 1400 1543 1670 1800 1510 1530 1393 1478 - 5 Y 400 383 440 375 354 358 381 365 353 372 F1 400 348 434 375 365 306 330 308 359 370 F1 F2 680 841 889 850 589 599 677 592 707 1150 F2 2250 2126 2461 2100 2103 2044 2090 2109 2293 2004 o: e: 550 537 605 500 502 531 521 501 536 440 F1 482 584 e 500 495 482 514 515 - F1 F2 O 980 1074 1200 900 921 931 928 931 1100 1246 F2 1902 2166 e 1900 1992 1835 1892 1868 - E: 250 310 345 275 293 228 310 290 271 336 F1 490 489 608 500 501 482 417 495 569 403 F1 F2 650 854 956 750 606 567 612 573 653 1009 F2 1990 1817 2040 1900 1702 1801 1697 1764 1892 1886 u: E 400 391 442 325 383 361 358 375 377 348 F1 420 371 440 o 375 345 349 339 319 - F1 F2 U 850 1010 1081 850 885 983 1042 863 989 1228 F2 1500 1501 1605 o 1700 1387 1426 1400 1361 - ø: 550 474 564 500 514 436 490 462 - F1 œ F2 1500 1477 1654 1550 1400 1400 1452 1374 - [S] = Sendlmeier and Seebode (2007), m: male speakers, f: female speakers; [W] = Wängler (1968, p. 23); [R] = Ramers (1988), male speakers B, M, H and P, context 1: [b t@n]; [C]a = Claßen et al. (1998, p. 219), accented. . . ; [C]u = Claßen et al. (1998, p. 219), unaccented. . . ; a : [R] and [W] use the symbol A:, e : [W] does not give a numerical value for E: but notes that it is distinguished from E only by length (“Hinsichtlich der Bildung wird auf [E] verwiesen, von dem dieser Laut nur durch Dauer verschieden ist.”, p. 46); o : [W] does not use the length mark for ø. Table A.4: Formant values of German monophthongs. Sources and notes: [K] = Kohler (1977, p. 54), approximations taken from figure 7; [K] [S]m [S]f [W] [R]B1 [R]M1 [R]H1 [R]P1 [C]a [C]u F1 250 263 302 275 290 228 260 248 280 333 -front vowels [K] [S]m [S]f [W] [R]B1 [R]M1 [R]H1 [R]P1 [C]a [C]u +front vowels 85 APPENDIX A. TABLES AND FIGURES 86 ID A01 A02 A03 A04 A05 A06 A07 A08 A09 A10 gr. A B01 B02 B03 B04 B05 B06 B07 B08 gr. B C01 C02 gr. C all [ø:]∼[œ] 1.07 1.31 1.62 1.38 0.88 1.22 1.25 1.23 1.78 0.91 1.27 (0.28) 1.42 1.54 1.67 1.68 1.57 1.45 1.45 1.33 1.51 (0.12) 1.53 1.59 1.56 (0.04) 1.39 (0.25) [a:]∼[a] 1.55 1.96 1.97 1.46 1.3 1.46 1.47 1.66 1.85 1.36 1.60 (0.25) 1.93 2.03 2.15 2.11 2.09 1.54 2.09 1.7 1.96 (0.22) 1.69 2.02 1.86 (0.23) 1.77 (0.28) [e:]∼[E] 1.91 2.52 1.63 1.31 1.2 1.79 1.28 1.64 2.21 1.67 1.72 (0.42) 1.72 1.42 1.5 2 1.39 1.64 1.42 1.74 1.60 (0.21) 1.33 1.55 1.44 (0.16) 1.64 (0.33) [E:]∼[E] 1.65 1.69 1.7 1.35 1.5 1.58 1.47 1.57 2.22 1.31 1.60 (0.25) 1.64 1.7 1.68 1.85 1.8 1.59 1.72 1.73 1.72 (0.09) 1.66 1.65 1.65 (0.01) 1.65 (0.19) [i:]∼[I] 2.04 2.42 2.42 1.51 1.29 2.14 1.4 2.66 2.8 1.51 2.02 (0.56) 2.04 1.97 1.98 2.42 1.86 2.47 1.27 1.97 2.00 (0.37) 1.65 2.01 1.83 (0.26) 1.99 (0.45) [o:]∼[O] 1.35 1.55 1.99 1.28 1.18 1.55 1.21 1.24 1.79 1.05 1.42 (0.3) 1.96 1.67 1.7 1.79 1.46 1.52 1.78 1.7 1.70 (0.16) 1.48 1.83 1.66 (0.25) 1.55 (0.27) [u:]∼[U] 1.37 1.66 1.97 1.26 1.58 1.83 1.14 1.29 2.11 1.21 1.54 (0.34) 1.79 1.93 2.06 2.33 1.65 1.89 1.49 2.28 1.93 (0.29) 1.78 2.12 1.95 (0.24) 1.74 (0.36) [y:]∼[Y] 1.22 1.67 2.48 1.21 1.37 1.34 0.96 1.36 1.99 1.25 1.48 (0.45) 1.69 1.76 1.82 2.05 2.09 2.03 1.36 1.9 1.84 (0.24) 1.59 1.85 1.72 (0.18) 1.65 (0.38) mean (sd) 1.52 (0.33) 1.85 (0.42) 1.97 (0.33) 1.34 (0.10) 1.29 (0.22) 1.61 (0.30) 1.27 (0.18) 1.58 (0.47) 2.10 (0.34) 1.28 (0.24) 1.58 (0.41) 1.77 (0.20) 1.75 (0.21) 1.82 (0.22) 2.03 (0.26) 1.74 (0.27) 1.77 (0.35) 1.57 (0.27) 1.79 (0.27) 1.78 (0.27) 1.59 (0.14) 1.83 (0.22) 1.71 (0.21) 1.67 (0.36) Table A.5: Vowel duration ratios of long-class vowels divided by short-class vowels.The values in parenthesis show the standard deviations of the respective mean values. 87 F1 A01 ø: œ a a: E e: E: I i: O o: U u: Y y: min 340 455 823 798 552 350 458 316 228 483 442 376 296 287 236 max 459 529 883 861 643 436 558 406 272 636 574 528 461 437 309 F2 mean 406 486 853 833 601 396 512 358 253 576 507 440 380 368 275 sd 52 30 30 25 35 35 43 34 16 56 53 59 73 70 32 min 1462 1425 1238 1261 1676 2120 1916 1836 2200 773 668 706 797 1685 1526 max 1674 1906 1371 1340 1975 2234 2131 2223 2265 1011 909 2466 2627 1863 1883 F3 mean 1554 1599 1305 1304 1871 2169 2046 2001 2227 888 796 2079 1762 1799 1710 sd 83 174 58 35 116 50 72 132 25 84 86 769 755 83 156 min 2152 2424 2594 2592 2464 2535 2325 2431 3000 2784 2793 2867 2340 1940 2013 max 2420 2514 2768 2816 2826 2756 2712 3072 3477 3148 3038 3163 3207 2607 2357 mean 2268 2477 2698 2712 2612 2644 2535 2623 3293 2979 2957 3085 2992 2270 2166 sd 115 36 79 91 153 87 128 244 174 156 93 123 327 302 142 mean 2416 2433 2712 2634 2787 2856 2651 2791 3403 2980 3067 2932 2867 2393 2375 sd 121 86 300 220 334 135 191 374 336 202 85 342 418 99 274 mean 2290 2360 2481 2512 2494 2756 2618 2561 3277 2506 2493 2526 2423 2368 2251 sd 101 33 71 125 66 182 58 139 137 180 71 114 101 117 56 Table A.6: Formant values speaker A01 (in Hertz) F1 A02 ø: œ a a: E e: E: I i: O o: U u: Y y: min 386 406 641 704 514 341 450 333 272 492 391 243 338 306 251 max 445 463 848 768 647 413 529 392 295 612 514 486 428 337 309 F2 mean 414 431 734 729 568 381 470 362 283 556 447 384 377 325 292 sd 20 23 68 26 48 28 29 25 9 49 47 81 37 13 22 min 1383 1284 1092 1215 1531 2090 1668 1736 2299 684 557 553 462 1547 1672 max 1573 1738 1541 1298 1758 2359 1968 2467 2469 982 866 1521 2386 1818 1956 F3 mean 1481 1458 1283 1255 1699 2244 1879 1954 2359 885 715 1021 945 1678 1799 sd 76 186 164 35 84 93 106 267 61 113 130 359 724 97 97 min 2277 2344 2286 2274 2287 2673 2448 2468 2872 2631 2967 2495 2491 2307 2116 max 2641 2545 3035 2872 3293 3009 2935 3534 3810 3238 3180 3403 3498 2549 2876 Table A.7: Formant values speaker A02 (in Hertz) F1 A03 ø: œ a a: E e: E: I i: O o: U u: Y y: min 351 419 609 653 442 332 354 299 254 468 229 354 295 299 268 max 403 481 648 703 515 360 399 371 316 586 475 431 441 339 306 F2 mean 375 456 630 673 482 354 384 347 288 527 374 381 336 319 287 sd 20 24 15 18 24 11 16 25 22 38 82 32 53 17 16 min 1465 1448 1257 1328 1731 2054 1926 1686 2109 840 515 780 734 1611 1568 max 1595 1617 1535 1515 1847 2124 2084 2079 2255 1130 881 1966 892 1662 1719 F3 mean 1516 1541 1431 1431 1771 2085 2007 1860 2165 970 745 1192 811 1637 1642 sd 47 77 93 68 42 27 69 130 53 93 129 407 59 20 63 min 2114 2315 2382 2356 2385 2498 2558 2389 3127 2276 2393 2379 2371 2236 2147 max 2415 2406 2603 2697 2556 2944 2697 2772 3460 2795 2587 2652 2628 2522 2294 Table A.8: Formant values speaker A03 (in Hertz) 88 APPENDIX A. TABLES AND FIGURES F1 A04 ø: œ a a: E e: E: I i: O o: U u: Y y: min 465 462 732 786 567 483 445 445 298 558 450 463 437 434 416 max 509 536 809 811 679 613 621 487 430 675 537 502 470 503 448 F2 mean 490 512 765 800 637 526 540 470 369 632 500 479 456 472 433 sd 20 29 31 12 41 51 68 15 46 41 31 15 15 29 13 min 1514 1550 1522 1471 2042 2234 1787 2241 2047 942 761 742 709 1619 1616 max 1834 1920 1906 1804 2232 2476 2485 2475 2744 1268 1098 1282 905 1928 2067 F3 mean 1668 1716 1696 1641 2131 2323 2223 2331 2564 1121 927 993 840 1760 1816 sd 110 133 148 136 69 86 233 99 260 125 134 183 76 127 176 min 2380 2288 1880 1718 2845 2989 2974 2849 2655 2225 2550 2532 2533 2605 2236 max 2770 2747 2925 2831 3067 3157 3135 3286 3411 2848 2704 2830 2975 2743 2750 mean 2581 2578 2469 2472 2968 3094 3069 3046 3217 2697 2632 2690 2707 2701 2497 sd 125 163 393 509 102 68 63 178 284 241 67 102 167 65 205 mean 2232 2292 2465 2466 2456 2534 2513 2436 2496 2592 2644 2527 2621 2167 2155 sd 95 242 87 117 83 103 86 194 185 193 171 154 325 98 39 mean 2757 2711 2561 2140 2945 3136 2840 3022 3280 2691 2723 2683 2523 2691 2697 sd 72 57 217 493 153 47 432 168 103 96 96 157 198 83 74 Table A.9: Formant values speaker A04 (in Hertz) F1 A05 ø: œ a a: E e: E: I i: O o: U u: Y y: min 358 331 598 645 440 401 414 299 306 442 442 355 369 321 287 max 389 399 729 686 566 438 466 414 363 543 550 514 481 357 365 F2 mean 369 365 647 665 491 421 436 354 339 521 496 411 401 338 326 sd 13 27 48 18 55 16 22 42 23 39 45 65 41 17 28 min 1345 1232 1123 1222 1730 1914 1842 1525 1833 755 585 625 446 1247 1435 max 1793 1922 1612 1532 1941 2167 2023 2177 2105 1123 1005 1832 1755 1772 2033 F3 mean 1566 1518 1351 1357 1828 2013 1935 1939 2002 935 823 1008 997 1552 1684 sd 153 273 180 116 91 97 61 224 95 150 146 454 539 207 232 min 2092 2120 2399 2327 2335 2409 2409 2217 2331 2296 2414 2294 2083 2061 2088 max 2360 2768 2636 2600 2542 2668 2623 2722 2743 2824 2841 2741 3029 2313 2191 Table A.10: Formant values speaker A05 (in Hertz) F1 A06 ø: œ a a: E e: E: I i: O o: U u: Y y: min 443 439 809 799 502 420 434 368 315 588 407 347 326 361 300 max 479 511 933 889 649 451 645 482 365 684 464 454 413 434 437 F2 mean 460 472 853 859 607 441 546 424 341 629 441 419 369 387 367 sd 16 30 48 35 56 13 93 43 20 33 21 41 41 29 46 min 1892 1941 1254 1174 1069 2536 1482 2216 2689 907 659 663 629 2045 1977 max 2184 2070 1773 1529 2292 2754 2766 2584 2785 1148 884 1713 1038 2253 2292 F3 mean 2040 2008 1483 1356 2008 2668 2188 2479 2738 994 739 1043 892 2132 2148 sd 109 45 194 164 467 87 580 143 35 105 86 446 155 81 126 min 2664 2608 2197 1517 2658 3075 2115 2766 3084 2554 2618 2357 2368 2592 2583 max 2846 2757 2832 2664 3096 3197 3209 3272 3379 2828 2880 2858 2840 2777 2789 Table A.11: Formant values speaker A06 (in Hertz) 89 F1 A07 ø: œ a a: E e: E: I i: O o: U u: Y y: min 391 422 744 951 485 426 417 312 304 569 413 393 375 320 305 max 461 479 958 1065 638 586 615 420 425 651 605 454 442 436 431 F2 mean 431 450 860 1009 557 491 539 382 382 606 508 418 406 388 384 sd 34 24 80 48 63 59 79 42 47 27 76 20 26 53 49 min 1776 1544 1019 1620 1943 2097 1965 762 2112 1021 745 620 747 1754 1872 max 1970 1997 1909 1705 2197 2243 2252 2746 2717 1494 1212 1301 1527 2148 2184 F3 mean 1858 1767 1514 1653 2115 2176 2108 2270 2462 1295 942 1123 1127 1914 2032 sd 81 194 297 33 110 64 120 744 235 187 165 258 265 159 108 min 2538 2052 1809 2095 2938 3057 3081 2655 3047 2166 2900 2696 2497 2513 2204 max 2817 2751 3015 2952 3329 3227 3187 3509 3510 3173 3327 3070 3150 2928 2760 mean 2712 2466 2359 2603 3124 3171 3125 3079 3344 2881 3075 2818 2850 2662 2610 sd 119 242 509 324 132 62 39 365 171 374 152 137 241 168 210 mean 2292 2288 2540 2413 2611 2743 2682 2793 3249 2379 2558 2490 2482 2314 2276 sd 68 124 178 270 198 155 145 172 117 167 96 139 170 112 85 mean 2613 2679 2651 2662 2820 3059 2913 2730 3629 2900 2877 2820 2810 2667 2648 sd 95 84 76 153 106 252 115 78 103 321 100 108 263 106 81 Table A.12: Formant values speaker A07 (in Hertz) F1 A08 ø: œ a a: E e: E: I i: O o: U u: Y y: min 365 378 641 678 429 372 381 352 355 488 436 370 351 382 352 max 422 507 752 761 532 391 464 419 374 571 572 408 404 405 395 F2 mean 395 431 705 707 498 382 411 373 367 537 486 392 385 394 381 sd 22 54 39 33 41 7 32 24 7 36 47 16 18 9 17 min 1592 1352 1244 1258 1803 1991 1892 1939 2190 936 647 794 736 1636 1429 max 1703 1712 1731 1474 1924 2203 2107 2239 2513 1209 995 1480 1151 1769 1814 F3 mean 1640 1602 1517 1357 1881 2080 2004 2099 2335 1078 809 1096 921 1705 1712 sd 49 129 168 104 47 95 84 116 137 103 129 220 155 58 141 min 2224 2166 2343 2170 2354 2461 2448 2552 3111 2171 2431 2318 2290 2211 2201 max 2412 2498 2844 2776 2869 2887 2818 3021 3427 2648 2707 2640 2760 2504 2398 Table A.13: Formant values speaker A08 (in Hertz) F1 A09 ø: œ a a: E e: E: I i: O o: U u: Y y: min 382 446 630 651 418 366 378 364 353 475 386 369 356 372 299 max 427 536 825 770 634 452 429 420 393 550 414 449 412 424 379 F2 mean 407 489 726 717 541 397 392 396 373 506 395 414 386 402 344 sd 15 36 78 53 76 29 19 19 15 30 10 29 20 22 38 min 1514 1451 1256 1134 1754 2450 2337 1952 2481 904 746 596 513 1600 1714 max 1753 1837 1561 1305 2124 2649 2539 2547 2686 1216 861 1263 912 1863 1965 F3 mean 1668 1618 1427 1212 1903 2528 2442 2306 2609 1042 804 947 757 1768 1849 sd 95 128 118 76 138 76 80 202 87 119 51 225 172 100 112 min 2516 2578 2552 2416 2690 2596 2781 2632 3498 2530 2744 2673 2494 2550 2492 max 2792 2799 2780 2828 2971 3268 3111 2856 3751 3462 3037 3009 3135 2794 2707 Table A.14: Formant values speaker A09 (in Hertz) 90 APPENDIX A. TABLES AND FIGURES F1 A10 ø: œ a a: E e: E: I i: O o: U u: Y y: min 505 528 703 696 508 482 535 357 334 547 497 342 376 359 319 max 592 633 1047 906 653 565 756 471 473 656 593 485 488 407 509 F2 mean 550 556 845 803 602 521 606 423 401 601 537 431 419 377 398 sd 33 39 124 86 53 28 82 41 61 45 40 55 43 21 64 min 1500 1565 1031 1434 1886 1573 1343 2277 2547 927 792 740 646 1453 1362 max 1837 2097 1765 1796 2226 2621 2306 2773 2809 1342 1534 1382 1440 1801 1971 F3 mean 1732 1778 1538 1557 2043 2048 1982 2525 2697 1213 1106 1066 1052 1701 1689 sd 125 179 269 150 116 407 355 188 92 151 246 212 304 144 223 min 2450 2474 1810 1898 2255 2282 2142 3002 2965 2298 2520 2482 2899 2697 2273 max 2784 2755 2912 2546 3057 3378 3325 3341 3615 3230 2863 3383 3206 2892 3027 mean 2645 2617 2351 2226 2675 2722 2835 3191 3276 2581 2686 2895 2998 2783 2766 sd 125 114 379 274 271 418 403 122 275 373 143 313 111 83 262 Table A.15: Formant values speaker A10 (in Hertz) F1 B01 ø: œ a a: E e: E: I i: O o: U u: Y y: min 399 436 741 1059 548 370 369 393 226 484 351 307 158 337 242 max 478 714 959 1187 618 433 467 461 382 627 456 490 432 450 418 F2 mean 435 538 839 1109 574 413 417 435 294 560 413 409 353 399 303 sd 27 108 75 55 30 24 33 24 58 49 42 60 108 43 70 min 1544 1766 1495 1553 2056 2070 1102 2069 2096 1056 418 927 367 1872 1749 max 1986 2264 2301 1712 2193 2814 2619 2617 3140 1448 1415 1358 1753 2095 2585 F3 mean 1859 1963 1825 1659 2100 2592 2201 2297 2784 1282 813 1214 980 1980 2100 sd 165 180 268 64 52 269 553 193 370 167 352 148 492 84 352 min 2460 2892 2493 2401 2781 3253 2382 2929 3478 2625 2979 2221 2233 2802 2708 max 3204 3086 3092 2647 3177 3569 3310 3428 3943 2976 3444 3143 3086 3481 3839 mean 2841 2976 2720 2544 3004 3412 3034 3188 3635 2809 3231 2722 2812 3061 3061 sd 269 74 215 93 130 121 341 216 175 165 170 344 342 308 481 Table A.16: Formant values speaker B01 (in Hertz) F1 B02 ø: œ a a: E e: E: I i: O o: U u: Y y: min 414 518 652 706 608 436 629 451 313 565 416 405 294 438 311 max 460 657 764 827 675 455 672 474 404 701 450 452 471 474 356 F2 mean 443 587 719 774 646 442 656 461 348 638 431 423 352 454 328 sd 18 53 41 48 30 7 17 9 38 46 13 20 64 17 17 min 1555 1636 1562 1038 1534 2322 1270 1909 2445 1084 647 709 473 1728 1821 max 1730 1887 1825 1569 2105 2600 2222 2404 2589 1549 1148 1378 1883 1877 2155 F3 mean 1663 1742 1686 1388 1959 2454 2003 2091 2535 1256 770 1123 965 1818 1970 sd 78 86 104 204 220 103 363 170 63 186 190 286 572 71 149 min 2208 2718 2661 1510 1967 3022 2023 2728 3405 2552 2223 2309 2281 2145 2117 max 2320 3072 3175 2915 3112 3324 3212 3249 3731 3299 2947 2862 2805 2674 2653 Table A.17: Formant values speaker B02 (in Hertz) mean 2272 2856 2911 2355 2865 3192 2977 3061 3570 2985 2609 2592 2522 2454 2431 sd 52 132 216 627 445 104 469 189 121 250 281 229 231 227 174 91 F1 B03 ø: œ a a: E e: E: I i: O o: U u: Y y: min 352 409 532 604 457 346 366 350 257 483 353 353 281 325 266 max 378 483 646 707 507 379 402 389 310 553 390 403 456 375 299 F2 mean 368 453 585 654 478 361 382 369 281 510 366 381 345 344 280 sd 10 25 44 45 18 13 12 16 22 24 13 19 59 23 11 min 1483 1353 1312 1273 1606 1985 1854 1659 2137 848 651 752 615 1569 1646 max 1629 1546 1588 1378 1760 2151 2083 2051 2265 1196 810 1347 845 1711 1952 F3 mean 1574 1494 1415 1318 1697 2077 1968 1806 2198 1066 698 1136 717 1619 1764 sd 57 72 109 39 66 54 89 152 56 120 59 219 86 58 121 min 2060 2198 1843 2023 2241 2463 2414 2297 2811 2128 2364 2252 2082 2160 2069 max 2244 2313 2345 2402 2483 2652 2568 2717 3241 2318 2584 2967 2668 2354 2317 mean 2176 2238 2155 2293 2378 2589 2489 2521 3016 2255 2489 2488 2469 2247 2193 sd 69 43 180 154 82 66 62 152 167 67 91 286 227 75 86 mean 2335 2276 2315 2473 2523 2758 2713 2610 3257 2471 2703 2476 2818 2314 2383 sd 195 158 83 104 116 91 73 197 279 128 138 201 642 105 402 mean 2309 2364 2303 2282 2643 3119 2922 2703 3781 2198 2359 2383 2347 2435 2464 sd 113 112 321 112 24 100 277 180 151 152 92 56 143 134 300 Table A.18: Formant values speaker B03 (in Hertz) F1 B04 ø: œ a a: E e: E: I i: O o: U u: Y y: min 303 358 505 540 362 287 314 286 189 432 300 235 128 255 201 max 343 439 600 598 503 339 362 322 320 490 351 325 369 301 263 F2 mean 322 405 542 569 447 318 332 303 224 457 317 301 283 283 232 sd 16 31 33 26 54 19 18 16 49 21 18 31 96 17 22 min 1380 1073 766 1025 1741 2016 2045 1722 2119 647 448 503 533 1549 1657 max 1883 2001 1546 1269 2023 2328 2142 2213 2371 817 691 2181 2383 2039 1951 F3 mean 1655 1518 1301 1141 1812 2157 2089 1895 2258 751 559 1144 1028 1751 1765 sd 170 313 283 91 108 122 49 172 119 74 94 704 780 184 114 min 2107 2091 2171 2308 2392 2655 2593 2449 2914 2298 2593 2235 2463 2131 1956 max 2692 2455 2406 2581 2716 2929 2791 2998 3643 2612 2973 2857 3958 2394 2930 Table A.19: Formant values speaker B04 (in Hertz) F1 B05 ø: œ a a: E e: E: I i: O o: U u: Y y: min 327 386 539 650 438 340 358 346 236 476 343 367 241 355 237 max 372 519 734 731 517 378 601 388 357 559 402 396 336 376 348 F2 mean 345 454 640 687 495 359 408 370 292 517 362 381 302 363 286 sd 17 53 64 37 30 12 95 14 45 28 23 12 36 8 36 min 1774 1631 988 926 1882 2215 2116 1848 2209 819 690 784 435 1658 2018 max 2040 2219 1739 1157 2170 2393 2411 2238 2402 1237 940 1877 1735 1961 2257 F3 mean 1891 1943 1405 1035 2000 2326 2271 2029 2285 994 777 1293 809 1845 2157 sd 118 209 295 96 109 64 109 140 67 162 99 384 473 121 85 min 2188 2163 1860 2188 2605 2942 2437 2529 3520 2020 2229 2323 2132 2233 2087 max 2460 2494 2653 2425 2670 3208 3167 3024 3919 2414 2456 2467 2539 2535 2890 Table A.20: Formant values speaker B05 (in Hertz) 92 APPENDIX A. TABLES AND FIGURES F1 B06 ø: œ a a: E e: E: I i: O o: U u: Y y: min 361 468 634 450 410 355 389 344 250 541 387 356 279 368 285 max 596 545 682 698 568 415 592 404 311 629 432 449 352 394 313 F2 mean 411 501 656 610 507 377 499 382 274 588 409 411 313 378 293 sd 83 27 22 97 65 22 81 23 20 40 19 37 25 10 11 min 1441 1412 1174 1163 1572 1904 1627 1612 1966 1016 630 846 652 1566 1648 max 1636 1532 1506 1286 1856 2031 1992 1964 2082 1139 851 1285 922 1814 1871 F3 mean 1550 1487 1341 1213 1715 1972 1788 1806 2016 1075 734 990 753 1633 1766 sd 66 50 127 51 106 41 122 117 41 46 75 168 110 103 94 min 1971 2149 2436 2497 2381 2543 2405 2241 2838 2425 2460 2287 2121 2125 1957 max 2495 2637 2691 2789 2732 2822 2822 2679 3227 2706 2699 2839 2574 2491 2225 mean 2161 2301 2528 2617 2549 2726 2578 2470 3039 2573 2605 2530 2398 2249 2084 sd 185 178 101 122 169 103 152 157 139 91 79 230 175 142 109 mean 2167 2135 2054 2059 2374 2718 2470 2371 3003 2060 2401 2266 2486 2245 2202 sd 66 97 128 81 182 107 34 123 218 98 268 91 341 67 83 mean 2491 2542 2428 2310 2556 2800 2669 2897 3131 2352 2544 2523 2624 2496 2516 sd 64 165 364 321 392 246 323 195 246 230 53 148 85 123 79 Table A.21: Formant values speaker B06 (in Hertz) F1 B07 ø: œ a a: E e: E: I i: O o: U u: Y y: min 343 373 616 637 447 338 443 355 244 547 357 389 288 359 267 max 383 555 680 707 547 356 554 411 327 654 449 468 517 394 287 F2 mean 361 486 651 669 509 345 499 382 280 592 401 430 357 372 274 sd 13 66 21 28 34 7 37 23 36 35 33 31 81 14 8 min 1502 1488 1034 912 1662 2133 1863 1642 2130 896 451 654 526 1586 1599 max 1874 1597 1414 1273 1917 2286 1951 2019 2447 1167 1311 1380 1664 1841 1903 F3 mean 1637 1553 1308 1126 1807 2195 1896 1795 2233 994 709 984 796 1658 1771 sd 136 45 142 131 96 62 34 133 112 94 306 240 440 105 111 min 2066 1986 1887 1953 2058 2573 2420 2246 2763 1958 2163 2177 2244 2143 2108 max 2234 2213 2190 2160 2593 2845 2513 2578 3414 2222 2925 2402 3172 2313 2322 Table A.22: Formant values speaker B07 (in Hertz) F1 B08 ø: œ a a: E e: E: I i: O o: U u: Y y: min 432 524 696 697 587 424 467 362 335 587 446 443 321 390 337 max 521 630 755 846 682 463 694 512 459 692 489 479 453 474 400 F2 mean 469 592 711 783 627 447 588 471 403 630 469 469 397 442 359 sd 34 40 23 63 35 15 96 55 45 40 16 14 60 35 22 min 1491 1556 1505 1017 949 2317 1632 1962 1749 993 650 591 592 1912 1835 max 1986 1859 1965 1724 1994 2520 2443 2416 2585 1207 945 1802 859 2315 2089 F3 mean 1798 1735 1688 1542 1710 2403 2093 2224 2356 1127 821 1007 743 2044 2003 sd 168 129 169 301 413 75 286 169 308 74 104 453 94 165 93 min 2409 2332 1798 1982 1944 2498 2348 2655 2690 2053 2496 2267 2517 2346 2402 max 2564 2789 2778 2793 2907 3015 3106 3205 3386 2585 2643 2699 2719 2674 2625 Table A.23: Formant values speaker B08 (in Hertz) 93 F1 C01 ø: œ a a: E e: E: I i: O o: U u: Y y: min 469 516 729 781 609 459 463 359 268 645 427 387 286 374 268 max 502 654 842 901 797 503 777 532 310 704 473 517 310 503 307 F2 mean 477 599 790 828 695 479 674 486 291 667 452 456 296 451 280 sd 12 46 45 45 62 15 113 64 15 22 20 50 10 50 14 min 1032 1500 1219 946 962 2491 900 1668 2631 1091 619 819 724 1360 1552 max 1835 1796 1689 1585 2230 2769 2555 2234 2854 1505 802 1579 1169 1872 1746 F3 mean 1588 1679 1466 1357 1910 2622 1972 2026 2763 1235 711 1079 840 1623 1631 sd 295 116 176 266 472 105 563 198 89 157 71 280 166 192 76 min 2324 2501 1720 1488 2310 3017 2126 2590 3330 1557 2721 2191 2693 1962 1947 max 3481 3229 2622 2764 3030 3442 3139 3319 3490 3116 3135 3277 3051 2907 2500 mean 2770 2821 2160 2086 2801 3208 2884 3065 3420 2604 2891 2743 2841 2504 2303 sd 396 251 346 618 271 154 382 274 60 549 153 378 138 445 189 Table A.24: Formant values speaker C01 (in Hertz) F1 C02 ø: œ a a: E e: E: I i: O o: U u: Y y: min 342 433 535 538 469 337 436 313 264 527 339 320 274 299 274 max 400 509 663 721 585 409 558 401 291 672 430 389 480 386 325 F2 mean 378 464 626 669 521 374 468 373 277 601 397 363 337 343 298 sd 23 31 48 75 43 26 47 32 11 53 34 28 76 32 17 min 1438 1377 1280 1282 1592 1648 1768 1646 2045 812 539 777 574 1417 1283 max 1592 1606 1540 1341 1826 2095 1977 1993 2441 1365 746 1224 1516 1607 1767 F3 mean 1517 1514 1436 1320 1730 1934 1888 1818 2199 1111 691 1096 838 1516 1664 sd 51 93 103 24 81 174 91 136 178 179 76 177 342 72 189 min 1925 2166 2167 2250 2265 2095 2229 2251 3010 2095 2002 1995 2062 2065 2108 max 2270 2446 2722 2659 2621 2922 2739 2552 3340 2922 2329 2289 2393 2391 2546 Table A.25: Formant values speaker C02 (in Hertz) mean 2094 2249 2358 2393 2434 2435 2438 2440 3167 2373 2149 2162 2224 2240 2217 sd 117 102 188 164 121 301 184 109 160 343 130 109 108 121 174 APPENDIX A. TABLES AND FIGURES u u I Yu e 2 U 9 3 u 600 E E U o o 0o o 0 o0 o 0 u o mean 0 mean i i i ii I e II e I U U ee uu e3 3 U i −1 y 0 300 y I 2 I u I I e mean 22 mean e mean I U U mean Y e e Y mean u 9 3 2 92 9 mean 3 U u 9mean9 3 3 U 9 mean E 33 E E 500 400 I e mean y y yyY Y I YI Y 9 2 9 3 3 33 E EE E E y y 9 3E uu u 2 u Y222 2 e I U U U oo 99 9 9 o 0o o 0 o0 o 0 0 0 0 0 0 E A a aAaAaa A 2 700 F1 (Hz) y y Ymean y Y 1 y i F1n i ii ii mean −2 200 94 800 a AA a Aa A aa Aa mean mean A A 900 2500 2000 1500 400 I I 9 500 3 9 UUu U 2 2 2 229 9 mean 9 2 mean9 E3 0 o 0 u 0 mean 00 A E A 2 700 o 0 EE 00 a aA a A a mean mean A a Aa aA aaaA A Aa a A 3 800 A U U 3E E 9 A E u u uu oo U u u oU oo 0 o 0 0 0 Y2 e I 00 E E Emean E 600 u U o meano o y i o U E mean U mean oo mean 29 U y ii y yy i iii y YY Y I e I YI Y I ee U ee e I I 9 2 22 9 2 299 u 33 9 2 9 33 U E3 E u uu −1 mean e 3333 F1 (Hz) Y mean 0 I I e u −2 −2 U yyy Yy YY IY Y I 1 I −1 (b) Speaker A01 mean ee ee mean 0 (a) Speaker A01 y e 1 F2n F1n 300 i 2 1000 F2 (Hz) y ii i iii mean 3 a A A 2500 2000 1500 1000 500 2 −2 (d) Speaker A02 U U uo o o o −1 UU U 2 E i u uu u u o Uu 9 0o Y Y Y I II IY 22 I 2 22 3 3 3U 2 9 E 9 9 EE 9 9 9 EE E 3E Y2 u u uuu U U UU U mean 0 0 0o 00 0 0 0 0 A a A A AA AA a a aa a A A A aA aa aa u o o oo o Uu 9 0 0 0 2 0 e eee3e 33 yyyy yy yY e I mean A ii mean mean mean E i iiiI 1 e I 3 −2 o y Y i 0 yy y yy mean yYY Y e Y mean I Y 22 II eee3e meanI mean 2 I 3 22 3 33 mean 3 mean 2 U 9 E 9 9 9 E 9mean9 EEE F1n 300 400 F1 (Hz) 500 −1 (c) Speaker A02 i ii i mean i I 600 0 F2n o i 1 F2 (Hz) 700 3 AA mean mean a 2000 1500 1000 500 2 1 0 −1 F2 (Hz) F2n (e) Speaker A03 (f) Speaker A03 −2 Figure A.1: The F1 /F2 vowel spaces of speakers A01, A02 and A03. The graph on the left shows the formant values in Hertz with the mean values for the individual sounds of the respective speaker in the background. The graph on the right shows the normalised formant values with the reference points in the background. See table 1 on page 7 for an explanation of the symbols. i −2 300 95 i ii ii i mean mean 0 700 0 A A A A aA a mean a Aa mean 800 2000 0 A a 1500 2 1000 1 0 −1 −2 F2n (a) Speaker A04 (b) Speaker A04 −2 Y 29 y U u U e 3 U u u U iI iIi i u u o 0 o o o o 0 mean E U 0 0 00mean 0 oo 2 600 A A a mean mean a 0 A a 3 700 A o A a A A a a a U 3E 9 E A u Y2 e I mean mean E y Yy Y y 9Y y9 I I Y i U i 9 i29 22y Y2 Y U u u u U U 2 I2 99 e e u I 3 e e 33 e 0 o 3 3E E o 3U o u o E U 0 0 00 0 E oo E A A A a Aa a A a a −1 U 0 y 1 400 000 00 F2 (Hz) y iI y i Y Y Y y y 9 I Ii I 9 mean Y i mean mean9 i2 9 22y Y2 Y i mean umean 2 mean 2 I 9 9 e e e e I333 mean e 3E E 3 mean 3U u meanE iI o A A A A a aA a Aa F1n 300 2500 u u o u U UU U u U o o oU o o 0 3 A a U U 3E 9 00 EE u 3 3e EE E EE 0 mean 0 y Y2 2 600 3 E 3e E E 500 i yy y I y Yy 3 y II 2 92 Y I I Y e 3I e 9 2 22 Y2 ee 9 3 99 9 3 e E 1 mean i e I u Uo U E mean F1 (Hz) u ou U uu U U U Umean o mean o oU mean o o 0 mean I e3 500 −1 y3 y YYY 2 92 9 2 22Y2 929 99 9 y 0 y y Yy I y I I II eI e 3 mean e e 3 mean e3mean E i i F1n 400 i i F1 (Hz) i mean A 400 1000 500 2 −2 uu uU U u mean u u o o U mean o o U U mean oo U mean U Yy 2 299 2 9 92 2 I 2mean mean 9 9 U 29 o 0 0 0 0 0 mean 3EEEE 9 E 3 u U U u uu o o o oUU U oo E U 00 3 A A aa A A A 800 2 700 0 a A A A A Aa mean a a a a 3 A aa A A aa mean o 0 0 0 0 0 3 E 1 00 3 uu Y2 e I F1n 600 3 E E mean E EE3 y i i I yYy YYy I Yy ee I 3 9y U U I I 2 Y29 e3e ee 99 2 I2 22 3 99 mean 900 F1 (Hz) y i iiii E 3 E −2 (d) Speaker A05 y Y 3 −1 (c) Speaker A05 y y Yy Y Y mean Yy e I 0 F2n y I I e I 3ee mean I e3 ee I mean 1 F2 (Hz) ii ii i mean i i 500 1500 −1 300 2000 A 2500 2000 1500 1000 2 1 0 −1 F2 (Hz) F2n (e) Speaker A06 (f) Speaker A06 Figure A.2: The F1 /F2 vowel spaces of speakers A04, A05 and A06. −2 u 96 APPENDIX A. TABLES AND FIGURES y yY u u U UU U U u o Iu mean u mean U o o o mean mean o mean 600 mean mean Ii y iiI I iII i y Y yY u U u y 2 y 2Y u ie3 iy UU UU u o Iu 9y9 u e e Y 29 22Y 9 U o o 3E 99 o Eee E 3E 0 3 e 0 00 o 0 o 33E E 0 0 o 00 o 0 mean 0 0 u Y2 e I mean U 3E 9 A o 0 1 F1 (Hz) 33 E E −1 u U mean mean 0 mean F1n i Ii3 iyy2Yy 2Y u e y 9y9 Y 2292 9 e e Y 29 3 99 E e Eee E 33 3EE e 0 I mean iII i 400 Y −2 iI Ii A A mean a 1000 A AA A a A A 2 800 A a a a a 3 aa A aa A A A mean a a 2500 2000 0 y mean mean U mean mean mean UUU uuuu mean mean U U u u U o mean o 0 E EE 0 0 mean 0 0 0 Y2 E3 E EEE 9 9 3E 000 00 600 2 a 700 A A A 3 1500 1000 2 −1 −2 (d) Speaker A08 300 (c) Speaker A08 −2 y u U Uu o o u u o omean u o mean U U mean U U 0 0 0 0 Y2 U 9 E ou 0 mean 0 0 E y Y2 e I 0 9 yy y iei I yyY y i i33e i Y2 ieee3I3II Y2 2 2 I 2 3 E I Y2 Y 9 e 9 9 9 EE 9 9 i E u u U Uu oo u u oo ou U U 00 0 0 U o U U U 3E 9 0 0 E 1 mean U −1 mean yyY y Y2 Y2 2 2 2 I mean Y2meanY 9 9 9 9 mean EE 9 F1n i Iie i i3 e i 33 3 I mean i I I e ee I mean mean mean E 3 e 500 0 F2n y y 600 1 F2 (Hz) y ie3 I A a A A a 2000 Aa A aa AAA a a mean mean o A A aa a a A 0 E A E 700 A A E a a A AA aa a Aa 2 400 o o 0 o o 0 9 1 o u u U u UU UU uuuoU u o y I i i I y 92 I iI ie yY eI33e eee Y y92 2 y 33 y Y9YY9 U I 22 3E e I o mean y i i i 0 9 mean F1 (Hz) o 0 o 9 E o F1n 500 3 E E 2500 −2 −2 u y mean A aa A mean A mean aa 3 aa 800 F1 (Hz) −1 (b) Speaker A07 −1 400 1 (a) Speaker A07 I y 2 e iiiIieIeiII yY y9 3e e e 3 yY92 Y2 e yY Y2 33 3IE 29Y29 3 9 mean 2 1000 F2n I i i 1500 F2 (Hz) A A A A 2500 2000 1500 1000 500 2 1 0 −1 F2 (Hz) F2n (e) Speaker A09 (f) Speaker A09 Figure A.3: The F1 /F2 vowel spaces of speakers A07, A08 and A09. −2 U Y Y y mean mean I mean I eE e 9 3 3 e mean i u U e uU u uU U u u mean mean −1 YY y y Y y mean y Y I Ii I iI i −2 y i i I i UU o oo u o 92 a a A 3E 9 A a 2 1500 2 1000 −1 (a) Speaker A10 (b) Speaker A10 y yy 600 9 9 −2 o y i −1 U ii i 0 0 9 800 o 3 U mean u UU oo u o Uu mean mean o 0 U 00 0 mean 0 0 yi ii A e I Y2 A 2 A 1000 A aa a 3 aa a a 1200 a i i i mean i e y 1000 500 2 −1 (d) Speaker B01 u u u I Y2 Uo U mean U U oo o oo mean o i i e e e e Ie 1 0 u 2 U Uo 2 UU Y2 I Y 2 I II IY Y2 2 u 9 9 0 9 9 E 0 3 E E 9 0 33 0 0 9 AE 3E3E 3 A 0a A A A Aa a a a Y2 U u U o Uo o ooo 0 2 3 800 y A a A a u u u y y yy u 3E 9 F1n 9 mean E 0 3 E E 9 0 mean00 33 9 A mean E 3E3mean E 3 A 0 a A mean A Aa A a 3E y e I 0 9 9 y i i iii mean 2 U 2 Y 2 U Y2 II 2 2 u U mean I I Imean Y Y mean −2 −2 u u u yy yyy mean 9 600 0 (c) Speaker B01 9 700 1 F2n 0 500 e e e e Ie mean y 1500 F2 (Hz) −1 ii i 0 A a A aa 2000 0 0 A a mean 2500 0 U 00 A A A 3000 2 0 9 mean A U uu o 3 UU u U u U oo U o ou o o 3E 9 A A A y yy Y eY 3 I 2 e e y3 3 YI 9 2Y u ee 3 I I I93I 2 Y 2 2 9 9 EEEE EE9 y 1 E9 u o 2 mean 2mean E EE E mean mean EE9 u U Y mean mean Imean 3 −2 u y y eY I 3 2Y u e e3 e 3y3 I3 YIY e e I9 I 29Y222 Ie i F1n yi mean 0 F2n mean i 1 F2 (Hz) 200 400 o A u ii F1 (Hz) U 3 2000 ii 300 u uU AA A a a 1000 2500 400 u 0 A F1 (Hz) u u U Y2 e I mean A a A A mean a U Y u U I e U u Uo o y 2 eE e oo 3 3e 9 9 29 2 0 0 9 e 232 o E 3 2 0 0o EE 3 9 0 0 EE A a A a 3 a 0 0 800 F1 (Hz) E3 YYy y Y y y Y I 1 600 e y iI Ii I i i e o y 2 29 2 0mean 0 99 e 2 2 mean o mean 3 30 o E E 2 0 E mean mean 3mean 9 0 0 EE a A A a 3 e y ii I i F1n 400 97 mean a a 2500 2000 1500 1000 500 2 1 0 −1 F2 (Hz) F2n (e) Speaker B02 (f) Speaker B02 Figure A.4: The F1 /F2 vowel spaces of speakers A10, B01 and B02. −2 APPENDIX A. TABLES AND FIGURES y i ii yyy yy y u mean mean ii i i i ii u YYY eI I e 22 ee Y mean 3 e 3I I 2 2Y2 2 emean 3 3 3 mean II mean mean 3 9 i u U U U mean mean o Umean 9 00 0 0 0 E A mean A A Aa Aa mean 700 3 1500 2 1000 u U o o 1 9 mean 9 600 aa 1500 1000 500 0 a 2 1 0 −1 −2 F2n (c) Speaker B04 (d) Speaker B04 u y u u u mean 2 2u ye I 2 Y2 i 3ee3 ee3 YY I Ymean I 2 mean e3I3 9 Imean Y mean I U 9 i e I 2Y 3 E 9 9 E EE9EEE uu oo oUo o U U U U e I mean 0 0 A 0 1 0 0 0 u y Y2 A 2 A 700 Aa 0 0 00 A0 0 A a A Aa aa A a a A A a a a 3 A o Uo 0 3 3 mean u ou o o U o U UU U 0 mean 0 A u uu 3E 9 9 mean i y i yi y y iy u ie 2 y I2 Y2 2 i 3ee3 ee3 I Y IY2 Y e3I3 9I I 9 U E 9 9 E 9 E E E 9 E F1n mean i Uo mean mean mean a a A 2000 A F2 (Hz) meanmean 9 AAA aa aa a 0 a y iy 00 00 mean A 2000 9 aa mean a o 0 A A A AA oUoooo U u U o u 3 500 0 A A yi y iy A mean E E U A 0 E i 0 00 0 0 mean u uU Y2 E E E 9 9 y 3E 9 mean u y e I oUo U uUo oo mean mean 2U 0 Y e3 I 2 i −2 u I IY Y mean2 Y I ie e 33eUmean I I 22 3 3 e mean e 2 2 mean mean 9 3 E u 9 9 E EE y y yy y Y eIeU Y I IY Y I2 Y 2U eUI I 22 ie e 33 3 3e e 2 2 9 3 E u 9 9 E 9 E 9 −1 U Y i i i i i i −2 300 F1 (Hz) 400 yyy Y eIU e y y mean y u 2 y i −2 (b) Speaker B03 −1 i −1 (a) Speaker B03 i i 0 F2n F1n 200 i i 1 F2 (Hz) mean 300 a a u 400 a a 2000 500 AAa A aa Aa mean F1 (Hz) 0 0 A Aa a 600 u 2 600 A UU o 0 00 0 0 9 A 0 A uu u oo Uo oo o Y2 1 E u uu 9 0 9 y 99 9 3EEEE E E 9 u F1n E E mean 500 F1 (Hz) 9 E 99 9mean EE E y yy y y YYY e eI I U ee Y 22 3 e 3 e3 33I III22Y22 U UU 3 9 y e I 0 U uoooou Uo o u U −1 e I Y2 3 400 300 i −2 98 1500 1000 500 2 1 0 −1 F2 (Hz) F2n (e) Speaker B05 (f) Speaker B05 Figure A.5: The F1 /F2 vowel spaces of speakers B03, B04 and B05. −2 300 i i ii i i yy y mean u yy y u mean y i −2 99 i iiii uuuu ii U U 2 1 3 2 a aa 700 A 2 1000 u 500 u uo o U oo o U U o U U mean U 9 o 3 E E 1 9 A 700 1500 000 000 A a A AA 0a a A a A a A a 3 1000 500 2 1 0 −1 −2 (c) Speaker B07 (d) Speaker B07 mean E 3 3A −1 i iiIi U o 9 9 9 mean93 9 E E 9 mean E Aa A A A A 0 0E 0 0 mean 0 0 u U Y2 e I 0 2 y 2 3EEE 99 993 9 U 9 3 E 3 E 3A A a AE AA u u o ouou U UU Uo o o o 0 0E 0000 0 A a 0 A a 2 3 meanE E U 2 umean u u ou o U o UU Uo o mean mean o uu y yy y i y yY e Y e e 222 Y ii 3eee Y U I Y3 I I2 I I 2 9 u ii i yY emean Y e e 22 e i 2 Y i mean mean 3e e Y U I Y3 I mean I 2mean I I 2 9 Y −2 uu y i F1n y o U U F2n y yy y u u o u o o o U UU o U F2 (Hz) 300 400 u u a A a 500 Y2 mean A AA a 0 A mean a mean a A a 2000 y 2 600 0 0 00 600 −2 y yy y y y ei e ee 2 e I 2Y I 2 2Y I 2YY9 2 I Y I I E3 o 9 U 3 9 3E3E E u 9 E 9 3 E 9 0 9 0 a mean A a a 800 a mean aa 3 F1 (Hz) −1 3E 9 mean U i iii e I mean mean mean mean i 0 400 2 IY i −1 2 2Y I 2 2Y YY9 2 I mean 2 Imean mean IY I 99 3 3E3EEE u 9 700 0 −2 u u u 3 3 1 (b) Speaker B06 y I A a A A (a) Speaker B06 E3 i e A 3 A y yy y ymeany I 0 F2n e o 00 0 aa 0 0 A A0 Aa a F2 (Hz) i ee e ee I mean 2500 9 9 3E 2 mean 1500 mean o 0 0 A 0 A mean A a a A U U a U mean A 2000 0 uU UU Uooooo 99 9 9 2 0 u uuu 9 00 9 E 1 600 3 i iii E mean u u Y2 3 33E E F1n 999 3 i y E 9 9 33E 3 mean mean 300 o 9 E E E F1 (Hz) U U a oo o mean mean mean U i o yy yy y y I ee I ee YYY2 2 2 3 e IYI 22 Y2 IE I e 3 e I oo U F1n 400 e I Y 500 F1 (Hz) u U 22 YYY 2 mean Y2 I 22 0 I e e I ee Y 3 meane mean III E e 3 −1 mean aa 2500 2000 1500 1000 2 1 0 −1 F2 (Hz) F2n (e) Speaker B08 (f) Speaker B08 Figure A.6: The F1 /F2 vowel spaces of speakers B06, B07 and B08. −2 y y yyy mean y Y 400 2 Y I 9 E 3E 0 mean 0 A E a aA a mean A A a 9 U U ooo ooo U2U U o E 0 00 0 0 A A A 3 a a aA a A A E 3E E3 3 E 0 A aa 3 a 2000 0 −1 y u u yY yy y y u mean U y U e Y2Y Y 2mean U e II U mean 2 Y emeanmean e I 2 U I 2 2 I mean e 9 9 3 3 9 3 E3 E u9 mean mean 9 9 0 E E E A mean a 0 3 Y 2 ou o 0 0 0mean A U uu u u u o u Uo oo o o o 0 A a 0 mean AA Y2 3E 9 2 600 A y e I 1 E E y yyy Y y y U U e Y2Y e 2Y U e II U 2 e Y e I U I 22 2 e I 9 3 9 3 9 3 E3 E u99 9 0 EE E A a 0 3 E 0 A 0 AA A A 0 aa aa iiiiI o U oo mean o o 9 3 i ii meanu U 0 eI u −1 I −2 −2 (b) Speaker C01 i 400 1 (a) Speaker C01 ii i 500 2 1000 F2n mean i e 1500 F2 (Hz) F1n 300 ii 0 aA A 0 3 700 aa mean aa 2500 2000 1500 1000 500 2 1 0 −1 −2 F2 (Hz) F2n (c) Speaker C02 (d) Speaker C02 mean F1/F2 (speaker group C + literature) mean normalised F1/F2 (speaker group C + literature) 300 −2 200 I U a 2500 F1 (Hz) A mean 900 A I u U Y 9 3E 99999 0 3 E 0 A Y Y e ee e 3e I 2 222Y ee I Y Y2 I I u u uu u u yy I 000 0 9 EE 3 mean E3 mean 3 700 U U 9 mean 800 U U 999 3 E o 2 mean mean mean 9 oo U o o oo mean 0 I mean I i 2 I I 600 F1 (Hz) 500 e U Y Y22Y Y 2 I 2 y yyy ii ii i i U 3 e e 3 e e e mean e mean −1 I uuu uuu u 1 i F1n i ii i imean i −2 APPENDIX A. TABLES AND FIGURES 300 100 [i:] ● [y:] ● [u:] −1 ● [y:] ● [U] ● [o:] ● [u:] ● [Y] ● [2:] ● [U] ● [o:] 0 500 [e:] [I] ● [9] ● [O] ● [9] ● [O] 1 600 [E:] [E] F1n [E:] [E] 700 [a] [a:] [a] [a:] 900 3 800 2 F1 (Hz) 400 [i:] ● [Y] ● [2:] [e:] [I] 2500 2000 1500 1000 500 F2 (Hz) (e) 2 1 0 −1 −2 F2n (f) Figure A.7: The F1 /F2 vowel spaces of speakers C01 and C02 and the reference points (mean values from speakers C01 and C02 and values taken from literature). 101 ID Fi A01 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 .022 – .013 – – – <.001 – – – – – – – – – – – – – .061 – – – .001 – – – – – F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 F2 F3 .067 – – .001 – <.001 <.001 .06 .099 <.001 – – .003 – – .028 .077 – .005 – – <.001 – – F1 F2 F3 F1 F2 F3 .001 • – – <.001 • – .035 ∗ A02 A03 A04 A05 A06 A07 A08 A09 A10 B01 B02 B03 B04 B05 B06 B07 B08 C01 C02 [ø:]∼[œ] ∗ ∗ • ◦ • • • • ◦ ∗ ◦ • [a:]∼[a] – – – – – – .003 – – .041 – – – – – – – – .005 – – – .088 – – .006 – – – – <.001 – – .077 .027 – .032 .085 – – – .026 – .027 – – .06 – – .056 – .062 – – ◦ ∗ ◦ ◦ • ∗ ∗ ∗ ∗ – – – – .038 ∗ – [e:]∼[E] <.001 .003 – <.001 <.001 – <.001 <.001 .015 .002 .002 .034 .026 .007 – .001 .017 .027 .094 – – .001 .002 – .004 <.001 .071 .011 – – • ◦ • • • • ∗ ◦ ◦ ∗ ∗ ◦ • ∗ ∗ • ◦ ◦ • ∗ <.001 .006 <.001 <.001 .002 – <.001 <.001 .001 .001 <.001 .003 <.001 <.001 <.001 .009 .004 .085 <.001 <.001 .004 <.001 .009 – • ◦ • • ◦ <.001 .013 .013 <.001 .035 – • ∗ ∗ • ∗ • • • ◦ • ◦ • • • ◦ ◦ • • ◦ • ◦ [E:]∼[E] .001 .009 – <.001 <.001 .06 .005 .037 – – – – – – – .04 .099 – – – – .076 – – – .086 – .05 – – – – .041 <.001 .027 – .016 .032 .022 – – – – – – .013 .012 .08 <.001 <.001 .002 .015 .044 – • ◦ • • ◦ ∗ ∗ ∗ • ∗ ∗ ∗ ∗ ∗ ∗ • • ◦ ∗ ∗ .008 ◦ .036 ∗ .098 .003 ◦ – – [E:]∼[e:] .004 .024 – .003 .009 – <.001 <.001 .006 .017 – .073 .061 .042 – – – – – – – .002 .014 – .004 <.001 – – – – <.001 – – – – – <.001 <.001 .026 .002 .001 .01 .077 .002 .056 – – – – .075 – – .095 – ◦ ∗ ◦ ◦ • • ◦ ∗ ∗ ◦ ∗ ◦ • • • • ∗ ◦ • ◦ ◦ – – – .069 .01 ∗ – [i:]∼[I] [o:]∼[O] <.001 .008 <.001 <.001 .013 .014 .002 .001 <.001 .002 .083 – – – – .004 .006 .012 – – – – .009 <.001 .046 .012 <.001 – .082 – • ◦ • • ∗ ∗ ◦ ◦ • ◦ .001 .023 .003 .001 .001 <.001 <.001 .001 <.001 .009 .002 .001 .006 .005 <.001 <.001 .006 <.001 <.001 <.001 <.001 .042 – .1 ◦ ∗ ◦ • • • • • • ◦ ◦ ◦ ◦ ◦ • • ◦ • • • • ∗ <.001 .021 .001 <.001 .001 .035 <.001 <.001 .001 <.001 .003 .013 <.001 .023 .057 <.001 <.001 – <.001 .072 .025 <.001 <.001 .098 • ∗ ◦ • ◦ ∗ • • • • ◦ ∗ • ∗ <.001 <.001 .024 <.001 .002 <.001 • • ∗ • ◦ • <.001 <.001 – <.001 .001 – • • ◦ ◦ ∗ ◦ • ∗ ∗ • .054 .088 – .003 .036 – .004 .007 – <.001 .027 – – – – <.001 .001 – .024 .006 – .063 .003 .052 <.001 .003 – .028 – – ◦ ∗ ◦ ◦ • ∗ [u:]∼[U] na na na na na na – .07 – .031 ∗ .079 – na na na • ◦ ∗ ◦ ◦ • ◦ ∗ • • • ∗ • • • ◦ .067 – – – – – – – – .079 – – – – – – – – na na na – .004 ◦ – na na na na na na <.001 • .019 ∗ – na na na na na na <.001 • – – – – – [y:]∼[Y] .07 – – .014 .07 – .012 – .092 .068 – .042 – – – – – – – – – – – – .014 – – – – – .023 – – <.001 .059 – .002 .034 – .002 – – .003 .002 – <.001 .055 .069 <.001 – – .003 – – ∗ ∗ ∗ ∗ ∗ • ◦ ∗ ◦ ◦ ◦ • • ◦ .001 ◦ – – .029 ∗ – – Table A.26: P-values of t-tests of within-speaker comparisons of formant values. Only p-values below the 0.1 level are shown. 102 APPENDIX A. TABLES AND FIGURES ID Fi [ø:] [œ] [a] [a:] A01 F1 F2 F3 A02 F1 F2 F3 A03 F1 F2 F3 A04 F1 F2 F3 A05 F1 F2 F3 A06 F1 F2 F3 A07 F1 F2 F3 A08 F1 F2 F3 A09 F1 F2 F3 A10 F1 F2 F3 B01 F1 F2 F3 B02 F1 F2 F3 B03 F1 F2 F3 B04 F1 F2 F3 B05 F1 F2 F3 B06 F1 F2 F3 B07 F1 F2 F3 B08 F1 F2 F3 – .053 .006 .074 – .073 – – – – – – .003 – .078 .078 .001 .027 – .054 – – – .082 – – – .002 – – .002 – .001 <.001 – <.001 – – .001 .001 – – <.001 – – <.001 <.001 – <.001 – .017 .029 – .007 – – .044 .08 – – ◦ <.001 <.001 ◦ .06 • .087 – • – .006 – ◦ – • .016 – – • – – .04 • .005 • – – • – – ∗ – ∗ .005 – ◦ – .06 – ∗ – – – – • – • <.001 – – – – ◦ .014 .045 – ∗ .003 – – – – ∗ – ◦ .069 .085 – <.001 .005 – ◦ .066 – – – .001 – – – .059 – – – – – .002 – .067 .081 – – – .02 .004 – – .016 .044 .001 .057 – – – – C01 F1 F2 F3 C02 F1 F2 F3 .034 ∗ – – – – – ◦ ◦ ◦ ∗ ◦ ◦ ∗ ◦ ∗ ∗ ◦ – – – – – .063 – – .002 – – .015 – .037 .065 – – – – – .014 – – – – – – – – – ◦ ∗ ∗ ∗ [E] • ∗ ∗ ◦ • ◦ • [e:] [E:] – <.001 • .029 ∗ – <.001 • – – – – – – .028 ∗ – .064 – – – .025 ∗ – .002 ◦ .042 ∗ – – – – .017 ∗ – – – – – .093 – – <.001 • – – – – – – – – – – – – – – <.001 • – .015 ∗ – .044 ∗ – – – – – – – .096 – – – – <.001 • – .028 ∗ – – – – .014 ∗ – .079 – – – – – .04 ∗ – – – .027 ∗ .016 ∗ – – – – – – <.001 • – – .037 ∗ – – – – – – .017 ∗ – – – – .023 ∗ – – – – – – – – – – – – – .063 – – – – – – – – – [I] – – .023 ∗ .022 ∗ – <.001 • – – – .001 • .003 ◦ .069 – – .011 ∗ .082 .051 – – – – .1 .061 – – – <.001 • – – – .086 – .081 – – – – .094 .092 – – .023 ∗ – .015 ∗ .012 ∗ .074 <.001 • – – .003 ◦ .001 • .091 – .087 – – – – – – – – – – – – – – – .02 ∗ – – .032 ∗ .087 .062 .021 ∗ – <.001 • – – .01 ∗ – .009 ◦ – – <.001 • <.001 • .02 ∗ – .028 ∗ – .055 – – – – – – <.001 • – – – .008 ◦ – – – – <.001 • – .007 ◦ – .03 ∗ – .074 – .003 ◦ – – – .001 • .087 – – – – – .008 ◦ .014 ∗ – – – – .065 – – – – – – – – – – – – .099 – – – .093 – .092 – – – .085 – – – – – – – – – – – – [i:] [O] [o:] .087 <.001 .075 – – – – .039 .034 – – – .018 <.001 .001 .014 .002 .017 .005 – .004 <.001 – – <.001 – .026 – .04 .059 .078 <.001 .01 – .006 .051 – .002 – – .033 – – .022 – – .001 – .004 – – – .022 – .005 .072 – .019 – – .007 – <.001 .048 – <.001 – – – – – – .003 – .016 – – – – – .005 .011 – – .01 .006 – .018 .074 – .075 – – – – .082 – – – – .006 – – <.001 .039 .096 .001 – – – – – – – • ∗ ∗ ∗ • ◦ ∗ ◦ ∗ ◦ ◦ • • ∗ ∗ ◦ • ∗ ◦ .028 ∗ .028 ∗ .001 • – – .097 .001 – – – – – – – .09 – <.001 – – .021 .015 .055 – – .038 .063 .008 – .052 .098 • ◦ ◦ ◦ ∗ ∗ • ◦ ∗ ◦ ∗ • • ∗ ∗ ∗ ◦ – – – .037 ∗ – – – – .006 .002 – – – .021 .022 – – .017 – – .092 – – .005 – – – – – – [U] [u:] ◦ • ∗ • na na na na na na na na na na na na – – – – – – – .004 ◦ .021 ∗ – – – ◦ ∗ ◦ ∗ ∗ ◦ ∗ ◦ ◦ na na na na na na – – – – – – – – – – – – – – .096 – – .081 .001 – – .002 – – .003 – – – – .001 – – – – – – na na na na na na .075 ∗ – ∗ – ∗ ◦ – – .025 ∗ – .016 ∗ .047 ∗ ◦ ◦ • – .033 ∗ – na na na na na na na na na na na na – – – – – – na na na na na na na na na na na na – – – – – – ◦ .004 ◦ – .092 – – – [Y] [y:] – .074 – <.001 .026 .036 .008 .001 – – – – <.001 – .009 .052 <.001 – – – – – .024 – – – – <.001 – .007 – – .02 – .01 – – – – <.001 – – – – .014 .047 .018 .002 .009 .019 – <.001 – – – – – – – .025 – .056 – – .049 – – .023 – – .073 – – .058 – – .021 – – .043 – – .007 – – – – – – – • ∗ ∗ ◦ ◦ • ◦ • ∗ • ◦ ∗ ∗ ∗ ∗ ◦ – – – .088 .073 – – .027 – – – – – <.001 .083 – .001 .015 .014 .021 .038 – .004 .004 ∗ ∗ • ∗ ∗ ∗ ◦ ◦ ∗ • ∗ ∗ • • ∗ ∗ ∗ ∗ ◦ ◦ .001 • .016 ∗ – – – – Table A.27: Between-speaker comparison of formant differences (against the reference values). P-values of the corresponding t-tests are shown below the 0.1 level. 103 ID Fi [ø:] [œ] A01 F1 F2 F3 A02 F1 F2 F3 A03 F1 F2 F3 A04 F1 F2 F3 A05 F1 F2 F3 A06 F1 F2 F3 A07 F1 F2 F3 A08 F1 F2 F3 A09 F1 F2 F3 A10 F1 F2 F3 B01 F1 F2 F3 B02 F1 F2 F3 B03 F1 F2 F3 B04 F1 F2 F3 B05 F1 F2 F3 B06 F1 F2 F3 B07 F1 F2 F3 B08 F1 F2 F3 – .002 .023 .048 .007 – – .005 – – .018 – .005 – – .045 .042 <.001 – – .082 – – – – .042 – .001 – .02 .001 – .014 <.001 – .003 – – .021 .001 – – <.001 – – <.001 .001 .027 <.001 – .059 .022 – .049 – .071 – .042 – – C01 F1 F2 F3 C02 F1 F2 F3 ◦ ∗ ∗ ◦ ◦ ∗ ◦ ∗ ∗ • ∗ ◦ ∗ [a] • ∗ • ◦ ∗ • • • • ∗ • ∗ ∗ ∗ .001 <.001 .003 – – – .015 – .044 .034 – – – – .01 .01 .099 – – – – .009 – – .08 – – – – – • • ◦ ∗ ∗ ∗ ◦ ◦ ◦ [a:] [E] [e:] [E:] [I] – .052 .002 – – – – .001 .057 .078 .029 – – .079 .033 – – – .002 <.001 – – – – – .073 – – .096 .071 – – – – – – – – – – .001 .003 – .021 – – – .047 .019 .04 .003 – .015 – – – – – – – – .001 <.001 – – .005 – – – .06 .061 – .001 – .054 .043 – .016 .047 <.001 – .001 .076 – – .031 – .003 – .082 – – .001 – – .021 <.001 .006 – – – <.001 .081 – – – – – – – .005 .001 – – <.001 <.001 – – – – .01 – .082 .027 – – – – – .04 .001 .081 .064 – – – .02 – .013 – – .01 .012 – – .064 <.001 .019 .002 .001 ◦ ◦ ∗ ∗ ◦ • • ◦ ∗ ∗ ∗ ∗ ◦ ∗ • • ◦ ◦ ∗ ∗ ∗ • • ∗ ◦ • ∗ • ◦ • ◦ ◦ • • ◦ – – .007 ◦ – – .002 ◦ – .093 .047 ∗ – – .084 – .07 – – – – – – – – – – – <.001 • .001 ◦ .073 <.001 • – .001 ◦ – – – – .002 ◦ – – – – – .03 ∗ – <.001 • .061 – – – – .007 ◦ – – – – – – – – <.001 • – .02 ∗ – – .056 – – – – – – – – .071 .002 ◦ – – – – .034 ∗ – .076 – .018 ∗ – – – – – – – <.001 • – .057 .045 ∗ – – – .024 ∗ – – – – – – – – – – – .004 ◦ .002 ◦ – – – – – – <.001 • – – – – .038 ∗ – .067 – – .059 – .074 – – .096 – – – – – – – – – – – .017 ∗ – .068 – – – – – .042 ∗ – – – – – .061 – – – .08 – – – .006 <.001 – – – – – – .035 .07 – – – – .002 – – – – .004 ◦ • ∗ ◦ .033 ∗ – – – .003 ◦ .033 ∗ – – – – – – <.001 • .002 ◦ – – – – .063 – – – – – [i:] [O] .008 <.001 .095 – .003 – – – .013 – – – .053 .024 .003 – – .011 .011 – .006 <.001 – – <.001 .051 .003 – <.001 .08 ◦ .04 • <.001 <.001 – ◦ – .004 – .051 ∗ – – – – – ∗ – ◦ .029 – .052 ∗ – ∗ <.001 – ◦ – • – – – • .002 – ◦ – .01 • – – ∗ • • .058 – – – – – – – – – – – – – – – – – – – – – – .066 – – – – .02 .065 – .029 – – – – – <.001 .004 .023 – – – – – – – – <.001 – – – – .012 – – – – <.001 – – – .023 .064 .095 .001 .041 – .008 – – – • – – – – – – .002 <.001 <.001 .086 – .038 ∗ ∗ ∗ ◦ ∗ ∗ ∗ ∗ • ∗ ◦ • ∗ ∗ • ◦ ∗ ◦ • • ∗ [o:] ◦ ∗ • ◦ ∗ ∗ • ∗ ◦ ∗ ◦ – .018 ∗ – .047 ∗ – – .007 – .001 .04 – <.001 – – .091 – .087 <.001 .003 – .052 – – – – – .044 .01 – – .009 <.001 – .016 .047 – – – .048 .002 – – – .094 – – – – – – <.001 .072 – .063 – – – – – .016 ◦ • ∗ • [U] [u:] [Y] [y:] na na na na na na na na na na na na – – – <.001 – .007 .01 .061 – – .093 .077 .001 – .005 .061 .053 – – – – – – – – .069 – <.001 .034 .002 – .019 .002 – – – – .001 .002 <.001 .015 – – – <.001 .035 – .036 .008 – – <.001 .054 .045 – .011 – – .014 – – – – .085 .03 ∗ • – ◦ na na na ∗ ∗ ◦ • ∗ ∗ ∗ ◦ • ∗ – – – .005 ◦ – – .014 ∗ – – – – – .058 – – – – – .076 – – na na na – – .061 .03 – – .041 – – .039 – – – – .001 – – – – – – na na na na na na – – – – .051 – na na na na na na na na na na na na – – – .098 – – na na na na na na na na na na na na ∗ ∗ ∗ • – – <.001 • .062 – – – – – – .084 – .053 – – .004 ◦ .043 ∗ .08 • ◦ ∗ • ◦ • ∗ ◦ – – – .072 – – – – – – – – .028 ∗ – – – – – – – – – .09 – ∗ ◦ ◦ ◦ • ∗ • ∗ ∗ ◦ • ∗ ∗ ∗ – – – .093 – – – – – – – – – .005 ◦ – – – .001 • .003 ◦ – – – – .081 – <.001 • .057 <.001 • – .082 – – .015 ∗ – – – Table A.28: Between-speaker comparison of formant differences (against group B). Blue marks “improved” values and red marks “worse” values. 104 APPENDIX A. TABLES AND FIGURES (a) speaker A01 V [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p SKG -0.62 (0.42) -0.45 (0.4) – -0.59 (0.45) -1.08 (0.54) .0034 -1.21 (0.32) -0.73 (0.38) <.001 -1.06 (0.29) -1.21 (0.32) – -1.06 (0.29) -0.73 (0.38) .0026 -1.07 (0.57) -0.91 (0.55) – -1.4 (0.7) -0.44 (1.32) .0033 -0.58 (1.26) -0.18 (0.97) – -0.42 (0.37) -0.41 (0.31) – RCG -0.94 (0.34) -1.03 (0.24) – -0.64 (0.49) -1.19 (0.19) <.001 -0.83 (0.53) -0.83 (0.39) – -0.94 (0.18) -0.83 (0.53) – -0.94 (0.18) -0.83 (0.39) – -1.17 (0.44) -0.9 (0.43) .0366 -1.46 (0.23) -0.86 (0.98) .0067 -1.31 (0.99) -1.49 (0.62) – -1.07 (0.42) -0.98 (0.25) – OQG 4.22 (0.64) 0.98 (1.56) <.001 2.21 (2.24) 2.81 (1.05) – 1.7 (1.38) 1.67 (1.81) – 2.55 (2.12) 1.7 (1.38) – 2.55 (2.12) 1.67 (1.81) – 3.13 (1.41) 3.15 (1.35) – 1.35 (1.72) 2.65 (1.74) .0296 2.49 (2.27) 2.56 (1.57) – 2.89 (1.78) 3.31 (1.12) – GOG -2.77 (0.83) -1.38 (1.12) <.001 -0.04 (0.38) -0.58 (0.51) <.001 -2.63 (0.99) -0.82 (0.8) <.001 -1.51 (1.23) -2.63 (0.99) .0016 -1.51 (1.23) -0.82 (0.8) .031 1.24 (1.93) -1.72 (1.37) .0027 -1.26 (0.67) -0.57 (0.67) <.001 -2.95 (1.18) -1.95 (0.93) .0178 -3.93 (1.34) -3.28 (2.23) – IC -1.91 (0.37) -1.7 (0.38) .0602 -1.34 (0.55) -1.33 (0.28) – -1.63 (0.71) -1.54 (0.4) – -1.4 (0.31) -1.63 (0.71) – -1.4 (0.31) -1.54 (0.4) – -1.53 (0.35) -1.71 (0.37) .0852 -1.97 (0.44) -1.73 (1.31) – -2.56 (0.93) -2.63 (0.28) – -2.21 (0.43) -2.18 (0.21) – T4G 0.17 (0.09) 0.24 (0.15) .0689 0.24 (0.06) 0.2 (0.07) .0701 0.19 (0.06) 0.37 (0.15) <.001 0.23 (0.11) 0.19 (0.06) .0992 0.23 (0.11) 0.37 (0.15) .0013 0.49 (0.12) 0.33 (0.1) <.001 0.27 (0.08) 0.32 (0.18) – 0.32 (0.1) 0.39 (0.14) .0824 0.24 (0.07) 0.22 (0.12) – l s mean -0.87 (-0.87) -0.6 (-0.6) -0.74 (0.36) -1.05 (-1.05) -1.04 (-1.04) -1.04 (0.24) 2.57 (2.57) 2.45 (2.45) 2.51 (0.83) -1.73 (-1.73) -1.47 (-1.47) -1.61 (1.36) -1.82 (-1.82) -1.83 (-1.83) -1.82 (0.42) 0.27 (0.27) 0.29 (0.29) 0.28 (0.09) (b) speaker A02 V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -0.35 (0.85) -1.07 (0.49) <.001 -1.51 (0.49) -1.52 (0.34) – -1.13 (0.52) -1.08 (0.4) – -1.18 (0.43) -1.13 (0.52) – -1.18 (0.43) -1.08 (0.4) – -0.79 (0.52) -1 (0.41) – -1.09 (1.01) -1.3 (0.88) – 0.31 (0.96) -0.25 (1.32) – -0.4 (0.32) -0.41 (0.35) – -1.03 (0.65) -1.11 (0.32) – -1.2 (0.43) -1.16 (0.82) – -0.73 (0.38) -1.02 (0.49) .0298 -0.89 (0.39) -0.73 (0.38) – -0.89 (0.39) -1.02 (0.49) – -0.67 (0.31) -1.35 (0.57) <.001 -1.67 (0.62) -1.33 (0.64) .0646 -0.51 (0.74) -1.26 (1.01) .0054 -1.25 (0.39) -1.25 (0.22) – 2.58 (1.21) 1.92 (0.86) .0387 0.66 (2.42) 1.36 (1.07) – 1.26 (0.98) 2.43 (2.14) .0292 2.92 (0.77) 1.26 (0.98) <.001 2.92 (0.77) 2.43 (2.14) – 4.01 (1.13) 3.88 (0.94) – 3.09 (1.5) 2.27 (1.28) .0522 3.19 (1.14) 3.84 (1.17) .0706 2.29 (1.68) 3.81 (1) .0011 -3.11 (0.87) -2.86 (0.47) – -1.07 (0.48) -1.04 (0.45) – -2.89 (1.17) -1.18 (1.12) <.001 -1.76 (0.87) -2.89 (1.17) .0011 -1.76 (0.87) -1.18 (1.12) .0499 -1.5 (1.88) -2.98 (1.24) .0071 -1.7 (1.2) -1.07 (1.51) – -1.92 (0.81) -1.99 (1.5) – -2.3 (0.82) -2.74 (0.95) – -2.02 (0.55) -2.31 (0.41) .0477 -2.25 (0.48) -2.3 (0.63) – -1.97 (0.32) -1.91 (0.43) – -2.31 (0.47) -1.97 (0.32) .0056 -2.31 (0.47) -1.91 (0.43) .0034 -1.98 (0.54) -2.45 (0.69) .0116 -2.19 (0.45) -2.19 (0.33) – -1.18 (0.95) -1.84 (1.12) .0333 -2.25 (0.61) -2.01 (0.82) – 0.14 (0.05) 0.18 (0.06) .0221 0.28 (0.14) 0.31 (0.1) – 0.2 (0.05) 0.3 (0.08) <.001 0.28 (0.08) 0.2 (0.05) <.001 0.28 (0.08) 0.3 (0.08) – 0.27 (0.09) 0.23 (0.07) .0939 0.34 (0.16) 0.32 (0.11) – 0.37 (0.12) 0.29 (0.07) .0083 0.23 (0.05) 0.23 (0.07) – l s mean -0.77 (-0.77) -0.95 (-0.95) -0.85 (0.52) -1 (-1) -1.21 (-1.21) -1.1 (0.3) 2.5 (2.5) 2.79 (2.79) 2.63 (1.04) -2.03 (-2.03) -1.98 (-1.98) -2.01 (0.76) -2.02 (-2.02) -2.14 (-2.14) -2.08 (0.31) 0.26 (0.26) 0.27 (0.27) 0.26 (0.06) Table A.29: Voice quality parameters (means and p-values of t-tests). The rows l and s show means for long-class and short-class vowels. 105 (a) speaker A03 V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -0.79 (0.34) -0.65 (0.46) – -0.1 (0.21) 0.18 (0.52) .0206 -0.94 (0.85) -0.63 (0.38) – -1.09 (0.22) -0.94 (0.85) – -1.09 (0.22) -0.63 (0.38) <.001 -0.79 (0.36) -0.77 (0.35) – -1.43 (0.5) -0.61 (0.89) <.001 -1.06 (1.3) -0.14 (1.01) .0083 -0.97 (0.66) -0.7 (0.6) – -1.15 (0.17) -0.93 (0.38) .0154 -0.4 (0.25) -0.39 (0.33) – -0.7 (0.62) -0.62 (0.45) – -0.8 (0.37) -0.7 (0.62) – -0.8 (0.37) -0.62 (0.45) – -0.19 (0.42) -0.75 (0.29) <.001 -1.51 (0.53) -0.94 (0.56) <.001 -1.39 (0.54) -0.8 (0.77) .0037 -1.06 (0.44) -0.86 (0.54) – 0.24 (1.49) 0.69 (1.2) – -0.09 (1.03) -0.02 (1.54) – -0.22 (0.65) 0.04 (0.97) – -0.23 (0.84) -0.22 (0.65) – -0.23 (0.84) 0.04 (0.97) – 4.25 (0.72) 1.66 (2.44) <.001 1.72 (1.17) 0.62 (1.52) .0079 4.27 (0.99) 2.2 (2.76) .0037 3.77 (2.74) 3.66 (0.98) – -2.66 (0.87) -0.35 (1.74) <.001 -1.04 (0.56) -0.68 (0.7) .0635 -3.11 (0.96) -0.91 (1.34) <.001 -2.98 (0.74) -3.11 (0.96) – -2.98 (0.74) -0.91 (1.34) <.001 -3.27 (0.93) -1.95 (1.62) .0296 -3.04 (1.45) -0.1 (1.1) <.001 -4.32 (0.81) -2.95 (1.53) .006 -4.27 (0.72) -2.55 (1.98) .0394 -1.78 (0.31) -1.68 (0.4) – -0.93 (0.29) -0.67 (0.56) .0527 -1.44 (0.49) -1.11 (0.55) .0318 -1.27 (0.41) -1.44 (0.49) – -1.27 (0.41) -1.11 (0.55) – -1.39 (0.65) -1.5 (0.28) – -1.96 (0.39) -1.51 (0.56) .0026 -2.6 (0.57) -1.61 (0.85) <.001 -1.56 (0.41) -1.54 (0.47) – 0.22 (0.04) 0.36 (0.15) <.001 0.16 (0.04) 0.22 (0.08) .0022 0.26 (0.37) 0.3 (0.13) – 0.19 (0.04) 0.26 (0.37) – 0.19 (0.04) 0.3 (0.13) <.001 0.15 (0.09) 0.31 (0.14) <.001 0.24 (0.03) 0.51 (0.27) <.001 0.15 (0.05) 0.32 (0.12) <.001 0.11 (0.03) 0.19 (0.08) <.001 l s mean -0.9 (-0.9) -0.47 (-0.47) -0.7 (0.42) -0.9 (-0.9) -0.75 (-0.75) -0.83 (0.36) 1.71 (1.71) 1.27 (1.27) 1.5 (1.72) -3.09 (-3.09) -1.36 (-1.36) -2.28 (1.36) -1.62 (-1.62) -1.37 (-1.37) -1.5 (0.45) 0.18 (0.18) 0.31 (0.31) 0.24 (0.1) V SKG RCG [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -0.28 (1.24) -0.69 (0.99) – -0.03 (1.03) -0.17 (0.78) – -0.36 (0.36) -0.81 (0.9) .0282 -0.39 (0.8) -0.36 (0.36) – -0.39 (0.8) -0.81 (0.9) .0889 -1.53 (1.86) -1.28 (1.42) – -0.63 (0.88) 0.11 (1.34) .0303 0.18 (1.1) -0.44 (2.22) – -0.38 (1.08) -0.38 (0.66) – -0.66 (1.09) -0.9 (0.73) – -0.17 (0.52) -0.09 (0.55) – -0.29 (0.18) -0.52 (0.83) – -0.34 (0.8) -0.29 (0.18) – -0.34 (0.8) -0.52 (0.83) – -1.76 (1.45) -1.15 (1.16) – -0.9 (0.45) -0.79 (0.51) – -0.81 (0.43) -1.03 (0.79) – -0.71 (0.95) -0.89 (0.63) – – 4.45 (0.29) 3.52 (0.36) <.001 2.8 (1.21) 1.74 (1.1) .0069 1.83 (1.23) 2.8 (1.21) .0266 1.83 (1.23) 1.74 (1.1) – -1.46 (2.14) 2.64 (2.18) <.001 3.27 (0.76) 1.46 (0.76) <.001 2.45 (1.37) 4.42 (0.54) <.001 3.37 (0.5) 2.75 (1.71) – l s mean -0.43 (-0.43) -0.52 (-0.52) -0.47 (0.47) -0.71 (-0.71) -0.77 (-0.77) -0.73 (0.43) 2.39 (2.39) 2.75 (2.75) 2.56 (1.52) (b) speaker A04 OQG na na GOG IC T4G -2.5 (0.71) -2.69 (0.99) – -1.56 (1.07) -1.93 (1.26) – -2.7 (1.5) -1.54 (1.93) .0343 -2.01 (1.76) -2.7 (1.5) – -2.01 (1.76) -1.54 (1.93) – – -2.01 (1.28) -1.4 (1.72) – -0.68 (1.08) -1.6 (0.76) .0064 0.6 (0.29) -1.65 (1.84) .0014 -1.97 (0.85) -2.21 (0.5) – -1.13 (0.72) -1.17 (0.45) – -1.19 (0.19) -1.64 (0.72) .0073 -1.33 (0.59) -1.19 (0.19) – -1.33 (0.59) -1.64 (0.72) – -2.82 (1.29) -2.38 (1) – -1.95 (0.37) -1.6 (0.84) .0692 -1.67 (0.2) -1.99 (0.86) .0654 -1.99 (0.77) -1.74 (0.53) – 0.1 (0.05) 0.16 (0.05) <.001 0.29 (0.17) 0.24 (0.07) – 0.24 (0.11) 0.41 (0.25) .0054 0.27 (0.15) 0.24 (0.11) – 0.27 (0.15) 0.41 (0.25) .0218 0.3 (0.13) 0.39 (0.31) – 0.29 (0.08) 0.41 (0.35) .0944 0.3 (0.16) 0.19 (0.08) .016 0.29 (0.08) 0.17 (0.12) <.001 -1.55 (-1.55) -1.8 (-1.8) -1.67 (0.88) -1.76 (-1.76) -1.82 (-1.82) -1.79 (0.48) 0.26 (0.26) 0.28 (0.28) 0.27 (0.09) na na Table A.30: Voice quality parameters (means and p-values of t-tests) 106 APPENDIX A. TABLES AND FIGURES (a) speaker A05 V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p 0.48 (0.38) 0.28 (0.43) – 0.09 (0.62) 0.02 (0.37) – -0.74 (0.42) -0.46 (0.4) .0261 -0.52 (0.24) -0.74 (0.42) .0388 -0.52 (0.24) -0.46 (0.4) – 0.13 (0.45) 0.01 (0.62) – -0.25 (0.49) -0.38 (0.45) – -0.57 (1.2) -0.12 (1.4) – 0.42 (0.33) 0.51 (0.5) – -0.66 (0.43) -0.15 (0.57) .001 -0.4 (0.43) -0.43 (0.61) – -0.87 (0.36) -0.86 (0.36) – -0.6 (0.25) -0.87 (0.36) .004 -0.6 (0.25) -0.86 (0.36) .0048 -0.28 (0.32) -0.47 (0.48) – -0.94 (0.29) -0.81 (0.37) – -1.07 (0.52) -0.45 (1.17) .0245 -0.64 (0.42) -0.61 (0.21) – 4.19 (0.75) 4.25 (0.53) – – 3.67 (1.14) 3.78 (1.67) – 3.83 (0.84) 3.67 (1.14) – 3.83 (0.84) 3.78 (1.67) – 3.9 (0.68) 3.82 (0.83) – 4.02 (0.96) 4.11 (0.41) – 3.99 (0.64) 4.37 (0.74) .08 4.69 (0.59) 4.3 (1.12) – -1.95 (1.15) -1.32 (2.08) – -0.27 (0.83) -0.74 (0.47) .0343 -3.1 (1.04) -2.57 (0.75) .0495 -1.59 (1.43) -3.1 (1.04) <.001 -1.59 (1.43) -2.57 (0.75) .0054 -0.39 (1.57) -1.28 (2.2) – -1.81 (0.82) -1.63 (0.81) – -1.21 (0.84) -1.96 (1.18) .0154 -0.73 (1.53) -2.26 (1.55) .0021 -1.91 (0.41) -1.09 (0.63) <.001 -0.82 (0.9) -1.08 (0.68) – -1.4 (0.32) -1.56 (0.35) .0926 -1.56 (0.23) -1.4 (0.32) .048 -1.56 (0.23) -1.56 (0.35) – -1.2 (0.41) -1.09 (0.54) – -2.17 (0.38) -1.66 (0.52) <.001 -1.72 (0.43) -1.13 (1.07) .0172 -1.65 (0.53) -1.68 (0.53) – 0.2 (0.08) 0.14 (0.06) .0022 0.12 (0.02) 0.11 (0.04) .0886 0.09 (0.05) 0.08 (0.03) – 0.12 (0.03) 0.09 (0.05) .0058 0.12 (0.03) 0.08 (0.03) <.001 0.17 (0.04) 0.18 (0.06) – 0.16 (0.08) 0.13 (0.03) – 0.29 (0.15) 0.3 (0.11) – 0.27 (0.09) 0.22 (0.06) .0391 l s mean -0.12 (-0.12) -0.02 (-0.02) -0.07 (0.4) -0.68 (-0.68) -0.54 (-0.54) -0.62 (0.26) 4.04 (4.04) 4.1 (4.1) 4.07 (0.29) -1.38 (-1.38) -1.68 (-1.68) -1.52 (0.8) -1.55 (-1.55) -1.33 (-1.33) -1.45 (0.37) 0.18 (0.18) 0.16 (0.16) 0.17 (0.07) V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -0.28 (0.94) -0.5 (0.43) – -0.04 (1.02) -0.76 (1.79) – -0.27 (0.4) -0.93 (1.19) .0153 -0.34 (0.46) -0.27 (0.4) – -0.34 (0.46) -0.93 (1.19) .0303 -0.87 (0.54) -0.42 (0.45) .0027 -0.76 (1.19) -0.79 (0.51) – -0.2 (0.97) 0.29 (1.46) – -0.2 (0.48) -0.25 (0.38) – -0.66 (0.72) -0.86 (0.39) – -0.3 (0.64) -0.99 (1.11) .0132 -0.47 (0.28) -1.18 (1.22) .0098 -0.74 (0.77) -0.47 (0.28) – -0.74 (0.77) -1.18 (1.22) – -0.75 (0.4) -0.57 (0.57) – -0.82 (0.49) -1.18 (0.27) .0033 -0.88 (0.69) -0.62 (0.87) – -0.77 (0.5) -0.52 (0.38) .0648 4.24 (0.97) 1.68 (1.66) <.001 1.93 (2.28) 1.5 (2.42) – – 0.64 (1.61) 0.15 (1.1) – -2.05 (3) -0.42 (4.22) – 5.05 (0.3) 1.05 (0.76) <.001 1.07 (0.92) -0.69 (3.05) .0416 0.82 (2.68) 0.08 (3.49) – -4.66 (0.52) -4.38 (0.94) – -0.54 (0.85) -1.42 (2) .0628 -4.57 (0.8) -1.64 (1.45) <.001 -2.79 (0.64) -4.57 (0.8) <.001 -2.79 (0.64) -1.64 (1.45) .0015 1.65 (0.43) -0.73 (1.64) .0013 -3.66 (1.01) -1.45 (1.56) <.001 -2.16 (0.08) -3.04 (1.8) – -0.03 (0.84) 0.03 (0.13) – -1.6 (0.58) -1.53 (0.47) – -1.23 (1.17) -1.46 (1.08) – -1.06 (0.29) -1.77 (1.16) .0073 -1.49 (0.88) -1.06 (0.29) .0297 -1.49 (0.88) -1.77 (1.16) – -1.94 (0.51) -1.52 (0.37) .0024 -1.75 (0.43) -1.72 (0.42) – -1.91 (0.62) -1.34 (0.57) .0021 -1.69 (0.4) -1.57 (0.59) – 0.08 (0.06) 0.17 (0.08) <.001 0.33 (0.3) 0.41 (0.32) – 0.05 (0.01) 0.3 (0.16) <.001 0.19 (0.12) 0.05 (0.01) <.001 0.19 (0.12) 0.3 (0.16) .0143 0.35 (0.19) 0.23 (0.11) .0119 0.09 (0.01) 0.28 (0.1) <.001 0.3 (0.13) 0.27 (0.15) – 0.38 (0.16) 0.3 (0.11) .0712 l s mean -0.37 (-0.37) -0.48 (-0.48) -0.42 (0.34) -0.67 (-0.67) -0.85 (-0.85) -0.75 (0.25) 1.67 (1.67) 0.53 (0.53) 1.15 (1.9) -2.09 (-2.09) -1.8 (-1.8) -1.96 (1.88) -1.58 (-1.58) -1.56 (-1.56) -1.57 (0.24) 0.22 (0.22) 0.28 (0.28) 0.25 (0.11) na na (b) speaker A06 na na – na na Table A.31: Voice quality parameters (means and p-values of t-tests) 107 (a) speaker A07 V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -0.9 (0.32) -0.32 (0.35) <.001 -1.12 (0.52) -0.66 (0.73) .0172 -0.06 (0.58) -0.17 (0.54) – -0.34 (0.43) -0.06 (0.58) .0566 -0.34 (0.43) -0.17 (0.54) – 0.02 (0.37) 0.06 (0.87) – -0.94 (0.82) -0.77 (0.7) – 0.83 (1.83) 0.24 (1.36) – 0.29 (0.64) -0.58 (0.94) .0015 -0.3 (0.39) -0.09 (0.43) .0988 -1.4 (0.55) -0.39 (1.03) <.001 -0.58 (0.26) -0.74 (0.23) .0336 -0.7 (0.43) -0.58 (0.26) – -0.7 (0.43) -0.74 (0.23) – -0.14 (0.66) -0.15 (0.86) – -1.05 (0.52) -0.79 (0.6) – -0.34 (1.19) -0.36 (0.63) – -0.1 (0.44) -0.35 (0.42) .0612 4.57 (0.51) 2.12 (1.58) <.001 0.88 (0.5) 0.52 (0.98) – 1.06 (2.25) 2.04 (1.05) .0827 1.92 (1.72) 1.06 (2.25) – 1.92 (1.72) 2.04 (1.05) – 2.45 (2.65) 3.79 (1.04) .0796 1.57 (1.05) 1.3 (2.24) – 1.85 (1.71) 3.13 (2.43) – 0.54 (2.73) 1.84 (2.09) – -5.15 (0.13) -4.42 (0.66) <.001 0.2 (0.95) 0.13 (0.83) – -2.96 (1.19) -2.12 (1.33) .0273 -2.47 (0.65) -2.96 (1.19) .0898 -2.47 (0.65) -2.12 (1.33) – -0.13 (0.76) -0.89 (1.62) – -3.31 (1.46) -1.86 (1.2) <.001 -0.55 (1.12) -1.99 (2.57) .0406 -1.71 (1.56) -2.75 (1.88) – -1.16 (0.34) -1.07 (0.43) – -1.73 (0.65) -1.08 (1.16) .0233 -0.76 (0.5) -0.82 (0.34) – -0.88 (0.45) -0.76 (0.5) – -0.88 (0.45) -0.82 (0.34) – -1.06 (0.79) -0.83 (0.68) – -1.21 (0.49) -1.21 (0.51) – -1.16 (0.85) -1.2 (0.59) – -1.17 (0.59) -1.42 (0.3) .0763 0.05 (0.02) 0.13 (0.07) <.001 0.88 (0.21) 0.62 (0.25) <.001 0.17 (0.08) 0.31 (0.29) .0301 0.14 (0.07) 0.17 (0.08) – 0.14 (0.07) 0.31 (0.29) .0135 0.25 (0.15) 0.16 (0.09) .0107 0.14 (0.09) 0.29 (0.17) <.001 0.45 (0.52) 0.33 (0.35) – 0.18 (0.11) 0.16 (0.12) – l s mean -0.28 (-0.28) -0.31 (-0.31) -0.29 (0.54) -0.58 (-0.58) -0.41 (-0.41) -0.5 (0.38) 1.86 (1.86) 2.11 (2.11) 1.97 (1.15) -2.01 (-2.01) -1.99 (-1.99) -2 (1.59) -1.14 (-1.14) -1.09 (-1.09) -1.12 (0.25) 0.28 (0.28) 0.28 (0.28) 0.28 (0.22) V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -0.7 (0.34) -0.98 (0.5) .0293 0.29 (0.25) 0.03 (0.32) .0036 -1.16 (0.2) -0.88 (0.61) .0405 -0.87 (0.46) -1.16 (0.2) .0089 -0.87 (0.46) -0.88 (0.61) – 0.52 (0.4) 0 (0.39) <.001 -1.11 (0.72) -0.11 (0.48) <.001 1 (1.62) 0.01 (1.09) .0174 -0.18 (0.4) -0.2 (0.29) – -0.7 (0.35) -0.85 (0.5) – 0.04 (0.26) 0.02 (0.33) – -1.05 (0.41) -0.69 (0.71) .0384 -0.74 (0.38) -1.05 (0.41) .0098 -0.74 (0.38) -0.69 (0.71) – -0.1 (0.31) -0.2 (0.32) – -1.03 (0.31) -0.49 (0.23) <.001 -0.06 (0.71) -0.59 (0.69) .0129 0 (0.15) -0.21 (0.4) .0393 4.05 (1.12) 3.25 (2.3) – 1.22 (0.9) 2.1 (1.19) .0086 4.65 (0.76) 1.04 (1.49) <.001 2.86 (1.01) 4.65 (0.76) <.001 2.86 (1.01) 1.04 (1.49) <.001 – -3.61 (1.1) -3.99 (1.33) – -0.2 (0.53) -0.13 (0.51) – -4.79 (0.43) -1.95 (0.85) <.001 -2.79 (1.34) -4.79 (0.43) <.001 -2.79 (1.34) -1.95 (0.85) .0409 -1.5 (0.86) -1.32 (1.31) – -2.62 (1.67) -0.89 (1.1) <.001 -1.64 (1.34) -1.77 (1.1) – -2.7 (1.53) -4.42 (0.9) <.001 -2.09 (0.4) -2.33 (0.45) .0499 -0.98 (0.75) -0.9 (0.34) – -1.74 (0.26) -1.62 (0.49) – -1.59 (0.54) -1.74 (0.26) – -1.59 (0.54) -1.62 (0.49) – -0.95 (0.3) -1.28 (0.4) .0026 -2.4 (0.37) -1.78 (0.29) <.001 -1.71 (0.92) -1.69 (0.36) – -1.47 (0.3) -1.38 (0.21) – 0.18 (0.1) 0.16 (0.07) – 0.28 (0.06) 0.3 (0.13) – 0.11 (0.05) 0.24 (0.08) <.001 0.2 (0.12) 0.11 (0.05) .0025 0.2 (0.12) 0.24 (0.08) – 0.13 (0.05) 0.15 (0.09) – 0.37 (0.16) 0.33 (0.11) – 0.19 (0.08) 0.11 (0.06) <.001 0.1 (0.08) 0.05 (0.02) .005 l s mean -0.28 (-0.28) -0.3 (-0.3) -0.29 (0.64) -0.46 (-0.46) -0.43 (-0.43) -0.44 (0.39) 3.22 (3.22) 2.68 (2.68) 2.98 (1.39) -2.48 (-2.48) -2.07 (-2.07) -2.29 (1.45) -1.62 (-1.62) -1.57 (-1.57) -1.59 (0.46) 0.2 (0.2) 0.19 (0.19) 0.19 (0.09) (b) speaker A08 na na – 2.16 (1.08) 2.03 (1.17) – 4.37 (0.75) 5 (0.58) .0342 na na Table A.32: Voice quality parameters (means and p-values of t-tests) 108 APPENDIX A. TABLES AND FIGURES (a) speaker A09 V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -0.16 (0.94) -1.2 (0.99) <.001 -0.87 (0.73) 0.19 (0.73) <.001 -0.44 (0.34) -1.12 (0.79) <.001 -0.21 (0.44) -0.44 (0.34) .0443 -0.21 (0.44) -1.12 (0.79) <.001 0.1 (0.72) -0.17 (0.94) – -1.37 (0.59) -0.92 (0.45) .0047 0.73 (2.37) -0.93 (2.09) .017 0 (0.56) -0.3 (0.47) .0609 -0.73 (0.71) -1.25 (0.7) .0136 -0.78 (0.4) -0.07 (0.62) <.001 -0.07 (0.37) -0.8 (0.48) <.001 -0.01 (0.45) -0.07 (0.37) – -0.01 (0.45) -0.8 (0.48) <.001 -0.11 (0.5) -0.27 (0.73) – -0.86 (0.38) -0.93 (0.58) – 0.15 (1.06) -1.13 (0.47) <.001 -0.59 (0.46) -0.85 (0.58) – 3.05 (0.94) 1.82 (0.95) .0043 -0.06 (1.46) 1.14 (2.63) .0921 3.89 (0.89) 2.05 (0.59) <.001 3.65 (1.34) 3.89 (0.89) – 3.65 (1.34) 2.05 (0.59) <.001 – 4.03 (0.77) 1.9 (1.49) <.001 2.62 (1.32) 0.91 (2.03) – 1.77 (1.9) 4.92 (0.48) .0017 -3.43 (1.76) -2.04 (1.22) .0175 1.05 (1.67) 0 (1.05) .0402 -3.42 (1.15) -2.36 (1.88) .0361 -4.05 (0.58) -3.42 (1.15) .0364 -4.05 (0.58) -2.36 (1.88) <.001 -1.3 (2.31) -2.74 (0.7) .0205 -4.09 (0.97) -2.05 (1.19) <.001 -1.91 (2.39) -4.1 (0.74) .0013 -2.7 (1.67) -3.95 (1.54) .0778 -2.07 (0.71) -2.33 (0.55) – -1.34 (0.4) -1 (0.54) .0229 -1.37 (0.38) -1.99 (0.38) <.001 -1.45 (0.3) -1.37 (0.38) – -1.45 (0.3) -1.99 (0.38) <.001 -0.9 (0.49) -1.91 (0.5) <.001 -2.31 (0.4) -2.16 (0.42) – -1.45 (0.94) -2.47 (0.47) <.001 -1.8 (0.41) -2.18 (0.42) .004 0.15 (0.11) 0.25 (0.1) .0026 0.89 (0.28) 0.62 (0.54) .0376 0.11 (0.07) 0.2 (0.08) <.001 0.18 (0.06) 0.11 (0.07) <.001 0.18 (0.06) 0.2 (0.08) – 0.28 (0.27) 0.11 (0.08) .0046 0.16 (0.05) 0.29 (0.12) <.001 0.23 (0.18) 0.64 (1.22) .0938 0.17 (0.1) 0.07 (0.05) <.001 l s mean -0.28 (-0.28) -0.64 (-0.64) -0.44 (0.6) -0.37 (-0.37) -0.76 (-0.76) -0.55 (0.45) 2.71 (2.71) 2.12 (2.12) 2.44 (1.42) -2.48 (-2.48) -2.46 (-2.46) -2.47 (1.52) -1.59 (-1.59) -2.01 (-2.01) -1.78 (0.5) 0.27 (0.27) 0.31 (0.31) 0.29 (0.23) V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -1.62 (1.09) -0.5 (0.28) <.001 -0.85 (1.1) -0.9 (2.01) – -0.1 (0.99) -0.76 (1.12) .0354 -0.11 (0.75) -0.1 (0.99) – -0.11 (0.75) -0.76 (1.12) .0242 -0.26 (1.67) -0.95 (1.05) – -1.65 (1.24) -1.45 (1.54) – -0.86 (2.18) -1.44 (2.12) – -1 (0.75) -1.62 (1.73) – -1.31 (0.92) -0.44 (0.36) <.001 0 (1.16) -0.32 (1.58) – -0.54 (0.73) -0.87 (0.76) – -0.45 (0.97) -0.54 (0.73) – -0.45 (0.97) -0.87 (0.76) – -0.72 (1.45) -0.98 (1.01) – -0.96 (0.46) -1.05 (1.09) – -1.12 (1.47) -1.42 (0.73) – -0.98 (0.72) -1.93 (1.27) .0059 4.48 (0.71) 3.76 (1.2) .0729 1.89 (0.79) 1.62 (1.08) – 4.8 (0.36) 0.57 (1.74) <.001 3.28 (1.18) 4.8 (0.36) <.001 3.28 (1.18) 0.57 (1.74) <.001 -0.59 (3.12) 0.08 (2.14) – 3.13 (0.82) 0.94 (0.99) <.001 1.32 (2.13) 1.28 (1.54) – -0.83 (2.23) -1.01 (1.3) – -3.75 (0.16) -3.32 (0.88) .0341 -0.48 (2) -0.06 (0.56) – -3.43 (0.95) -2.61 (1.9) – -2.09 (1.43) -3.43 (0.95) .0014 -2.09 (1.43) -2.61 (1.9) – 1.77 (0.13) -1.31 (2.11) .0305 -2.96 (0.52) -1.85 (0.65) <.001 2.05 (0.38) -2.08 (1.16) <.001 – -2.15 (0.94) -0.94 (0.55) <.001 -1.72 (1.02) -1.42 (1.31) – -1.1 (0.71) -1.66 (0.74) .0108 -1.62 (0.64) -1.1 (0.71) .0113 -1.62 (0.64) -1.66 (0.74) – -1.6 (1.34) -2.41 (1.58) .0748 -2.32 (0.76) -2.13 (1.14) – -2.87 (1.3) -2.63 (1.07) – -1.95 (0.83) -2.7 (1.33) .038 0.13 (0.11) 0.12 (0.07) – 0.58 (0.24) 0.28 (0.14) <.001 0.06 (0.03) 0.37 (0.21) <.001 0.21 (0.12) 0.06 (0.03) <.001 0.21 (0.12) 0.37 (0.21) .0027 0.31 (0.2) 0.3 (0.12) – 0.16 (0.04) 0.4 (0.17) <.001 0.36 (0.14) 0.28 (0.14) .0724 0.26 (0.13) 0.26 (0.06) – l s mean -0.8 (-0.8) -1.09 (-1.09) -0.94 (0.54) -0.76 (-0.76) -1 (-1) -0.87 (0.48) 2.19 (2.19) 1.04 (1.04) 1.65 (1.89) -1.27 (-1.27) -1.87 (-1.87) -1.55 (1.89) -1.92 (-1.92) -1.98 (-1.98) -1.95 (0.58) 0.26 (0.26) 0.29 (0.29) 0.27 (0.13) na na (b) speaker A10 na na Table A.33: Voice quality parameters (means and p-values of t-tests) 109 (a) speaker B01 V SKG RCG [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p 0.49 (0.59) -0.19 (0.43) <.001 -1.37 (0.3) -0.76 (0.55) <.001 0.63 (0.59) 0.1 (0.61) .0034 0.1 (0.6) 0.63 (0.59) .003 0.1 (0.6) 0.1 (0.61) – -0.7 (0.37) 0.11 (0.7) <.001 -0.61 (2.9) 0.1 (1.11) – 1.07 (1.78) -0.17 (1.1) .0358 -0.75 (0.71) 0.02 (1.34) .0312 0.04 (0.62) -0.36 (0.46) .0135 -1.68 (0.48) -0.75 (0.77) <.001 0.37 (0.39) -0.16 (0.52) <.001 -0.03 (0.68) 0.37 (0.39) .0178 -0.03 (0.68) -0.16 (0.52) – -1.83 (1.33) 0.13 (0.53) <.001 -0.48 (0.58) 0.08 (0.47) <.001 -0.73 (1.25) -0.7 (0.76) – -1.14 (0.9) -0.01 (0.93) <.001 OQG – 1.01 (0.76) 1.78 (1.28) .0454 5.03 (0.14) 2.14 (2.43) <.001 2.06 (1.63) 5.03 (0.14) <.001 2.06 (1.63) 2.14 (2.43) – -3.68 (1.33) 0.99 (3.79) <.001 3.64 (0.93) 2.61 (1.63) .02 -1.58 (2.6) 0.87 (3.6) – -3.09 (1.45) 1.6 (2.16) <.001 l s mean -0.14 (-0.14) -0.11 (-0.11) -0.13 (0.63) -0.68 (-0.68) -0.25 (-0.25) -0.48 (0.66) 0.48 (0.48) 1.67 (1.67) 1.03 (2.49) V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p 0.04 (0.34) -0.21 (1.02) – 0.44 (0.9) 0.53 (0.74) – 0.53 (0.55) -0.33 (0.71) <.001 0.43 (0.7) 0.53 (0.55) – 0.43 (0.7) -0.33 (0.71) <.001 -0.69 (0.41) -0.35 (0.96) – 0.15 (2.17) 0.26 (0.54) – -2.01 (2.52) 0.75 (0.7) .0013 -0.54 (0.75) -0.8 (0.88) – 0.06 (0.33) 0.02 (0.84) – 0.5 (0.43) 0.6 (0.5) – 0.08 (0.3) -0.2 (0.51) .027 0.31 (0.56) 0.08 (0.3) .078 0.31 (0.56) -0.2 (0.51) .0017 -1.2 (0.41) -0.38 (0.66) <.001 -0.71 (0.73) -0.44 (0.28) – -1.45 (1.07) 0.18 (0.3) <.001 -0.51 (0.56) -0.31 (0.95) – 4.89 (0.35) 3.51 (0.99) <.001 4.22 (0.65) 4.12 (0.56) – – 4.25 (0.84) 1.6 (2.33) <.001 -2.1 (1.1) 0.09 (3) .0311 2.58 (1.82) 1.59 (1.72) – -2.69 (1.27) 1.83 (2.8) <.001 0.13 (1.21) 5.17 (0.11) <.001 -1.97 (0.45) 0.63 (1.11) <.001 1.73 (1.31) 0.06 (1.14) <.001 -2.19 (0.32) -1.18 (2.04) .0665 0.03 (1.6) -2.19 (0.32) <.001 0.03 (1.6) -1.18 (2.04) .0564 -3.33 (1.47) -1.86 (1.4) – -1.91 (1.4) -1.42 (1.51) – – -1.63 (0.25) -1.63 (0.6) – -0.73 (0.39) -0.74 (0.35) – -1.16 (0.38) -1.32 (0.56) – -0.65 (0.51) -1.16 (0.38) <.001 -0.65 (0.51) -1.32 (0.56) <.001 -1.99 (0.43) -1.7 (0.66) .077 -2.4 (0.8) -1.45 (0.39) <.001 -2.84 (0.71) -1.37 (0.7) <.001 -1.7 (0.46) -1.85 (0.67) – 0.12 (0.04) 0.66 (0.49) <.001 0.67 (0.36) 0.33 (0.14) <.001 0.08 (0.01) 0.43 (0.33) <.001 0.21 (0.07) 0.08 (0.01) <.001 0.21 (0.07) 0.43 (0.33) .0042 0.3 (0.03) 0.31 (0.22) – 0.35 (0.13) 0.56 (0.4) .021 0.22 (0.07) 0.36 (0.09) <.001 0.37 (0.06) 0.14 (0.03) <.001 l s mean -0.21 (-0.21) -0.02 (-0.02) -0.12 (0.71) -0.37 (-0.37) -0.08 (-0.08) -0.23 (0.58) 1.61 (1.61) 2.72 (2.72) 2.12 (2.6) -1.27 (-1.27) -0.75 (-0.75) -1.04 (1.48) -1.64 (-1.64) -1.44 (-1.44) -1.54 (0.6) 0.29 (0.29) 0.4 (0.4) 0.34 (0.18) na na GOG IC T4G -2.42 (1.03) -1.02 (1.86) .0593 – -1.28 (0.67) -1.55 (0.49) – -2.21 (0.35) -1.76 (0.57) .0025 -0.41 (0.49) -0.93 (0.47) <.001 -0.91 (0.6) -0.41 (0.49) .0032 -0.91 (0.6) -0.93 (0.47) – -2.46 (1.2) -1.06 (0.56) <.001 -1.68 (0.73) -1.15 (0.51) .0064 -2.06 (1.52) -1.44 (0.91) .092 -2.32 (0.76) -0.95 (0.83) <.001 0.08 (0.03) 0.38 (0.2) <.001 0.76 (0.22) 1.31 (0.88) .0067 0.13 (0.08) 0.54 (0.35) <.001 0.33 (0.23) 0.13 (0.08) <.001 0.33 (0.23) 0.54 (0.35) .017 0.11 (0.06) 0.28 (0.13) <.001 0.57 (0.46) 0.67 (0.28) – 0.27 (0.13) 0.16 (0.06) <.001 0.14 (0.06) 0.34 (0.19) <.001 -1.3 (-1.3) -0.6 (-0.6) -0.99 (1.28) -1.67 (-1.67) -1.26 (-1.26) -1.48 (0.6) 0.3 (0.3) 0.52 (0.52) 0.4 (0.33) na na – -1.57 (1.33) 0.36 (1.19) <.001 -1.89 (1.44) -1.57 (1.33) – -1.89 (1.44) 0.36 (1.19) <.001 na na – 0.83 (2.63) 0.64 (1.57) – -1.45 (0.65) -2.38 (2.03) – na na (b) speaker B02 na na – na na na na – na na Table A.34: Voice quality parameters (means and p-values of t-tests) 110 APPENDIX A. TABLES AND FIGURES (a) speaker B03 V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -0.36 (0.36) -0.17 (0.61) – 0.63 (0.5) 0.42 (0.55) – -0.34 (0.64) -0.24 (0.39) – -0.39 (0.43) -0.34 (0.64) – -0.39 (0.43) -0.24 (0.39) – -0.3 (0.47) -0.61 (0.39) .0179 -0.73 (0.44) 0.83 (0.64) <.001 0.13 (1.08) 0.4 (1.11) – 0.21 (0.38) -0.54 (0.52) <.001 -0.64 (0.43) -0.47 (0.39) – 0.34 (0.55) 0.45 (0.4) – -0.12 (0.55) -0.17 (0.35) – -0.09 (0.48) -0.12 (0.55) – -0.09 (0.48) -0.17 (0.35) – 0.26 (0.26) -0.48 (0.59) <.001 -0.61 (0.54) -0.2 (0.53) .0102 -0.09 (0.6) -0.5 (0.48) .0129 -0.28 (0.34) -0.85 (0.36) <.001 2.29 (0.59) 3.52 (0.98) <.001 4.08 (0.66) 4.43 (0.74) – 2.41 (0.81) 3.57 (1.06) <.001 3.21 (1.19) 2.41 (0.81) .0095 3.21 (1.19) 3.57 (1.06) – 3.45 (0.63) 2.7 (1.06) .0576 2.59 (1.93) 3.18 (0.88) – 4.22 (0.74) 2.55 (1.53) <.001 4.54 (0.52) 3.5 (1.44) .03 -2.43 (1.17) -1.15 (0.8) <.001 0.38 (0.64) 0.2 (0.69) – -2.8 (1.04) -1.08 (1.14) <.001 -2.3 (1.15) -2.8 (1.04) – -2.3 (1.15) -1.08 (1.14) <.001 -1.78 (1.05) -2.53 (1.18) .045 -2.66 (0.88) -0.62 (1.11) <.001 -1.62 (1.32) -1.76 (1.67) – -2.93 (1.2) -2.74 (1.54) – -1.3 (0.43) -1.24 (0.57) – -0.67 (0.45) -1.08 (0.41) .0038 -1.18 (0.3) -1.32 (0.21) .059 -1.32 (0.33) -1.18 (0.3) – -1.32 (0.33) -1.32 (0.21) – -1.34 (0.44) -1.57 (0.54) – -1.33 (0.51) -1.28 (0.58) – -1.46 (0.72) -1.57 (0.36) – -1.66 (0.39) -1.59 (0.34) – 0.12 (0.05) 0.18 (0.04) <.001 0.15 (0.05) 0.18 (0.06) – 0.11 (0.08) 0.2 (0.09) <.001 0.14 (0.05) 0.11 (0.08) – 0.14 (0.05) 0.2 (0.09) .0081 0.22 (0.1) 0.2 (0.06) – 0.19 (0.06) 0.28 (0.09) <.001 0.31 (0.07) 0.29 (0.13) – 0.18 (0.04) 0.17 (0.04) – l s mean -0.14 (-0.14) 0.01 (0.01) -0.07 (0.48) -0.15 (-0.15) -0.32 (-0.32) -0.23 (0.38) 3.35 (3.35) 3.35 (3.35) 3.35 (0.74) -2.02 (-2.02) -1.38 (-1.38) -1.72 (1.07) -1.28 (-1.28) -1.38 (-1.38) -1.33 (0.24) 0.18 (0.18) 0.21 (0.21) 0.19 (0.06) (b) speaker B04 V SKG RCG [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p 1.35 (0.59) 1.51 (0.5) – 2.01 (0.34) 1.66 (0.7) .0357 0.85 (0.63) 0.73 (0.53) – 0.74 (0.27) 0.85 (0.63) – 0.74 (0.27) 0.73 (0.53) – 0.77 (0.57) 0.76 (0.41) – 1.08 (1.79) 1.51 (0.52) – 1.89 (0.49) 1.31 (1.61) .0834 1.03 (0.66) 0.98 (0.44) – 0.8 (0.5) 0.67 (0.34) – 0.45 (0.35) 0.54 (0.36) – 0.57 (0.42) 0.26 (0.4) .009 0.45 (0.29) 0.57 (0.42) – 0.45 (0.29) 0.26 (0.4) .0741 0.53 (0.71) 0.2 (0.47) .0646 -0.1 (1.63) 0.81 (1.21) .0349 1.47 (0.57) 0.66 (1.04) .0014 0.9 (0.42) 0.37 (0.3) <.001 l s mean 1.22 (1.22) 1.21 (1.21) 1.21 (0.43) 0.63 (0.63) 0.5 (0.5) 0.57 (0.36) OQG GOG IC T4G – 0.7 (1.16) 0.32 (0.91) – 1.57 (0.67) 0.89 (0.97) .0097 -0.2 (2) 0.21 (1.44) – -0.17 (1.25) -0.2 (2) – -0.17 (1.25) 0.21 (1.44) – 1.96 (2.02) -0.3 (2.35) .0017 1.3 (2.01) 0.19 (1.01) .0709 2.26 (2.03) 0.89 (1.49) .0538 1.87 (1.46) 0.96 (1.7) .0799 -0.34 (0.55) 0.03 (0.31) .008 0.26 (0.21) 0.25 (0.55) – 0.1 (0.51) 0 (0.46) – -0.27 (0.21) 0.1 (0.51) .0015 -0.27 (0.21) 0 (0.46) .0162 -0.43 (0.6) -0.81 (0.34) .0092 -0.58 (1.29) 0.29 (0.81) .0083 0.44 (0.59) -0.31 (1.02) .0031 -0.55 (0.51) -0.75 (0.4) – 0.1 (0.03) 0.12 (0.04) .0413 0.21 (0.06) 0.16 (0.06) .0081 0.15 (0.09) 0.12 (0.05) – 0.18 (0.1) 0.15 (0.09) – 0.18 (0.1) 0.12 (0.05) .0175 0.42 (0.23) 0.17 (0.09) <.001 0.49 (0.25) 0.2 (0.06) <.001 0.64 (0.35) 0.56 (0.48) – 0.28 (0.14) 0.13 (0.04) <.001 3.93 (3.93) 3.73 (3.73) 3.84 (1.45) 1.16 (1.16) 0.45 (0.45) 0.83 (0.83) -0.17 (-0.17) -0.19 (-0.19) -0.18 (0.4) 0.31 (0.31) 0.21 (0.21) 0.26 (0.18) na na – na na – 3.34 (1.35) 4.65 (1.03) .0532 4.73 (0.39) 3.34 (1.35) .0503 4.73 (0.39) 4.65 (1.03) – 5.21 (0.34) 4.92 (0.46) – 3.99 (0.22) 4.57 (0.51) .0131 2.35 (2.83) 0.78 (1.53) – na na Table A.35: Voice quality parameters (means and p-values of t-tests) 111 (a) speaker B05 V [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p SKG 0.25 (0.25) 0.59 (0.67) .0273 1.35 (0.58) 0.92 (0.54) .0157 -0.53 (0.47) -0.5 (0.6) – -0.41 (1.12) -0.53 (0.47) – -0.41 (1.12) -0.5 (0.6) – -0.62 (0.69) -0.12 (0.64) .0123 -0.21 (1.08) 0.5 (0.32) .0046 -0.47 (1.83) 0.51 (0.89) .0243 -0.12 (0.63) -0.05 (0.6) – RCG -0.05 (0.33) 0.06 (0.39) – 0.12 (0.69) 0.09 (0.55) – -0.08 (0.3) -0.41 (0.66) .0318 -0.11 (1.06) -0.08 (0.3) – -0.11 (1.06) -0.41 (0.66) – -0.4 (0.78) -0.01 (0.81) .0934 -1.11 (0.27) -0.64 (0.22) <.001 -0.64 (1.25) -0.43 (0.75) – -0.01 (0.45) -0.29 (0.34) .0225 l s mean -0.09 (-0.09) 0.26 (0.26) 0.07 (0.59) -0.28 (-0.28) -0.23 (-0.23) -0.26 (0.35) OQG 4.51 (0.5) 2.29 (1.37) <.001 1.32 (1.08) 1.33 (1.58) – – – -2.86 (1.74) -1.26 (1.45) .0033 -4.48 (1.31) -1.61 (1.32) .0459 -0.35 (2.88) -1.58 (0.63) – IC -1.11 (0.34) -0.57 (0.55) <.001 -0.25 (0.4) -0.24 (0.83) – -0.85 (0.38) -1.21 (0.51) .009 -1.39 (0.75) -0.85 (0.38) .0031 -1.39 (0.75) -1.21 (0.51) – -1.15 (0.31) -0.9 (0.39) .0168 -2.04 (0.69) -0.83 (0.31) <.001 -1.77 (1.46) -0.96 (0.88) .0264 -1.18 (0.62) -0.89 (0.43) .0776 T4G 0.13 (0.06) 0.25 (0.09) <.001 0.19 (0.07) 0.23 (0.07) – 0.07 (0.03) 0.23 (0.15) <.001 0.18 (0.26) 0.07 (0.03) .0521 0.18 (0.26) 0.23 (0.15) – 0.29 (0.08) 0.15 (0.13) <.001 0.17 (0.1) 0.26 (0.16) .0207 0.32 (0.16) 0.22 (0.1) .0093 0.41 (0.14) 0.1 (0.04) <.001 2.03 (2.03) 2.2 (2.2) 2.11 (1.89) -2.18 (-2.18) -1.61 (-1.61) -1.92 (1.43) -1.22 (-1.22) -0.8 (-0.8) -1.02 (0.49) 0.22 (0.22) 0.2 (0.2) 0.21 (0.09) na na – na na – 4.12 (0.66) 0.2 (1.47) <.001 -1.68 (1.69) 2.45 (1.2) <.001 3.95 (0.92) 1.45 (1.75) <.001 -0.02 (2.86) 3.48 (1.1) <.001 na na GOG -0.58 (2.2) -2.02 (1.64) .0258 -0.17 (0.91) -0.3 (1.04) – -4.15 (0.7) -2.87 (1.26) <.001 -2.69 (3.49) -4.15 (0.7) – -2.69 (3.49) -2.87 (1.26) – na na (b) speaker B06 V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p 0.04 (0.5) 0.26 (0.71) – 1.17 (0.72) 0.21 (0.71) <.001 -0.5 (0.4) 0.15 (0.52) <.001 0.14 (0.41) -0.5 (0.4) <.001 0.14 (0.41) 0.15 (0.52) – 0.06 (0.56) 0.04 (0.24) – -0.68 (0.72) -0.04 (0.5) <.001 0.07 (1.37) 0.63 (0.78) .0937 -0.1 (0.53) 0.53 (0.35) <.001 -0.37 (0.32) 0.04 (0.75) .0199 0.28 (0.59) -0.42 (0.48) <.001 0.06 (0.31) 0.15 (0.23) – 0.23 (0.31) 0.06 (0.31) .0614 0.23 (0.31) 0.15 (0.23) – -0.31 (0.35) 0 (0.29) .0019 -1.36 (0.34) -0.78 (0.44) <.001 -0.57 (0.57) -0.27 (0.63) .0916 -0.41 (0.43) -0.08 (0.34) .0097 2.3 (0.6) 3.47 (1.01) <.001 3.88 (1.02) 3.69 (0.94) – 2.59 (0.75) 3.02 (0.86) – 2.77 (1.01) 2.59 (0.75) – 2.77 (1.01) 3.02 (0.86) – 4.6 (0.37) 4.25 (0.69) – 3.67 (0.62) 2.41 (1.24) <.001 4.23 (1.04) 4 (0.82) – 4.71 (0.22) 4.33 (1.09) – -1.91 (0.89) -0.16 (0.83) <.001 1.31 (1.13) 0.16 (0.54) <.001 -1.74 (0.77) 0.39 (1.39) <.001 -0.32 (0.69) -1.74 (0.77) <.001 -0.32 (0.69) 0.39 (1.39) .0467 -2.67 (0.79) -0.93 (1.23) <.001 -1 (0.74) -0.67 (0.65) – -1.2 (1.56) -0.1 (1.43) .0161 -2.81 (1.49) -0.64 (1.9) <.001 -1.28 (0.31) -1.15 (0.4) – -0.62 (0.41) -0.89 (0.33) .0235 -1.04 (0.28) -0.97 (0.39) – -0.79 (0.31) -1.04 (0.28) .0053 -0.79 (0.31) -0.97 (0.39) .0993 -0.76 (0.24) -0.81 (0.38) – -1.44 (0.44) -1.29 (0.28) – -1.6 (0.4) -1.19 (0.8) .029 -1.45 (0.58) -1.06 (0.38) .0142 0.12 (0.05) 0.18 (0.05) <.001 0.36 (0.21) 0.18 (0.05) .0019 0.11 (0.05) 0.29 (0.25) .0064 0.14 (0.05) 0.11 (0.05) .0669 0.14 (0.05) 0.29 (0.25) .0191 0.19 (0.1) 0.14 (0.05) .0656 0.33 (0.08) 0.19 (0.08) <.001 0.32 (0.15) 0.31 (0.21) – 0.18 (0.06) 0.15 (0.07) – l s mean 0.03 (0.03) 0.25 (0.25) 0.13 (0.44) -0.3 (-0.3) -0.2 (-0.2) -0.25 (0.43) 3.6 (3.6) 3.6 (3.6) 3.6 (0.8) -1.29 (-1.29) -0.28 (-0.28) -0.82 (1.13) -1.12 (-1.12) -1.05 (-1.05) -1.09 (0.29) 0.22 (0.22) 0.21 (0.21) 0.21 (0.08) Table A.36: Voice quality parameters (means and p-values of t-tests) 112 APPENDIX A. TABLES AND FIGURES (a) speaker B07 V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p -0.45 (0.48) 0.05 (0.5) <.001 0.84 (0.76) 0.91 (0.53) – -0.78 (0.37) -0.5 (0.59) .0522 -0.78 (0.22) -0.78 (0.37) – -0.78 (0.22) -0.5 (0.59) .0329 -0.07 (1.1) -0.17 (0.29) – -0.58 (0.86) 0.13 (0.9) .0073 -0.86 (0.81) -0.05 (0.39) <.001 -0.22 (0.38) -0.09 (0.43) – -0.86 (0.32) -0.61 (0.59) .0727 -0.15 (0.58) -0.06 (0.48) – -0.86 (0.17) -0.3 (0.59) <.001 -0.69 (0.27) -0.86 (0.17) .0144 -0.69 (0.27) -0.3 (0.59) .005 0.01 (0.79) -0.42 (0.37) .0216 -1.04 (0.3) -0.17 (0.85) <.001 -1.45 (0.36) -0.98 (0.32) <.001 -0.72 (0.29) -0.53 (0.32) .0752 2.99 (0.73) 3.38 (1.59) – – 2.09 (1.65) 3.5 (0.88) <.001 4.09 (0.92) 2.09 (1.65) <.001 4.09 (0.92) 3.5 (0.88) .0314 3.85 (0.86) 4.46 (0.61) – 2.84 (1.19) 3.67 (0.96) .0201 4.54 (0.62) 4.53 (0.53) – 4.51 (0.04) 3.46 (0.82) .0049 -3.24 (0.96) -1.34 (1.11) <.001 0.37 (0.39) -0.34 (0.72) <.001 -3.75 (0.92) -1.1 (1.1) <.001 -1.59 (0.79) -3.75 (0.92) <.001 -1.59 (0.79) -1.1 (1.1) .0816 -2.24 (2.75) -1.85 (0.61) – -1.94 (0.88) -0.41 (0.88) <.001 -2.64 (1.62) -1.21 (0.61) <.001 -3.56 (1.55) -0.94 (0.73) <.001 -1.82 (0.63) -1.37 (0.51) .0093 -0.84 (0.49) -0.79 (0.43) – -2.21 (0.27) -1.49 (0.56) <.001 -1.42 (0.32) -2.21 (0.27) <.001 -1.42 (0.32) -1.49 (0.56) – -1.11 (0.8) -1.55 (0.45) .023 -2.48 (0.44) -0.88 (1.08) <.001 -2.47 (0.38) -2.19 (0.52) .0417 -1.89 (0.42) -1.79 (0.34) – 0.1 (0.04) 0.19 (0.08) <.001 0.26 (0.13) 0.25 (0.12) – 0.1 (0.02) 0.19 (0.06) <.001 0.15 (0.05) 0.1 (0.02) <.001 0.15 (0.05) 0.19 (0.06) .026 0.22 (0.11) 0.15 (0.06) .0071 0.25 (0.06) 0.31 (0.11) .0304 0.32 (0.09) 0.27 (0.06) .0225 0.13 (0.03) 0.18 (0.07) .0292 l s mean -0.36 (-0.36) 0.04 (0.04) -0.18 (0.53) -0.72 (-0.72) -0.44 (-0.44) -0.59 (0.41) 3.56 (3.56) 3.83 (3.83) 3.68 (0.75) -2.32 (-2.32) -1.02 (-1.02) -1.72 (1.21) -1.78 (-1.78) -1.44 (-1.44) -1.62 (0.57) 0.19 (0.19) 0.22 (0.22) 0.2 (0.07) V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p 0.23 (1.02) -0.86 (0.91) <.001 -0.18 (1.5) -0.53 (1.69) – -0.86 (0.57) -1.51 (1.85) – -0.59 (1.39) -0.86 (0.57) – -0.59 (1.39) -1.51 (1.85) .057 -3.53 (1.59) -0.45 (1.17) <.001 -0.36 (2.27) -0.45 (0.57) – -2.65 (2.28) -1.13 (0.84) .0055 -0.19 (0.68) -0.52 (0.95) – -0.73 (0.75) -1.07 (0.97) – -0.46 (0.83) -0.59 (1.47) – -0.44 (0.65) -1.46 (1.29) .0014 -0.26 (1.62) -0.44 (0.65) – -0.26 (1.62) -1.46 (1.29) .0068 -3.01 (0.75) -0.42 (1.29) <.001 -1.13 (0.77) -0.42 (0.77) .0025 -1.81 (0.66) -2.56 (1.08) .0065 -1.08 (0.85) -0.78 (0.99) – 5.36 (0.09) 1.34 (1.77) <.001 2.01 (0.8) 2.73 (0.5) .0058 4.03 (0.36) 0.62 (0.8) <.001 1.95 (1.08) 4.03 (0.36) <.001 1.95 (1.08) 0.62 (0.8) <.001 0.14 (1.33) 3.48 (3) .0069 4.86 (0.49) 1.28 (2.1) <.001 -2.66 (0.92) -1.94 (1.71) – -0.71 (1.19) -2.14 (2.06) .0072 -2.53 (1.59) -1.79 (2.23) – -1.73 (1.42) -2.53 (1.59) – -1.73 (1.42) -1.79 (2.23) – – -0.98 (0.88) -1.28 (1.18) – -1.15 (1.31) -1.34 (1.52) – -0.96 (0.59) -1.16 (1.86) – -1.19 (1.32) -0.96 (0.59) – -1.19 (1.32) -1.16 (1.86) – -3.12 (0.78) -1.07 (1.23) <.001 -1.83 (0.55) -0.74 (0.55) <.001 -2.8 (0.49) -2.92 (0.88) – -1.65 (0.88) -1.15 (0.8) .0577 0.07 (0.04) 0.25 (0.14) <.001 0.35 (0.24) 0.35 (0.38) – 0.09 (0.05) 0.58 (0.46) <.001 0.15 (0.09) 0.09 (0.05) .0026 0.15 (0.09) 0.58 (0.46) <.001 0.43 (0.13) 0.26 (0.21) .0016 0.1 (0.05) 0.23 (0.09) <.001 0.38 (0.13) 0.46 (0.2) .0998 0.35 (0.09) 0.17 (0.05) <.001 l s mean -1.02 (-1.02) -0.78 (-0.78) -0.91 (0.99) -1.11 (-1.11) -1.04 (-1.04) -1.08 (0.82) 2.79 (2.79) 2.27 (2.27) 2.55 (1.69) -1.76 (-1.76) -2.23 (-2.23) -1.97 (1.34) -1.71 (-1.71) -1.38 (-1.38) -1.56 (0.77) 0.24 (0.24) 0.33 (0.33) 0.28 (0.15) na na (b) speaker B08 na na – 1.15 (1.1) 4.17 (0.55) <.001 na na – -3 (1.22) -0.47 (1.28) <.001 0.08 (0.87) -4.8 (0.16) .0091 na na Table A.37: Voice quality parameters (means and p-values of t-tests) 113 (a) speaker C01 V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p 0.53 (0.87) -0.22 (0.51) <.001 0.06 (0.77) -0.01 (1.15) – 0.47 (0.42) -0.41 (0.97) <.001 -0.14 (1.09) 0.47 (0.42) .0164 -0.14 (1.09) -0.41 (0.97) – -1.59 (0.35) -0.98 (1.14) .0304 -0.56 (1.33) -0.05 (1.13) – -3.99 (0.64) -0.16 (1.57) <.001 -1.5 (0.65) 0.48 (0.68) <.001 0.05 (0.96) -0.11 (0.71) – -0.07 (1.24) 0.11 (0.64) – 0.25 (0.52) -0.62 (0.93) <.001 -0.6 (1.14) 0.25 (0.52) .0022 -0.6 (1.14) -0.62 (0.93) – -1.37 (0.41) -1.51 (1.54) – -0.86 (0.52) -0.36 (0.95) .0289 -2.53 (0.76) -0.74 (1.01) <.001 -1.35 (0.36) -0.08 (0.31) <.001 3.16 (1.43) 0.92 (1.41) .0012 2.1 (0.62) 1.87 (1.34) – 4.98 (0.26) 1.06 (0.85) <.001 2.07 (0.62) 4.98 (0.26) <.001 2.07 (0.62) 1.06 (0.85) <.001 -4.77 (0.56) -3.96 (0.72) .0067 3.36 (0.64) 1.4 (1.43) <.001 -3.93 (0.67) 0.45 (1.69) <.001 -3.47 (2.67) 1.04 (2.84) <.001 -1.5 (1.09) -1.88 (1.61) – -0.02 (0.56) -0.73 (1.6) .0498 -1.6 (0.61) -2.45 (1.38) .0169 -1.02 (1.37) -1.6 (0.61) .0685 -1.02 (1.37) -2.45 (1.38) .0014 – -0.98 (0.78) -1.02 (0.36) – -0.88 (1.05) -0.65 (0.57) – -0.56 (0.53) -1.01 (0.98) .0561 -0.94 (0.93) -0.56 (0.53) .0914 -0.94 (0.93) -1.01 (0.98) – -2.16 (0.36) -2.2 (1.33) – -1.87 (0.24) -1.07 (0.87) <.001 -3.5 (0.56) -1.63 (1.09) <.001 -2.62 (0.44) -0.89 (0.5) <.001 0.22 (0.18) 0.46 (0.29) .0018 0.37 (0.22) 0.28 (0.27) – 0.17 (0.05) 0.32 (0.2) .001 0.32 (0.32) 0.17 (0.05) .0255 0.32 (0.32) 0.32 (0.2) – 0.06 (0.03) 0.58 (0.25) <.001 0.33 (0.18) 0.31 (0.24) – 0.08 (0.04) 0.51 (0.22) <.001 0.05 (0.03) 0.41 (0.28) <.001 l s mean -0.84 (-0.84) -0.19 (-0.19) -0.54 (1.15) -0.81 (-0.81) -0.47 (-0.47) -0.66 (0.77) 0.44 (0.44) 0.4 (0.4) 0.42 (3.01) -0.9 (-0.9) -1.46 (-1.46) -1.15 (0.77) -1.69 (-1.69) -1.21 (-1.21) -1.46 (0.84) 0.2 (0.2) 0.41 (0.41) 0.3 (0.16) V SKG RCG OQG GOG IC T4G [ø:] [œ] p [a:] [a] p [e:] [E] p [E:] [e:] p [E:] [E] p [i:] [I] p [o:] [O] p [u:] [U] p [y:] [Y] p 0.04 (0.29) -0.15 (0.3) .0338 0.28 (0.36) 0.36 (0.69) – 0 (0.32) 0.19 (0.37) .071 -0.35 (0.4) 0 (0.32) .0014 -0.35 (0.4) 0.19 (0.37) <.001 0.32 (0.49) -0.02 (0.47) .0164 -0.22 (0.85) 0.87 (0.51) <.001 -0.35 (0.93) 0.46 (0.47) <.001 0.28 (0.9) 0.09 (0.19) – -0.09 (0.29) -0.23 (0.32) – 0.01 (0.39) -0.26 (0.62) .0859 0.04 (0.41) 0.07 (0.36) – -0.41 (0.36) 0.04 (0.41) <.001 -0.41 (0.36) 0.07 (0.36) <.001 -0.66 (0.4) -0.02 (0.31) <.001 -0.51 (1.1) 0.27 (0.54) .0039 -0.97 (0.68) 0.2 (0.43) <.001 -0.17 (0.53) -0.24 (0.45) – 4.22 (0.87) 3.47 (0.45) .001 3.94 (0.59) 3.96 (0.84) – 4.41 (0.6) 3.94 (0.48) .0043 3.26 (0.42) 4.41 (0.6) <.001 3.26 (0.42) 3.94 (0.48) <.001 2.44 (2.36) 4.32 (0.44) .0794 4.43 (0.53) 3.68 (0.91) .0018 2.66 (1.83) 3.96 (2.18) .0716 5.4 (0.01) 4.36 (1.01) .0101 -0.81 (1.18) -0.29 (0.93) .0964 0.21 (0.65) 0.31 (1.33) – -0.44 (0.87) 0.61 (0.87) <.001 -0.42 (0.4) -0.44 (0.87) – -0.42 (0.4) 0.61 (0.87) <.001 -0.84 (1.17) -0.86 (1.78) – -0.07 (0.67) 0.38 (0.83) .0455 -1.57 (0.91) -0.6 (1.73) .029 -1.95 (0.63) -1.21 (1.37) .0335 -1.15 (0.36) -1.23 (0.29) – -0.82 (0.44) -0.61 (0.34) .0898 -0.88 (0.3) -0.85 (0.23) – -0.76 (0.28) -0.88 (0.3) – -0.76 (0.28) -0.85 (0.23) – -1.03 (0.34) -0.84 (0.42) .0913 -1.3 (1.22) -0.7 (0.65) .0398 -1.79 (0.55) -1.01 (0.63) <.001 -0.94 (0.36) -1.05 (0.39) – 0.22 (0.05) 0.34 (0.12) <.001 0.18 (0.04) 0.22 (0.09) .0577 0.22 (0.05) 0.34 (0.16) .0033 0.3 (0.1) 0.22 (0.05) .0014 0.3 (0.1) 0.34 (0.16) – 0.29 (0.2) 0.22 (0.04) – 0.39 (0.08) 0.29 (0.1) <.001 0.33 (0.16) 0.3 (0.13) – 0.16 (0.05) 0.27 (0.09) <.001 l s mean 0 (0) 0.26 (0.26) 0.12 (0.33) -0.34 (-0.34) -0.03 (-0.03) -0.2 (0.33) 3.84 (3.84) 3.96 (3.96) 3.9 (0.74) -0.74 (-0.74) -0.24 (-0.24) -0.5 (0.73) -1.08 (-1.08) -0.9 (-0.9) -1 (0.29) 0.26 (0.26) 0.28 (0.28) 0.27 (0.07) na na – -0.36 (1.19) -0.78 (1.18) – na na – na na (b) speaker C02 Table A.38: Voice quality parameters (means and p-values of t-tests) 114 APPENDIX A. TABLES AND FIGURES [ø:]∼[œ] [a:]∼[a] [e:]∼[E] [e:]∼[E:] [E:]∼[E] [i:]∼[I] [o:]∼[O] [u:]∼[U] [y:]∼[Y] A01 dur. F opp. F nl. SKG RCG OQG GOG T4G IC – ∗/–/∗ •◦–/–•– – – • • – – ◦ –/–/– •–•/––◦ ◦ • – • – – • •/◦/– •––/••• • – – • – • – •/◦/– •––/∗•– – – – ◦ – – ◦ ◦/∗/– ∗•–/••• ◦ – – ∗ – ◦ • •/◦/• ◦–◦/∗•◦ – ∗ – ◦ – • – –/–/– –•–/◦–– ◦ ◦ ∗ • – – ∗ na na – – – ∗ – – – –/–/– ••∗/•◦• – – – – – – A02 dur. F opp. F nl. SKG RCG OQG GOG T4G IC – –/–/– ◦•◦/–•– • – ∗ – ∗ ∗ ◦ –/–/– •••/◦•• – – – – – – • •/•/– ••◦/•◦• – ∗ ∗ • – • ◦ •/•/– ••◦/∗•◦ – – • ◦ ◦ • ◦ ◦/◦/– ∗•◦/•◦• – – – ∗ ◦ – ◦ •/∗/∗ •••/◦•• – • – ◦ ∗ – ◦ ◦/∗/– ∗•–/•–◦ – – – – – – ◦ na na – ◦ – – ∗ ◦ ∗ ∗/–/– •∗•/–∗∗ – – ◦ – – – A03 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ∗ •/–/– •••/••– – ∗ – • – • • ◦/–/– ∗∗•/–•• ∗ – – – – ◦ ◦ •/•/∗ •••/••◦ – – – • ∗ – – ◦/∗/– •••/–◦• – – – – – – ◦ •/•/◦ –◦•/••◦ • – – • – • • ◦/◦/• •∗∗/••• – • • ∗ – • • ◦/◦/– •••/•–• • • ◦ • ◦ • ◦ –/–/– •••/••• ◦ ◦ ◦ ◦ • • • ∗/–/– •••/––• – – – ∗ – • A04 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ◦ –/–/– •••/–•• – – na – – • ◦ ∗/–/– –••/∗•• – – • – – – ∗ ◦/◦/∗ ◦∗◦/••∗ ∗ – ◦ ∗ ◦ ◦ – –/–/– ◦∗◦/◦•– – – ∗ – – – ∗ ∗/–/– ◦•–/••∗ – – – – – ∗ ◦ ◦/–/– •••/•∗• – – • na – – – •/∗/– •••/•∗• ∗ – • – – – ∗ ∗/–/– –••/•∗• – – • ◦ – ∗ – –/–/∗ –••/••• – – – ◦ – • A05 dur. F opp. F nl. SKG RCG OQG GOG T4G IC – –/–/– –•◦/–•• – ◦ – – • ◦ – –/–/– •••/••∗ – – na ∗ – – – ∗/◦/– –◦•/••• ∗ – – ∗ – – – –/–/– –◦•/–•• ∗ ◦ – • ∗ ◦ ◦ –/∗/– –••/••• – ◦ – ◦ – • – –/–/– ∗––/◦•• – – – – – – – –/–/– –•∗/•∗• – – – – • – ∗ na na – ∗ – ∗ ∗ – ∗ –/–/– ••∗/–•– – – – ◦ – ∗ A06 dur. F opp. F nl. SKG RCG OQG GOG T4G IC – –/–/– ◦–∗/––• – – • – – • ∗ –/–/– ◦◦•/–•• – ∗ – – – – • •/∗/∗ ••∗/••• ∗ ◦ na • ◦ • – ∗/–/– ••∗/••• – – na • ∗ • ∗ –/–/– •••/••• ∗ – – ◦ – ∗ • ◦/◦/∗ ∗–∗/••• ◦ – – ◦ ◦ ∗ • •/◦/– •••/•–• – ◦ • • – • ◦ –/–/– ••◦/••• – – ∗ – ◦ – ∗ –/–/– ∗∗–/◦–• – – – – – – A07 dur. F opp. F nl. SKG RCG OQG GOG T4G IC – –/–/– •◦•/–•∗ • – • • – • ◦ ◦/–/– ––•/••• ∗ • – – ∗ • – –/–/– ◦–•/∗•∗ – ∗ – ∗ – ∗ – –/–/– ◦–•/◦•– – – – – – – ◦ –/–/– ◦•–/∗•∗ – – – – – ∗ ∗ –/–/– –•–/∗•• – – – – – ∗ – – – ∗/◦/– –/–/– –/–/– ••–/–•• –••/••• –∗•/••• – – ◦ – – – – – – • ∗ – – – – • – – continued on following page. . . Table A.39: Summary (see page 116 for description). ID 115 . . . continued from previous page ID [ø:]∼[œ] [a:]∼[a] [e:]∼[E] [e:]∼[E:] [E:]∼[E] [i:]∼[I] [o:]∼[O] [u:]∼[U] [y:]∼[Y] A08 dur. F opp. F nl. SKG RCG OQG GOG T4G IC – –/–/– ••◦/∗•– ∗ – – – ∗ – • –/–/– ◦••/–•• ◦ – ◦ – – – • •/◦/– –∗•/••• ∗ ∗ • • – • – –/–/– –∗•/–•• ◦ ◦ • • – ◦ • ◦/∗/– –••/••• – – • ∗ – – • –/◦/• –••/∗◦• • – na – ◦ – ∗ –/◦/– ∗••/•∗• • • – • • – – –/–/– –••/••• ∗ ∗ ∗ – – • – –/–/– –••/•∗• – ∗ na • – ◦ A09 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ◦ ◦/–/– •••/••∗ • ∗ ◦ ∗ – ◦ • –/◦/– •–•/◦•• • • – ∗ ∗ ∗ • ◦/•/– •••/••• • • • ∗ • • – –/–/– •••/––• ∗ – – ∗ – • • ◦/•/– ––•/••• • • • • • – ◦ ∗/∗/• –•∗/••– – – na ∗ • ◦ • •/◦/– ∗–•/–◦• ◦ – • • – • ◦ –/–/– –••/••• ∗ • – ◦ • – ∗ ∗/–/– •••/••• – – ◦ – ◦ • A10 dur. F opp. F nl. SKG RCG OQG GOG T4G IC – –/–/– –••/◦•• • • – ∗ • – – –/–/– ••◦/••• – – – – – • ◦ ∗/–/– –◦•/••• ∗ – • – ∗ • – –/–/– –◦•/••• – – • ◦ ∗ • ∗ –/–/– •••/••• ∗ – • – – ◦ ◦ –/–/– •∗◦/∗–– – – – ∗ – – – ∗/–/– ∗◦•/∗•• – – • • – • – –/–/– ••–/••◦ – – – • – – – –/–/– ••∗/–•– – ◦ – na ∗ – B01 dur. F opp. F nl. SKG RCG OQG GOG T4G IC – –/–/– •••/••• • ∗ na – – • ◦ •/–/– –•∗/••• • • ∗ na ◦ ◦ ◦ •/◦/• ••◦/∗•• ◦ • • • • • – –/–/∗ ••◦/–•• ◦ ∗ • – ◦ • ◦ •/–/– –••/∗•• – – – • – ∗ ◦ ◦/∗/◦ ◦••/◦•• • • • na • • ◦ •/∗/◦ ••–/–•• – • ∗ – ◦ – ∗ –/–/– •••/••• ∗ – – – – • ∗ ∗/–/– •••/•◦• ∗ • • na • • B02 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ◦ •/–/• ••–/••◦ – – • • – • • –/∗/– •••/•∗∗ – – – • – • ∗ •/◦/– •••/∗•• • ∗ na – – • ∗ •/∗/– •••/–•• – – na • • • ◦ –/–/– –••/∗•• • ◦ • – • ◦ • •/•/• ••◦/••• – • ∗ – – – ◦ •/◦/∗ –••/••• – – – – • ∗ ∗ na na ◦ • • na • • • •/–/– ◦◦•/•∗• – – • na – • B03 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ◦ •/–/– •◦◦/••– – – • • – • • ∗/–/– ◦••/••• – – – – ◦ – ◦ •/•/• •••/••• – – • • – • – ∗/∗/∗ •••/–•• – – ◦ – – – ◦ •/•/∗ –••/••• – – – • – ◦ ◦ •/•/• •••/••• ∗ • – ∗ – – • •/•/• •∗∗/••◦ • ∗ – • – • • –/◦/– •∗•/◦•• – ∗ • – – – • ◦/∗/– •∗•/•∗• • • ∗ – – – B04 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ∗ •/–/– •••/••∗ – – na – ◦ ∗ • –/–/∗ •∗•/••• ∗ – na ◦ – ◦ • ◦/•/◦ •∗•/••• – ◦ – – – – – –/–/– •∗•/–•• – – – – ◦ – ◦ ◦/•/◦ –••/••• – – – – ∗ ∗ • ◦/◦/◦ •–•/••• – – – ◦ ◦ • ◦ • ∗ •/◦/∗ na ◦/–/– ••∗/•–• na •••/•◦• – – – ∗ ◦ • ∗ – na – – – ◦ ◦ – • – • continued on following page. . . Table A.39: Summary (see page 116 for description). 116 APPENDIX A. TABLES AND FIGURES . . . continued from previous page ID [ø:]∼[œ] [a:]∼[a] [e:]∼[E] [e:]∼[E:] [E:]∼[E] [i:]∼[I] [o:]∼[O] [u:]∼[U] [y:]∼[Y] B05 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ∗ ◦/–/– ∗–•/•∗◦ ∗ – • ∗ • • • –/∗/– ◦–•/••• ∗ – – – – – ◦ •/•/• •∗∗/••• – ∗ na • ◦ • ∗ –/–/– •∗∗/••• – – na – ◦ – • –/◦/– •••/••• – – • – – – • ◦/◦/• •–∗/•◦• ∗ – • na ∗ • ◦ •/∗/– ••◦/•∗∗ ◦ • • ◦ • ∗ ∗ na na ∗ – • ∗ ∗ ◦ ◦ ◦/◦/– •–◦/•◦• – ∗ na – – • B06 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ∗ ∗/–/– •∗∗/••• – ∗ • • – • ◦ –/–/– •∗∗/◦•∗ • • – • ∗ ◦ ◦ ◦/◦/– •••/••• • – – • – ◦ – ∗/∗/– •••/••• • – – • ◦ – ◦ –/–/– •••/••• – – – ∗ – ∗ • •/◦/• ◦–•/••◦ – ◦ – • – – ◦ •/•/– ••–/◦•• • • • – – • ∗ •/∗/– •••/••• – – – ∗ ∗ – ◦ •/–/– •–∗/•∗• • ◦ – • ∗ – B07 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ◦ ◦/–/– –◦•/••∗ • – – • ◦ • • –/–/– •∗•/••• – – na • – – ∗ •/•/◦ –•∗/••• – • • • • • – •/•/◦ –•∗/••• – ∗ • • • • ◦ –/–/– •••/••• ∗ ◦ ∗ – – ∗ ∗ •/•/• •••/•◦• – ∗ – – ∗ ◦ ◦ •/–/∗ •••/∗◦– ◦ • ∗ • • ∗ ◦ na na • • – • ∗ ∗ – •/–/– ∗∗∗/•∗• – – ◦ • – ∗ B08 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ◦ •/–/– •••/••• • – • – – • ◦ –/–/– •••/••• – – ◦ ◦ – – • •/◦/– ∗••/••• – ◦ • – – • – ∗/∗/– ∗••/••• – – • – – ◦ • –/–/– •••/••• – ◦ • – – • ◦ ∗/–/– •••/••◦ • • ◦ na • ◦ ◦ •/•/– •••/•◦◦ – ◦ • • • • • na na ◦ ◦ na ◦ – – • ◦/–/– •––/•–• – – • na – • C01 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ∗ •/–/– ∗••/••• • – ◦ – – ◦ ◦ –/–/– •••/••◦ – – – ∗ – – ◦ •/∗/∗ –∗•/◦•• • • • ∗ – • – ◦/∗/– –∗•/••• ∗ ◦ • – – ∗ ◦ –/–/– •••/◦•• – – • ◦ – – • •/•/∗ ∗∗–/••• ∗ – ◦ na – • ∗ •/•/– ••∗/••• – ∗ • – • – ◦ •/–/– –•◦/••• • • • na • • ◦ ◦/–/– –∗•/••• • • • na • • C02 dur. F opp. F nl. SKG RCG OQG GOG T4G IC ∗ •/–/∗ •••/••• ∗ – • – – • ◦ –/∗/– •••/••• – – – – – – – •/∗/– •••/••• – – ◦ • – ◦ – ◦/–/– •••/••• ◦ • • – – ◦ ∗ –/∗/– •••/••• • • • • – – • •/◦/• ••◦/••• ∗ • – – – – ◦ •/◦/– •∗∗/∗•• • ◦ ◦ ∗ ∗ • ◦ –/–/– •••/••• • • – ∗ • – ∗ ∗/–/– •••/••• – – ∗ ∗ – • Summary of the acoustic measurements for each speaker. The rows “dur.” show the realisation of a distinction in vowel quantity (duration). A • marks a highly significant difference at the 0.001 level, a ◦ a difference at the 0.01 level and a ∗ a difference at the 0.05 level. The row “F opp.” shows the realisation of a distinction in the three formant values and “F nl.” shows the native-likeness of the two vowels of the rexpective column. The remaining rows show the realisation of a distinction in the various measured voice quality parameters (note, that the voice quality results might be distorted). Table A.39: Appendix B Wordlists Word list A: Schiff, stellen, Bühne, Stadt, muss, offen, steht, spuken, Gewalt, Hölle, dehnen, fühle, gewöhne, Geld, gezackt, Köter, dämlich, kippen, zwölf, Spott, Lawine, Töchter, vertuschen, Beet, Gespött, kaputt, Stall, Verstoß, Schutt, Stahl, dünsten, Tuch, egal, Pollen, dumm, Staat, tief, Häftling, Laden, Blume, Wohl, Täler, Danke, zerstößt, Blümchen, erspäht, böse, Bogen, Bude, Gepäck, höflich, dünn, Ofen, beginnen, Gestüt, Draht, Diebe, dem, Stück, Höhle, Tochter, stehlen, gönnen, Miete, Frosch, Bett, Pappe, Donner, Mitte, Teller, gewönne, Polen, Dänen, Tisch, gebuhlt, Hütte, biete, Steg, Hüte, Hof, stählen, Bitte, spucken, Düse, schief, gewählt, Mus, Fülle Word list B: Bett, dünsten, gezackt, steht, Schutt, zerstößt, Düse, vertuschen, dehnen, fühle, gewönne, Draht, Bogen, kaputt, Mitte, Steg, Blümchen, egal, erspäht, Hölle, Tochter, Diebe, böse, Gestüt, Verstoß, stehlen, Lawine, dünn, Häftling, Frosch, gebuhlt, gönnen, Gewalt, Bude, dem, Fülle, Stall, Dänen, Polen, Schiff, Laden, Gespött, stählen, Hüte, schief, zwölf, dämlich, Tuch, Danke, Gepäck, kippen, muss, stellen, Köter, Staat, Stück, Geld, Ofen, Miete, Stadt, Blume, Wohl, Hütte, Höhle, dumm, Bitte, höflich, offen, Stahl, spuken, gewählt, Pollen, Tisch, Teller, tief, Spott, Mus, Täler, beginnen, Pappe, Töchter, Donner, Bühne, spucken, gewöhne, Beet, Hof, biete Word list C: zerstößt, gewählt, Diebe, Höhle, kaputt, Verstoß, Fülle, gewöhne, Staat, spucken, Geld, beginnen, dumm, Häftling, Köter, Mus, Dänen, Draht, Hölle, dämlich, Danke, Wohl, Bett, Stück, Gespött, Steg, spuken, tief, höflich, Teller, kippen, gebuhlt, Pollen, Hüte, dem, Donner, Hütte, Täler, Lawine, stellen, Hof, Schutt, Bitte, fühle, Frosch, Tisch, Blümchen, zwölf, Gepäck, Polen, Stahl, Bühne, Spott, vertuschen, gönnen, Ofen, dünsten, muss, steht, schief, erspäht, Pappe, Tochter, Miete, Tuch, Bogen, Schiff, dünn, offen, Gewalt, Gestüt, Blume, Stall, egal, biete, Bude, dehnen, Mitte, gewönne, stehlen, Laden, böse, stählen, Stadt, Düse, Töchter, gezackt, Beet Word list D: kippen, Gestüt, Blume, Gewalt, höflich, Bett, Frosch, Tuch, gewählt, Spott, tief, gönnen, Stahl, Steg, schief, gebuhlt, dämlich, Miete, böse, dünsten, Lawine, Teller, Wohl, Draht, dem, dumm, biete, gewönne, Hüte, Staat, spucken, Gespött, fühle, muss, stellen, Hof, Düse, Stall, Tochter, Köter, stehlen, Mitte, Hütte, erspäht, Pollen, Bude, Diebe, egal, Bogen, Töchter, Stück, Danke, Donner, Täler, Blümchen, zwölf, Schiff, Fülle, Schutt, Bitte, Hölle, Laden, stählen, Ofen, spuken, Stadt, Verstoß, dehnen, gezackt, beginnen, Häftling, Höhle, Pappe, Beet, Bühne, kaputt, Dänen, gewöhne, Mus, Polen, steht, dünn, Tisch, Gepäck, offen, zerstößt, Geld, vertuschen 117 118 APPENDIX B. WORDLISTS Er hat Schiff gesagt. Er hat dünsten gesagt. Er hat Tochter gesagt. Er hat stellen gesagt. Er hat Tuch gesagt. Er hat stehlen gesagt. Er hat Bühne gesagt. Er hat egal gesagt. Er hat gönnen gesagt. Er hat Stadt gesagt. Er hat Pollen gesagt. Er hat muss gesagt. Er hat dumm gesagt. Er hat offen gesagt. Er hat Staat gesagt. Er hat steht gesagt. Er hat tief gesagt. Er hat spuken gesagt. Er hat Häftling gesagt. Er hat Gewalt gesagt. Er hat Laden gesagt. Er hat Hölle gesagt. Er hat Blume gesagt. Er hat dehnen gesagt. Er hat Wohl gesagt. Er hat fühle gesagt. Er hat Täler gesagt. Er hat gewöhne gesagt. Er hat Danke gesagt. Er hat Polen gesagt. Er hat Geld gesagt. Er hat zerstößt gesagt. Er hat Dänen gesagt. Er hat gezackt gesagt. Er hat Blümchen gesagt. Er hat Tisch gesagt. Er hat Köter gesagt. Er hat erspäht gesagt. Er hat gebuhlt gesagt. Er hat dämlich gesagt. Er hat böse gesagt. Er hat Hütte gesagt. Er hat kippen gesagt. Er hat Bogen gesagt. Er hat biete gesagt. Er hat zwölf gesagt. Er hat Bude gesagt. Er hat Spott gesagt. Er hat Gepäck gesagt. Er hat Lawine gesagt. Er hat höflich gesagt. Er hat Töchter gesagt. Er hat dünn gesagt. Er hat vertuschen gesagt. Er hat Ofen gesagt. Er hat Beet gesagt. Er hat beginnen gesagt. Er hat Gespött gesagt. Er hat Gestüt gesagt. Er hat kaputt gesagt. Er hat Draht gesagt. Er hat Stall gesagt. Er hat Diebe gesagt. Er hat Verstoß gesagt. Er hat dem gesagt. Er hat gewählt gesagt. Er hat Schutt gesagt. Er hat Stück gesagt. Er hat Mus gesagt. Er hat Stahl gesagt. Er hat Höhle gesagt. Er hat Fülle gesagt. Er hat Miete gesagt. Er hat Frosch gesagt. Er hat Bett gesagt. Er hat Pappe gesagt. Er hat Donner gesagt. Er hat Mitte gesagt. Er hat Teller gesagt. Er hat gewönne gesagt. Er hat Steg gesagt. Er hat Hüte gesagt. Er hat Hof gesagt. Er hat stählen gesagt. Er hat Bitte gesagt. Er hat spucken gesagt. Er hat Düse gesagt. Er hat schief gesagt. Figure B.1: Printed version of word list A, as handed out to the speakers Bibliography Handbook of the International Phonetic Association. Cambridge University Press, 1999. Adank, Patti, Roel Smits, and Roeland van Hout. A comparison of vowel normalization procedures for language variation research. Journal of the Acoustic Society of America, 116(5):3099–3107, 2004. Altenberg, Evelyn P. The judgment, perception, and production of consonant clusters in a second language. IRAL – International Review of Applied Linguistics in Language Teaching, 43:53–80, 2005. Anderson-Hsieh, Janet, Ruth Johnson, and Kenneth Koehler. The relationship between native speaker judgments of nonnative pronunciation and deviance in segmentals, prosody and syllable structure. Language Learning, 42(4):529–555, 1992. Asher, James J. and Ramiro García. The optimal age to learn a foreign language. The Modern Language Journal, 53(5):334–341, 1969. Bakran, Juraj. Zvučna slika hrvatskog govora. Ibis grafika, Zagreb, 1996. Becker, Thomas. Das Vokalsystem der deutschen Standardsprache. Peter Lang, 1998. Beddor, Patrice Speeter and Terry L. Gottfried. Methodological issues in cross-language speech perception research with adults. In: Strange (1995). Best, Catherine T. A direct realist view of cros-language speech perception. In: Strange (1995). Birdsong, David (editor). Second Language Acquisition and the Critical Period Hypothesis. Mahwah: Lawrence Erbaum Assoc., 1999. Birdsong, David and Michelle Molis. On the evidence of maturational constraints in second-language acquisition. Journal of Memory and Language, 44:235–249, 2001. Bongaerts, Theo. Ultimate attainment in L2 pronunciation: The case of very advanced late L2 learners. In: Birdsong (1999). Bongaerts, Theo. Introduction: Ultimate attainment and the critical period hypothesis for second language acquisition. IRAL – International Review of Applied Linguistics in Language Teaching, 43:259–267, 2005. Brennan, Eileen M., Ellen B. Ryan, and William E. Dawson. Scaling of apparent accentedness by magnitude estimation and sensory modality matching. Journal of Psycholinguistic Research, 4(1):27–36, 1975. 119 120 BIBLIOGRAPHY Brière, Eugène J. An investigation of phonological interference. Language, 42(4):768–796, 1966. Chomsky, Noam. The formal nature of language. In: Lenneberg (1967), chapter Appendix A. Claßen, Kathrin, Grzegorz Dogil, Michael Jessen, Krzysztof Marasek, and Wolfgang Wokurek. Stimmqualität und Wortbetonung im Deutschen. Linguistische Berichte, 174:202–245, 1998. Clark, John and Colin Yallop. An introduction to phonetics and phonology. Blackwell, second edition, 1995. Flege, James Emil. A critical period for learning to pronounce foreign languages? Applied Linguistics, 8:162–177, 1987a. Flege, James Emil. The instrumental study of L2 speech production: Some methodological considerations. Language Learning, 37(2):285–296, 1987b. Flege, James Emil. Production and perception of a novel, second-language phonetic contrast. Journal of the Acoustic Society of America, 93(3):1589–1608, 1993. Flege, James Emil. Second language speech learning: Theory, findings and problems. In: Strange (1995). Flege, James Emil. Age of learning and second language speech. In: Birdsong (1999). Flege, James Emil and Kathryn L. Fletcher. Talker and listener effects on degree of perceived foreign accent. Journal of the Acoustic Society of America, 91(1):370–389, 1992. Flege, James Emil, Elaina M. Frieda, and Takeshi Nozawa. Amount of native-language (L1) use af fects the pronunciation of an L2. Journal of Phonetics, 25:169–186, 1997. Flege, James Emil and James Hillenbrand. Limits on phonetic accuracy in foreign language speech production. Journal of the Acoustic Society of America, 76(3):706–721, 1984. Flege, James Emil, M.J. Munro, and I. MacKay. Factors affecting strength of perceived foreign accent in a second language. Journal of the Acoustic Society of America, 97(5):3125–3134, 1995. Flege, James Emil, Grace H. Yeni-Komshian, and Serena Liu. Age constraints on second-language acquisition. Journal of Memory and Language, 41:78–104, 1999. Guion, Susan G., James Emil Flege, and Jonathan D. Loftin. The effect of L1 use on pronunciation in Quichua-Spanish bilinguals. Journal of Phonetics, 28:27–24, 2000. Ioup, Georgette. Is there a structural foreign accent? A comparison of syntactic and phonological errors in second language acquisition. Language Learning, 34(2):1–17, 1984. Iverson, Paul, Patricia K. Kuhl, Reiko Akahane-Yamada, Eugen Diesch, Yoh’ich Tohkura, Andreas Kettermann, and Claudia Siebert. A perceptual interference account of acquisition difficulties for non-native phonemes. Cognition, 87:B47–B57, 2003. Jilka, Matthias. The Contribution of Intonation to the Perception of Foreign Accent. Fakultät für Philosophie der Universität Stuttgart, 2000. Doctoral Dissertation. Kohler, Klaus J. Einführung in die Phonetik des Deutschen. Grundlagen der Germanistik 20. Erich Schmidt Verlag, Berlin, 1977. Kuhl, Patricia K. and Paul Iverson. Linguistic experience and the “Perceptual Magnet Effect”. In: Strange (1995). BIBLIOGRAPHY 121 Lenneberg, Eric H. Biological foundations of language. John Whiley & Sons, Inc., 1967. Levi, Susannah V., Stephen J. Winters, and David B. Pisoni. Speaker-independent factors affecting the perception of foreign accent in a second language. Journal of the Acoustic Society of America, 121(4):2327–2338, 2007. Li, Wei. Dimensions of bilingualism. In: Wei Li (editor), The bilingualism reader. Routledge, 2005. Lobanov, B. M. Classification of Russian vowels spoken by different speakers. Journal of the Acoustic Society of America, 49(2):606–608, 1971. Long, Michael H.. Maturational constraints on language development. Studies in Second Language Acquisition, 12(3):251–281, 1990. Long, Mike. Problems with supposed counter-evidence to the critical period hypothesis. IRAL – International Review of Applied Linguistics in Language Teaching, 43:287–317, 2005. Mack, Molly. Consonant and vowel perception and production: Early English-French bilinguals and English monolinguals. Perception & Psychophysics, 46(2):187–200, 1989. Majewski, Wojciech and Harry Hollien. Formant frequency regions of Polish vowels. Journal of the Acoustic Society of America, 42(5):1031–1037, 1967. Major, Roy Coleman. Foreign Accent: The ontogeny and phylogney of second language phonology. Lawrence Erlbaum Associates, 2001. Meisel, Jürgen M. Principles of universal grammar and strategies of language use: On some similarities and differences between first and second language acquisition. In: Lynn Eubank (editor), Point counterpoint: universal grammar in the second language. John Benjamins Publishing Company, 1991. Mildner, Vesna and Damir Horga. Relations between second language proficiency and formantdefined vowel space. In: Proceedings of the XIVth International Congress of Phonetic Sciences (ICphS99), pages 1455–1458. San Francisco, 1999. Novoa, Loriana, Deborah Fein, and Loraine K. Obler. Talent in foreign languages: A case study. In: Loraine K. Obler and Deborah Fein (editors), The Exceptional Brain: The Neuropsychology of Talent and Special Abilities. The Guilford Press, 1988. Ortmann, Wolf Dieter (editor). Lernschwierigkeiten in der deutschen Aussprache. Goethe-Institut, München, 1976. Parts 1-3. Piske, Thorsten, Ian R. A. MacKay, and James E. Flege. Factors affecting degree of foreign accent in an L2: a review. Journal of Phonetics, 29:191–215, 2001. Ramers, Karl Heinz. Vokalquantität und -qualität im Deutschen. Linguistische Arbeiten 213. Niemeyer, Tübingen, 1988. Scovel, Tom. Foreign accents, language acquisition and cerebral dominance. Language Learning, 19:245–253, 1969. Seliger, Herbert W. and Robert M. Vago. The study of firts language attrition: an overview. In: Herbert W. Seliger and Robert M. Vago (editors), First language attrition. Cambridge University Press, 1991. 122 BIBLIOGRAPHY Selinker, Larry. Interlanguage. IRAL – International Review of Applied Linguistics in Language Teaching, 10(3):209–231, 1972. Sendlmeier, Walter F. and Julia Seebode. Formantkarten des deutschen Vokalsystems. 2007. URL http://www.kgw.tu-berlin.de/forschung/Formantkarten. [accessed 11th October 2007]. Southwood, M. Helen and James Emil Flege. Scaling foreign accent: direct magnitude estimation versus interval scaling. Clinical Linguistics & Phonetics, 13(5):335–349, 1999. Strange, Winifred (editor). Speech Perception and Linguistic Experience: Theoretical and Methodological Issues. MD. York Press, 1995. Tröster-Mutz, Stefan. Die Realisierung von Vokallängen: erlaubt ist, was Sp[a(:)]ß macht? SKY Journal of Linguistics, 17:249–265, 2004. URL http://www.ling.helsinki.fi/sky/julkaisut/ SKY2004/Tr%F6ster-Mutz.pdf. White, Lydia. Universal Grammar and second language acquisition. John Benjamins Publishing Company, 1989. Wängler, Hans-Heinrich. Atlas deutscher Sprachlaute. Akademie-Verlag, Berlin, 4 edition, 1968. Wode, Henning. Phonology in L2 acquisition. In: Sascha W. Felix (editor), Second language development: Trends and Issues, pages 123–136. Narr, Tübingen, 1980. Woods, Anthony, Paul Fletcher, and Arthur Hughes. Statistics in language studies. Cambridge University Press, 1986. Yamada, Reiko A. Age and acquisition of second language speech sounds: Perception of American English /ô/ and /l/ by native speakers of Japanese. In: Strange (1995).
© Copyright 2026 Paperzz