Think or Sink: Chinese Learners` Acquisition of the English

Language Learning
ISSN 0023-8333
Think or Sink: Chinese Learners’
Acquisition of the English Voiceless
Interdental Fricative
D. Victoria Rau
Wheaton College
Hui-Huan Ann Chang
Providence University, Taiwan
Elaine E. Tarone
University of Minnesota
This study investigates the production of the English interdental fricative [θ] by Chinese
learners of English, using a variationist framework. Twenty-seven Chinese participants
were asked to evaluate the acceptability of four possible substitutes for the variable (th)
and to perform four oral production tasks. The results indicated that immediate phonetic
environment and speech style accounted for the accurate production of [θ]. Learners who
had good control over the pronunciation of [θ] used monitoring strategies, whereas those
who had lower production accuracy depended on phonetic salience strategies. Lexical
frequency slightly facilitated accurate production of [θ]. In addition, speakers from both
This study is the result of the project titled “Style, Proficiency, and Attitude in Acquisition of
Phonology by Chinese Learners of English” (NSC92-2411-H-126-002) granted by the National
Science Council in Taiwan to the first author (8/1/2003–7/31/2004). This work is also partially
supported by an NSC grant (41169F), which allowed the first author to spend 9 months as a Visiting
Scholar at the Center for Advanced Research on Language Acquisition (CARLA), University
of Minnesota from August 2003 to May 2004. The preliminary versions were presented at the
AAAL 2004, Portland, Oregon (4/30-5/2/ 2004), the 22nd International Conference on English
Teaching and Learning in the R.O.C., National Taiwan Normal University, Taipei (6/4-5/2005),
and the NWAV 36, University of Pennsylvania (10/11-14/2007) and an English Language Institute
(ELI) Brown Bag Talk (2/15/2008) while the first author was a visiting scholar at the University
of Michigan from September 2007 to May 2008 during her sabbatical leave from Providence
University.
Correspondence concerning this article should be addressed to D. Victoria Rau, 1313 E. Prairie
Ave., Wheaton, IL 60187. Internet: [email protected]
Language Learning 59:3, September 2009, pp. 581–621
C 2009 Language Learning Research Club, University of Michigan
581
Rau, Chang, and Tarone
Think or Sink
Taiwan and China rated [s] as the most acceptable substitute for (th), confirming that
they make up a single English-as-a-foreign-language speech community.
Keywords interlanguage variation; voiceless interdental fricative; Chinese English;
VARBRUL; lexical frequency; speech community; markedness; development over time
The voiceless interdental fricative theta (th) in English is known to be acquired
late by English-speaking children (e.g., Clark, 2003; Ingram, 1989). It has a
range of attested variants in the pronunciation of both L1 and L2 speakers.
Among English L1 speakers, the vernacular variants that have been commonly
identified are [t] and [f] (e.g., Dubois & Horvath, 1999; Klopfenstein, 2002;
Wolfram & Schilling-Estes, 1998). In SLA, there may be a variation in the
segment selected to substitute for a target sound (or “differential substitution,”
according to Weinberger, 1994). Among English L2 speakers, the most commonly cited substitution variants for (th) are [t], [s], and [f]. Thai, Russian, and
Hungarian speakers are reported to replace [θ ] with [t], Japanese, Korean, German, and Egyptian Arabic L1 speakers tend to substitute [s] for the target sound
(e.g., Lee & Cho, 2002; Lombardi, 2003), whereas Hong Kong Cantonese L1
speakers prefer [f] (Peust, 1996).
Variation in interlanguage (IL) phonology is an important area of research
in SLA. Conventional analyses of IL phonology, such as Archibald’s (1993)
application of a parametric framework to the study of L2 acquisition of English
stress patterns, do not take variation into account. However, Bayley and Preston
(1996), Preston (1989, 2000, 2002), Tarone (1979, 1983, 2000, 2002, 2007),
and others have demonstrated the central importance of variation studies to the
study of IL and particularly to the phonology of IL. Indeed, Major (2001), in
reviewing work on IL phonology, stated flatly, “any model, theory, or purported
explanation that fails to account for variation is not accounting for the data,
period” (p. 69).
Learners’ pronunciation of English (th) was of early interest in research on
IL phonology. Gatbonton (1978) used the (th) variable to illustrate her gradual
diffusion model of variation; (th) was also the variable studied by Schmidt
(1987) when he showed that learners transfer patterns of native language (NL)
Arabic social variation into their Arabic-English IL. Both studies used a variationist model to demonstrate the complex impact of NL transfer on (th) patterns
in IL phonology. Schmidt explicitly concluded that the variationist approach
he used to analyze (th) in Arabic-English IL was superior to conventional
approaches in predicting and explaining patterns of SLA:
Language Learning 59:3, September 2009, pp. 581–621
582
Rau, Chang, and Tarone
Think or Sink
Besides predicting the occurrence and distribution of second language
errors in pronunciation in a more precise manner than conventional
analyses contrasting native and target languages as static systems, the
present investigation may have something to say about the relative
persistence of phonological interference in even advanced second
language learners. The patterning of the Arabic TH-variable has a number
of properties that allow it to be classified as a “stable sociolinguistic
variable” (Labov, 1970) . . . these substitutions are made well below the
level of conscious awareness. (Schmidt, 1987, p. 376)
Explanations for IL variation may focus on “internal” factors such as linguistic constraints (e.g., Hansen, 2001), transfer of L1 variational constraints
(e.g., Schmidt, 1987), the interaction between L1 transfer and universal developmental factors (e.g., Major, 2001), or extralinguistic (“external”) factors
(e.g., Dewaele, 2004). Preston (1989, 2000, 2002) and Fasold and Preston
(2007) have proposed a model that provides a balanced explanatory framework
to account for the influence of both internal and external factors on IL variation.
In this model, sociolinguistic variationists present a model of variable bilingual
competence that relates patterns of linguistic change to both sociocultural contextual forces (“external”) and linguistic contextual forces (“internal”). It is the
most comprehensive model to date, integrating sociolinguistic and psycholinguistic data to account for the way IL variation may impact second language
acquisition (Tarone, 2005, 2007, in press).
Evidence supportive of this model has been provided by the quantitative
variationist approach to SLA, using the VARBRUL program (Paolillo, 2002;
Tagliamonte, 2006; Young & Bayley, 1996).1 VARBRUL is a computer program
that carries out a multiple regression analysis of variables impacting language
use, successfully capturing complex interactions among linguistic, social, situational, psychological, developmental, and universal factors. It enables one to
provide a comprehensive account for the learner’s IL use and development.
A quantitative approach using VARBRUL has been successfully adopted
by many studies on IL variation among Chinese learners of English at various
levels of language. Phonological variables documented in these VARBRUL
analyses include (r) (Chen, 2001), (th) (Chang, 2004, 2009), (aw) (An, 2007),
and (ey) (Chang, 2008); morphosyntactic variables include the plural noun
marker (–s) (Young, 1988, 1989, 1991), past tense marking (Bayley, 1991, 1994,
1996; Chen 2002), and future tense marking (Tsai, 2007); and pragmatic and
discourse variables include articles (Chen, 1998) and codeswitching behavior
between Chinese and English (Huang, 2007).
583
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
The aim of the present study is to provide an accurate multivariate account
of variation patterns in the use of the interdental fricative by two different speech
communities of Chinese learners of English, and to document the impact of
this variation on SLA.
Speech Community
In exploring the influence of various linguistic (“internal”) factors on differential substitution patterns for [θ ] among speakers of different first languages
(L1s), recent research has targeted the ranking of two factors—markedness
and faithfulness (Lombardi, 2003; Wester, Gilbers, & Lowie, 2007) and auditory salience and weight (Brannen, 2002)—as plausible factors influencing
different L1 speakers’ tendency to select either [s] or [t] as substitutes for [θ ].
However, these factors cannot explain cases such as that which holds when the
L1 substitution of [θ ] in European French is [s] and that in Quebec French is
[t] (Brannen; Lombardi). In such cases we must consider the factor of speech
community, as it is members of a speech community who share any given
dialect or speech variety. As Preston (1989) has argued, second language (L2)
learners from the same L1 background or geographical area can be considered
to form a speech community when they share a norm with regard to a targeted
L2 variable. Clearly, speakers of European French have different norms for [θ ]
substitution than speakers of Quebec French, and so they must be members of
different speech communities.
We aim to use a quantitative variationist methodology to demonstrate in
the present study that, unlike European and Quebecois French speakers, Chinese learners of English in both China and Taiwan make up a single speech
community because they share the same norm for preferred substitutions for
[θ ].
Variation
One problem with conventional studies on the influence of L1 transfer on IL
phonology is that they tend to assume that each L1 group uses just one fixed
variant categorically to substitute for a given target variant. In fact, however, it
has been reported that Chinese speakers from different backgrounds substitute
different variants for a target English [θ ]. Take spelling errors for example.
Among Cantonese-speaking Chinese children growing up in Canada, (th) in
thick or teeth are predominantly misspelled as <s> or <z>, rather than <f>
(Wang & Geva, 2003). In Weinberger’s (1994) speech samples, it appears that
Language Learning 59:3, September 2009, pp. 581–621
584
Rau, Chang, and Tarone
Think or Sink
English L2 speakers from Hong Kong, where Cantonese is spoken predominantly, choose to say [f], as in free and fink,2 whereas those from Mainland
China and Taiwan use the variant [s], as in sree or sink (with [s] palatalized).
Peust (1996) reported the transfer variant for production of English [θ ] to be [f]
by Hong Kong Chinese, [t] by Malaysia/Singapore Chinese, but [s] by Chinese
in Taiwan.
The present study will describe the entire envelope (i.e., whole array) of
variation of (th) in order to reveal that learners from the same L1 group in fact
demonstrate allophonic complexity in their speech patterns even when they
self-report consensus on a single preferred substitution.
Frequency
Many authors (e.g., N. Ellis, 2002; Gass & Mackey, 2002), have argued that
frequency plays a role in several areas of SLA, including interactional input
and output and speech processing. Previous research in SLA has predicted that
a high token frequency will strengthen psycholinguistic ties and facilitate accurate perception and production (e.g., Flege, Takagi, & Mann, 1996; Langman
& Bayley, 2002; Trofimovich, Gatbonton, & Segalowitz, 2007). Other research
has supported the notion that linguistic representation is mediated by the frequency with which certain linguistic structures occur in the language (e.g.,
Bybee, 2006; Bybee & Hopper, 2001; Munson, 2001).
In the present study, we examine the impact of lexical (token) frequency on
accurate production of (th).
Markedness
Previous studies have examined the role of markedness in SLA using different definitions. Eckman and Iverson (1993) argued that it was typological
markedness (Hawkins, 1987) rather than sonority distance per se which better explained L2 learners’ knowledge of English clusters in syllable onsets.
Cardoso and John (2006) investigated the effect of markedness and frequency
on the acquisition of C clusters by Brazilian Portuguese learners of English
and found it was markedness of sonority sequencing, not input frequency, that
determined the order of acquisition of C clusters in L2 speech. Hansen (2001)
found natural phonological processes to carry more weight than L1 transfer,
markedness, and sonority in accounting for the acquisition of English codas by
Chinese learners.
585
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
In the present study, we follow Eckman’s (1977) definition of typological markedness and assume environments themselves can be in markedness
relationships (Carlisle, 1994). We use this concept to explain why certain phenomena are acquired before others.
Development Over Time
Development over time refers to the trajectory of SLA. One of the built-in
strengths of Preston’s model is its capacity to predict how complex constraints
change across time. Change over time is an “external” variable in Preston’s
model. Although Hansen (2001) collected data on Chinese learners’ acquisition of English codas over the duration of 6 months, she found either very few
changes in production accuracy or a “puzzling” U-shaped curve of development
that was difficult to interpret. It is not easy to observe phonological changes in
young adult L2 learners with intermediate or advanced L2 proficiency levels.
Although VARBRUL analyses of IL variation in morphosyntax usually compare groups with different proficiency levels to infer the factors that determine
L2 learners’ development over time (e.g., Young, 1989), it may not be feasible
to do this for phonological development, as accent may be the last to change,
compared with other linguistic levels (Van Coetsem, 1988).
In the present study, we will resort to a measurement of accurate production of (th) to compare development of (th) over time among three groups of
learners.
Research Questions
The research questions of this study are the following:
1. What factors affect the accuracy of [θ ] production by Chinese learners?
Can a constraint hierarchy be identified, and if so, which internal and
external factor groups impose the greatest effect on variation?
2. To what extent can a variationist model account for this variation? Does
lexical frequency play a role in the production of [θ ]?
3. What development patterns can be inferred from the data? Are groups
with overall higher accurate production of [θ ] constrained by the same or
different factors as groups with overall lower accurate production of [θ ]?
4. What are the participants’ self-reported preferences for substitutions for
[θ ]? Is the favored substitute [s], as reported by Weinberger (1994) for
English learners in China and Taiwan? Do the self-reported substitution
Language Learning 59:3, September 2009, pp. 581–621
586
Rau, Chang, and Tarone
Think or Sink
norms for [θ ] suggest that learners of English from China and from Taiwan
constitute a single speech community? What is the learner’s repertoire of
variation in producing [θ ]?
Method
The present study used VARBRUL3 to analyze variable patterns of production
of the voiceless interdental fricative (th) by Chinese learners of English from
Taiwan and from China. We aimed to uncover the combination of factors that
best accounts for accurate production of (th), and to determine whether lexical
frequency effects play a role in the process, and how internal and external
factors change as a learner’s overall accuracy increases. In addition, we asked
participants to report their own preferred substitutes for (th), and compared
these responses to their productions of (th).
Participants
The participants consist of 27 Chinese English speakers from two samples
(China group n = 11, Taiwan group n = 16). The China sample, mostly male
graduate students, had higher English proficiency with a more diverse disciplinary background than did the Taiwan sample, which were all undergraduate
English majors and predominantly female. The two samples were first studied
separately by the authors to secure detailed demographic background about
these participants and then combined to ensure a wider representation of Chinese learners of English, who tend to speak various regional dialects in addition
to Mandarin, the lingua franca.
The China Sample
The China sample consists of 11 Chinese foreign students4 (8 males and 3
females) from Mainland China, studying at the University of Minnesota (as
shown in Table 1). Nine were graduate students in engineering and science,
recruited with the help of the Center for Teaching and Learning Services at the
University of Minnesota, where the students had received instruction to improve
their English language and teaching skills as international teaching assistants.
Two additional participants, an undergraduate and a postdoctoral student, were
recruited as friends of friends. The participants’ ages ranged from 20 to 35, with
a mode of 26. Their length of residence (LOR) in the United States ranged from
3 months to 5 years, with a mode of 1.5 years. The average age of onset (AO)
for English learning was age 12. After they indicated an interest in participating
in the study, the participants were contacted by e-mail to obtain their informed
consent to be interviewed.
587
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
Table 1 Participants’ background information of the China sample
No.
Gender
Age
LOR
Origin, dialect
Education
AO
1
2
3
4
5
6
M
F
M
F
F
M
25
28
26
26
33
20
3 months
3.5 years
1.5 years
1.5 years
8 months
2 years
Grad Engineering
Grad Engineering
Grad Engineering
Grad Science
Postdoc Science
Undergrad Engineering
13
12
13
10
12
11
7
8
9
M
M
M
35
28
26
4.5 years
2.5 years
1.5 years
Grad Engineering
Grad Soc. science
Grad Engineering
12
13
13
10
11
M
M
26
32
1 years
5 years
Sichuan, Mandarin
Shaanxi, Mandarin
Zhejiang, Mandarin
Jiangsu,Mandarin
Hebei, Mandarin
Jiangsu, Mandarin;
Cantonese
Hebei, Mandarin
Henan, Mandarin
Inner Mongolia,
Mandarin
Hubei, Mandarin
Hunan, Mandarin
Grad Science
Grad Engineering
12
12
The Taiwan Sample
The Taiwan sample (Chang, 2004) consists of 16 college undergraduate English
majors at Providence University (3 males, 13 females). See Table 2. They were
recruited as friends of friends to fill four categories from the first year to the
fourth year, with four students in each. They shared the same L1, Mandarin,
with various levels of proficiency in the Southern Min dialects. All participants
had begun learning English at the age of 11 or 12. Compared with the China
sample, the Taiwan sample had a lower level of English proficiency.
Data Elicitation
All demographic information, including age, gender, Chinese dialects spoken,5
LOR in an English-speaking country, field of study, AO of English learning,
and reported paper-based TOEFL score,6 were gathered in Part 1: Warm-up
interview (Appendix D). Speech samples were elicited in a sociolinguistic
interview7 containing the following four tasks: story reading (Appendix A),
story retelling (Appendix B),8 word list reading (Appendix C), and an interview
(protocol in Appendix D),9 which lasted 45 min on average. Part 3 of the
interview protocol, containing questions on topics such as recalling memorable
people, dreams, dangerous events, or childhood games, aims to elicit a more
casual style of speech than the formal interview. During this part, the interviewer
always tried to find topics that the participant would be very willing to talk about,
in addition to those identified in advance. If the participants found the questions
on dangerous events or dreams too personal, the interviewer would switch to
Language Learning 59:3, September 2009, pp. 581–621
588
Rau, Chang, and Tarone
Think or Sink
Table 2 Participants’ information background of the Taiwan sample
No.
Gender
Age of onset for
learning English
Dialect
background
1
2
3
4
5
6
7
8
9
10
F
F
F
F
F
F
F
F
M
M
11
12
11
11
11
12
12
12
11
12
11
12
F
M
12
12
13
14
15
F
F
F
11
11
12
16
F
12
Mandarin
Mandarin
Mandarin
Mandarin
Mandarin
Mandarin
Mandarin
Mandarin
Mandarin
Southern Min;
Mandarin
Mandarin
Southern Min;
Mandarin
Mandarin
Mandarin
Southern Min;
Mandarin
Mandarin
Education
1st-year undergrad
1st-year undergrad
1st-year undergrad
1st-year undergrad
2nd-year undergrad
2nd-year undergrad
2nd-year undergrad
2nd-year undergrad
3rd-year undergrad
3rd-year undergrad
3rd-year undergrad
3rd-year undergrad
4th-year undergrad
4th-year undergrad
4th-year undergrad
4th-year undergrad
other topics, such as ghost stories, supernatural experiences, or hypothetical
questions, such as “if I could live my life all over again. . .” to elicit a casual
speech style.
Participants’ attitudes regarding the most acceptable substitute for [θ ] were
elicited in Part 2: Getting information (Appendix D). As shown in Appendix
D, the authors10 orally modeled four possible substitutes ([s], [t], [f], [S]) for
English [θ ], and participants were asked to evaluate each alternant for (th)
individually on a 7-point Likert scale before putting all four in an acceptability
rating and deciding on the most acceptable substitute (i.e., the one closest to
the standard [θ ]11 ).
Coding
All of the interviews were tape-recorded and transcribed in English orthography.
The tokens for the variable (th) were coded according to the factors of the
dependent and independent variables for the GOLDVARB 2001 program,12
The dependent variable was given two levels: accurate production and
inaccurate production of [θ ]. All variants other than [θ ], such as [s], [S], [t], [f],
589
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
[d], [z] or zero, were coded as inaccurate. The independent variables consist
of five internal and two external factor groups with their respective factors
(Appendix E). The six internal factor groups comprise (a) vowel following an
onset (th), (b) vowel preceding a coda (th), (c) consonant preceding a coda (th),
(d) lexical token frequency, and (e) phonetic features (regrouping of immediate
phonetic environment). The two external factor groups contain speech style
and development over time. Word list and story reading are classified as formal
style, whereas story-retelling and conversations in interviews are classified as
informal or spontaneous style. Development over time is discussed later.
To find the combination of factor groups that best account for the variation
of (th), we examined the preceding and following segments co-occurring in the
immediate environment of (th) and speech style.
To test for lexical frequency effects, we divided the 67 lexemes (see
Appendix F) with (th) that occurred in the informal style in our dataset into
three levels: over 400 times, between 400 and 100 times, and below 100 times.
This grouping was based on two principles: (a) there is a natural category
(>400) emerging in the internal corpus in which the lexeme “think” is the
only member and (b) the top six lexemes with a token frequency above 100
all constitute frequent words (2,199–7,829 tokens) in the MICASE corpora
(http://quod.lib.umich.edu/m/micase/) in academic spoken English, except for
the word “third” (due to the influence of the story-retelling on “The Three Little
Pigs”). Thus, we had external evidence from an independent source to support
our classification of word frequency into three levels.
To test whether lexical frequency and phonetic features account for accurate
production of (th), we regrouped all the words according to the following
features: front vowel, thr clusters, rhotacized vowel, complex coda (with n, r, l,
or f), back vowel, and diphthong.
To test whether these constraints change as the learners’ production accuracy increases, we treated accurate production as an independent variable
(i.e., development over time) in one of the VARBRUL runs and classified the
individuals into three groups based on their (th) production accuracy to test
if the groups demonstrate the same or different variation patterns. It might
appear that transforming production accuracy into an independent variable creates circularity, but, in fact, this is not a problem because our goal was to
trace and compare the hierarchy constraints of the three groups. The other
independent variables consist of phonetic features and speech styles. The percentages of accurate production of the target variant [θ ] were calculated for each
participant by dividing the number of accurately pronounced tokens of [θ ] by
all the tokens with [θ ] produced from their individual speech samples and
Language Learning 59:3, September 2009, pp. 581–621
590
Rau, Chang, and Tarone
Think or Sink
multiplying by 100. Different rankings of the constraints of phonetic features
and speech style would reflect different strategies adopted by the groups in
pronunciation of (th).
Finally, participants’ preferred substitutions for (th) were tallied from the
questionnaire and placed in an implicational scale to determine their rank order.
Similar rank orders of these substitute forms would support the conclusion that
Chinese learners of English in both China and Taiwan constitute a single speech
community.
All 4,386 tokens for the variable (th) were entered for VARBRUL analyses.
The coding for one of the participants was checked by two research assistants
to reach consensus before the rest of the data for the China sample were coded.
In addition, some questionable tokens of voiceless interdental fricative [θ ] and
other variants for the Taiwan sample were analyzed on a Praat 4.1.18 (Boersma
& Weenink, 2003) as a spot check for coding reliability.
What Was Not Coded
The word “enthuse” in the word list (Appendix C) was not coded because most
of the participants did not know how to pronounce it. The word “sixth” occurred
too infrequently and hence was not coded to avoid a knockout effect.13
Many studies exclude lexical exceptions (e.g., “and” is usually excluded
from studies of consonant cluster reduction) because it has been convincingly
demonstrated that lexical exceptions have different representations (Guy, 2007).
However, the most frequently occurring tokens, such as “think” or “with,” were
not excluded from our study because they did not undergo any sound reduction
in L2 and, hence, should not be considered lexical exceptions.
Neither did we limit the number of tokens of a single lexical item per
speaker, as have many studies of linguistic variation. We believe it was necessary
to get a good picture of the internally generated corpus for this study. It might
be feasible for a future study to consider this option of limiting the number of
tokens of a single lexical item per speaker.
Results and Discussion
Patterns of Phonological Variation of (th) Among Chinese Speakers
The following section provides results in response to the research question
about the best combination of factors that accounts for the accuracy of [θ ]
production by the Chinese learners in the study.
Table 3 displays the factor groups that significantly affected accurate production of (th) based on the result of a step-up/step-down analysis. In a
591
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
Table 3 Significant factors that affect accurate production of (th)
Factor group
Vowel following an onset (th)
Low front vowel
Mid/rhotacized vowel
High front vowel
Back mid round vowel
Low back vowel
High mid vowel after consonant cluster thr
Back vowel after consonant cluster thr
High front vowel after consonant cluster thr
Diphthong /au/
Range
Vowel preceding a coda (th)
High front vowel
Back mid round vowel
Mid front vowel
Low front vowel
High round vowels
Diphthong /au/
Rhotacized vowel
Range
Consonant preceding a coda (th)
/l/
/f/
/n/
/r/
Range
Speech style
Word list
Passage reading
Informal speech style
Range
Total
Input 0.717
Total chi-square = 32.2020
Chi-square/cell = 0.5279
Log likelihood = −2448.402
Maximum possible likelihood = −2431.676
Limited chi-square = 36.19 (df = 19, p = .01)
Language Learning 59:3, September 2009, pp. 581–621
VARBRUL
Tokens
Weight (Pi) correct/total Percentage
0.59
0.55
0.54
0.53
0.46
0.45
0.43
0.31
0.19
0.40
59/71
213/280
1,179/1,568
53/67
57/74
93/120
93/129
179/330
3/10
83%
76%
75%
79%
77%
78%
72%
54%
30%
0.58
0.57
0.55
0.48
0.46
0.45
0.38
0.20
352/461
55/70
86/109
80/109
148/207
79/112
169/263
76%
79%
79%
73%
71%
71%
64%
0.61
0.57
0.52
0.39
0.22
64/80
4/6
149/200
81/122
80%
67%
75%
66%
0.63
0.54
0.38
0.25
1,020/1,246
1,011/1,325
1,163/1,815
82%
76%
64%
3,194/4,386
73%
592
Rau, Chang, and Tarone
Think or Sink
VARBRUL analysis, a factor with a probability weight above .50 favors the use
of the variant that we have selected as the application value (in this case, accurate pronunciation of th) relative to the input value (also called the corrected
mean). In other words, a weight above .50 promotes the accurate production of
(th), whereas a weight below .50 inhibits it. A weight of .50 has no influence
on the accurate production of (th). The farther above or below .50 a probability weight is, the greater its degree of positive or negative influence is on the
accuracy of (th).14 All of the factor weights are ranked from the highest to the
lowest. The range is the difference between the highest probability weight and
the lowest in each factor group. The higher the range is, the more important the
factor group is. The total chi-square has a value of 32.20 (df = 19, p < .01),
which indicates that the factors and factor groups are independent, an important
assumption in VARBRUL analysis. Our participants’ accurate production of
(th) was averaged at 72% (input probability = 0.717).
The significant factor groups selected by VARBRUL to predict the variation
patterns of (th) in Chinese English include the immediate phonetic environment
and speech style. In general, the factor groups facilitating accurate production
of [θ ] demonstrate a constraint hierarchy, with the segments following the onset
(th) imposing the greatest influence (range = .40), followed by speech style
(range = .25), and segments preceding the coda (th) (with the ranges at .22 for
consonants and .20 for vowels, respectively).
The vowels after the onset (th) that promote the accurate production of (th)
from the highest probability weights to the lowest include: low front vowel /æ/
(.59) (e.g., thank), mid or rhotacized vowel /@, 3/ (.55) (e.g., third), high front
vowel /I/ (.54) (e.g., think) and /i/ (e.g., wealthy), as well as round vowel /O
(.53) (e.g., thought). On the other hand, the environments that inhibit accurate
production of (th) include low back vowel /2/ (.46) (e.g., thunder), diphthong
/aw/ (.19) (e.g., thousand), and thr cluster (.31, .45, .43) (e.g., three, threaten,
through, throw).
The vowels before the coda (th) that promote accurate production of (th)
include high front vowel /I/ (.58) (e.g., with) and /i/ (e.g., teeth), back mid round
vowel /O/ (.57) (e.g., moth), and mid front vowel /ε/ (.55) (e.g., breath). The
vowels that inhibit accurate production of (th) include low front vowel /æ/ (.48)
(e.g., math), high round vowel /u/ (.46) (e.g., youth, truth), diphthong /aw/ (.45)
(e.g., mouth), and mid/rhotacized vowel /@/ (.38) (e.g. earth).
The segments preceding the (th) in the coda position, except for /r/ (.39)
as in north, all favor the accurate production of (th); These include /l/ (.61) in
wealth, /f/ (.57) in fifth, and /n, N/ (.52) in tenth and strength.
593
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
Speech styles follow a pattern similar to that reported in the literature: word
list reading (.63) and, to a lesser extent, passage reading (.54) favor accurate
production of [θ ], whereas informal speech style (.38), combining interview
and story-retelling styles, disfavors accurate production of [θ ].
These results suggest that accurate production of English [θ ] by Chinese
learners is consistent with the markedness (i.e., implicational typology) principle. The onset position is more facilitative than the coda, with the range of
factor weights of the former higher than that of the latter (.40 vs. .20 or .22).
The complex onset or coda with /r/ (i.e., #thr or rth#) inhibits the accurate
production of [θ ]. A single nucleus vowel is more favorable than the diphthong
/aw/. Other more specific phonetic environments facilitative of accurate pronunciation of (th) can be explained by ease of articulation, such as high and
mid vowels or alveolar consonants, which are closer to the interdental position.
Learners’ sensitivity to stylistic shifting in our results accords with findings
in many previous sociolinguistic studies, as discussed in Tarone (2007, 2009).
The fact that story-retelling could be regrouped with interviews due to their
inhibiting factor weights indicates that story-retelling might be a suitable task
replacing interviews to elicit “informal” speech for future research design.
Frequencies of Accurate Production of (th)
This section presents answers to our research question asking how token frequencies affect the production of (th) and whether high token frequencies can
account for the use of standard variants for (th).
To test the influence of token frequencies and types of phonetic environments on the accurate production of (th), all tokens in spontaneous speech
(story-retelling and interviews) were considered. To produce the most parsimonious model,15 the factors of token frequency were regrouped from three
into two with a word frequency of 400 (i.e., the lexeme think) in the whole
database as the threshold. The words occurring in the spontaneous speech style
were regrouped according to three types of phonetic environments: (a) front
vowel, (b) back vowel, rhotacized vowel, and complex coda (with n, r, l, or f),
and (c) thr clusters and diphthong.
The VARBRUL results, presented in Table 4, show that both phonetic features and token frequency had a significant impact on variation, with phonetic
features ranked as a higher constraint than token frequency (range = .20 vs.
.07). The lexeme “think” with its inflected forms (e.g., “thinks” and “thinking”) was the most frequent word, occurring over 400 times in the spontaneous
speech dataset and favoring the accurate production of (th), with a probability
weight of .55. Furthermore, the (th) sound with preceding or following front
Language Learning 59:3, September 2009, pp. 581–621
594
Rau, Chang, and Tarone
Think or Sink
Table 4 Token frequency and phonetic features on accurate production of (th)
Factor group
Tokens
correct/total
Percentage
0.55
0.48
.07
351/484
812/1,331
72%
61%
0.54
0.49
788/1,142
261/424
69%
61%
0.34
0.20
114/249
45%
1,163/1,815
64%
Weight
Token frequency
Word token frequency ≥ 400 times
Word token frequency < 400 times
Range
Phonetic features
Front vowels
Back vowels, complex codas,
and rhotacized vowels
thr clusters and diphthongs
Range
Total
Input = 0.644
Total chi-square = 0.0021
Chi-square/cell = 0.0005
Log likelihood = −1158.723
Limited chi-square = 7.82 (df = 3, p = .05)
vowel (e.g. “think,” “something,” “with,”) was the only factor that promoted
the accurate production of (th) (probability weight = .54). The input weight
(.64) or the corrected means of accurate production of (th) (64%) in Table 4,
based on the 1,815 tokens of informal style, is naturally lower than that in
Table 3 (.72 or 72%), which includes the formal style.
The line graph in Figure 1 is very revealing of the relationships just described. The front vowel type occurred most frequently in our data and also had
the highest production accuracy.
In summary, results indicate that both phonetic features and lexical frequency (token frequency) can be combined to account for accurate production
of (th) in spontaneous speech style. High token frequency containing the variable (th) favored the accurate production of (th). The most frequent word
“think,” along with other words containing front vowels, promoted the accurate production of (th). As predicted by research in SLA (Ellis, 2002; Flege
et al., 1996; Langman & Bayley, 2002; Trofimovich et al., 2007), high token
frequency strengthened psycholinguistic ties and facilitated accurate perception and production. Due to the fact that “think” was the only lexeme in the
595
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
900
800
Front V
Total Frequency
700
600
500
400
300
200
Rhotacized V
thr cluster
Diphthong
Consonant
Back V
100
0
30%
40%
50%
60%
70%
80%
Accurate Rate of (th) by Phonetic Feature for Informal Style
Figure 1 Relationship between frequencies and accuracy of (th).
400+ group16 and the lexical frequency effect had a very small range (.07), our
conclusion on lexical frequency effects should remain tentative.
So far we have only considered the phonological variation pattern of (th)
that characterized the entire group of Chinese speakers of English. However,
a detailed analysis of the different constraints and accuracy rates shown in
different participant subgroups will be very illuminating in inferring strategy
modifications over time in the acquisition of (th).
Variation Patterns for (th) in Different Accuracy Groups
What development patterns can be inferred from our data? Are groups with
three levels of accurate production of [θ ] constrained by the same or different
factors? The three groups of learners divided by the percentage of accurate
production of (th) (high, mid, and low) were compared with phonetic features
and speech styles as independent variables to determine whether the constraint
hierarchy for each group is the same or different.
Table 5 shows that for the low and mid groups with an average accuracy
of 54% and 78%, respectively, both phonetic features and speech styles were
significant factors affecting learners’ accurate production of (th), whereas for
the high group with an average accuracy of 91%, only speech style had a great
Language Learning 59:3, September 2009, pp. 581–621
596
597
Phonetic features
Front vowel
Back vowel, rhotacized vowel,
complex coda (with n, r, l, or f)
Diphthong, thr clusters
Range
Speech style
Word list
Passage reading
Informal speech style
Range
Total
Factor group
58%
57%
47%
54%
243/416
251/439
257/548
751/1,403
0.55
0.54
0.42
0.13
38%
91/238
0.35
0.22
59%
52%
%
424/714
236/451
Tokens
correct/total
0.57
0.48
Weight
Low group
(accuracy < 70%)
n=9
0.74
0.55
0.32
0.42
0.35
0.24
0.59
0.41
Weight
1,035/1,382
332/371
309/392
394/619
134/204
608/774
293/404
Tokens
correct/total
75%
89%
79%
64%
66%
79%
73%
%
Mid group
(accuracy ≥70% < 80%)
n=8
Table 5 Accuracy of (th) production in Chinese-English interlanguage: VARBRUL weights
0.77
0.52
0.28
0.49
[0.43]
[0.51]
[0.53]
Weight
88%
97%
91%
79%
85%
87%
90%
%
(Continued)
1,408/1,601
445/459
451/494
512/648
222/260
722/828
464/513
Tokens
correct/total
High group
(accuracy ≥ 80%)
n = 10
Rau, Chang, and Tarone
Think or Sink
Language Learning 59:3, September 2009, pp. 581–621
Input = 0.54
Total chi-square = 11.2743
Chi-square/cell = 1.2527
Log likelihood = −942.382
Limited chi-square = 13.28,
(df = 4; p = .01)
Low Group
Table 5 Continued.
Input = 0.78
Total chi-square = 5.9116
Chi-square/cell = 0.6568
Log likelihood = −713.869
Limited chi-square = 9.49,
(df = 4; p = .05)
Mid Group
Input 0.91
Total chi-square = 0.0001
Chi-square/cell = 0.0000
Log likelihood = −541.631
Limited chi-square = 5.99,
(df = 2; p = .05)
High Group
Rau, Chang, and Tarone
Think or Sink
Language Learning 59:3, September 2009, pp. 581–621
598
Rau, Chang, and Tarone
Think or Sink
impact with phonetic features no longer significant. The low and mid groups
were only sensitive to front vowels (with weights of .57 and .59, respectively),
but the high group also improved its production of (th) in many of the phonetic
environments that were considered difficult for the low and mid groups (i.e.,
back vowel, rhotacized vowel, and complex coda [with n, r, l, or f] with a
probability weight of .53). The two most challenging environments for the
production of (th) for the Chinese learners turned out to be diphthongs and
“thr” clusters because even the high group had trouble with these environments
with inhibiting probability weights (.43).
The constraint hierarchies operating in the three groups present different
patterns. The low group’s production of (th) was affected by phonetic features
more than speech style (range = .22 vs. .13), whereas the mid group responded
to speech style much more than phonetic features (range = .42 vs. .24). The
high group’s production of (th) was no longer affected by phonetic features at
all but was still greatly sensitive to speech style (range = .49).
VARBRUL analysis enabled us to detect fine-tuned differences among
high-, mid-, and low-accuracy groups in the factors influencing their accurate
production of (th). Based on the probability weights and constraint hierarchy, we
can see that learners who had a good control of the pronunciation of (th) (over
91% accuracy) mastered the more difficult phonetic environments and, hence,
increased their accuracy rate. The low and mid groups, on the other hand, relied
more on phonetic salience strategies in that vowel quality facilitated accurate
production. Although the low group was responsive to speech style, it was
not as important as phonetic salience in its influence on (th) accuracy. The
mid and high groups, however, were very sensitive to style shift, with a clear
distinction among the three styles: a closer approximation to native-speaker
competence.
So far, we have shown the variation patterns and strategies used by different
groups in accurate production of (th). Now, we will turn to the statements
about preferred substitutions for (th) that were made by learners in China and
Taiwan.
Repertoires of Variants for (th)
What are the participants’ self-reported preferences for the segment used to
substitute for [θ ]? Is it [s], as reported by Weinberger (1994) for English
learners in China and Taiwan? Does the participants’ self-reported preference
for segment substitution for (th) match their actual production, and what is the
repertoire of variation in their actual production of [θ ]? We will examine the
results from the China sample and the Taiwan sample separately.
599
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
Table 6 Ranking and rating of the substitutes for [θ ] for the China group
Most acceptable substitutes
(where 1 = perfectly acceptable &
7 = completely unacceptable)
Participant Place
of origin in China
LOR
[θ ] variants in speech
#1
3 months
#2
3.5 years
#3
1.5 years
#11
5 years
#10
1 year
#7
2 years
#9
1.5 years
#4
1.5 years
#6
2 years
#8
2.5 years
#5
8 months
Sichuan
[s, ]
Shannxi
[s]
Zhejiang
[s]
Hunan
[s]
Hubei
[s]
Hebei
[s]
Inner Mongolia
[s]
Jiangsu
[s]
Jiangsu
[s]
Henan
[s]
Hebei
[s, t]
Ratings of the four
s
5
s
2
s
1
s
1
s
6
s
4
s
2
s
2
s
4
4
t
1
possible substitute variants
t
7
7
t
6
7
t
6
6
t
5
6
t
7
7
f
6
6
f
6
7
f
t
7
7
f
t
6
6
s
f
6
6
s
f
2
1
f
7
f
7
f
7
f
7
f
7
t
7
t
7
7
7
t
7
2
The China Sample
Among the 1962 tokens with (th) produced by the China group, only 619
tokens (32%) were coded as inaccurate. Nine out of the 11 China participants
ranked [s] as the most acceptable substitutes for [θ ]. All of the participants from
Mainland China, in their oral production, used [s] as the primary substitute for
[θ ], including the two who, in their evaluation of acceptable
substitutes, reported
they thought the most acceptable variants were [ ] and [t], as illustrated in
Table 6.
Participant #8, who reported that [ ] was the most acceptable substitute,
still used [s] as the substitute in the very few tokens that he missed in his careful
Language Learning 59:3, September 2009, pp. 581–621
600
Rau, Chang, and Tarone
Think or Sink
Table 7 Variants for English (th) among 16 Taiwanese EFL participants
Variants
[z]
[t]
[f]
[d]
[s]
[/]
[ø]
[S]
Example
[wIz]
[tIn]
[fri]
[flu]
[dri]
[dwi]
[sIn]
[n2sIN]
[sIn]
[wIs]
[wI/]
[b@de]
[Swi]
Source
with
thing
three
through
three
three
think
nothing
thing
with
with
birthday
three
S3
S3
S5
S5
S7
S3
S2
S8
S2
S6
S10
S12
S4
production of [θ ]. Participant #5, who rated [t] as the most acceptable substitute,
did use [t] as the substitute at the beginning of her oral interview, when she was
asked to read “healthy” and “forth” in the task of passage reading. However,
then she switched to [s] in the rest of the interview. Participant #1 used [ ] in
his speech for two tokens, “enthusiastic” and “think,” but the majority of his
substitutes were still [s].
All in all, the China sample had only three variants [s], [t], and [ ] in their
inaccurate production of (th). Furthermore, 99% of the substitutes they made
for (th) (615 out of 619) were [s].
The Taiwan Sample
There were 573 out of 2,424 tokens (23.6%) marked as inaccurate productions
of (th) in the Taiwan group. As indicated in Table 7, there were seven inaccurate
variants produced by the Taiwan group. Participant #3 (S3) had a preference for
[z] in the word “with,” [d] in the word, “three,” and [t] in the words “nothing”
and “think.” The [f] variant only occurred in onsets, such as “three.”
As shown in Table 8, 13 out the 16 Taiwan participants ranked [s] as their
most preferred substitutes for [θ ]. The other three ranked [s] as equally acceptable as one of the other three variants. In their actual production, most
participants replaced [θ ] with [s]. Eighty-six percent of their inaccurate productions of (th) (491 out of 573) were [s]. Interestingly, we found that firstand second-year undergraduate students in this group had a wider range of
601
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
Table 8 Choice of acceptable substitutes of [θ ] by the Taiwan group
Participant No.
s
t
f
2
14
6
12
16
4
15
8
13
9
11
7
1
3
10
5
7
7
7
7
7
7
7
6
6
6
7
7
6
6
6
5
3
3
3
4
4
4
2
3
2
4
3
3
5
4
6
5
7
7
7
7
6
5
3
5
6
6
7
7
7
7
6
7
7
7
7
7
7
7
7
6
6
6
6
6
6
4
7
6
1 = strongly acceptable; 2 = moderately acceptable; 3 = slightly acceptable; 4 = neutral;
5 = slightly unacceptable; 6 = moderately unacceptable; 7 = strongly unacceptable
variants to replace the target sound (th), including [s], [z], [ ], [f], [d], and [t].
On the other hand, only [s], [t], [d], [/], and ø (deletion) were used by third- and
fourth-year undergraduate students to replace [θ ].
Thus, the self-reported preferences for the most acceptable substitutes for
(th) have confirmed Weinberger’s (1994) observation that [s] is the preferred
substitute for (th) for Chinese learners of English in China and Taiwan. This
finding would suggest that the Mandarin Chinese speakers in China and Taiwan
constitute the same English as a foreign language speech community.
We observed that the Taiwan sample tended to produce a wider range of
variants than the China sample. One possible reason is that the China group had
a higher proficiency; as more advanced learners’ pronunciation approximates
the native norm, their inventory of variants is reduced to a more limited number
of choices, usually converging to [s]. Another possible reason for this finding,
however, is the quality of the input that the learners were getting in these two
learning contexts. The Taiwan sample was getting more input from other learners (whose output contains a range of errors), whereas the China sample was
possibly getting more consistent input on a single norm from native speakers.
Language Learning 59:3, September 2009, pp. 581–621
602
Rau, Chang, and Tarone
Think or Sink
This finding suggests the extent to which it matters whether L2 input is provided
by natives or nonnatives, or by experts as opposed to peers. It matters not just
in contexts in which learners’ goals entail following a native model and approaching nativelike norms. As an anonymous reviewer pointed out, it is also
an important factor to consider with reference to Lingua Franca English (ELF).
In ELF, as the reviewer pointed out, it is important to enhance intelligibility
and improve the predictability of sound substitutions by reducing the number
of substitute variants. Thus, L2 input from peers may be problematic in ELF
contexts simply because it increases the number of substitute variants present
in the overall input envelope. This possibility deserves further consideration in
future research.
Implications
This study demonstrates the usefulness of a variationist analytical approach in
identifying the complex, but not always obvious, factors that affect the accuracy
of [θ ] produced by Chinese learners from Taiwan and mainland China in
different linguistic contexts and at different developmental stages. VARBRUL
enabled us to do this in a more precise manner than conventional analyses could
have. The results point to the privilege of front vowels over other phonetic
environments in promoting the accurate production of (th) by Chinese speakers
of English. However, as their production accuracy improves, the learners tend
to be influenced more by formality of speech style (and so by use of monitoring
strategies) than phonetic salience. In addition, as the learner’s pronunciation
improved, the extent of allophonic complexity was reduced.
There is a slight lexical frequency effect on the accurate production of
(th) in spontaneous speech. The most frequent words containing (th) (i.e., the
lexeme “think”) tended to favor accurate production. This result lends some
support to psycholinguistic research that demonstrates that language processing
is sensitive to frequency of occurrence, although the nature of the research
design has limited generalization to the discussion of frequency.
This study also compared the stated preferences of Chinese learners of
English from China and Taiwan for (th) variants with the variants they actually
used in speaking. The responses suggest that learners in these groups constitute
a single speech community, in that they share norms for (th) substitution. For
both groups as a whole, [s] was the most acceptable substitute for (th) on both
measures, although certain individuals’ actual performance did not always agree
with their stated preferences. The implications of this finding are interesting,
in light of Labov’s (1966) point about speech communities: that it is speakers’
603
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
norms that make them members of a speech community rather than their speech
performance, due to such factors as their linguistic insecurity. Is the dichotomy
between our participants’ linguistic norms and their actual performance due to
linguistic insecurity or some other factor? We hope future studies will focus on
the complex dynamics of L2 learners’ norms for this and other phonological
variables as well.
This study, albeit preliminary, has provided an interesting direction for
future studies on IL phonology by showing that speech styles and phonetic features help to account for variation patterns. The modification of learner’s strategies from phonetic salience to monitoring as their production accuracy increases
also has pedagogical implications for English as a second language/English as
a foreign language teachers who are teaching pronunciation.
Revised version accepted 10 July 2008
Notes
1 The VARBRUL program has been developed as a tool for logistic regression to
help identify the probabilistic weights. Paolillo (2002) has provided a
comprehensive discussion of the statistical models and methods for analyzing
linguistic variation. Young and Bayley (1996) and Tagliamonte (2006) have
provided the most detailed step-by-step procedures for VARBRUL analysis and
interpretations.
2 Consult Weinberger’s speech accent archive (http://classweb.gmu.edu/accent) for
examples.
3 We acknowledge that SPSS has a wider application than VARBRUL and several
current studies on IL have chosen to use SPSS (particularly Geeslin, 2002, 2003;
Geeslin & Guijarro-Fuentes, 2006). However, the VARBRUL program was
selected for the current study for three major reasons: (a) Logistic regression is the
appropriate statistical test for our research design; (b) VARBRUL allows a
researcher to gradually narrow the analysis to those factor groups and factors that
can be said with a certain degree of probability to correlate with variation; and (c)
we would like to facilitate comparison with other studies on IL variation of
Chinese learners of English (e.g., Bayley, 1991; Young, 1989).
4 The original sample (Rau & Tarone, 2004) contained four participants from Hong
Kong and Macau. The four non-Mainland Chinese students were removed from
the data.
5 The dialectal backgrounds of the participants from Mainland China were not
tested as an independent variable in this study due to our preliminary observation
that the variable (th) did not seem to be affected by the dialectal backgrounds of
the Chinese participants. However, L1 seemed to have differential effects on their
Language Learning 59:3, September 2009, pp. 581–621
604
Rau, Chang, and Tarone
6
7
8
9
10
11
12
13
14
15
605
Think or Sink
production of some other segments and requires future investigation. For example,
two participants from Sichuan and Hubei respectively merged /l/ and /n/. One
participant from Zhejiang did not have /r/ in his L1 Chinese dialect. One
participant from Inner Mongolia substituted /v/ for /w/. All of these L1 features
were transferred to their English L2.
All participants met TOEFL requirements to be admitted to their respective degree
programs.
The sociolinguistic interview (Labov, 1966) aims at eliciting linguistic data in
different speech contexts, usually comprising an informal part (consisting of free
conversation) for eliciting vernacular or local use and a formal part (consisting of
a reading passage, word lists, and minimal pairs) to elicit various degrees of
formal or standard language use.
Pictures taken from Chang & Rau (2004). Used by permission.
Some questions were geared toward the Chinese graduate students, whereas others
were suitable for the Taiwanese undergraduate students. The researchers chose the
questions from the protocol to suit their groups during the interviews.
One of the anonymous reviewers questioned the validity of the nonnative-speaking
authors’ modeling of the four substitutes. With the increasing rise of World
Englishes, the question of whose norms are to be used and taught has often been
raised (e.g., Kachru & Nelson, 1996; McKay, 2002). We believe that it was to our
advantage to have researchers with Chinese L1 background to both model and
analyze the sounds because we know exactly how these substitutes are pronounced
by Chinese learners of English with lower levels of proficiency in the language.
None of the four substitutes should be considered fillers because (a) they exist in
Mandarin Chinese phonology and (b) they are being used as substitutes for [θ] by
Chinese English speakers in the Chinese diaspora, as our review of the research
literature establishes. Thus, any one of them can be potentially treated by a learner
as close to the standard or following the principle of least effort (Chomsky, 1995).
The program manual (Robinson, Lawrence, & Tagliamonte 2001) can be
downloaded from the following Website: http://www.york.ac.uk/depts/lang/
webstuff/goldvarb/manual/manualOct2001.html
VARBRUL cannot run on a cell file with the knockout factor because it has either
0% or 100% of its tokens associated with the application value.
The VARBRUL probability weights are adjusted from unbalanced distribution of
the data, so they may not correspond to the percentages. This explains how a factor
with a weight below 0.50 can seemingly end up producing more correct tokens
than one with a weight above 0.50. (e.g., cf. 0.45 > 78% vs. 0.54 > 75% in
Table 3).
The goal of an analysis using the logistic regression as implemented in VARBRUL
(or any other logistic regression program) is to produce as parsimonious a model
as possible that still accounts for the data. It is not only to report on which factor
groups significantly constrain variation but also combine factors within groups
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
where there is both a statistical and a linguistic justification for doing so. In
Table 4, front vowels slightly favor correct pronunciation, whereas the differences
between back vowels, complex codas, and rhotacized vowels are nonsignificant
(according to the log-likelihood test; see Young & Bayley, 1996). Furthermore, the
difference between thr clusters and diphthongs is also nonsignificant. Thus, six
factors in the Factor group of phonetic features were reduced to 3 factors. The
three factors in the Factor group of lexical frequency underwent the same
reduction to two factors (≥400 times vs. <400 times), following the same logic.
16 One of the anonymous reviewers questioned if “think” being the only lexeme in
the 400+ group weakened the impact of the high vowel factor. Our answer to it is
“No” because the result of the chi-square test, as shown in Table 4, indicates that
lexical frequency and phonetic features are independent from each other. In other
words, front vowel regardless of height favors accurate production of (th),
followed by the lexeme “think.”
17 The earthquake occurred on September 21, 1999. The quake’s major impact was in
Central Taiwan, from which many students at Providence University came. For
more information on this earthquake, consult http://www.coping.org/travels/
Taiwan/quake.htm.
References
An, W.-M. (2007). Down or don: A study of phonological variation of (aw) among
Taiwanese graduate students. Unpublished master’s thesis, Providence University,
Taiwan.
Archibald, J. (1993). Language learnability and L2 phonology: The acquisition of
metrical parameters. Dordrecht: Kluwer Academic.
Bayley, R. (1991). Variation theory and second language learning: Linguistic and
social constraints on interlanguage tense marking. Unpublished doctoral
dissertation, Stanford University, Stanford, CA.
Bayley, R. (1994). Interlanguage variation and the quantitative paradigm: Past tense
marking in Chinese-English. In E. E. Tarone, S. M. Gass, & A. D. Cohen (Eds.),
Research methodology in second-language acquisition (pp. 157–181). Hillsdale,
NJ: Lawrence Erlbaum.
Bayley, R. (1996). Competing constraints on variation in the speech of adult Chinese
learners of English. In R. Bayley & D. R. Preston, (Eds.), Second language
acquisition and linguistic variation (pp. 97–120). Amsterdam: Benjamins.
Bayley, R., & Preston, D. (Eds.). (1996) Second language acquisition and linguistic
variation. Amsterdam: Benjamins.
Boersma, P., & Weenink, D. (2003). Praat software: doing phonetics by computer.
Available from http://www.fon.hum.uva.nl/praat/
Brannen, K. (2002). The role of perception in differential substitution. Canadian
Journal of Linguistics, 47, 1–46.
Language Learning 59:3, September 2009, pp. 581–621
606
Rau, Chang, and Tarone
Think or Sink
Bybee, J. (2006). Frequency of use and the organization of language. Oxford: Oxford
University Press.
Bybee, J., & Hopper, P. (Eds.) (2001). Frequency and the emergence of linguistic
structure. Amsterdam: Benjamins.
Cardoso, W., & John P. (2006). Markedness vs. frequency effects in second language
phonology: A variable perspective. Paper presented at NWAV 35, Ohio State
University, Columbus, OH, November 9–12, 2007.
Carlisle, R. 1994. Markedness and environment as internal constraints on the
variability of interlanguage phonology. In M. Yavas (Ed.), First and second
language phonology (pp. 223–249). San Diego, CA: Singular.
Chang, A. (2004). Phonological variation of (th) among EFL learners in Taiwan.
Unpublished master’s thesis, Providence University, Taiwan.
Chang, A. (2009). Phonological variation of (th) among EFL learners in Taiwan.
Proceedings of international conference of sociolinguistics and functional
linguistics. Yuan Ze University, Taiwan.
Chang, A., & Rau, D. V. (2004). Phonological variation of (th) among EFL learners in
Taiwan. Paper presented at the AAAL 2004, Portland, Oregon.
Chang, S.-L. (2008). Taiwanese EFL learners’ interlanguage variation of the English
vowel (ey). Unpublished master’s thesis, Providence University, Taiwan.
Chen, H.-C. (1998). Interlanguage variation of the use of English articles from
Chinese students’ journal writing. Unpublished master’s thesis, Providence
University, Taiwan.
Chen, P.-C. (2002). Interlanguage variation on English past tense marking in Chinese
students’ writing. Unpublished master’s thesis, Providence University, Taiwan.
Chen, S.-C. (2001). Interlanguage variation in the production of English (r) by
Taiwanese learners. Unpublished master’s thesis, Providence University, Taiwan.
Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press.
Clark, E. V. (2003). First language acquisition. Cambridge: Cambridge University
Press.
Dewaele, J. (2004). Retention or omission of the ne in advanced French interlanguage:
The variable effect of extralinguistic factors. Journal of Sociolinguistics, 8(3),
433–450.
Dubois, S., & Horvath, B. (1999). Let’s tink about dat: inter-dental fricative in Cajun
English. Language Variation and Change, 10, 245–261.
Eckman, F. (1977). Markedness and the contrastive analysis hypothesis. Language
Learning, 27, 315–330.
Eckman, F., & Iverson, G. (1993). Sonority and markedness among onset clusters in
the interlanguage of ESL learners. Second Language Research, 9(3), 234–252.
Ellis, N. (2002). Frequency effects in language processing. Studies in Second
Language Acquisition, 24, 143–188.
Fasold, R., & Preston, D. (2009). The psycholinguistic unity of inherent variability:
Old Occam whips out his razor. In R. Bayley & C. Lucas (Eds.), Sociolinguistic
607
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
variation: Theories, methods, and applications (pp. 45–69). Cambridge: Cambridge
University Press.
Flege, J. E., Takagi, N., & Mann, V. (1996). Lexical familiarity and English-language
experience affect Japanese adults’ perception of /r/ and /l/. Journal of the Acoustical
Society of America, 99, 1161–1173.
Gass, S. M., & Mackey, A. (2002). Frequency effects and second language acquisition.
Studies in Second Language Acquisition, 24, 249–260.
Gatbonton, E. (1978). Patterned phonetic variability in second language speech: A
gradual diffusion model. Canadian Modern Language Review, 34, 335–347.
Geeslin, K. (2002). The acquisition of Spanish copula choice and its relationship to
language change. Studies in Second Language Acquisition, 24, 419–450.
Geeslin, K. (2003). A comparison of copula choice: Native Spanish speakers and
advanced learners. Language Learning, 53(4), 703–764.
Geeslin, K., & Guijarro-Fuentes, P. (2006). Second language acquisition of variable
structures in Spanish by Portuguese speakers. Language Learning, 56(1), 53–107.
Guy, G. R. (2007). Lexical exceptions in variable phonology. University of
Pennsylvania Working Papers in Linguistics, 13(2), 109–119.
Hansen, J. (2001). Linguistic constraints on the acquisition of English syllable codas
by native speakers of Mandarin Chinese. Applied Linguistics, 22(3), 338–365.
Hawkins, R. (1987). The notion of “typological markedness” as a predictor of order of
difficulty in the L2 acquisition of relative clauses. Paper presented to the BAAL
seminar on the Place of Linguistics in Applied Linguistics, Essex.
Huang, J.-F. (2007). Langue use by English immersion kindergarten children in
Taiwan. Unpublished master’s thesis, Providence University, Taiwan.
Ingram, D. (1989). First language acquisition: Method, description, and explanation.
Cambridge: Cambridge University Press.
Kachru, B. B., & Nelson, C. L. (1996). World Englishes. In S. L. McKay & N.
Hornberger (Eds.), Sociolinguistics and language teaching (pp. 71–102).
Cambridge: Cambridge University Press.
Klopfenstein, M. (2002). Sum: Theta-f Variation in Varieties of English. The Linguist
List 13.1959. Available from http://linguistlist.org/issues/13/13-1959.html
Labov, W. (1966). The social stratification of English in New York City. Washington,
DC: Centre for Applied Linguistics.
Langman, J., & Bayley, R. (2002). The acquisition of verbal morphology by Chinese
learners of Hungarian. Language Variation and Change, 14, 55–77.
Lee, S., & Cho, M.-H. (2002). Sound replacement in the acquisition of English
consonant clusters: A constraint-based approach. Studies in Phonetics, Phonology
and Morphology, 8(2), 255–271.
Lombardi, L. (2003). Second language data and constraints on manner: Explaining
substitutions for the English inter-dentals. Second Language Research, 19(3),
225–250.
McKay, S. L. (2002). Teaching English as an international language. Oxford: Oxford
University Press.
Language Learning 59:3, September 2009, pp. 581–621
608
Rau, Chang, and Tarone
Think or Sink
Major, R. C. (2001). Foreign accent: The ontology and phylogeny of second language
phonology. Mahwah, NJ: Lawrence Erlbaum.
Munson, B. (2001). Phonological pattern frequency and speech production in adults
and children. Journal of Speech, Language, and Hearing Research, 44, 778–792.
Paolillo, J. (2002). Analyzing linguistic variation: Statistical models and methods.
Stanford, CA: CSLI publications.
Peust, C. (1996). Sum: th-substitution. The Linguist List 7.1108. Available from
http://linguistlist.org/issues/7/7-1108.html
Preston, D. R. (1989). Sociolinguistics and second language acquisition. Oxford:
Blackwell.
Preston, D. R. (2000). Three kinds of sociolingusitics and SLA: A psycholinguistic
perspective. In B. Swierzbin, F. Morris, M. Anderson, C. Klee, & E. Tarone (Eds.),
Social and cognitive factors in second language acquisition: Selected proceedings
of the 1999 Second Language Research Form (pp. 3–30). Somerville, MA:
Cascadilla Press.
Preston, D. R. (2002). A variationist perspective on SLA: Psycholinguistic concerns.
In R. Kaplan. (Ed.), Oxford handbook of applied linguistics (pp. 141–159). Oxford:
Oxford University Press.
Rau, D. V. (2004). Style, proficiency, and attitude in acquisition of phonology by
Chinese learners of English. NSC report (NSC92-2411-H-126-002), Providence
University, Taiwan.
Rau, D. V., & Tarone, E. E. (2004). Chinese learners’ acquisition of the English
voiceless interdental fricative: A study of interlanguage variation. Paper presented
at the CARLA & ESL Forum, February 18, 2004, University of Minnesota.
Robinson, J., Lawrence, H., & Tagliamonte, S. (2001). GOLDVARB 2001 [computer
program]: A multivariate analysis application for windows. York University.
http://www.york.ac.uk/depts/lang/webstuff/goldvarb/manual/manualOct2001.html
Schmidt, R. W. (1987). Sociolinguistic variation and language transfer in phonology.
In G. Ioup & S. H. Weinberger (Eds.), Interlanguage phonology: The acquisition of
a second language sound system. (pp. 365–377). Cambridge, MA: Newbury House.
Tagliamonte, S. A. (2006). Analysing sociolinguistic variation. Cambridge:
Cambridge University Press.
Tarone, E. (1979). Interlanguage as chameleon. Language Learning, 29, 181–191.
Tarone, E. (1983). On the variability of interlanguage systems. Applied Linguistics, 4,
142–163.
Tarone, E. (2000). Still wrestling with “context” in interlanguage theory. Annual
Review of Applied Linguistics, 20, 182–198.
Tarone, E. (2002). Frequency effects, noticing, and creativity: Factors in a variationist
interlanguage framework. Studies in Second Language Acquisition, 24, 287–296.
Tarone, E. (2005). English for specific purposes and interlanguage pragmatics. In K.
N. Bardovi-Harlig & B. Hartford (Eds.), Interlanguage pragmatics: Exploring
institutional talk (pp. 157–176). Hillsdale, NJ: Lawrence Erlbaum.
609
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
Tarone, E. (2007). Sociolinguistic approaches to second language acquisition research,
1997–2007. Modern Language Journal, 91, 835–846.
Tarone, E. (in press). Social context and cognition in SLA: A variationist perspective.
In R. Batstone (Ed.), Sociocognitive perspectives on language use and language
learning. Oxford: Oxford University Press.
Trofimovich, P., Gatbonton, E., & Segalowitz, N. (2007). A dynamic look at L2
phonological learning: Seeking processing explanations for implicational
phenomena. Studies in Second Language Acquisition, 29, 407–448.
Tsai, Y.-C. (2007). A study of English future tense variation among Taiwanese
graduate students. Unpublished master’s thesis, Providence University, Taiwan.
Van Coetsem, F. (1988). Loan phonology and the two transfer types in language
contact. Publications in Language Sciences, 27. Dordrecht: Foris.
Wang, M., & Geva, E. (2003). Spelling acquisition of novel English phonemes in
Chinese children. Reading and Writing: An Interdisciplinary Journal, 16, 325–348.
Wester, F., Gilbers, D., & Lowie, W. (2007). Substitution of dental fricative in English
by Dutch L2 speakers. Language Sciences, 29, 477–491.
Weinberger, S. (1994). Theoretical foundations of second language phonology.
Unpublished doctoral dissertation, University of Washington.
Wolfram, W., & Schilling-Estes, N. (1998). American English: Dialects and variation.
Oxford: Blackwell.
Young, R. (1988). Variation and the interlanguage hypothesis. Studies in Second
Language Acquisition, 10, 281–302.
Young, R. (1989). Approaches to variation in interlanguage morphology: Plural
marking in the speech of Chinese learners of English. Unpublished doctoral
dissertation, University of Pennsylvania.
Young, R. (1991). Variation in interlanguage morphology. New York: Peter Lang.
Young, R., & Bayley, R. (1996). VARBRUL analysis for second language acquisition
research. In R. Bayley & D. R. Preston. (Eds.), Second language acquisition and
linguistic variation (pp. 253–306). Amsterdam: Benjamins.
Appendix A
Story Reading
The Th ree Little Pigs
Once upon a time there were three little heal th y and strong pigs living on
the farm wi th their mother. On their tenth bir thday they left their home and
went for th to seek their weal th .
Before they left, their mother Ruth told them, “Whatever you do, do the
best you can because that’s the way to get along in the world.”
The first little pig went south and built his house out of straw because it
was the easiest thing to do.
Language Learning 59:3, September 2009, pp. 581–621
610
Rau, Chang, and Tarone
Think or Sink
The second little pig went nor th through a forest. He built his house out of
thick sticks he gathered from the forest.
The third little pig went east and built his house of bricks because he wanted
a house wi th more strength.
One night, a big bad, but thin wolf called Kei th , wi th sharp tee th, and
who dearly loved to eat fat little piggies, came along and saw the first little pig’s
house of straw. He thundered, “Let me in, let me in, little pig or I’ll huff and I’ll
puff and I’ll blow your house down!” But the little pig said noth ing because he
wasn’t home. He had gone to visit the second little pig in the Northern Forest.
So after blowing the house down he didn’t find any th ing to eat, not even a
single moth!
The next night Kei th followed a path through the forest to the house of
sticks. Although he had no luck the day before, again he said, “Let me in, let
me in, little pig or I’ll huff and I’ll puff and I’ll blow your house down!” And
again nobody answered because the first and second little piggies had gone to
see the third little pig. Of course he blew down the house of sticks and, after
catching his brea th , he searched through the rubble looking for a little pig. He
found no food, just some of the second pig’s playth ings which he took home.
He yelled, “What on ear th is going on?!”
The third night the wolf came to the brick house. Being tired of saying the
same things every night, he threatened, “I’ll blow your house down and eat
all three of you!” This time there was somebody there. To tell the truth, all
three little piggies were there, but they didn’t answer out of fear. Th anks to
the bricks, no matter how hard Kei th huffed, he couldn’t blow down the house.
But this thin wolf was a sly one and after thinking over his dilemma for a
while he climbed up on the roof to look for another way into the brick house.
The three little pigs saw him go up to the roof and lit a roaring fire in the
fireplace and placed a large pot of water on it. When Kei th finally found the
hole of the chimney he crawled down and landed right in the pot of boiling
water. That was the end of their trouble wi th the big bad, but thin, wolf.
The next day the little pigs invited their mother over. She said, “Youth , it
is just as I told you. The way to get along in the world is to do things as well
as you can.” Fortunately for the little pigs, one of them did, and now the other
two had learned that lesson. And they lived happily ever after!
611
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
Appendix B
Pictures for Story-Retelling
Language Learning 59:3, September 2009, pp. 581–621
612
Rau, Chang, and Tarone
613
Think or Sink
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
Appendix C
Word List
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
sunk dunk punk
thick tick sick
pink dink mink
sink think rink link sinking thinking
shank sank thank tank
ban tan san fan ran
tin sin shin fin thin
sang tang pang
thing sing ring
bad sad mad
bird third curd
sender lender render
sunder thunder
enth use ensues
threat threaten threatening
three tree free Sri Lanka
true through
thought sought fought taught
earth errs
for th fort force
Norse nor th
nip nit Nick nil
streng th streng ths strengthen
wide ride lied
tenth tense tent
night light right
Welsh weal th welt wells
heal real seal peal
worth worse Wordswor th
rear seer Lear
breath bread
blue brew glue grew
Keith keys
bake wake cake lake rake sake
moss moth
mad map mat mass ma th
Language Learning 59:3, September 2009, pp. 581–621
614
Rau, Chang, and Tarone
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
Think or Sink
mouth mouse
bet bed beg beck
pat path pass
Ruth roof root loot
book look cook
south southern
rot lot
teeth tease
truth truce
wi th wit wish will
no youth no use
noth ing nodding
petting setting
someth ing everyth ing anyth ing
playth ing placing
heal th y heal thier weal th y weal th ier
hail rail tail sail wail
bir thday bir th date
late rate wait mate
Appendix D
Interview Protocol
1. Warm-up Conversation
(1) What is your name?
(2) When were you born? Where were you born? Where were you brought
up?
(3) What is your first language? How many other languages do you speak?
(4) How long have you lived here? Have you lived in any other places?
(5) Can you tell me about your parents? Where were they born? Your parents’
occupation? Your parents’ dialects?
(6) Do you have any brothers or sisters?
(7) How long have you been in the U.S.?
(8) What made you choose the University of Minnesota?
(9) What is your network of friends?
(10) When did you begin to learn English? What is your TOEFL score?
2. Getting Information
(1) Did you like school when you were a child? Why?
615
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
(2) Can you describe your school for me? Can you describe your field of
study for me?
(3) Many people say that children’s ability to calculate is getting worse. Do
you agree? Why?
(4) What is your preferred learning style? What is your preferred teaching
style?
(5) What year are you in now? What made you choose this school?
(6) What courses are you taking this semester? What are your favorite subjects? Why?
(7) What kind of jobs are you interested in after you finish studying here?
What are your plans for the future?
(8) What’s the most difficult part of studying English for you?
(9) What part of English pronunciation is most challenging for you?
(10) What’s the most difficult part of being a TA or RA at the University here?
(11) What kind of assistance have you been receiving to help you become an
effective TA or RA here?
(12) What do you do in your free time?
(13) Evaluate how acceptable it is to use the following sounds to replace [θ ]
in English.
a. How acceptable do you feel it is to replace [θ ] with [s] sound in a
word, such as sree, heals, and somesing for three, health, and something,
respectively?
1
2
3
4
5
6
7
–
–
–
–
–
–
–
Perfectly Acceptable
Moderately Acceptable
Slightly Acceptable
Neutral
Slightly Unacceptable
Moderately Unacceptable
Completely Unacceptable
b. How acceptable do you feel it is to replace [θ ] with [f] sound in a
word, such as f ree, healf , and somef ing for three, health, and something,
respectively?
1
2
3
4
5
–
–
–
–
–
Perfectly Acceptable
Moderately Acceptable
Slightly Acceptable
Neutral
Slightly Unacceptable
Language Learning 59:3, September 2009, pp. 581–621
616
Rau, Chang, and Tarone
Think or Sink
6 – Moderately Unacceptable
7 – Completely Unacceptable
c. How acceptable do you feel it is to replace [θ ] with [t] sound in a
word, such as tree, healt, and someting for three, health, and something,
respectively?
1
2
3
4
5
6
7
–
–
–
–
–
–
–
Perfectly Acceptable
Moderately Acceptable
Slightly Acceptable
Neutral
Slightly Unacceptable
Moderately Unacceptable
Completely Unacceptable
d. How acceptable do you feel it is to replace [θ ] with [ ] sound in a word,
such as shree, healsh, and someshing for three, health, and something,
respectively?
1
2
3
4
5
6
7
–
–
–
–
–
–
–
Perfectly Acceptable
Moderately Acceptable
Slightly Acceptable
Neutral
Slightly Unacceptable
Moderately Unacceptable
Completely Unacceptable
e. Please first rank the following five sounds, [s], [f], [t], [ ], and [θ ], from
1 (most acceptable) to 5 (least acceptable).
[s] [f] [t] [ ] [θ ] = 1
Then place the five sounds on the following chart in relation to one another,
indicating how acceptable you feel each pronunciation is.
<—————————————————————————————>
1 Most Acceptable
5 Least Acceptable
[θ ]
3. Relating Information
(1) Were there any memorable teachers or events at any of your previous
schools?
617
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
(2) Was there ever a moment in your life in which it seemed that your life
was in serious danger or that you might be seriously injured? If yes, what
happened?
(3) Do you remember the 921 Earthquake17 in Taiwan? Where were you?
What happened to you?
(4) Can you remember and recall a nice dream or a nightmare you had?
(5) Can you remember and recall a childhood game?
(6) If you could be born again and relive your life, what would you do
differently, if anything?
Appendix E
Coding Protocol
Dependent Variable:
FG1: Production of English [θ ]
1 = accurate production
0 = inaccurate production
Independent Variable:
FG2: Vowel following an onset (th)
i = high front vowel (e.g., think, theory)
a = low front vowel (e.g., thank)
o = back mid round vowel (e.g., thought, diphthong)
r = mid central rotacized vowel (e.g., third), reduced vowel schwa
(e.g., strengthen, Catholic, mathematics)
b = low back vowel (e.g., thunder)
d = diphthong (e.g., thousand)
1 = high front vowel after consonant cluster thr (e.g., three)
2 = high mid vowel after consonant cluster thr (e.g., threaten)
3 = back vowel after consonant cluster thr (e.g., through, throw)
/ not applicable
FG3: Vowel preceding a coda (th)
i = high front vowel (e.g., teeth)
e = mid front vowel (e.g., breath)
a = low front vowel (e.g., math)
u = high round vowel (e.g., youth, truth)
Language Learning 59:3, September 2009, pp. 581–621
618
Rau, Chang, and Tarone
Think or Sink
o = back mid round vowel (e.g., moth)
r = mid central rhotacized vowel (e.g., birthday)
d = diphthong (e.g., mouth)
/= not applicable
FG4: Consonant preceding a coda (th)
l = /l/ (e.g., wealth)
f = /f/ (e.g., fifth)
n = /n/ (e.g., tenth)
r = /r/ (e.g., north)
FG5: Speech style
i = interview
w = word list
p = passage reading
r = story retelling
FG6: Lexical token frequency
H = word token frequency over 400 times
M = word token frequency above 100 and below 400 times
l = word token frequency below 100 times
FG7: Phonetic features (regrouping of immediate environment)
f = front vowel
t = thr clusters
r = rhotacized vowel
c = complex coda (with n, r, l, or f)
b = back vowel
d = diphthong
FG8: Development over time (measured by percentages of accurate production of th)
H = top 1/3
M = mid 1/3
L = low 1/3
619
Language Learning 59:3, September 2009, pp. 581–621
Rau, Chang, and Tarone
Think or Sink
Appendix F
Frequency of Words Containing (th) and Accuracy Rates by Informal Style
Order
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Word
Total
Accurate N
Accurate %
Word frequency
from MICASE
Think(s)/(ing)
Three
With
Something
Third
Thing(s)
Nothing
Birthday
Anything
Teeth
South
Strength(s)
Thin
Thick
Thought
North
Earthquake
Keith
Math
Everything
Tenth
Ruth
Both
Through
Tooth
Thank(s)
Wealth
Plaything(s)
Mathematics
Threat(en)/(ed)
Thousand
Youth
Thirty
Health
Months
484
177
147
123
107
102
34
33
32
31
27
26
25
25
24
24
24
34
21
21
19
18
18
17
16
16
15
12
12
10
10
9
9
9
8
351
81
95
56
77
75
27
15
27
22
11
15
21
20
17
16
7
25
12
15
11
11
10
9
12
14
9
8
9
7
3
6
4
8
6
73
46
65
46
72
74
79
45
84
71
41
58
84
80
71
67
29
74
57
71
58
61
56
53
75
88
60
67
75
70
30
67
44
89
75
7,544
2,199
7,829
3,193
283
5,407
394
19
895
50
162
75
48
40
1,033
211
2
5
184
585
18
4
849
1, 512
11
510
41
0
58
9
362
26
328
227
118
(Continued)
Language Learning 59:3, September 2009, pp. 581–621
620
Rau, Chang, and Tarone
Think or Sink
Appendix F
Continued
Order
Word
Total
Accurate N
Accurate %
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
Total
Theory
Month
Path
Healthy
Fifth
Strengthen
Fourth
Breath
Worth
Truth
Thunder
Thirteen
Method
Earth
Throw
Threat
Thief
Theoretical
Mouth
Mathematical
Enthusiasm
Wealthy
Twentieth
Truthfulness
Thursday
Throughout
Therapy
Thatcher
Seventh
Moth
Fourteenth
Eighth
7
7
6
6
6
5
5
5
4
4
4
4
4
4
3
2
2
2
2
2
2
1
1
1
1
1
1
1
1
1
1
1
1,815
5
2
3
3
4
4
1
3
2
0
2
3
3
2
1
0
0
1
2
2
1
1
1
0
0
0
1
0
1
1
1
1
1,163
71
29
50
50
67
80
20
60
50
0
50
75
75
50
33
0
0
50
100
100
50
100
100
0
0
0
100
0
100
100
100
100
64
621
Word frequency
from MICASE
314
115
68
31
55
10
110
32
119
122
3
76
123
95
85
68
1
52
56
30
7
30
50
5
140
115
29
1
25
7
14
34
Language Learning 59:3, September 2009, pp. 581–621