Time on Task and Immersion - Canadian Parents for French | Ontario

Turnbull, M., Lapkin, S., Hart, D., & Swain, M. (1998). Time on task and immersion graduates’ French proficiency. In S. Lapkin
(ed.), French second language education in Canada: Empirical studies (pp. 31-55). Toronto: University of Toronto Press.
Time on Task and Immersion
Graduates' French Proficiency
MILES TURNBULL, SHARON LAPKIN,
DOUG HART, and MERRILL SWAIN
Since the first French immersion class started in St Lambert, Quebec, about 30 years ago,
thousands of non-francophone Canadians have graduated from either an early (EI), middle (MI)
or late (LI) immersion program.1 The total time spent studying French varies; most El programs
result in approximately 6000 hours of French instruction by the end of Grade 8, whereas students
who complete MI and LI programs had between 1200 and 2000 hours by the same grade level.
Many educators assumed a direct relationship between spending more time studying a second
language (L2) and higher proficiency in that language. This assumption makes a great deal of
common sense, and findings from a number of studies support it (Carroll, 1975; Morrison,
Bonyun, & Pawley, 1981; Morrison, Bonyun, Pawley, & Walsh, 1980; Stenett & Earl, 1984a,
1984b; Stern et al., 1976). Many of these studies in the late 1970s and early 1980s involved core
French programs in Canada, where students had short daily classes, resulting in a total of about
1400 hours of French instruction by the end of senior secondary school. Whereas a great deal of
attention has been given to examining the French proficiency, cultural attitudes, motivation, and
career aspirations of students who have graduated from an immersion program (Harley, 1994;
Hart & Lapkin, 1989; Hart, Lapkin, & Swain, 1989; Husum & Bryce, 1991; MacFarlane &
Wesche, 1995; Turnbull, 1990a, 1990b; Wesche, 1988,1993; Wesche, Morrison, Pawley, &
Ready, 1986), few of these studies have examined the time/proficiency issue. Those that did
typically focused on gross differences among programs. Therefore, many of these studies have
shown that EI students' French proficiency is higher than that of graduates of late or middle
immersion programs, given that students in El programs are exposed to many more hours of
French than those in MI or LI programs.
However, this has not happened in all studies for all test measures. Furthermore, the differences
between these groups have not been proportional to the differences in accumulated hours of
instruction in French. These findings support the theory that older students learn L2s more
efficiently (Harley, 1982; Krashen, Long, & Scarcella, 1979; Krashen, Scarcella, & Long, 1982;
Long, 1990) because they have already developed what Cummins (1983) called cognitive
academic linguistic proficiency (CALF) in their L1 which they can use in the L2 when they
begin learning it. Some of the studies cited have also revealed considerable variation in the
French proficiency of senior secondary students who started their immersion program at the
same time. Can this variation be attributed to differences in accumulated hours of instruction in
French, especially at the high school level where students report having taken anywhere from 1
to 36 courses?
In this chapter we will discuss the issue of total instructional time in French and its impact on
immersion graduates' proficiency in French. Our analyses examine a large merged database
derived from evaluation studies conducted in western (Calgary, Saskatoon, Kelowna), central
(Sudbury, Nipissing, Toronto, Peel), and eastern Canada (EE.L, Newfoundland, Halifax). We
will address the following questions:
1. How do the French skills (receptive and productive) of immersion students compare
across programs with different total accumulated hours in French?
2. How do the French skills (receptive and productive) of students who start studying
French at the same age vary? Can this variation be attributed to differences in
accumulated hours of instruction in French? Can it be attributed to the way in which that
time is accumulated (intensity)?
Before describing the test instruments and the students who completed them, we discuss some
background on Canadian immersion programs. Then follows a brief review of relevant literature
related to (a) time and proficiency and (b) age and L2 acquisition.
Background on Immersion Programs
French immersion, an optional program designed for non-native speakers of French, exists in all
Canadian provinces and territories. It has grown steadily since its beginning in 1965 and
continues to expand; the national increase for 1994-5 was estimated at 1.3 per cent, with total
enrolment at 305,149 (Goldbloom, 1995). This growth stems in part from a federal government
policy that has offered grants (via provincial governments) to school boards that implement
immersion programs. The Canadian government views this incentive program as a way of
promoting individual bilingualism and, ultimately, national unity. In addition, we cannot
overlook the role of parents as an important factor in the growth of immersion. Many parents
have worked hard to initiate and support immersion programs because of their dissatisfaction
with the traditional core French programs. Moreover, many parents consider immersion
educationally enriching and potentially advantageous for their child's future career.
Whatever the program type, the immersion curriculum rests on the principle of offering a
variety of school subjects taught in the L2; French is therefore the medium and not the object of
instruction. In the first 2 years of most programs, the instruction occurs largely or totally in
French. Over the course of the program, the amount of English instruction increases and the
French instruction decreases accordingly. When the students reach high school, they typically
take only a few subjects in French each year, often in addition to a course in French language
arts, depending on what the school board offers and what the students choose. As already
indicated, by the end of Grade 8 a typical El program results in over 6000 hours total
accumulated instruction in French. Students in MI and LI programs accumulate between 1200
and 2000 hours in French. In addition, these students typically take between 1000 and 1500
hours of high school courses taught in French.
Many researchers, especially at the Second Language Institute at the University of Ottawa
(MacFarlane & Wesche, 1995; Wesche, 1988,1993) and at the Ontario Institute for Studies in
Education (Harley, 1994; Hart & Lapkin, 1989; Hart, Lapkin, & Swain, 1989), have conducted
studies of graduates from early and late immersion programs. Others have focused on immersion
graduates in other parts of the country (Husum & Bryce, 1991; Turnbull, 1990a). All of these
studies found that early immersion students had superior French skills on some but not all test
measures, as compared with late immersion students.
Time and Proficiency
Considerable evidence suggests that an increase in accumulated instructional time correlates with
higher levels of L2 achievement. Carroll (1975), under the auspices of the International
Association for the Evaluation of Educational Achievement, examined the role of a number of
factors in determining the level of achievement attained by students in eight non-French
speaking countries taking French as a second language (FSL) programs with short daily classes.
In his conclusions, Carroll described learning a language as a cumulative process where the
amount of instructional time was the principal factor determining the proficiency attained by the
students.
In the 1970s and early 1980s several studies in Canada compared the proficiency of students
in core French and immersion programs in the same grade (Morrison et al., 1981, 1980; Stern et
al., 1976). The results consistently indicated that the immersion students with significantly more
instructional hours than core French students obtained much higher scores on all test measures,
both receptive and productive. Furthermore, the Ottawa studies (Morrison et al., 1981, 1980),
compared 'extended' French programs with both core and immersion programs. As expected,
students in these programs had French proficiency superior to that of those in core French but
inferior to that of immersion students.
The studies cited (Morrison et al., 1981,1980; Stern et al., 1976), as well as program
evaluations conducted by the board of education in London, Ontario (Stenett & Earl, 1984a,
1984b), examined the effect of successive increases in instructional time in core French
programs. Although the tests used were often difficult for the students, increases in instructional
time in French resulted in higher test scores.
These studies have had an important impact on programming policy for FSL programs in
Canada. For example, in Ontario, although they acknowledge the importance of quality teaching
and of the curriculum, educators consider the number of instructional hours in French the key
factor in the delivery of French programs (Ontario Ministry of Education, 1986), suggesting a
direct relationship between time and proficiency.
However, Swain (1981) and Cummins (1983) both questioned the principle of a direct
relationship between the amount of L2 instruction and L2 achievement. They maintained that the
time argument does not necessarily apply, at least linearly, when age is considered. Swain
referred to a study (Lapkin, Swain, Kamin, & Hanna, 1980) which compared LI students who
had accumulated 1400 hours of French starting at age 12 with EI students who had accumulated
over 4000 hours of French starting at age 5. The EI students outperformed the LI students on a
listening comprehension test. However, the LI students performed better than the EI students on
a reading comprehension test and similarly on a cloze test.
Other researchers have reported findings from comparisons between EI and LI students that
support this non-linear relationship between time and proficiency (Genesee, 1987; Harley, 1982;
1986; Hart & Lapkin, 1989; Hart et al., 1989; Swain & Lapkin, 1982; Turnbull, 1990a, 1990b;
Wesche et al., 1986). As Swain and Cummins argued, these findings suggest that older learners
more efficiently accomplish some aspects of L2 learning than younger learners do, at least in
school settings.
Age and L2 Acquisition
The discussion of Swain (1981) and Cummins (1983) raised the issue of the relationship between
age and L2 acquisition. Substantial evidence (see, e.g., Harley, 1982,1986; Krashen et al.,
1979,1982; Long, 1990) in the literature supports the following three generalizations offered by
Krashen et al.(1979):
1. Adults proceed through early stages of morphological and syntactical development faster
than children (where time or exposure are held constant).
2. Older children acquire a second language faster than younger children (again in early
stages of morphology and syntax, where time is held constant).
3. Child starters outperform adult starters in the long run.
Even though they did not address the issue of total accumulated instructional time and age,
one could argue that the studies of immersion graduates cited support Krashen et al.'s
generalizations. These studies have shown that EI students perform better than MI and LI
students in some but not all skill areas; in addition, differences in performance are not as large as
one would expect given that the El students have experienced much more French than the MI
and LI students. Apparently, the older students learn more quickly and start to catch up to the
learners who started much younger.
Methods
Tests and Survey Instruments
We used the Senior French Proficiency Test Package for French Immersion, cooperatively
developed by the University of Ottawa's Second Language Institute and the Modern Language
Centre at the Ontario Institute for Studies in Education (Toronto) at all test sites. It includes tests
of four skill areas as follows.2
Listening
Two tests measure students' comprehension of spoken French. One consists of three excerpts
from radio broadcasts, involving content similar to what a student might hear in academic
situations. There are 14 multiple-choice items. Second, the students listen twice to a taped
excerpt of a radio broadcast aimed at a teen-aged listening audience. During the second
presentation the student must repeat the 14 sentences one by one. Three scores are derived, of
which two measure speaking skills. The first, a measure of listening comprehension, awards one
point for each sentence repeated so that the scorer can understand the overall or 'global semantic'
meaning.
Reading
The test presents three reading passages dealing with the exploration of space, bilingualism in
the United States, and the French language. Each student reads the selections at his or her own
speed and then answers several multiple-choice items following each reading, for a total of 19
items.
Writing
There are two writing tests. One, a cloze test uses an extract from a journalistic essay on the
proliferation of opinion polls. Words from the original text are deleted at regular intervals; the
test asks the student to supply the missing word, one word for each blank space. Scoring for this
34-item test gives credit for any word acceptable within the context of the sentence. The second
test asks students to express an opinion in writing and justify it with supporting examples and
explanations. Scoring is on a 4-point scale.
Speaking
Of the two speaking tests, one uses the same sentence-repetition task whose first score was for
listening. A second, 'exact' score awards one point for every sentence repeated exactly. A third,
'linguistic features' measure awards a point for each of 13 syntactic features (e.g., impersonal
verbs or pronoun objects) repeated correctly, 5 discursive features (e.g., correct use of past
tenses), and 6 phonological features (e.g., liaison or dropping of the mute 'e' where obligatory).
In the second test, the oral opinion measure, the test administrator asks each student to express
a personal opinion on a given topic of general interest and support it with two reasons if possible.
Task fulfilment is scored on a 4-point scale. There is also a questionnaire surveying the students'
social background, self-assessments of French, and plans and preferences for use of French after
secondary school. This chapter will report only parts of the questionnaire's social background
section that provide a profile of the participants.
The Merged Database
The merged database includes results of studies using the test instruments and questionnaire. We
tested most classes at the end of Grade 12. The composition of the entire merged database
appears in Table 2.1. However, the number of students completing certain test measures does not
always match 1160 because not all students attended every testing session. All available students
in participating classes received group tests and student surveys. We administered oral tests on
an individual withdrawal basis to a simple random sample of 6 to 8 students per class. In total,
the database contains findings for 48 classes, of which 21 are mostly EI students, 17 mostly LI
students and 10 mostly MI. As stated, EI students
Table 2.1. Composition of merged database by school board
School board
Sudbury (Ontario)
Nipissing (Ontario)
Toronto Public (Ontario)
Calgary (Alberta)
PEI
Metro Toronto Separate (Ontario)
Peel (Ontario)
Saskatoon (Saskatchewan)
Halifax (N.S.)
Kelowna (B.C.)
Newfoundland
Total
N
157
105
259
132
89
236
69
22
32
29
30
1160
Table 2.2. Cumulative hours of instruction in French by program
Early
Middle
Late
immersion immersion immersion
Elementary Levela
Starting grade for immersion
K or 1
5
7
Hours in core French
0
240
360
Hours in immersion
5300-6040
1800
720-1080
Total (K-8)
5300-6040
2040
1080-1440
Secondary Levelb
Median number of credits (120 hours per credit)
Grade 9
4
Grade 10
3
Grade 11
3
Grade 12
2
12
Totalc
3
2
2
1
8
3
3
3
3
12
Total hours in French
6740-7480
3000 2520-2880
Based on participating boards' program regulations.
b
Based on participating students' self-reports.
c
The total median score is based on the distribution of the students' total course count.
It is not the sum of grade-specific medians.
a
in our database had typically accumulated about 6000 hours by the end of Grade 8, whereas LI
students had accumulated roughly 1200 hours. The MI students had started intensive French
instruction in Grade 5 with 50 per cent of their instruction in French. By the end of Grade 8, they
had accumulated about 2000 hours of instruction in French. The LI students generally began
their program in Grade 7, with as much as 80 per cent of instructional time in French in Grades 7
and 8.3
For all groups in our database, the number of courses taken at the secondary level (Grades 9 to
12) varied significantly from one jurisdiction to another, with two main clusters of students
falling either into the 5- to 8-course range or the 12- to 14-course range. Table 2.2 summarizes
the total accumulated hours of French instruction and number of courses at the secondary level.
Our unit of analysis throughout is the individual student. This means we assigned students to
categories, based first on their elementary school background (EI MI, or LI) and then on the
number of courses in French they had taken in high school, according to their personal
experiences. In other words, regardless of the program they were in at the time of testing (at
Grade 12), we grouped students according to their personal program histories.
As noted, data for group tests include all available eligible students in a jurisdiction; numbers
are large in most categories. Speaking test data (including data for one measure of listening
comprehension based on the sentence-repetition task) include samples of students in each
jurisdiction. Numbers in some categories in our analyses are comparatively small; we have
included appropriate cautionary notes where reported findings rest on very small samples. We
use tests of significance throughout this chapter as a heuristic device to assess apparent
differences in test scores between subgroups of students.4
Analysis
To examine the research questions posed, we analysed the database in three different ways. First,
we used the entire sample to examine the test scores across program types. Second, we examined
test scores within programs for polar subsamples of the two clusters of students, one taking 5 to 8
courses in French at secondary school, and the other 12 to 14 courses. Third, we examined the
test scores of two different program types where the students had accumulated a comparable
number of hours of instruction in French, but in different ways. The following sections present a
more detailed description and the results from each analysis.
Results
Test Scores Across Program Types
First, we examined the database in order to investigate how the receptive (listening and reading)
and productive (speaking and writing) skills in French compared across programs in which
students had been exposed to different total accumulated instructional hours in French and in
which the starting ages of the students differed. In this instance, we compared the test scores on
the Senior French Proficiency Test Package by program type. Table 2.3 presents the results of
this analysis.
Significant differences among program groups occurred for 8 of 12 test measures, but not for
the listening test total score, the oral and written
Table 2.3. Comparison of test scores by program type
Early
immersion
X
SD
N
Listening comprehension
Listening test total score (max=14)
Global semantic equivalence score
(sentence repetition: max=14)
Middle
immersion
X
SD
N
Late
immersion
X
SD
N
Significance levela
(two-tailed t test)
EI/MI EI/LI MI/LI
Significance
contrasts
(Tukey< .05)
9.70
11.56
2.29
2.68
523
185
9.33
8.95
2.15
2.84
184
107
9.73
8.90
2.44
3.37
170
52
ns
.000
ns
.000
ns
ns
EI>MI
EI>LI
4.17
16.74
9.26
1.97
2.28
3.31
2.78
4.01
2.00
.96
.84
.98
185
185
185
185
185
185
2.22
13.69
7.74
1.57
1.62
2.77
2.51
4.59
2.81
.97
.93
.71
107
107
107
107
107
107
2.50
13.54
7.69
1.67
1.73
2.44
2.95
4.93
2.97
.98
1.05
1.23
52
52
52
52
52
52
.000
.000
.001
.001
.000
.000
.000
.000
.000
.050
.001
.000
ns
ns
ns
ns
ns
.118
EI>MI
EI>MI
EI>MI
EI>MI
EI>MI
EI>MI
EI>LI
EI>LI
EI>LI
2.89
.62
185
2.87
.71
107
2.65
.71
52
ns
ns
ns
Reading comprehension
Reading test total score (max=19)
11.11
3.25
521
10.62
2.84
183
11.18
3.17
174
ns
ns
ns
Cloze test (max=34)
21.32
4.98
526
20.34
5.08
183
20.47
5.67
173
.027
ns
ns
2.96
.95
519
2.83
.91
183
2.95
.93
172
ns
ns
ns
Speaking (sentence repetition)
Global exact score (max=14)
Total count of scored features (max=24)
Count of syntactic features (max=13)
Count of liaisons (max=3)
Count of syncopes (max=3)
Count of discursive features (max=5)
Oral opinion measure (max=4)
Written opinion measure (max=4)
EI>LI
EI>LI
The two-tailed t tests and Tukey analysis were completed only if the test scores between groups were statistically significant (p y 05) on an analysis of
variance (ANOVA).
a
opinion measures, and the reading test total score. For those measures where an analysis of
variance (ANOVA, p < .05) indicated significant differences across program groups, we
conducted pairwise comparisons between groups. These results (Table 2.3) show program
comparisons (early versus middle immersion, early versus late, middle versus late).
When we compared the mean test scores of the EI students with those of the MI students (twotailed t tests), we found that the EI students significantly outperformed the MI students on eight
test measures (listening and speaking criteria on sentence-repetition task, as well as cloze test)
where the ANOVA had indicated significant differences across program groups. More
conservative post hoc multiple comparisons (Tukey, p < .05) confirmed this pattern for all test
measures except the cloze test.
In the case of comparisons between the mean test scores of the EI and LI students, t tests
revealed statistically significant differences on seven test measures (listening and speaking
criteria on the sentence-repetition task). Post hoc multiple comparisons (Tukey, p < .05) failed to
confirm differences between the EI and LI students in mean scores on the liaison count
(sentence-repetition task) as statistically significant. Thus, we found differences between EI and
LI students on all but one of the sentence-repetition measures, clearly indicating that EI students’
speaking skills are superior.
Moreover, we found no statistica1 differences betweeen the EI students and both LI and MI
students on the listening total score, the oral and written opinion measures, and the reading test
total score.
The third comparison we made across programs involved all the MI and LI students. Here,
two-tailed t tests revealed no statistically significant differences between these groups on any of
the test measures.
In Table 2.4, we present percentiles of selected test scores by program type. The percentile
scores enable us to compare the distributions of student scores in different programs, rather than
simply the mean or average scores, yielding more detailed information about the comparative
performance in different programs. A percentile score marks the high point for a stated
proportion of students. Thus, as shown in Table 2 4 the 33.3 percentile score for EI students on
the cloze test is 19.0. This means that a third (33.3 per cent) of early immersion students obtained
scores of 19.0 or lower on the cloze test.
Table 2.4 indicates that measures where we found no significant differences among mean
scores (here the listening and reading comprehension tests and the cloze test) also show similar
distributions of scores across programs. Thus, for example, Table 2.3 indicates that the cloze test
Table 2.4 Percentiles of selected test scores by program type
Early immersion
10
33.3
50
66.6
90
Listening comprehension
Listening test total score (max=14)
Speaking (sentence repetition)
Total count of scored features (max=24)
Reading comprehension
Reading test total score (max=19)
Cloze test (max=34)
10
Middle immersion
33.3
50
66.6
90
10
Late immersion
33.3
50
66.6
90
6.4
9.0
10.0
11.0
12.6
7.0
8.6
9.0
10.0
12.0
6.1
9.0
10.0
11.0
13.0
11.0
16.0
17.0
19.0
21.0
7.0
12.0
13.0
15.9
20.0
7.0
11.0
13.0
16.0
20.0
6.2
10.0
521
13.0
15.0
7.0
10.0
11.0
12.0
14.0
7.0
10.0
11.0
13.0
15.0
14.0
19.0
526
24.0
27.0
13.0
18.0
21.0
23.0
26.6
12.0
19.0
21.0
23.9
27.0
mean scores for all three program groups fall within a very narrow range, from 20.34 to 21.32.
Table 2.4 indicates the distributions. Thus, for example, the cloze test score which defines the
10th percentile (i.e., the bottom 10 per cent of students) is 14.0 for EI students, 13.0 for MI
students and 12.0 for LI students. The 66.6 percentile score, defining the high point for two
thirds of the students, is 24.0 for EI students, 23.0 for MI, and 23.9 for LI - unlike the total count
of scored features, where El students significantly outperformed both MI and LI students.
Here the percentile scores of EI students exceed those of MI and LI students (similar to each
other) at least to the 66.6 percentile. The bottom 10 per cent of EI students on the total count of
scored features measure reaches a score of 11.0, compared with 7.0 for both MI and LI students.
Thus, some students in the bottom 10 per cent of the EI groups score substantially higher than
any students in the bottom 10 per cent of the MI or LI groups. Similarly, at the 66.6 percentile
(covering two thirds of students) the upper score is 19.0 for EI students, compared with 15.9 for
MI and 16.0 for LI.
The second part of our first research question asks to what degree the test score differences
across programs are proportional to total accumulated instructional hours in French. Table 2.2
shows the total number of accumulated hours of instruction in French at the end of secondary
school for each program. In general, the EI students had accumulated between 2.3 and 2.5 times
as many hours of instruction in French as either MI or LI students. Therefore, examining the test
scores of each group, one would logically expect higher scores from the EI students, in view of
the difference in total accumulated hours of instruction in French. Interestingly, the difference
occurs almost exclusively in speaking, rarely in the other skills.
These findings support Swain's (1981) and Cummins's (1983) suggestions that the relationship
between the amount of L2 instruction and L2 achievement is not direct or linear when the
students compared began learning the L2 at different ages. Our findings also support the theory
that older learners accomplish some aspects of L2 learning more efficiently, at least in school
settings.
Test Scores within Programs
Our second research question asked how the French skills (receptive and productive) of students
who start studying French at the same age vary. To explore this issue, we compared the effect of
different numbers of courses taught in French at the secondary level on the test scores of students
within the same program. We will also explain our method of investigation before presenting
findings.
Table 2.5 presents Pearson correlations of test scores with the number of secondary courses
taught in French for each program type. Almost all test measures and the number of secondary
courses correlate significantly for EI students (except syncopes5 and oral opinion). Similar,but
fewer, statistically significant correlations occur for the MI students. However, a statistically
significant correlation surfaces on only one test measure (liaisons on the sentence-repetition task)
for the LI students. This anomaly suggests that the LI student population may be somewhat
unique. We will return to this discussion when we examine the results of the LI comparisons.
Nevertheless, with the exception of the LI students, our data suggest a positive correlation
between the number of French courses at the secondary level and proficiency in French.
Table 2.5 Pearson correlations of test scores with number of secondary courses in French by
program type
Early
Middle
Late
immersion
immersion
immersion
N
N
N
ϒ
ϒ
ϒ
Listening comprehension
Listening test total score
.11*
519
.06
184
.05 168
(max=14)
Global semantic
.42*** 184
.36*** 107
.02
52
equivalence score
(sentence repetition:
max=14)
Speaking (sentence repetition)
Global exact score
(max=14)
Total count of scored
features (max=24)
Count of syntactic
features (max=13)
Count of liaisons
(max=3)
Count of syncopes
(max=3)
Count of discursive
features (max=5)
.27***
184
.33***
107
.09
52
.23***
184
.28**
107
.11
52
.26***
184
.23**
107
.15
52
.20**
184
.16
107
.28*
52
.04
184
.29**
107
-.16
52
.16*
184
.17
107
-.01
52
.09
184
.08
107
.21
52
Reading comprehension
Reading test total score
(max=19)
.09*
516
.02
183
.03
172
Cloze test (max=34)
.17***
522
.27***
183
.11
171
Written opinion measure
(max=4)
* p < .05
** p < .01
.12**
515
.18**
183
.13
170
Oral opinion measure
(max=4)
*** p < .001
To explore whether a substantial difference in the number of secondary French courses will
affect the proficiency skills of students within the same program, we extracted two polar groups
from each sample of students from the same program, one that had taken 5 to 8 French courses in
high school and one that had taken 12 to 14. The results of these comparisons appear in Tables
2.6,2.7 and 2.8.
In the case of EI students (Table 2.6), those who had completed more French courses at the
secondary level attained statistically significant higher test scores on all except three test
measures (the syncopes and discourse measures on the sentence-repetition task and the oral
opinion measure).
Table 2.6 Comparison of early immersion test scores by number of secondary courses in French
Significance
EI # 8
EI $ 12
level
(two-tailed
X
SD
N
X
SD
N
t test)
Listening comprehension
Listening test total score
9.30 2.48 109
9.92 2.26 301
.017
(max=14)
Global semantic equivalence
9.11 3.54
35
12.29 1.72 111
.001
score (sentence repetition:
max=14)
Speaking (sentence repetition)
Global exact score
(max=14)
Total count of scored
features (max=24)
Count of syntactic
features (max=13)
Count of liaisons (max=3)
Count of syncopes
(max=3)
Count of discursive
features (max=5)
2.91
2.70
35
4.77
2.65
111
.001
14.94
4.73
35
17.35
3.76
111
.001
8.11
2.56
35
9.68
1.89
111
.001
1.66
2.11
1.06
0.99
35
35
2.07
2.32
0.87
0.80
111
111
.026
ns
3.06
1.14
35
3.42
0.89
111
ns
2.77
0.60
35
2.90
0.65
111
ns
Reading comprehension
Reading test total score
(max=19)
10.50
3.25
107
11.55
3.13
295
.001
Cloze test (max=34)
19.76
5.84
105
22.01
4.62
306
.001
2.72
1.07
104
3.03
0.92
300
.001
Oral opinion measure
(max=4)
Written opinion measure
(max=4)
The MI group with the greater number of French courses (Table 2.7), had statistically
significant better scores on most measures of the sentence-repetition task (global semantic
equivalence, global exact score, total count of scored features, syntactic features, and syncopes).
In addition, they did better on the cloze test. However, the subsample of MI students who had
done 12 to 14 French courses is quite small (n = 15-28, depending on the test measure);
consequently, we treat these results as exploratory.
Interestingly, with more exposure to French, EI and MI students had better speaking skills,
especially when we considered more 'form-related' criteria (Tables 2.6 and 2.7). Furthermore,
greater exposure to French at the secondary level resulted in better scores on the cloze test,
considered by some researchers as a global measure of overall L2 proficiency (Swain, Lapkin, &
Barik, 1976).
Table 2.7 Comparison of middle immersion test scores by number of secondary courses in French
Significance
MI # 8
MI $ 12
level
(two-tailed
X
SD
N
X
SD
N
t test)
Listening comprehension
Listening test total score
9.17 2.07 107
9.50 2.53 28
ns
(max=14)
Global semantic equivalence
8.17 2.99
63
11.07 1.94 15
.001
score (sentence repetition:
max=14)
Speaking (sentence repetition)
Global exact score
(max=14)
Total count of scored
features (max=24)
Count of syntactic
features (max=13)
Count of liaisons (max=3)
Count of syncopes
(max=3)
Count of discursive
features (max=5)
1.89
2.53
63
3.80
3.30
15
.016
12.65
4.89
63
15.67
4.32
15
.027
7.06
3.00
63
8.60
2.32
15
.040
1.52
1.38
1.03
0.89
63
63
1.87
2.07
0.92
1.03
15
15
ns
.001
2.68
1.26
63
3.13
1.06
15
ns
2.79
0.74
63
2.80
0.41
15
ns
Reading comprehension
Reading test total score
(max=19)
10.31
2.80
107
10.46
2.44
26
ns
Cloze test (max=34)
18.87
5.01
106
22.69
3.73
26
.001
2.66
0.90
106
3.00
0.80
26
ns
Oral opinion measure
(max=4)
Written opinion measure
(max=4)
Among the LI students (Table 2.8), we found statistically significant differences in mean test
scores between the lower and higher French-course groups on two test measures only, the written
opinion measure and the liaison count on the sentence-repetition task: not surprising, given the
results for the Pearson correlations (Table 2.5) between test scores and number of secondary
French courses for all the LI students. Although we must interpret these results carefully, given
the small sample of LI students who had done eight secondary French courses or fewer, our
findings suggest that factors other than the number of courses taken in French affect the LI
students' proficiency more than EI or MI students. One possible explanation might be the
selectivity of the LI population. Generally, these students probably chose for themselves whether
they wanted to enter immersion. Consequently, weak or poorly-motivated students are likely less
numerous in this sample. Moreover, some evidence in the literature suggests that LI has a higher
attrition rate than other programs (Halsall & Clarke, 1992). One could argue, therefore, that
students still in LI at Grade 12 are the strongest and most motivated of an already select group;
as a result, additional hours would not so dramatically affect these 'good language learners.' This
conjecture needs verifying in future research.
Table 2.8 Comparison of late immersion test scores by number of secondary courses in French
Significance
LI # 8
LI $ 12
level
(two-tailed
X
SD
N
X
SD
N
t test)
Listening comprehension
Listening test total score
9.58 1.84 24
9.84 2.61 103
ns
(max=14)
Global semantic equivalence
9.73 2.80 11
8.73 3.87
26
ns
score (sentence repetition:
max=14)
Speaking (sentence repetition)
Global exact score
(max=14)
Total count of scored
features (max=24)
Count of syntactic
features (max=13)
Count of liaisons (max=3)
Count of syncopes
(max=3)
Count of discursive
features (max=5)
2.27
3.07
11
2.65
3.08
26
ns
13.09
5.47
11
14.00
5.26
26
ns
7.09
3.39
11
8.08
3.19
26
ns
1.27
2.00
1.10
1.10
11
11
1.96
1.54
0.92
1.07
26
26
.05
ns
2.73
1.01
11
2.42
1.39
26
ns
2.64
0.81
11
2.81
0.63
26
ns
Reading comprehension
Reading test total score
(max=19)
10.70
2.51
23
11.40
3.15
104
ns
Cloze test (max=34)
19.88
5.85
24
21.04
5.28
101
ns
2.63
1.01
24
3.08
0.86
101
.05
Oral opinion measure
(max=4)
Written opinion measure
(max=4)
Comparisons across Programs with Comparable Accumulated Hours of French Instruction
Finally, to explore issues of distribution rather than quantity of instructional time in French, we
compared students in MI and LI who had taken approximately the same number of hours in
French. Table 2.9 presents comparisons of test scores for MI students who had 8 or fewer
secondary French courses with scores for LI students who had 12 or more courses. We identified
statistically significant, better scores for the LI students on the global semantic equivalence and
liaison count in the sentence-repetition task, on the reading and cloze tests, and on the written
opinion measure. These findings may relate to an interplay between intensity and 'recency'. The
LI students who took more secondary courses in French may have benefitted from this relatively
'intense' instruction in the 3 or 4 years before testing.6 Moreover, students who take 12 courses
do so in a consistent pattern for the most part: 3 courses per year. Students taking a median
number of 8 courses (Table 2.2), usually take a decreasing number over the 4 years of secondary
Table 2.9 Comparison of middle and late immersion students with comparable accumulated hours
of French instruction
Significance
MI # 8
LI $ 12
level
(two-tailed
X
SD
N
X
SD
N
t test)
Listening comprehension
Listening test total score
9.17 2.07 107
9.84 2.61 103
ns
(max=14)
Global semantic equivalence
8.17 2.99
63
8.73 3.87
26
.038
score (sentence repetition:
max=14)
Speaking (sentence repetition)
Global exact score
(max=14)
Total count of scored
features (max=24)
Count of syntactic
features (max=13)
Count of liaisons (max=3)
Count of syncopes
(max=3)
Count of discursive
features (max=5)
1.89
2.53
63
2.65
3.08
26
ns
12.65
4.89
63
14.00
5.26
26
ns
7.06
3.00
63
8.08
3.19
26
ns
1.52
1.38
1.13
0.89
63
63
1.96
1.54
0.92
1.07
26
26
.050
ns
2.68
1.26
63
2.42
1.39
26
ns
2.79
0.74
63
2.81
0.63
26
ns
Reading comprehension
Reading test total score
(max=19)
10.31
2.80
107
11.40
3.15
104
.009
Cloze test (max=34)
18.87
5.01
106
21.04
5.28
101
.003
2.66
0.90
106
3.08
0.86
101
.001
Oral opinion measure
(max=4)
Written opinion measure
(max=4)
school (3 courses in Grade 9, decreasing to one in Grade 12). By the time we tested them at the
end of Grade 12, their exposure to substantial instructional time in French was more remote (less
recent).
Discussion
Our analysis comparing the French proficiency of immersion students in different programs has
revealed that EI students outperformed students from MI and LI programs on selected measures
of listening and speaking ability. The EI students did not do better on a multiple-choice test of
listening comprehension nor on any measures of French literacy. Clearly, an early start in the
immersion program has a beneficial impact predominantly on speaking skills.
When we examined the effect of different numbers of secondary courses taught in French on
students' test scores within the same program, we found that EI students with 12 or more
secondary French courses obtained higher test scores on most measures than those who had
completed 8 or fewer courses. Similar results were obtained for MI students, however, the group
that had completed more French secondary courses had fewer statistically significant better
scores. The same comparison of the LI students in our database produced an anomaly; higher
numbers of French courses in high school did not noticeably affect their test scores, with the
exception of the written opinion measure and the liaison count on the sentence-repetition task.
One could then argue that the LI students represent a more homogeneous ability group (despite
differences in their exposure to French at the secondary level) than either EI or MI students,
because they (and their parents) make a more informed decision about immersion's
appropriateness for them. In other words, they probably self-select more than either EI or LI
students. In addition, weaker and less motivated LI students may tend to leave the program
before Grade 12; as a result, the sample may become even more select than that of the two other
programs.
Last, we compared students from two different programs who had accumulated comparable
hours of French instruction: LI students who had completed 12 or more secondary French
courses with MI students who had completed 8 or fewer courses. The comparisons revealed that
LI students outperformed the MI students on some test measures: the reading and cloze tests, the
global semantic equivalence and liaison count on the sentence-repetition task, and the written
opinion measure. We can best explain these findings by an interaction of intensity and recency,
whereby the LI students had advantages in testing because they had more recently received more
intense instruction in French. Furthermore, the LI students in this comparison had done more
courses in French in the 3 years of senior secondary school closest to testing time. Again, one
could also argue that LI students who complete secondary school in immersion may be the
strongest and most motivated of an already select group.
In terms of program delivery, it is likely that some will interpret these findings as supporting
the implementation of late immersion programs rather than early immersion programs. We
caution strongly, however, against making such a simplistic interpretation; numerous factors not
investigated in this study (e.g., self-selection processes involved in late immersion) are at play in
the start-up and delivery of any immersion program. Furthermore, one must keep in mind that
the results are limited to the tests we used which did not assess all aspects of language
proficiency, for example, sociolinguistic and discourse competence. Nevertheless, the results do
give strong support to providing increased instructional time in French in whichever program
one wishes to implement.
References
Carroll, J.B. (1975). The teaching of French as a foreign language in eight countries. New York:
Wiley.
Cummins, J. (1983). Language proficiency, bihteracy and French immersion. Canadian Journal
of Education 8(2), 117-38.
Gardner, R.C., Lalonde, R.N., Moorcraft, R., & Evers, F.T. (1985). Second language attrition:
The role of motivation and use (Research Bulletin No. 638). London, Ontario: University
of Western Ontario, Department of Psychology.
Genesee, F. (1987). Learning through two languages. Cambridge, MA: Newbury House.
Goldbloom, V. (1995). Commissioner of Official Languages: Annual Report, 1994. Ottawa:
Minister of Supply and Services Canada.
Halsall, N., & Clarke, L. (1992). Secondary school French immersion study. Ottawa: Carleton
Board of Education.
Harley, B. (1982). Age-related differences in the acquisition of the French verb system by
anglophone students in French immersion programs. PhD thesis. University of Toronto.
— (1986). Age in second language acquisition. Clevedon, UK: Multilingual Matters.
— (1994). After immersion: Maintaining the momentum. Journal of Multilingual and
Multicultural Development 25(2/3), 229^4.
Hart, D., & Lapkin, S. (1989). French immersion at the secondary/post-secondary interface.
(Final Report to the Ontario Ministry of Education.) Toronto: OISE, Modem Language
Centre.
— & Swain, M. (1989). Evaluation of continuing bilingual and late immersion programs at the
secondary level. (Final report to the Calgary Board of Education.) Toronto: OISE, Modern
Language Centre.
Husum, R., & Bryce, R. (1991). A survey of graduates from a Saskatchewan French immersion
high school. Canadian Modern Language Review 48,135-43.
Krashen, S., Long, M., & Scarcella, R.C. (1979). Age, rate and eventual attainment in second
language acquisition. TESOL Quarterly 13, 573-82.
Krashen, S., Scarcella, R.C., & Long, M. (Eds.). (1982). Child-adult differences in second
language acquisition. Rowley, MA: Newbury House.
Lapkin, S., Swain, M., Kamin, J., & Hanna, G. (1980). Report on the 1979 evaluation of the Peel
County late French immersion program. Grades 8,10,11 and 12. Toronto: OISE, Modern
Language Centre.
— (1983). Late immersion in perspective: The Peel study. Canadian Modern Language Review
39(2), 182-206.
Long, M. (1990). Maturational constraints on language development. Studies in Second
Language Acquisition 11, 251-85.
MacFarlane, A., & Wesche, M. (1995). Immersion outcomes: Beyond language proficiency.
Canadian Modern Language Review 52(2), 250-74.
Morrison, F., Bonyun, R., & Pawley, C. (1981). Longitudinal and cross-sectional studies of
French proficiency in Ottawa and Carleton schools. (Eighth Annual Report submitted to
the Ontario Ministry of Education.) Ottawa: Ottawa Board of Education Research Centre.
— & Walsh, M. (1980). French proficiency and general progress: Students in elementary core
French programs, 1978-1980, and in immersion and bilingual programs, grades 8,10 and
12, 1980. (Seventh Annual Report submitted to the Ontario Ministry of Education.)
Ottawa: Ottawa Board of Education Research Centre.
Ontario Ministry of Education. (1986). French as a second language: Curriculum guideline,
Ontario academic courses. Toronto: Queen's Printer for Ontario.
Smythe, P.Q, Jutras, G.C., Bramwell, J.R., & Gardner, R.C. (1973). Second language retention
over varying intervals. Modern Language Journal 57(8), 400-5.
Stenett, R.G., & Earl, L.M. (1984a). Elementary French core program evaluation: Final report.
London, Ontario: Board of Education for the City of London.
— (1984b). Elementary French core program evaluation: Summary of findings 1978 to 1983.
London, Ontario: Board of Education for the City of London.
Stem, H.H., Swain, M., McLean, L.D., Friedman, R.J., Harley, B., & Lapkin, S. (1976).
Approaches to teaching French. Toronto: Ontario Ministry of Education.
Swain, M. (1981). Time and timing in bilingual education. Language Learning 31(1), 1,1-13.
— & Barik, H.C. (1976). The cloze test as a measure of second language proficiency. Working
Papers on Bilingualism 11, 32-42.
— & Lapkin, S. (1982). Evaluating bilingual education: A Canadian case study. Clevedon,
Avon: Multilingual Matters.
Turnbull, M. (1990a). Les diplômés des programmes d’immersion. MA thesis, McMaster
University, Hamilton, Ontario.
— (1990b). PEI study of French immersion students at the secondary/post-secondary interface.
(Report to the University of Prince Edward Island.) Charlottetown, P.E.I.
Wesche, M. (1988). Les diplômés de l’immersion: Implications dans le domaine de
1'enseignement du français. In C. Besnard & C. Elkabas (Eds.), L'Université de demain:
Courants actuels et apports de la didactique des langues à 1'enseignement du français
langue seconde (pp. 215-29). Toronto: Canadian Scholars Press.
— (1993). French immersion graduates at university and beyond: What difference has it made?
InJ. Alatis (Ed.), Georgetown Roundtable on Languages and Linguistics (pp. 208-40).
Washington, DC: Georgetown University Press.
— Morrison, F., Pawley, C., & Ready, D. (1986). Post-secondary follow-up of former immersion
students in the Ottawa area: A pilot study. Ottawa: University of Ottawa, Second Language
Institute.
— (1990). French immersion: Post-secondary consequences for individuals and universities.
Canadian Modern Language Review 46(3), 430-51.
Winch, R.E., & CampbeU, D.T. (1969). Proof? No. Evidence? Yes. The significance of tests of
significance. American Sociologist 4(2), 140-3.
Notes
1
El starts in either kindergarten or Grade 1, depending on the province and school board. MI
starts either in Grade 4 or 5. LI starts in Grade 6 or 7.
2
Description taken from Hart, Lapkin, and Swain (1989), pp. 4-5.
3
The Toronto Board 'late extended' LI program is anomalous. It too begins at 52 Language
Outcomes: Core and Immersion Grade 7, but with a maximum of only 40 per cent of
instructional time in French in Grades 7 and 8. The lowest figures in column 3 of Table 2.2
reflect this program.
4
Strictly speaking, these tests of significance are appropriate only for data from random
samples of students - that is, the oral proficiency test scores. The tests of significance formally
assess the magnitude of obtained differences against the size of differences that might occur with
a given probability (e.g., .10, .05, .01) through the 'luck of the draw' in selecting samples from
full populations. Where our data came from populations (discounting absentees and nonrespondents), no differences resulting from sampling error can appear. Obtained differences are
fully real, uncontaminated by effects of sampling. There are, however, arguments for employing
tests of significance for population data:
According to conventional reasoning, one is inquiring as to the probability that the
observed difference between subsample means may have occurred ... where the true difference
- or population value - is zero ...
But we elect to phrase the question differently: If we assume the set to be homogeneous,
what is the probability that dividing the set into two subsets on the basis of a variable of
classification that makes no real difference would give a difference between subsample means
as great as that observed?
... it is a plausible rival explanation of a difference that it is of the magnitude that would
appear frequently by chance ... (Winch & Campbell, 1969, pp.142-3)
The term 'explanation' may be too strong. However, the concept of comparing differences
between subpopulations against 'the magnitude that would appear frequently by chance' appears
particularly appropriate in studies such as ours, where results rest variously on
samples/subsamples or populations/subpopulations.
5
Syncope = the dropping of a mute 'e' where obligatory (e.g., Je le donne, le médecin).
6
The language retention/attrition literature, which examines the attrition of French second
language skills over time, has discussed a possible 'recency effect' (e.g., Gardner, Lalonde,
Moorcraft, & Evers, 1985; Smythe, Jutras, Bramwell, & Gardner, 1973). These studies have
shown that less use over an extended period (6 months or more) resulted in significant, but not
large, attrition of some L2 skills.