Development of decoding, reading comprehension, vocabulary and

Reading and Writing: An Interdisciplinary Journal 14: 61–89, 2001.
© 2001 Kluwer Academic Publishers. Printed in the Netherlands.
61
Development of decoding, reading comprehension, vocabulary
and spelling during the elementary school years
COR AARNOUTSE1, JAN VAN LEEUWE2, MARINUS VOETEN3 &
HAN OUD4
1 Department of Educational Sciences, University of Nijmegen, Nijmegen, The Netherlands;
2 Statistical Consultancy Group, University of Nijmegen, Nijmegen, The Netherlands;
3 Department of Educational Sciences, University of Nijmegen, Nijmegen, The Netherlands;
4 Department of Special Education, University of Nijmegen, Nijmegen, The Netherlands
Abstract. The goal of this study was (1) to investigate the development of decoding (efficiency), reading comprehension, vocabulary and spelling during the elementary school years
and (2) to determine the differences between poor, average and good performers with regard
to the development of these skills. Twice each year two standardized tests for each skill were
administered. For two successive periods, one of the tests for each skill was the same. To
describe the development in terms of a latent variable evolving across grades, the structuredmeans version of the structural equation model was used. The growth was expressed in terms
of effect size. With respect to the first question, clear seasonal effects were found for reading
comprehension, vocabulary and spelling, while the seasonal effect for decoding efficiency
was restricted to the early grades. Progress tended to be greater from fall to spring than from
spring to fall. For decoding efficiency, and to a lesser degree for vocabulary and spelling,
growth showed a declining trend across grades. For reading comprehension, the progress in
grade 2 was lower than the progress in grade 3, but progress was declining across higher
grades. With respect to the second question, it appeared that initially low performers on reading comprehension, vocabulary and spelling tended to show a greater progress, especially in
periods where the largest amount of instruction was given. Although it was found that the low,
medium and high ability groups remain in the same order, as far as their means are concerned,
these findings do not confirm the existence of a Matthew effect for reading comprehension,
vocabulary and spelling. For decoding efficiency no clear differential effect could be found:
the gap between the poor and good performers did not widen over time for this skill.
Keywords: Decoding efficiency, Developmental curves, Matthew effect, Reading comprehension, Spelling, Vocabulary
Introduction
This article presents the results of a longitudinal study to investigate the
educational development of elementary school students with regard to some
important aspects of reading and language arts, namely decoding (efficiency),
reading comprehension, vocabulary and spelling. The purpose of this study
62
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
was to determine the average development of these skills in six years of Dutch
elementary education. The findings can then provide the basis for assessment
of the individual developmental trajectories of pupils in Dutch elementary
schools. In addition, average developmental courses for these skills were
examined for pupils identified as poor, average or good performers in first
grade.
In the present section, the concepts of decoding (efficiency), reading
comprehension, vocabulary and spelling will be described. Thereafter, the
results of some important longitudinal studies concerned with the development of decoding (efficiency), reading comprehension, vocabulary and
spelling in elementary school will be considered. In closing, the question of
what longitudinal research has to say about the development of the different
skills in low-, average- and high-achieving pupils will be discussed.
Decoding is the ability to transform printed letter strings into a phonetic
code (Perfetti 1985). The most important step towards the identification of
printed words is to utilize the alphabetic principle, which means to be able
to represent a letter or a combination of letters by their phonemes (Stanovich
1986). Application of the alphabetic principle depends in part on sensitivity of
phonemes as units of speech. In first grade, students learn to parse the printed
word into graphemes and subsequently assign the phonemes to the different
graphemes. After that the students have to blend these phonemes into words.
In the next grades, students learn to recognize words or groups of words as
fast as possible (Perfetti 1985). Decoding can be measured by the accuracy
of pronouncing increasingly difficult words or pseudowords or by the rate
to pronounce increasingly difficult words or pseudowords correctly. In the
latter case we speak of decoding efficiency. Reading comprehension requires
understanding the meanings of words, sentences and texts. At different levels
(i.e., the lexical, syntactic, semantic, and pragmatic levels), students try to
understand the written message of a writer. The simple view of reading
(Carver 1993; Gough & Tunmer 1986; Hoover & Gough 1990) claims that
reading comprehension depends on two components, viz., decoding and
linguistic comprehension. This theory states that these components are necessary for reading success but neither one is sufficient by itself. According
to this theory, there are developmental changes in the nature of the relationships between the components themselves, and between the components
and the criterion variable of reading comprehension. In the early grades, the
components of decoding and linguistic comprehension are, at most, weakly
related. From the middle grades on, linguistic comprehension contributes
more substantially to reading comprehension than decoding. According to
Sticht & James (1984), decoding is well developed by grade 3 whereas
vocabulary and comprehension continue to develop for many years to come.
DEVELOPMENT OF READING SKILLS
63
Vocabulary refers to the knowledge of lexical meanings of words and the
concepts connected to these meanings. Differences in the size of vocabulary
have an effect on word recognition skills as well as reading comprehension
(Aarnoutse & Van Leeuwe 1988; Beck & McKeown 1991). In spelling,
the spoken language is converted into graphic symbols. It is known that
orthographic processing skills explain a considerable amount of variance in
reading ability (Cunningham & Stanovich 1993). In the first grades a strong
relationship exists between decoding and spelling.
In studying the development of language skills, it is important to know
whether specific skills are present and thus empirically distinguishable at
various points in time. It may be the case, for instance, that certain skills are
clearly distinguisable at later stages in development but do not differentiate
themselves at earlier stages. One must also be certain that the same skill is
being measured throughout the entire developmental period and that the skill
is thus assessed using the same or clearly equivalent measures. Finally, it
is important to know how stable the individual differences in the various
skills are. One can then ask about the extent to which later development is
predicted by earlier development. Several longitudinal studies on the development of decoding (efficiency), reading comprehension, vocabulary and
spelling during elementary school years have examined these questions (e.g.,
Aarnoutse & Van Leeuwe 1988; Boland 1991, 1993; Butler, Marsh, Sheppard
& Sheppard 1985; De Visser 1989; Malmquist 1969; Mommers 1987; Röhr
1978; Taube 1988; Van Dongen 1984).
Malmquist (1969) carried out a longitudinal study in grades 1, 2 and 3 (N
= 230) in Swedish elementary schools and found the predictive validity of
the reading and school readiness tests not to be high: the amount of variance
explained at the end of grade 1 for decoding efficiency, reading comprehension and spelling was 26%, 39% and 31%, respectively. Decoding tests,
administered in the middle of grade 1 predicted performance at the end of
grade 1 reasonably well: the amount of variance explained for decoding efficiency, reading comprehension and spelling was now 52%, 59% and 42%,
respectively. The tests for decoding efficiency, reading comprehension and
spelling at the end of grade 1 explained the variance in these variables at
the end of grade 2 with 69%, 55% and 49%, respectively, and the variance
in the variables at the end of grade 3 with 60%, 25% and 42%, respectively. The tests for decoding efficiency, reading comprehension and spelling
administered at the end of grade 2 explained 85%, 30% and 62% of the
variance in performance at the end of grade 3. This research shows that the
skills of decoding efficiency, reading comprehension and spelling can already
be distinguished at the end of grade 1. It also shows the tests of decoding
efficiency, reading comprehension and spelling administered in the previous
64
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
school year to be the best predictors of performance on these tests during the
next year.
Röhr (1978) examined the predictability of the reading and spelling
achievement of children (N = 180) at the ends of grades 1 and 2 in German
elementary schools. Using variables relating to the social and psychological
environments of these students when they were in kindergarten and variables
relating to their linguistic and cognitive abilities, the investigator found that
a total of 64.5% of the variance in the reading and spelling skills of the
children at the end of grade 1 could be predicted. At the end of grade 2,
64% and 63% of the variance in the reading and spelling skills of the students
could be predicted by the information on their reading and spelling skills
in grade 1. Unlike Malmquist, Röhr did not distinguish between decoding
and reading comprehension. Like Malmquist, however, he found aspects of
language (reading etc.) in a particular school year to be best predicted by
measures of the same aspects in the previous year.
In their longitudinal study involving Australian children from kindergarten, grades 1, 2, 3 and 6 (N = 320), Butler, Marsh, Sheppard & Sheppard
(1985) found the reading scores obtained at a particular point in elementary
school to be most directly and strongly related to the reading scores obtained
immediately prior to that point in time. The researchers conclude “that the
acquisition of reading skills for students in this study followed a smooth,
stable developmental pattern in which the acquisition of skills at any particular point in time depends on the mastery of prior skills” (p. 357). They
also found that the gap between the more and less able group increased over
time (the so-called fan spread). Once again, however, Butler et al. did not
distinguish between decoding and reading comprehension.
Taube (1988) followed 500 students in Swedish elementary schools for
8 years and found 80% of the variance in their reading skills in grade 6 to
be explained by their reading skills in grades 2 and 3. Similarly, De Visser
(1989) followed 300 Dutch students from grades 1 through 6 and found their
achievement in grade 6 to be both directly and indirectly influenced by their
achievement in the first years of elementary school.
Van Dongen (1984) and Mommers (1987) studied the development of
decoding efficiency, reading comprehension and spelling skills across a
period of three years in two samples of 12 randomly selected elementary
schools from the Netherlands (N = 225 and 236). The results indicated that:
(1) specific versus general prerequisites should be clearly distinguished in
predicting reading and spelling achievement, (2) decoding efficiency, reading
comprehension and spelling achievements constitute clearly distinguishable
skills after an eight month period of instruction, and (3) decoding efficiency, reading comprehension and spelling achievement are best predicted
DEVELOPMENT OF READING SKILLS
65
by measures of the same skills at an earlier age. In this longitudinal study,
the distinct character of the decoding efficiency, reading comprehension and
spelling skills after eight months of formal instruction could be observed
more clearly than in cross-sectional, correlational research.
In a follow up on Mommers’s study, Boland (1991, 1993) examined the
decoding efficiency, reading comprehension and spelling achievement of the
same students in grade 6 (N = 310) and those studied by Mommers in grades
1, 2 and 3. The results of this study indicated that it is possible to distinguish decoding efficiency, reading comprehension and spelling in grade 6 as
separate variables. Reading comprehension strongly correlated with vocabulary and measures of verbal intelligence. The interrelations between the three
variables (decoding efficiency, reading comprehension and spelling) appeared
to be dominated by the effects between the variables of the same sort. “Most
decoding variance can be explained by a measure of the same variable at
an earlier time point; the same holds for reading comprehension and – to a
slightly less pronounced degree – for the spelling ability” (Boland 1991: 212).
Aarnoutse & Van Leeuwe (1988) examined the relative effects of decoding
efficiency, vocabulary and spatial intelligence on reading comprehension
using the scores for these variables in grades 3 and 6 from the longitudinal
research by Mommers and by Boland. Vocabulary measured in grades 3 and
6 appeared to be the most important predictor of reading comprehension
measured in grade 6. Spatial intelligence and decoding efficiency, measured
in grades 3 and 6 followed as predictors, with decoding efficiency making the
smallest contribution.
On the basis of these longitudinal results, one can conclude that decoding
efficiency, reading comprehension and spelling constitute separate factors
or constructs by the end of grade 1. The scores for each of these skills are
highly predictable by the scores for the same skill at an earlier point in time.
Prediction is best when the time interval is short. That is, measurement at
an immediately prior point in time appears to contain all of the information
needed to predict later performance and thereby makes the pupil’s previous
history more or less irrelevant.
In several cross-sectional studies conducted over the last 20 years,
researchers have examined the question of whether and to what extent differences exist between groups of students with regard to decoding (efficiency)
(Just & Carpenter 1987; Perfetti 1985; Stanovich 1991), reading comprehension (Daneman 1991; Garner 1987; Paris, Wasik & Turner 1991; Pearson
& Fielding 1991; Vauras, Kinnunen & Kuusela 1994), vocabulary (Beck
& McKeown 1991; Nagy & Herman 1987) and spelling (Ehri 1986, 1991;
Venezky & Massaro 1987). In most of the studies, the differences between
good and poor learners are considered. In only a few studies, however,
66
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
has the development of different groups of students with regard to the
above-mentioned skills been considered for more than one grade or age.
Lesgold & Resnick (1982) followed the reading progress of a group of
students from the beginning of first grade through third grade. Based on
second and third-grade comprehension scores the children were grouped into
three levels. It appeared that the lowest ability group began in first grade with
low skills in certain tests of basic skills required for reading, viz., vocabulary
and phoneme-grapheme knowledge. This group failed to improve in these
skills over time, falling farther behind their successful peers. The problems in
comprehension of this group did not show up until the second grade.
Aarnoutse, Mommers, Smits & Van Leeuwe (1986) examined the development of decoding efficiency, reading comprehension and spelling abilities
of different groups of students from the moment they entered the second
grade through fourth grade. On the basis of the scores on a decoding efficiency test administered at the beginning of second grade, four groups of
decoders could be distinguished. On the basis of the reading comprehension
and spelling tests, four groups of reading comprehenders and spellers could
similarly be distinguished. The results showed the different groups composed
at the beginning of grade 2 to maintain their relative position over the years. In
other words, the composition of the four groups of decoders, reading comprehenders and spellers remained largely the same in grades 2, 3 and 4 although
their average developmental curves were, with the exception of the decoding
skill, not parallel.
Juel, Griffith & Gough (1986) and Juel (1988) followed the decoding and
spelling development of 54 lower socio-economic students from grades 1
through 4. The results showed the chances of a poor reader (decoder) in first
grade ranking among the poor readers in fourth grade to be 88%; the chances
of a poor reader in second grade moving up to the level of an average reader
in fourth grade was 13%. The spelling measurements were not as stable over
time as the decoding measurements, however.
Shaywitz et al. (1995) followed 396 students from kindergarten through
grade 6 in order to examine the Matthew effect (Stanovich 1986), which
claims that the gap between good and poor readers increases as time goes on
(the so-called fan-spread). Stanovich (1986) has argued that those students
who read best initially improve their reading skills at a faster rate than
students who do not read as well, because of increased exposure to more
written language. In other words, over time, better readers get better (rich-getricher), and poorer readers become relatively poorer (poor-get-poorer). This
means that the development of individual variation in reading can be characterised by a stable rank ordering of individuals and an increase of differences
among students (Bast 1995). Shaywitz et al., however, did not find evidence
DEVELOPMENT OF READING SKILLS
67
of a Matthew effect for decoding skills. (They found a Matthew effect for
IQ, though this effect was relatively small). Their findings suggest, in fact,
that “those children who were initially poor readers in the early school years
remain poor readers relative to other children in the sample. Thus, though the
rate of reading achievement is higher in those children in the lowest quartiles,
their reading achievement, as an absolute level, remains much lower than that
of children who began at a more advanced level of reading achievement. This
finding suggests that shortly after school entry, children’s reading achievement changes very little relative to that of their peers” (Shaywitz et al. 1995:
903).
To uncover the influence and causes of the Matthew effect in reading Bast
(1995) carried out a longitudinal study in the first three years of the Dutch
elementary school (N = 235). In line with the Matthew model Bast found that
initially poor decoders (poor in decoding efficiency) remained poor decoders
during those grades and that the performance gap with good decoders became
larger in the course of development. In contrast to the claims of the Matthew
model, however, Bast could not find evidence for increasing differences in
reading comprehension, vocabulary and attitudes towards reading. Only for
leisure time reading activities an increase of interindividual variance was
found. Another, more methodological finding of this study was that the
development of individual differences in decoding efficiency and reading
comprehension could adequately be described by a simplex growth model
(see Method).
As mentioned before, the present study is focused on the development of
decoding efficiency, reading comprehension, vocabulary and spelling during
the elementary school years. The main research questions were the following:
(1) How do the skills of decoding efficiency, reading comprehension, vocabulary and spelling develop over a period of six years of elementary education?
Is the average development of these skills a question of increasing growth
(progress) or does the growth in these skills appear to decrease after some
school years?, (2) What are the differences between poor, average and good
performers with regard to the development of decoding efficiency, reading
comprehension, vocabulary and spelling? Do the differences between the
poor and good performers appear to increase during the elementary school
period? Can, with other words, a Matthew effect be detected?
On the basis of the research mentioned above, it was expected that the
average development in the areas of decoding efficiency, reading comprehension, vocabulary and spelling shows a rather stable pattern of growth
during the elementary school years. Another expectation was that the interindividual differences observed in the development of the poor, average and
good performers for these skills do not increase over time. That is, the relative
68
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
differences between the three groups of students were expected to remain the
same or to decrease throughout the six elementary school grades.
Method
Longitudinal design. Three cohorts of students were distinguished:
− students attending grade 1 in the school year 1991–1992 (cohort a),
− students attending grade 2 in the school year 1991–1992 (cohort b), and
− students attending grade 3 in the school year 1991–1992 (cohort c).
The cohorts were tested following the scheme presented below:
1
2
a
b
a
School year
1991–1992
1992–1993
1993–1994
1994–1995
1995–1996
1996–1997
Grade
3
4
Cohorts
c
b
a
c
b
a
5
c
b
a
6
c
b
a
The students in the cohorts were tested twice each year in the months
of October and April. During each measurement period, two standardized
norm-referenced tests were administered for decoding efficiency, reading
comprehension, vocabulary and spelling. For two successive measurement
periods, at least one of the tests for each skill was the same, which made it
possible to construct a common scale for each skill and thereby determine
the developmental curves for the four competencies. Using multiple cohorts
had several advantages. First, the data obtained on one cohort could be used
to adapt the tests for subsequent cohorts. Second, the cohorts could function
as each other’s replicates. Third, the students who repeated a grade could be
studied in the next cohort.
Sample. A stratified random sample of 39 schools was taken from the
population of Dutch elementary schools. The stratification variables were
the degree of urbanisation (municipalities with more or less than 100,000
inhabitants) and the composition of the school population or ‘school weight’
(low, medium or high). The second weight is used by the Dutch government
for school funding and is a combination of ethnic origin, SES, and level
of education of the parents. Seven schools dropped out after the first year
DEVELOPMENT OF READING SKILLS
69
because of the amount of work involved in the administration of the tests,
the low achievement of their students, and the late reporting of results to the
school. Thereafter, the three cohorts consisted of about 900 students each,
with 49% boys and 51% girls on average.
Instruments
The achievement of the students in reading and language arts were the
variables of interest in the present study. Reading achievement involved
decoding efficiency and reading comprehension; language achievement
involved vocabulary and spelling.
Decoding efficiency. During each measurement period, two forms of the same
test for decoding efficiency were administered: form A and B of the One
Minute Test (Een Minuut Test) from Brus & Voeten (1973). The One Minute
Test measures the word-decoding ability of students in grades 1 through 6.
The child’s task is to read aloud as many words as possible from a card in
one minute (there are four columns with 29 unrelated words in each). The list
of words on this card decreases in frequency of usage; the test measures the
rate of pronouncing increasingly difficult words correctly. The child’s score
is the number of words read in one minute, minus the number of words read
incorrectly. The test is a combination of a measure of rate and an accuracy
measure: it can be viewed as a test measuring decoding efficiency. A very
small number of students in grade 6 could read the 116 words correctly
in one minute. The two parallel forms of the test were administered on all
occasions. The distributions of the test scores were found to be acceptable
for all of the grades. The test/retest correlations and the correlations for the
two forms of the test were also found to be above 0.85 for all grades.
Reading comprehension. Two tests for reading comprehension were
administered at each measurement point: the Reading Comprehension Tests
(Begrijpend Leestest 3, 4, 5, 6, 7, 8) developed by Aarnoutse (1996a) and
one of the Reading Comprehension Tests (Lees en Begrijp 1 1979; Lees en
Begrijp 2 1980a; Begrijpend Lezen M3, E3, E4, M5, E5 1981) developed by
Cito. The Reading Comprehension Tests from Aarnoutse and from Cito are
designed to measure general reading comprehension of students in grades 1
through 6. All tests are measures of accuracy: there were no time limits for
the students to finish these tests. The students read short expository and/or
narrative texts and then answered multiple choice questions concerned with
the texts. The questions can pertain to the word, sentence or text levels. The
passages of the tests vary across the grades in difficulty from easy to difficult.
The administration of one test lasted about one hour. The distributions of
70
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
the scores for the reading comprehension tests were found to be acceptable
with the exception of some tests. Most of these tests showed a ceiling effect
during second administration. The Cronbach alpha coefficients for the tests
from Aarnoutse and from Cito have all found to be 0.83 or higher and 0.79
or higher, respectively.
Vocabulary. During each measurement period, one or two of the Vocabulary
Tests (Woordenschattest 3, 4, 5, 6, 7, 8) developed by Aarnoutse (1996c)
were administered. In grades 3, 4, 5 and 6, one of the Vocabulary Tests
(Woordenschattest) developed by Stijnen (1975) was also administered;
in grade 4, the Synonyms Test (Synoniementest) developed by Aarnoutse
(1987) was also administered. The Vocabulary Tests and the Synonyms
Test from Aarnoutse and the tests from Stijnen are designed to measure the
ability of students to comprehend the meaning of a word within the context
of a single sentence. All tests are measures of accuracy: there were no time
limits for the students to finish these tests. The student’s task is to choose
from four alternatives the word with the same or almost the same meaning
as the word underlined in the target sentence. The vocabulary tests contain
words that, across grades, decrease in frequency of usage. The administration
of one test took about half an hour. The distributions of the scores for the
vocabulary tests were found to be fairly acceptable with the exception of
some tests. These tests showed a ceiling effect during second administration.
The Cronbach alpha coefficients for the tests from Aarnoutse and from
Stijnen were all found to be 0.80 or higher and 0.89 or higher, respectively.
Spelling. During each measurement period, one or two of the Spelling Tests
(Spelling tests) developed by Aarnoutse (1996b) were administered. In grades
1 and 2, the Spelling Tests (Woorddictee 1 1980b; Woorddictee 2 1980c)
developed by Cito were also administered. The Spelling Tests from Aarnoutse
are designed to measure the ability of students in grades 1 through 6 to spell
words correctly. The student’s task is to spell the words presented within
a single sentence or series of sentences. The tests for grades 1 and 2 are
composed of only nouns; the tests in the higher grades contain verbs as well.
The distributions of the scores for these tests were found to be acceptable with
the exception of two tests. One of these tests showed a ceiling effect during
both administrations. The Cronbach alpha coefficients for the tests were all
found to be 0.83 or higher. The Spelling Tests from Cito are designed to
measure the ability of students in grades 1 and 2 to spell words. The student’s
task is to spell the words presented in a sentence. The two tests measure
the spelling of nouns. The distributions of the scores were not found to be
acceptable as the tests showed a strong ceiling effect. The Cronbach alpha
DEVELOPMENT OF READING SKILLS
71
coefficients for the tests were found to be 0.80 or higher. All spelling tests
contain words that, across grades, decrease in frequency of usage. The tests
are measures of accuracy; all students had ample time to finish the tests. The
administration of one test took about 40 minutes.
All tests, mentioned above, meet the requirements of reliability and
construct and predictive validity (Aarnoutse, Van Leeuwe, Voeten, Van Kan
& Oud 1996). A factor analysis with all these tests as variables revealed
three factors, which could be interpreted as decoding efficiency, reading
comprehension/vocabulary, and spelling.
Procedure
All tests were administered by teachers after a short training. Only the One
Minute Test required individual administration. As already noted, the same
test was used on all occasions to measure decoding efficiency. For the other
skills, different tests were used depending on the grade level. The teachers
sent the test forms back to the research staff, and the results of the tests were
subsequently reported to the school. The research staff met regularly (twice a
year) with the teachers to discuss the progress of the project.
For the present study, only the data from the students of cohort a (grades
1 through 6) will be considered. This involves a total of 11 measurement
periods or two points a year with exception of grade 1 (12; 21, 22; 31, 32;
41, 42; 51, 52; 61, 62; the first number designates the grade and the second
number the period: 1 for fall and 2 for spring). It was impossible to administer
tests in the first measurement period in grade 1 (October). At that time most of
the Dutch students can neither read nor spell. In the second period of grade 1
(April) most students can decode and spell words of one or two syllables. For
each skill two tests were administered during each of the eleven measurement
periods (from spring in grade 1 to spring in grade 6). To get a stable estimate
of the underlying trait, which enables the investigation of the longitudinal
development of the skills, a four step procedure was devised and applied to
each of the four skills, separately.
Step 1: Student selection
One of the problems in longitudinal studies is the drop-out of subjects.
Subjects left the longitudinal sample for several reasons: some of the students
moved to other schools and/or places; other students had to repeat a grade or
entered a special education program; and seven of the schools dropped out
after the first year. Only those students with a valid score on at least one of
the two tests for each of the eleven measurement periods were included in the
longitudinal sample. From the 1218 students who entered the project in 1991
in grade one, 515 students remained for decoding efficiency, 568 for reading
72
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
comprehension, 580 for vocabulary and 520 for spelling. It should be noted
that this drop-out may have influenced the representativeness of the sample;
all inferences are therefore made with regard to students who remained in the
same school for a period of six years without repeating a grade.
Step 2: Handling missing data
The remaining data obviously contained some missing scores. For this reason,
the covariance matrices, to be analyzed in the next step, were estimated using
the full information maximum likelihood method (Anderson 1957).
Step 3: Estimating the underlying latent trait
The two test scores for a skill in each measurement period were assumed
to be two congeneric measures of that skill. This implies that the two tests
measure the same construct, but that their factor loadings may differ. For each
skill the latent variable at any measurement period was assumed to depend
directly on its immediate predecessor only, i.e. the latent variables for each
skill were supposed to form a so-called simplex structure (Jöreskog 1970)
over time. Structural equation modelling was performed separately for each
of the four skills using version 3.6 of the computer program AMOS (Arbuckle
1996). The structured-means version of the structural equation model was
used (Jöreskog & Sörbom 1993, chap. 10). The means, variances and covariances for the 21 or 22 variables observed for each skill constituted the input
to the program. The relevant parameters were estimated using the maximum
likelihood method. The models were almost identical for each skill. With
respect to decoding efficiency, scores of test versions A and B of the One
Minute Test were available at all eleven measurement periods; these versions
are common to all measurement periods. For the other three skills a relaylike procedure was used in which one test was administered at two successive
measurement periods and then substituted by the next test according to the
following scheme for the eleven measurement periods: AB, BC, CD, DE, EF,
FG, GH, HI, IJ, JK, KL. So test A was administered in period 12, test B in
periods 12 and 21, test C in periods 21 and 22, test D in periods 22 and 31,
etc.
To make scores on the latent variables for a skill comparable over time,
two types of restrictions were imposed. First, for the same tests, administered
in two successive periods both the intercepts and the factor loadings were
constrained to be equal. Second, the scale of the latent variables had to be
fixed. This may be done arbitrarily. It was decided to set mean and variance
of the first latent variable equal to the mean and variance of the very first
observed variable (test A) in the first measurement period (12). By this type
of analysis observed scores on individual tests are ‘compressed’ to scores on
latent variables which are assumed to be linearly related to each other. The
73
DEVELOPMENT OF READING SKILLS
influence of a high or a low score by chance is greatly removed, since it is
part of the error, accounted for in the model. For this reason, regression to the
mean may be regarded as negligible.
Step 4: Making progress comparable: Effect size
Though the foregoing steps result in a common metric to describe skill development over time, it does not guarantee that dispersions remain the same over
time. It is likely that standard deviations vary across measurement periods.
To make the assessment of progress from one measurement period to the
next comparable across periods and skills a standardized average gain score
was computed. The differences between means of the latent variables at
two consecutive time points were divided by the standard deviation of the
latent variable at the first of the two time points. This measure is called an
‘effect size’, because it is analogous to the effect size index used in group
comparisons (Cohen 1969). The measure indicates how large mean gain on a
skill is relative to the magnitude of individual differences at the first of two
time points. The results of this study will primarily be presented in terms of
progress comparisons expressed by this measure.
Results
The development of the four skills
The development of decoding efficiency, reading comprehension, vocabulary
and spelling was analysed according to the four-step procedure described
in the foregoing section. For completeness, the global fit statistics for the
structural equation analyses of step 3 are presented in Table 1, where GFI
denotes the ‘goodness of fit index’ (Jöreskog & Sörbom 1984), CFI denotes
the ‘comparative fit index’ (Bentler 1990) and RMSEA denotes the ‘root
mean square error of approximation’ (Browne & Cudeck 1993).
Table 1. Global fit statistics for the four skills
Decoding efficiency
Reading comprehension
Vocabulary
Spelling
χ2
df
p
GFI
CFI
RMSEA
N
625.969
515.409
928.323
540.745
219
180
200
180
0.000
0.000
0.000
0.000
0.900
0.922
0.844
0.913
0.993
0.993
0.986
0.992
0.060
0.057
0.079
0.062
515
568
580
520
74
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
(a)
(b)
Figure 1. Effect sizes for the four skills by measurement period:
(a) decoding efficiency; (b) reading comprehension.
DEVELOPMENT OF READING SKILLS
(c)
(d)
Figure 1. Effect sizes for the four skills by measurement period:
(c) vocabulary; (d) spelling.
75
76
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
While the chi-square tests yielded significant deviations from the model
for each of the four skills, the remaining indicators show a satisfactory fit.
GFI and CFI approximate 1, while RMSEA is smaller than 0.08, which indicates a reasonable fit according to Browne and Cudeck (1993). Attempts were
undertaken to improve the fit by freeing covariances among residuals. Such
alterations did not improve the fit of the model substantially, however, and
no systematic patterns within these covariances among residuals could be
discovered.
The effect sizes across measurement periods for each of the four skills are
presented in Figure 1.
Figure 1 shows for all four skills progress to gain its maximum in grade
2 (i.e. between fall, period 21, and spring, period 22, of grade 2). In this
figure two main trends are immediately clear. First, the effect sizes in the
period from fall to spring exceed the effect sizes in the period from spring to
fall, with some minor exceptions. The amount of time spent on instruction in
reading and language arts, which is greatest in the period from fall to spring
in the Dutch school system, apparently influences the progress of students in
each of the four skills. Second, the effect sizes are decreasing across grades.
This decline is greatest for decoding efficiency, starting from the end of grade
3 (period 31 → 32). Note that the effect sizes for reading comprehension
and vocabulary show a similar pattern. To inspect the tendency of decreasing
effect sizes across grades, effect sizes were also determined for each grade,
i.e. from spring to spring (12 → 22, 22 → 32, etc.). These effect sizes per
grade are presented in Figure 2.
Figure 2 shows that progress in decoding efficiency decreases strongly by
grade, especially after grade 2. Progress in reading comprehension is highest
in grade 3. Apparently, in grade 1 and 2 the greatest attention is on the technical aspect of reading. When decoding efficiency has reached a certain level
in grade 2 and 3 (cf. Sticht & James 1984), the attention shifts to reading
comprehension. The degree of progress in reading comprehension decreases
in grade 4 and 5, while an upward trend is apparent from grade 5 to 6. This
growth may be caused by the final examination of the elementary school
students in February in grade 6. For vocabulary and spelling, the degree of
progress is intermediate and decreases by grade. An exception is vocabulary
in grade 5. It is possible that the increase in vocabulary in this grade is caused
by the more prominent role social studies takes in the fifth grade of the Dutch
elementary school curriculum. In social studies students learn a lot of new
words and concepts.
DEVELOPMENT OF READING SKILLS
77
Figure 2. Effect sizes for the four skills by grade.
Developmental differences between low, medium and high performers
In order to examine the development of students who performed differently
in grade 1, the students were subdivided into three subgroups. This was done
for each skill on the basis of their scores in the spring of grade 1, when these
students can already read words with one or two syllables. Low, medium and
high performers were distinguished according to a percentile distribution of
17%, 66%, and 17%, respectively, reflecting the cut-off points of −1, and +1
SD in a normal distribution. As two tests were involved for each skill in grade
1, the distribution of the mean z-score for the two tests served as the criterion.
By using the mean of two test scores the risk of regression to the mean in
comparing extreme groups is (greatly) diminished. Given that the two tests
have discrete values, the subdivision into 17%, 66% and 17% could not be
realized perfectly. The percentages were therefore approximated as closely
as possible; the number of pupils included in each analysis is presented in
Table 2.
For each of the four skills and each of the three subgroups, steps 2 to 4 of
the procedure were performed along the lines described for the total sample
in the procedure section. In the structural equation model of step 3, however,
the parameters in the main part of the model were fixed to the values attained
for the total group in the foregoing part. The parameters to be estimated were
restricted to the intercepts of the latent variables and the variances of the error
78
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
Table 2. Numbers of students in subgroups
Decoding efficiency
Reading comprehension
Vocabulary
Spelling
Low
Medium
High
94
94
91
93
335
381
387
340
86
93
102
87
terms. In such a manner, the development in the means for the subgroups of
low, medium and high performers in grade 1 could be identified. To properly
compare the progress, effect sizes were based on the standard deviations for
the total group. These effect sizes, for each of the four skills and for the
subgroups of performers in grade 1, are presented in Figure 3.
Figure 3 shows again a seasonal effect for decoding efficiency in the first
grades, and a clear seasonal effect for reading comprehension, vocabulary
and spelling in all grades. For decoding efficiency, Figure 3 does not show a
clear differentiation between the three subgroups in the higher grades, even
though that the group of high performers has the lowest effect sizes in most
grades. This last finding does not confirm the existence of a Matthew effect
for decoding. If a Matthew effect for decoding would exist, the effect sizes
for the group of good performers would exceed the effect sizes for the group
of low performers in all grades. For reading comprehension, Figure 3 shows
a different pattern. First, low performers mostly show higher effect sizes than
the medium performers, which for the most part exceed the effect sizes of the
higher performers. This result does not support the existence of a Matthew
effect for reading comprehension. Second, the highest effect size for the low
performers is found in period 21 → 22, indicating that they are trying to catch
up. Third, the highest effect sizes for the medium and high subgroups are
also found in period 21 → 22. So for the total group of students (see Figure
1), the greatest amount of progress is found in grade 2, while the initially
low performers are pursuing already from grade 1. It is also striking that
the progress of the initially low performers exceeds the progress of the other
two groups in the periods from fall to spring. Apparently, the low performers
benefit most from instruction. For vocabulary and spelling similar patterns of
effect sizes are found (see Figure 3). The low performers mostly show higher
effect sizes than the medium and high performers. Exceptions are found for
vocabulary in grade 2 (period 22 → 31) and in grade 4 (period 42 → 51). In
general, the effect sizes are highest for the initially low performers and lowest
DEVELOPMENT OF READING SKILLS
79
for the initially high performers, which does not confirm the Matthew effect
for both vocabulary and spelling.
In Figure 4 the average development for each of the subgroups of low,
medium and high performers on each of the four skills is presented.
The development curves for the three subgroups are distinct but nevertheless follow rather similar patterns with decreasing differences across grade for
reading comprehension,vocabulary and spelling. The mean of the group of
initially low performers remains lower than the mean of the group of initially
medium performers, which in turn, remains lower than the mean of the
group of initially high performers. The distinction between the three groups
is largest for decoding efficiency and smallest for spelling. Figure 4 shows
again that the gap between the low and high performers does not increase
over time. The decrease is lowest for decoding. The difference between the
means of the groups of high and low performers was 31.88 in grade 1 and
24.31 in grade 6. For the other three skills the decrease in difference between
groups of high and low performers was much greater. Since, according to
Shaywitz et al. (1995) “a Matthew effect would predict that the gap between
good and poor readers should increase as time goes on” (p. 897), the above
results do not confirm the existence of a Matthew effect for reading comprehension, vocabulary and spelling. For decoding, at least a Matthew effect is
not supported. As far as Figure 4 is concerned, the conclusions are based
on comparisons of differences in time between good and poor performers,
i.e. comparisons of vertical gaps over time.The comparison of differences
between groups performing equally well, i.e. comparison of horizontal gaps
in Figure 4 reveals findings of another nature. It is interesting to see that only
by grade 3 the poor decoders (32) attain a level at which the good decoders
started in grade 1. So, there is on average difference in decoding efficiency
of two years between these two groups. In grade 6 (61) the poor decoders
attain a decoding efficiency level that the good decoders achieve in grade 3
(31). The same phenomenon can be noticed for reading comprehension: the
poor reading comprehenders attain in grade 3 (32) a level by which the good
reading comprehenders started in grade 1. In grade 6 (62) the poor reading
comprehenders achieve the level that the high performers attain in grade 4
(42). The same phenomenon can be found for vocabulary and spelling (in the
first grades).
Discussion
The present study focussed on two questions: (1) How do decoding efficiency,
reading comprehension, vocabulary and spelling develop during six years of
80
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
(a)
(b)
Figure 3. Effect sizes for low, medium and high performers:
(a) decoding efficiency; (b) reading comprehension.
DEVELOPMENT OF READING SKILLS
(c)
(d)
Figure 3. Effect sizes for low, medium and high performers:
(c) vocabulary; (d) spelling.
81
82
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
(a)
(b)
Figure 4. Development in means for low, medium and high performers in decoding efficiency,
reading comprehension, vocabulary and spelling:
(a) decoding efficiency; (b) reading comprehension.
DEVELOPMENT OF READING SKILLS
83
(c)
(d)
Figure 4. Development in means for low, medium and high performers in decoding efficiency,
reading comprehension, vocabulary and spelling:
(c) vocabulary; (d) spelling.
84
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
elementary school?, and (2) What are the differences between poor, average
and good performers with respect to the development of these skills?
With respect to the first question, clear seasonal effects were found for
reading comprehension, vocabulary and spelling, while the seasonal effect
for decoding efficiency was restricted to the early grades (cf. Brus & Voeten
1973; Voeten 1991). Apparently, the amount of instruction in reading and
language arts influences the speeding up of these skills within these grades.
For decoding efficiency, and to a lesser degree for vocabulary and spelling
growth shows a declining trend across grades. For reading comprehension,
the progress in grade 2 is lower than the progress in grade 3, while the rate of
progress is declining across higher grades. Apparently, most students make
much progress in decoding efficiency in the lower grades (1 and 2), whereafter highest progress is shifted to reading comprehension. With increasing
age the rate of progress on vocabulary and spelling is declining, except for
vocabulary in grade 5. Perhaps, this latter revived progress in vocabulary
is caused by the more prominent role social studies takes in this grade.
With respect to the second question, it appeared that initially low performers
on reading comprehension, vocabulary and spelling tended to show higher
progress, especially in periods where the largest amount of instruction at
school is given. Although it was found that the low, medium and high groups
on average remain in the same order, these findings do not support the existence of a Matthew effect for reading comprehension, vocabulary and spelling.
For decoding efficiency no clear differential effect could be found: groups
of initially low, medium and high performers did not differ systematically
in mean development. In other words, the gap between the poor and good
performers did not widen over time for this skill, so a Matthew effect for
decoding efficiency is not supported.
Generally, the results of this study support the expectations. With respect
to the first expectation the results show that the development of the four
skills to be basically steady and to gradually slow over time. These findings
are for the most part in line with those of Malmquist (1969), Butler et al.
(1985), Mommers (1987) and Boland (1991), as reported in the introduction.
The results also show growth to be greatest in the lower grades; particularly
grade 2 for decoding efficiency and grade 3 for reading comprehension. With
respect to the second expectation, it appeared that students with initially
poorer ability in the four skills showed greater improvement over time relative
to students with initially better ability in these areas. Shaywitz et al. (1995)
found the same result with regard to reading. It appeared also that the developmental curves of the three groups followed rather similar patterns with
decreasing differences across grade for reading comprehension, vocabulary
DEVELOPMENT OF READING SKILLS
85
and spelling. This result is for the most part in line with the findings of
Aarnoutse et al. (1986), Juel (1988), Shaywitz et al. (1995) and Bast (1995).
The results of this study have important implications for both theory and
educational practice. With respect to theory it appeared that one of the most
important hypotheses of the Matthew model of Stanovich (1986), namely
the increasing performance differences between pupils over time, was not
supported. Like Shaywitz et al. (1995) and for the most part Bast (1995), an
increase in the achievement difference between poor and good performers
was not found in this study. The question is how to explain the fact that Bast
(1995) did find an increase of variability for decoding efficiency, while in the
present study such a Matthew effect was not found. A possible explanation
can be that Bast (1995) excluded two groups from his sampling procedure
(the upper 25% and the lowest 10% of the distribution) “in order to arrive
at a sample of students that could be tracked the first three grades” (p. 17).
With respect to practical implications the results show growth to be greatest
in the lower grades, particularly grade 2 for decoding efficiency and grade 3
for reading comprehension. This finding suggests that intervention programs
should be implemented in the first three grades of the primary school or
in kindergarten. The effects of programs like Reading Recovery (Center et
al. 1995; Clay 1991) and Success for All (Slavin 1995; Slavin et al. 1995)
provide support for this finding. Another practical implication is that the
average developmental curves provide an important basis for student assessment, monitoring and prediction. Such a system can involve not only the
measurement of the progress of students over longer periods of time but
also the development of guidelines or programs for remediation along with a
computerised registration system (Jansen 1997; Oud & Jansen 1996).
A restriction on the conclusions relates to the students included in the
study. Only those students with a valid score on at least one of the two tests
for each of the eleven measurement periods were admitted. This meant that
students repeating a grade or entering a special-education program were not
involved in the study. Given the three cohorts in our study, it will nevertheless
be possible to analyse the development of the repeaters from cohorts 2 and 3
as well.
Several important questions can be formulated for further research like:
How do decoding efficiency, reading comprehension, vocabulary and spelling
relate across the six primary grades? Which relationships become stronger or
weaker over time? Another important question to be addressed is: How can
the learning progress of individual pupils be estimated over time? (Voeten
1990, 1991). Other questions are: How do decoding efficiency, reading
comprehension, vocabulary and spelling develop when pupils appear to be
(a) poor in decoding efficiency and poor in reading comprehension, (b)
86
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
poor in decoding efficiency and good in reading comprehension, (c) good
in decoding efficiency and poor in reading comprehension, and (d) good in
decoding efficiency and good in reading comprehension? Which minimum
level of decoding efficiency, vocabulary and other cognitive and affective
skills and attitudes are necessary to achieve a minimum level of reading
comprehension?
References
Aarnoutse, C.A.J. (1987). Synoniementest [Synonyms test]. Nijmegen: Berkhout.
Aarnoutse, C.A.J. (1996a). Begrijpend leestests [Reading comprehension tests]. Nijmegen:
Berkhout.
Aarnoutse, C.A.J. (1996b). Spellingtests [Spelling tests]. Nijmegen: Berkhout.
Aarnoutse, C.A.J. (1996c). Woordenschattests [Vocabulary tests]. Nijmegen: Berkhout.
Aarnoutse, C.A.J., Mommers, M.J.C., Smits, B.W.G.M. & Van Leeuwe, J.F.J. (1986). De
ontwikkeling en samenhang van technisch lezen, begrijpend lezen en spellen [Development and relation between decoding, reading comprehension and spelling], Pedagogische
Studiën [Pedagogical Studies] 63: 97–110.
Aarnoutse, C.A.J. & Van Leeuwe, J.F.J. (1988). Het belang van technisch lezen, woordenschat
en ruimtelijke intelligentie voor begrijpend lezen [Importance of decoding, vocabulary
and spatial intelligence for reading comprehension], Pedagogische Studiën [Pedagogical
Studies] 65: 49–59.
Aarnoutse, C., Van Leeuwe, J., Voeten, R., Van Kan, N. & Oud, J. (1996). Longitudinaal
onderzoek schoolvorderingen in het basisonderwijs [Longitudinal study of schoolachievements in the elementary school]. Nijmegen: University of Nijmegen.
Anderson, T.W. (1957). Maximum likelihood estimates for a multivariate normal distribution
when some observations are missing, Journal of the American Statistical Association 52:
200–203.
Arbuckle, J.L. (1996). Amos user’s guide version 3.6. Chicago: SmallWaters Corporation.
Bast, J.W. (1995). The development of individual differences in reading ability (doctoral
dissertation). Amsterdam: Paedologisch Institut.
Beck, I. & McKeown, M. (1991). Conditions of vocabulary acquisition. In: R. Barr, M.L.
Kamil, P.B. Mosenthal & P.D. Pearson (eds.), Handbook of reading research, volume 2
(pp. 789–814). New York: Longman.
Bentler, P. (1990). Comparative fit indexes in structural models, Psychological Bulletin 107:
238–246.
Boland, T. (1991). Lezen op termijn [Reading in the long term] (doctoral dissertation).
Nijmegen: University of Nijmegen.
Boland, T. (1993). The importance of being literate: Reading development in primary school
and its consequences for the school career in secondary education, European Journal of
Psychology of Education 8: 289–305.
Browne, M.W. & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K.A. Bollen
& J.S. Long (eds.), Testing structural equation models (pp. 136–162). Newbury Park, CA:
Sage.
Brus, B. Th. & Voeten, M.J.M. (1973). Een Minuut Test [One minute test]. Nijmegen:
Berkhout.
DEVELOPMENT OF READING SKILLS
87
Butler, S.R., Marsh, H.W., Sheppard, M.J. & Sheppard, J.L. (1985). Seven year longitudinal
study of the early prediction of reading achievement, Journal of Educational Psychology
77: 349–361.
Carver, R.P. (1993). Merging the simple view of reading with rauding theory, Journal of
Reading Behavior 25: 439–455.
Center, Y., Wheldall, K., Freeman, L., Outhred, L. & McNaught, M. (1995). An evaluation of
reading recovery, Reading Research Quarterly 30: 240–263.
Cito (1979). Lees en begrijp 1 [Read and comprehend 1]. Arnhem: Cito.
Cito (1980a). Lees en begrijp 2 [Read and comprehend 2]. Arnhem: Cito.
Cito (1980b). Woorddictee 1 [Spelling test 1]. Arnhem: Cito.
Cito (1980c). Woorddictee 2 [Spelling test 2]. Arnhem: Cito.
Cito (1981). Begrijpend lezen 3, 4 en 5 [Reading comprehension 3, 4 and 5]. Arnhem: Cito.
Clay, M.M. (1991). Becoming literate. Auckland: Heinemann.
Cohen, J. (1969). Statistical power analysis for the behavioral sciences. New York: Academic
Press.
Cunningham, A.E. & Stanovich, K.E. (1993). Children’s literacy environments and early word
recognition subskills, Reading and Writing: An Interdisciplinary Journal 5: 193–204.
Daneman, M. (1991). Individual differences in reading skills. In: R. Barr, M.L. Kamil, P.B.
Mosenthal & P.D. Pearson (eds.), Handbook of reading research, volume 2 (pp. 512–538).
New York: Longman.
De Visser, L. (1989). Leerlingkenmerken en schoolprestaties [Student characteristics and
school achievements]. In: J.H. Slavenburg & T.A. Peters (eds.), Het project Onderwijs en Sociaal Milieu: een eindbalans [The project education and social environment].
Rotterdam: Rotterdamse Schooladviesdienst.
Ehri, L.C. (1986). Sources of difficulty in learning to spell and read. In: M.L. Wolraich &
D. Routh (eds.), Advances in developmental and behavioral pediatrics (pp. 121–195).
Greenwich, CT: Jai Press.
Ehri, L.C. (1991). Development of the ability to read words. In: R. Barr, M.L. Kamil, P.B.
Mosenthal & P.D. Pearson (eds.), Handbook of reading research, volume 2 (pp. 383–417).
New York: Longman.
Garner, R. (1987). Metacognition and reading comprehension. Norwood, NJ: Ablex.
Gough, P.B. & Tunmer, W.E. (1986). Decoding, reading, and reading disability, Remedial and
Special Education 7: 6–10.
Hoover, W.A. & Gough, P.B. (1990). The simple view of reading, Reading and Writing: An
Interdisciplinary Journal 2: 127–160.
Jansen, R.A.R.G. (1997). Constructing monitoring systems in the behavioral sciences. The
SEM state space approach (doctoral dissertation). Nijmegen: University Press Nijmegen.
Jöreskog, K.G. (1970). Estimation and testing of simplex models, British Journal of Mathematical and Statistical Psychology 23: 121–145.
Jöreskog, K.G. & Sörbom, D. (1984). Lisrel-V1 user’s guide (3rd edn.). Mooresville, IN:
Scientific Software.
Jöreskog, K.G. & Sörbom, D. (1993). New features in LISREL 8. Chicago: Scientific Software
International.
Juel, C. (1988). Learning to read and to write: A longitudinal study of 54 children from first
through fourth grades, Journal of Educational Psychology 80: 437–447.
Juel, C., Griffith, P.L. & Gough, Ph.B. (1986). Acquisition of literacy: A longitudinal study of
children in first and second grade, Journal of Educational Psychology 78: 243–255.
Just, M.A. & Carpenter, P.A. (1987). The Psychology of reading and language comprehension.
Newton, MA: Allyn and Bacon.
88
C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD
Lesgold, A.M. & Resnick, L.B. (1982). How reading disabilities develop: Perspectives from
a longitudinal study. In: J.P. Das, R. Mulcahey & A. Wall (eds.), Theory and research in
learning disabilities (pp. 155–187). New York: Plenum.
Malmquist, E. (1969). Lässvarigheter pa grundskolans lagstadium: Experimentelle studier
[A longitudinal study of reading skills in the elementary schools: Experimental studies].
Linköping: University of Linköping.
Mommers, M.J.C. (1987). An investigation into the relation between word recognition skills,
reading comprehension and spelling skills in the first two years of primary school, Journal
of Research in Reading 10: 122–143.
Nagy, W.E. & Herman, P.A. (1987). Depth and breath of vocabulary knowledge: Implications
of acquisition and instruction. In: M.G. McKeown & M.E. Curtis (eds.), The nature of
vocabulary acquisition (pp. 19–35). Hillsdale, NJ: Erlbaum.
Oud, J.H.L. & Jansen, R.A.R.G. (1996). Nonstationary longitudinal LISREL model estimation from incomplete panel data using EM and the Kalman smoother. In: U. Engel & J.
Reinecke (eds.), Analysis of change: Advanced techniques in panel data analysis (pp. 135–
159). New York: de Gruyter.
Paris, S.G., Wasik, B.A. & Turner, J.C. (1991). The development of strategic readers. In: R.
Barr, M.L. Kamil, P.B. Mosenthal & P.D. Pearson (eds.), Handbook of reading research,
volume 2 (pp. 609–640). New York: Longman.
Pearson, P.D. & Fielding, L. (1991). Comprehension instruction. In: R. Barr, M.L. Kamil, P.B.
Mosenthal & P.B. Pearson (eds.), Handbook of reading research, volume 2 (pp. 815–860).
New York: Longman.
Perfetti, C.A. (1985). Reading ability. New York: Oxford University Press.
Röhr, H. (1978). Voraussetzungen zum Erlernen des Lesens und Rechtschreibens [Prerequisites for learning to read and to spell] (Dissertation). Münster: University of Münster.
Shaywitz, B.A., Holford, T.R., Holahan, J.M., Fletcher, J.M., Stuebing, K.K., Francis, D.J.
& Shaywitz, S.E. (1995). A Matthew effect of IQ but not for reading: Results from
longitudinal study, Reading Research Quarterly 30: 894–906.
Slavin, R.E. (1995). Cooperative learning. Boston: Allyn and Bacon.
Slavin, R.E., Madden, N.A., Dolan, L.J., Wasik, B.A., Ross, S., Smith, L. & Dianda, M.
(1995). Success for all; a summary of research. Paper presented at the annual conference
of the American Educational Research Association, San Francisco, CA.
Stanovich, K.E. (1986). Matthew effects in reading: Some consequences of individual
differences in the acquisition of literacy, Reading Research Quarterly 26: 7–29.
Stanovich, K.E. (1991). Discrepancy definitions of reading disability: Has intelligence led us
astray?, Reading Research Quarterly 26: 7–29.
Sticht, T.G. & James, J.H. (1984). Listening and reading. In: P.D. Pearson (ed.), Handbook of
reading research, volume 1 (pp. 293–318). New York: Longman.
Stijnen, P.J.J. (1975). Woordenschattest [Vocabulary test]. Nijmegen: Berkhout.
Taube, K. (1988). Reading acquisition and self-concept. Umea: University of Umea.
Van Dongen, D. (1984). Leesmoeilijkheden. Naar diagnostiserend onderwijzen bij het leren
lezen [Reading difficulties. Toward diagnostic teaching in reading] (doctoral dissertation).
Tilburg: Zwijsen.
Vauras, M., Kinnunen, R. & Kuusela, L. (1994). Development of text-processing skills in
high-, average-, and low-achieving primary school children, Journal of Reading Behavior
26: 361–389.
Venezky, R.L. & Massaro, D.W. (1987). Orthograhic structure and spelling-sound regularity in
reading English words. In: A. Allport, D. Mackay, W. Prinz & Scheerer (eds.), Language
perception and production (pp. 159–179). London: Academic Press.
DEVELOPMENT OF READING SKILLS
89
Venezky, R.L. & Massaro, D.W. (1987). Orthographic structure and spelling-sound regularity
in reading English words. In: A. Allport, D. Mackay, W. Prinz & Scheerer (eds.), Language
perception and production (pp. 159–179). London: Academic Press.
Voeten, M.J.M. (1990). Longitudinal onderzoek van de leesontwikkeling [Longitudinal
research of reading development]. In: C. Aarnoutse & M. Voeten (eds.), Gaat en onderwijst. Liber amicorum voor Dr. M.J.C. Mommers [Go and teach. Liber amicorum for Dr.
M.J.C. Mommers]. Tilburg: Zwijsen.
Voeten, M.J.M. (1991). Beschrijving van de individuele ontwikkeling van leervorderingen
[Description of the individual development of school achievement]. In: J. Hoogstraten &
W.J. van der Linden (eds.), Methodologie [Methodology]. Amsterdam: Stiching Centrum
voor Onderwijsonderzoek.
Address for correspondence: Cor Aarnoutse, PH.D., University of Nijmegen, Department of
Educational Sciences, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
Phone: +31 243612081; Fax: + 31 243615978; E-mail: [email protected]