Reading and Writing: An Interdisciplinary Journal 14: 61–89, 2001. © 2001 Kluwer Academic Publishers. Printed in the Netherlands. 61 Development of decoding, reading comprehension, vocabulary and spelling during the elementary school years COR AARNOUTSE1, JAN VAN LEEUWE2, MARINUS VOETEN3 & HAN OUD4 1 Department of Educational Sciences, University of Nijmegen, Nijmegen, The Netherlands; 2 Statistical Consultancy Group, University of Nijmegen, Nijmegen, The Netherlands; 3 Department of Educational Sciences, University of Nijmegen, Nijmegen, The Netherlands; 4 Department of Special Education, University of Nijmegen, Nijmegen, The Netherlands Abstract. The goal of this study was (1) to investigate the development of decoding (efficiency), reading comprehension, vocabulary and spelling during the elementary school years and (2) to determine the differences between poor, average and good performers with regard to the development of these skills. Twice each year two standardized tests for each skill were administered. For two successive periods, one of the tests for each skill was the same. To describe the development in terms of a latent variable evolving across grades, the structuredmeans version of the structural equation model was used. The growth was expressed in terms of effect size. With respect to the first question, clear seasonal effects were found for reading comprehension, vocabulary and spelling, while the seasonal effect for decoding efficiency was restricted to the early grades. Progress tended to be greater from fall to spring than from spring to fall. For decoding efficiency, and to a lesser degree for vocabulary and spelling, growth showed a declining trend across grades. For reading comprehension, the progress in grade 2 was lower than the progress in grade 3, but progress was declining across higher grades. With respect to the second question, it appeared that initially low performers on reading comprehension, vocabulary and spelling tended to show a greater progress, especially in periods where the largest amount of instruction was given. Although it was found that the low, medium and high ability groups remain in the same order, as far as their means are concerned, these findings do not confirm the existence of a Matthew effect for reading comprehension, vocabulary and spelling. For decoding efficiency no clear differential effect could be found: the gap between the poor and good performers did not widen over time for this skill. Keywords: Decoding efficiency, Developmental curves, Matthew effect, Reading comprehension, Spelling, Vocabulary Introduction This article presents the results of a longitudinal study to investigate the educational development of elementary school students with regard to some important aspects of reading and language arts, namely decoding (efficiency), reading comprehension, vocabulary and spelling. The purpose of this study 62 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD was to determine the average development of these skills in six years of Dutch elementary education. The findings can then provide the basis for assessment of the individual developmental trajectories of pupils in Dutch elementary schools. In addition, average developmental courses for these skills were examined for pupils identified as poor, average or good performers in first grade. In the present section, the concepts of decoding (efficiency), reading comprehension, vocabulary and spelling will be described. Thereafter, the results of some important longitudinal studies concerned with the development of decoding (efficiency), reading comprehension, vocabulary and spelling in elementary school will be considered. In closing, the question of what longitudinal research has to say about the development of the different skills in low-, average- and high-achieving pupils will be discussed. Decoding is the ability to transform printed letter strings into a phonetic code (Perfetti 1985). The most important step towards the identification of printed words is to utilize the alphabetic principle, which means to be able to represent a letter or a combination of letters by their phonemes (Stanovich 1986). Application of the alphabetic principle depends in part on sensitivity of phonemes as units of speech. In first grade, students learn to parse the printed word into graphemes and subsequently assign the phonemes to the different graphemes. After that the students have to blend these phonemes into words. In the next grades, students learn to recognize words or groups of words as fast as possible (Perfetti 1985). Decoding can be measured by the accuracy of pronouncing increasingly difficult words or pseudowords or by the rate to pronounce increasingly difficult words or pseudowords correctly. In the latter case we speak of decoding efficiency. Reading comprehension requires understanding the meanings of words, sentences and texts. At different levels (i.e., the lexical, syntactic, semantic, and pragmatic levels), students try to understand the written message of a writer. The simple view of reading (Carver 1993; Gough & Tunmer 1986; Hoover & Gough 1990) claims that reading comprehension depends on two components, viz., decoding and linguistic comprehension. This theory states that these components are necessary for reading success but neither one is sufficient by itself. According to this theory, there are developmental changes in the nature of the relationships between the components themselves, and between the components and the criterion variable of reading comprehension. In the early grades, the components of decoding and linguistic comprehension are, at most, weakly related. From the middle grades on, linguistic comprehension contributes more substantially to reading comprehension than decoding. According to Sticht & James (1984), decoding is well developed by grade 3 whereas vocabulary and comprehension continue to develop for many years to come. DEVELOPMENT OF READING SKILLS 63 Vocabulary refers to the knowledge of lexical meanings of words and the concepts connected to these meanings. Differences in the size of vocabulary have an effect on word recognition skills as well as reading comprehension (Aarnoutse & Van Leeuwe 1988; Beck & McKeown 1991). In spelling, the spoken language is converted into graphic symbols. It is known that orthographic processing skills explain a considerable amount of variance in reading ability (Cunningham & Stanovich 1993). In the first grades a strong relationship exists between decoding and spelling. In studying the development of language skills, it is important to know whether specific skills are present and thus empirically distinguishable at various points in time. It may be the case, for instance, that certain skills are clearly distinguisable at later stages in development but do not differentiate themselves at earlier stages. One must also be certain that the same skill is being measured throughout the entire developmental period and that the skill is thus assessed using the same or clearly equivalent measures. Finally, it is important to know how stable the individual differences in the various skills are. One can then ask about the extent to which later development is predicted by earlier development. Several longitudinal studies on the development of decoding (efficiency), reading comprehension, vocabulary and spelling during elementary school years have examined these questions (e.g., Aarnoutse & Van Leeuwe 1988; Boland 1991, 1993; Butler, Marsh, Sheppard & Sheppard 1985; De Visser 1989; Malmquist 1969; Mommers 1987; Röhr 1978; Taube 1988; Van Dongen 1984). Malmquist (1969) carried out a longitudinal study in grades 1, 2 and 3 (N = 230) in Swedish elementary schools and found the predictive validity of the reading and school readiness tests not to be high: the amount of variance explained at the end of grade 1 for decoding efficiency, reading comprehension and spelling was 26%, 39% and 31%, respectively. Decoding tests, administered in the middle of grade 1 predicted performance at the end of grade 1 reasonably well: the amount of variance explained for decoding efficiency, reading comprehension and spelling was now 52%, 59% and 42%, respectively. The tests for decoding efficiency, reading comprehension and spelling at the end of grade 1 explained the variance in these variables at the end of grade 2 with 69%, 55% and 49%, respectively, and the variance in the variables at the end of grade 3 with 60%, 25% and 42%, respectively. The tests for decoding efficiency, reading comprehension and spelling administered at the end of grade 2 explained 85%, 30% and 62% of the variance in performance at the end of grade 3. This research shows that the skills of decoding efficiency, reading comprehension and spelling can already be distinguished at the end of grade 1. It also shows the tests of decoding efficiency, reading comprehension and spelling administered in the previous 64 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD school year to be the best predictors of performance on these tests during the next year. Röhr (1978) examined the predictability of the reading and spelling achievement of children (N = 180) at the ends of grades 1 and 2 in German elementary schools. Using variables relating to the social and psychological environments of these students when they were in kindergarten and variables relating to their linguistic and cognitive abilities, the investigator found that a total of 64.5% of the variance in the reading and spelling skills of the children at the end of grade 1 could be predicted. At the end of grade 2, 64% and 63% of the variance in the reading and spelling skills of the students could be predicted by the information on their reading and spelling skills in grade 1. Unlike Malmquist, Röhr did not distinguish between decoding and reading comprehension. Like Malmquist, however, he found aspects of language (reading etc.) in a particular school year to be best predicted by measures of the same aspects in the previous year. In their longitudinal study involving Australian children from kindergarten, grades 1, 2, 3 and 6 (N = 320), Butler, Marsh, Sheppard & Sheppard (1985) found the reading scores obtained at a particular point in elementary school to be most directly and strongly related to the reading scores obtained immediately prior to that point in time. The researchers conclude “that the acquisition of reading skills for students in this study followed a smooth, stable developmental pattern in which the acquisition of skills at any particular point in time depends on the mastery of prior skills” (p. 357). They also found that the gap between the more and less able group increased over time (the so-called fan spread). Once again, however, Butler et al. did not distinguish between decoding and reading comprehension. Taube (1988) followed 500 students in Swedish elementary schools for 8 years and found 80% of the variance in their reading skills in grade 6 to be explained by their reading skills in grades 2 and 3. Similarly, De Visser (1989) followed 300 Dutch students from grades 1 through 6 and found their achievement in grade 6 to be both directly and indirectly influenced by their achievement in the first years of elementary school. Van Dongen (1984) and Mommers (1987) studied the development of decoding efficiency, reading comprehension and spelling skills across a period of three years in two samples of 12 randomly selected elementary schools from the Netherlands (N = 225 and 236). The results indicated that: (1) specific versus general prerequisites should be clearly distinguished in predicting reading and spelling achievement, (2) decoding efficiency, reading comprehension and spelling achievements constitute clearly distinguishable skills after an eight month period of instruction, and (3) decoding efficiency, reading comprehension and spelling achievement are best predicted DEVELOPMENT OF READING SKILLS 65 by measures of the same skills at an earlier age. In this longitudinal study, the distinct character of the decoding efficiency, reading comprehension and spelling skills after eight months of formal instruction could be observed more clearly than in cross-sectional, correlational research. In a follow up on Mommers’s study, Boland (1991, 1993) examined the decoding efficiency, reading comprehension and spelling achievement of the same students in grade 6 (N = 310) and those studied by Mommers in grades 1, 2 and 3. The results of this study indicated that it is possible to distinguish decoding efficiency, reading comprehension and spelling in grade 6 as separate variables. Reading comprehension strongly correlated with vocabulary and measures of verbal intelligence. The interrelations between the three variables (decoding efficiency, reading comprehension and spelling) appeared to be dominated by the effects between the variables of the same sort. “Most decoding variance can be explained by a measure of the same variable at an earlier time point; the same holds for reading comprehension and – to a slightly less pronounced degree – for the spelling ability” (Boland 1991: 212). Aarnoutse & Van Leeuwe (1988) examined the relative effects of decoding efficiency, vocabulary and spatial intelligence on reading comprehension using the scores for these variables in grades 3 and 6 from the longitudinal research by Mommers and by Boland. Vocabulary measured in grades 3 and 6 appeared to be the most important predictor of reading comprehension measured in grade 6. Spatial intelligence and decoding efficiency, measured in grades 3 and 6 followed as predictors, with decoding efficiency making the smallest contribution. On the basis of these longitudinal results, one can conclude that decoding efficiency, reading comprehension and spelling constitute separate factors or constructs by the end of grade 1. The scores for each of these skills are highly predictable by the scores for the same skill at an earlier point in time. Prediction is best when the time interval is short. That is, measurement at an immediately prior point in time appears to contain all of the information needed to predict later performance and thereby makes the pupil’s previous history more or less irrelevant. In several cross-sectional studies conducted over the last 20 years, researchers have examined the question of whether and to what extent differences exist between groups of students with regard to decoding (efficiency) (Just & Carpenter 1987; Perfetti 1985; Stanovich 1991), reading comprehension (Daneman 1991; Garner 1987; Paris, Wasik & Turner 1991; Pearson & Fielding 1991; Vauras, Kinnunen & Kuusela 1994), vocabulary (Beck & McKeown 1991; Nagy & Herman 1987) and spelling (Ehri 1986, 1991; Venezky & Massaro 1987). In most of the studies, the differences between good and poor learners are considered. In only a few studies, however, 66 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD has the development of different groups of students with regard to the above-mentioned skills been considered for more than one grade or age. Lesgold & Resnick (1982) followed the reading progress of a group of students from the beginning of first grade through third grade. Based on second and third-grade comprehension scores the children were grouped into three levels. It appeared that the lowest ability group began in first grade with low skills in certain tests of basic skills required for reading, viz., vocabulary and phoneme-grapheme knowledge. This group failed to improve in these skills over time, falling farther behind their successful peers. The problems in comprehension of this group did not show up until the second grade. Aarnoutse, Mommers, Smits & Van Leeuwe (1986) examined the development of decoding efficiency, reading comprehension and spelling abilities of different groups of students from the moment they entered the second grade through fourth grade. On the basis of the scores on a decoding efficiency test administered at the beginning of second grade, four groups of decoders could be distinguished. On the basis of the reading comprehension and spelling tests, four groups of reading comprehenders and spellers could similarly be distinguished. The results showed the different groups composed at the beginning of grade 2 to maintain their relative position over the years. In other words, the composition of the four groups of decoders, reading comprehenders and spellers remained largely the same in grades 2, 3 and 4 although their average developmental curves were, with the exception of the decoding skill, not parallel. Juel, Griffith & Gough (1986) and Juel (1988) followed the decoding and spelling development of 54 lower socio-economic students from grades 1 through 4. The results showed the chances of a poor reader (decoder) in first grade ranking among the poor readers in fourth grade to be 88%; the chances of a poor reader in second grade moving up to the level of an average reader in fourth grade was 13%. The spelling measurements were not as stable over time as the decoding measurements, however. Shaywitz et al. (1995) followed 396 students from kindergarten through grade 6 in order to examine the Matthew effect (Stanovich 1986), which claims that the gap between good and poor readers increases as time goes on (the so-called fan-spread). Stanovich (1986) has argued that those students who read best initially improve their reading skills at a faster rate than students who do not read as well, because of increased exposure to more written language. In other words, over time, better readers get better (rich-getricher), and poorer readers become relatively poorer (poor-get-poorer). This means that the development of individual variation in reading can be characterised by a stable rank ordering of individuals and an increase of differences among students (Bast 1995). Shaywitz et al., however, did not find evidence DEVELOPMENT OF READING SKILLS 67 of a Matthew effect for decoding skills. (They found a Matthew effect for IQ, though this effect was relatively small). Their findings suggest, in fact, that “those children who were initially poor readers in the early school years remain poor readers relative to other children in the sample. Thus, though the rate of reading achievement is higher in those children in the lowest quartiles, their reading achievement, as an absolute level, remains much lower than that of children who began at a more advanced level of reading achievement. This finding suggests that shortly after school entry, children’s reading achievement changes very little relative to that of their peers” (Shaywitz et al. 1995: 903). To uncover the influence and causes of the Matthew effect in reading Bast (1995) carried out a longitudinal study in the first three years of the Dutch elementary school (N = 235). In line with the Matthew model Bast found that initially poor decoders (poor in decoding efficiency) remained poor decoders during those grades and that the performance gap with good decoders became larger in the course of development. In contrast to the claims of the Matthew model, however, Bast could not find evidence for increasing differences in reading comprehension, vocabulary and attitudes towards reading. Only for leisure time reading activities an increase of interindividual variance was found. Another, more methodological finding of this study was that the development of individual differences in decoding efficiency and reading comprehension could adequately be described by a simplex growth model (see Method). As mentioned before, the present study is focused on the development of decoding efficiency, reading comprehension, vocabulary and spelling during the elementary school years. The main research questions were the following: (1) How do the skills of decoding efficiency, reading comprehension, vocabulary and spelling develop over a period of six years of elementary education? Is the average development of these skills a question of increasing growth (progress) or does the growth in these skills appear to decrease after some school years?, (2) What are the differences between poor, average and good performers with regard to the development of decoding efficiency, reading comprehension, vocabulary and spelling? Do the differences between the poor and good performers appear to increase during the elementary school period? Can, with other words, a Matthew effect be detected? On the basis of the research mentioned above, it was expected that the average development in the areas of decoding efficiency, reading comprehension, vocabulary and spelling shows a rather stable pattern of growth during the elementary school years. Another expectation was that the interindividual differences observed in the development of the poor, average and good performers for these skills do not increase over time. That is, the relative 68 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD differences between the three groups of students were expected to remain the same or to decrease throughout the six elementary school grades. Method Longitudinal design. Three cohorts of students were distinguished: − students attending grade 1 in the school year 1991–1992 (cohort a), − students attending grade 2 in the school year 1991–1992 (cohort b), and − students attending grade 3 in the school year 1991–1992 (cohort c). The cohorts were tested following the scheme presented below: 1 2 a b a School year 1991–1992 1992–1993 1993–1994 1994–1995 1995–1996 1996–1997 Grade 3 4 Cohorts c b a c b a 5 c b a 6 c b a The students in the cohorts were tested twice each year in the months of October and April. During each measurement period, two standardized norm-referenced tests were administered for decoding efficiency, reading comprehension, vocabulary and spelling. For two successive measurement periods, at least one of the tests for each skill was the same, which made it possible to construct a common scale for each skill and thereby determine the developmental curves for the four competencies. Using multiple cohorts had several advantages. First, the data obtained on one cohort could be used to adapt the tests for subsequent cohorts. Second, the cohorts could function as each other’s replicates. Third, the students who repeated a grade could be studied in the next cohort. Sample. A stratified random sample of 39 schools was taken from the population of Dutch elementary schools. The stratification variables were the degree of urbanisation (municipalities with more or less than 100,000 inhabitants) and the composition of the school population or ‘school weight’ (low, medium or high). The second weight is used by the Dutch government for school funding and is a combination of ethnic origin, SES, and level of education of the parents. Seven schools dropped out after the first year DEVELOPMENT OF READING SKILLS 69 because of the amount of work involved in the administration of the tests, the low achievement of their students, and the late reporting of results to the school. Thereafter, the three cohorts consisted of about 900 students each, with 49% boys and 51% girls on average. Instruments The achievement of the students in reading and language arts were the variables of interest in the present study. Reading achievement involved decoding efficiency and reading comprehension; language achievement involved vocabulary and spelling. Decoding efficiency. During each measurement period, two forms of the same test for decoding efficiency were administered: form A and B of the One Minute Test (Een Minuut Test) from Brus & Voeten (1973). The One Minute Test measures the word-decoding ability of students in grades 1 through 6. The child’s task is to read aloud as many words as possible from a card in one minute (there are four columns with 29 unrelated words in each). The list of words on this card decreases in frequency of usage; the test measures the rate of pronouncing increasingly difficult words correctly. The child’s score is the number of words read in one minute, minus the number of words read incorrectly. The test is a combination of a measure of rate and an accuracy measure: it can be viewed as a test measuring decoding efficiency. A very small number of students in grade 6 could read the 116 words correctly in one minute. The two parallel forms of the test were administered on all occasions. The distributions of the test scores were found to be acceptable for all of the grades. The test/retest correlations and the correlations for the two forms of the test were also found to be above 0.85 for all grades. Reading comprehension. Two tests for reading comprehension were administered at each measurement point: the Reading Comprehension Tests (Begrijpend Leestest 3, 4, 5, 6, 7, 8) developed by Aarnoutse (1996a) and one of the Reading Comprehension Tests (Lees en Begrijp 1 1979; Lees en Begrijp 2 1980a; Begrijpend Lezen M3, E3, E4, M5, E5 1981) developed by Cito. The Reading Comprehension Tests from Aarnoutse and from Cito are designed to measure general reading comprehension of students in grades 1 through 6. All tests are measures of accuracy: there were no time limits for the students to finish these tests. The students read short expository and/or narrative texts and then answered multiple choice questions concerned with the texts. The questions can pertain to the word, sentence or text levels. The passages of the tests vary across the grades in difficulty from easy to difficult. The administration of one test lasted about one hour. The distributions of 70 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD the scores for the reading comprehension tests were found to be acceptable with the exception of some tests. Most of these tests showed a ceiling effect during second administration. The Cronbach alpha coefficients for the tests from Aarnoutse and from Cito have all found to be 0.83 or higher and 0.79 or higher, respectively. Vocabulary. During each measurement period, one or two of the Vocabulary Tests (Woordenschattest 3, 4, 5, 6, 7, 8) developed by Aarnoutse (1996c) were administered. In grades 3, 4, 5 and 6, one of the Vocabulary Tests (Woordenschattest) developed by Stijnen (1975) was also administered; in grade 4, the Synonyms Test (Synoniementest) developed by Aarnoutse (1987) was also administered. The Vocabulary Tests and the Synonyms Test from Aarnoutse and the tests from Stijnen are designed to measure the ability of students to comprehend the meaning of a word within the context of a single sentence. All tests are measures of accuracy: there were no time limits for the students to finish these tests. The student’s task is to choose from four alternatives the word with the same or almost the same meaning as the word underlined in the target sentence. The vocabulary tests contain words that, across grades, decrease in frequency of usage. The administration of one test took about half an hour. The distributions of the scores for the vocabulary tests were found to be fairly acceptable with the exception of some tests. These tests showed a ceiling effect during second administration. The Cronbach alpha coefficients for the tests from Aarnoutse and from Stijnen were all found to be 0.80 or higher and 0.89 or higher, respectively. Spelling. During each measurement period, one or two of the Spelling Tests (Spelling tests) developed by Aarnoutse (1996b) were administered. In grades 1 and 2, the Spelling Tests (Woorddictee 1 1980b; Woorddictee 2 1980c) developed by Cito were also administered. The Spelling Tests from Aarnoutse are designed to measure the ability of students in grades 1 through 6 to spell words correctly. The student’s task is to spell the words presented within a single sentence or series of sentences. The tests for grades 1 and 2 are composed of only nouns; the tests in the higher grades contain verbs as well. The distributions of the scores for these tests were found to be acceptable with the exception of two tests. One of these tests showed a ceiling effect during both administrations. The Cronbach alpha coefficients for the tests were all found to be 0.83 or higher. The Spelling Tests from Cito are designed to measure the ability of students in grades 1 and 2 to spell words. The student’s task is to spell the words presented in a sentence. The two tests measure the spelling of nouns. The distributions of the scores were not found to be acceptable as the tests showed a strong ceiling effect. The Cronbach alpha DEVELOPMENT OF READING SKILLS 71 coefficients for the tests were found to be 0.80 or higher. All spelling tests contain words that, across grades, decrease in frequency of usage. The tests are measures of accuracy; all students had ample time to finish the tests. The administration of one test took about 40 minutes. All tests, mentioned above, meet the requirements of reliability and construct and predictive validity (Aarnoutse, Van Leeuwe, Voeten, Van Kan & Oud 1996). A factor analysis with all these tests as variables revealed three factors, which could be interpreted as decoding efficiency, reading comprehension/vocabulary, and spelling. Procedure All tests were administered by teachers after a short training. Only the One Minute Test required individual administration. As already noted, the same test was used on all occasions to measure decoding efficiency. For the other skills, different tests were used depending on the grade level. The teachers sent the test forms back to the research staff, and the results of the tests were subsequently reported to the school. The research staff met regularly (twice a year) with the teachers to discuss the progress of the project. For the present study, only the data from the students of cohort a (grades 1 through 6) will be considered. This involves a total of 11 measurement periods or two points a year with exception of grade 1 (12; 21, 22; 31, 32; 41, 42; 51, 52; 61, 62; the first number designates the grade and the second number the period: 1 for fall and 2 for spring). It was impossible to administer tests in the first measurement period in grade 1 (October). At that time most of the Dutch students can neither read nor spell. In the second period of grade 1 (April) most students can decode and spell words of one or two syllables. For each skill two tests were administered during each of the eleven measurement periods (from spring in grade 1 to spring in grade 6). To get a stable estimate of the underlying trait, which enables the investigation of the longitudinal development of the skills, a four step procedure was devised and applied to each of the four skills, separately. Step 1: Student selection One of the problems in longitudinal studies is the drop-out of subjects. Subjects left the longitudinal sample for several reasons: some of the students moved to other schools and/or places; other students had to repeat a grade or entered a special education program; and seven of the schools dropped out after the first year. Only those students with a valid score on at least one of the two tests for each of the eleven measurement periods were included in the longitudinal sample. From the 1218 students who entered the project in 1991 in grade one, 515 students remained for decoding efficiency, 568 for reading 72 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD comprehension, 580 for vocabulary and 520 for spelling. It should be noted that this drop-out may have influenced the representativeness of the sample; all inferences are therefore made with regard to students who remained in the same school for a period of six years without repeating a grade. Step 2: Handling missing data The remaining data obviously contained some missing scores. For this reason, the covariance matrices, to be analyzed in the next step, were estimated using the full information maximum likelihood method (Anderson 1957). Step 3: Estimating the underlying latent trait The two test scores for a skill in each measurement period were assumed to be two congeneric measures of that skill. This implies that the two tests measure the same construct, but that their factor loadings may differ. For each skill the latent variable at any measurement period was assumed to depend directly on its immediate predecessor only, i.e. the latent variables for each skill were supposed to form a so-called simplex structure (Jöreskog 1970) over time. Structural equation modelling was performed separately for each of the four skills using version 3.6 of the computer program AMOS (Arbuckle 1996). The structured-means version of the structural equation model was used (Jöreskog & Sörbom 1993, chap. 10). The means, variances and covariances for the 21 or 22 variables observed for each skill constituted the input to the program. The relevant parameters were estimated using the maximum likelihood method. The models were almost identical for each skill. With respect to decoding efficiency, scores of test versions A and B of the One Minute Test were available at all eleven measurement periods; these versions are common to all measurement periods. For the other three skills a relaylike procedure was used in which one test was administered at two successive measurement periods and then substituted by the next test according to the following scheme for the eleven measurement periods: AB, BC, CD, DE, EF, FG, GH, HI, IJ, JK, KL. So test A was administered in period 12, test B in periods 12 and 21, test C in periods 21 and 22, test D in periods 22 and 31, etc. To make scores on the latent variables for a skill comparable over time, two types of restrictions were imposed. First, for the same tests, administered in two successive periods both the intercepts and the factor loadings were constrained to be equal. Second, the scale of the latent variables had to be fixed. This may be done arbitrarily. It was decided to set mean and variance of the first latent variable equal to the mean and variance of the very first observed variable (test A) in the first measurement period (12). By this type of analysis observed scores on individual tests are ‘compressed’ to scores on latent variables which are assumed to be linearly related to each other. The 73 DEVELOPMENT OF READING SKILLS influence of a high or a low score by chance is greatly removed, since it is part of the error, accounted for in the model. For this reason, regression to the mean may be regarded as negligible. Step 4: Making progress comparable: Effect size Though the foregoing steps result in a common metric to describe skill development over time, it does not guarantee that dispersions remain the same over time. It is likely that standard deviations vary across measurement periods. To make the assessment of progress from one measurement period to the next comparable across periods and skills a standardized average gain score was computed. The differences between means of the latent variables at two consecutive time points were divided by the standard deviation of the latent variable at the first of the two time points. This measure is called an ‘effect size’, because it is analogous to the effect size index used in group comparisons (Cohen 1969). The measure indicates how large mean gain on a skill is relative to the magnitude of individual differences at the first of two time points. The results of this study will primarily be presented in terms of progress comparisons expressed by this measure. Results The development of the four skills The development of decoding efficiency, reading comprehension, vocabulary and spelling was analysed according to the four-step procedure described in the foregoing section. For completeness, the global fit statistics for the structural equation analyses of step 3 are presented in Table 1, where GFI denotes the ‘goodness of fit index’ (Jöreskog & Sörbom 1984), CFI denotes the ‘comparative fit index’ (Bentler 1990) and RMSEA denotes the ‘root mean square error of approximation’ (Browne & Cudeck 1993). Table 1. Global fit statistics for the four skills Decoding efficiency Reading comprehension Vocabulary Spelling χ2 df p GFI CFI RMSEA N 625.969 515.409 928.323 540.745 219 180 200 180 0.000 0.000 0.000 0.000 0.900 0.922 0.844 0.913 0.993 0.993 0.986 0.992 0.060 0.057 0.079 0.062 515 568 580 520 74 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD (a) (b) Figure 1. Effect sizes for the four skills by measurement period: (a) decoding efficiency; (b) reading comprehension. DEVELOPMENT OF READING SKILLS (c) (d) Figure 1. Effect sizes for the four skills by measurement period: (c) vocabulary; (d) spelling. 75 76 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD While the chi-square tests yielded significant deviations from the model for each of the four skills, the remaining indicators show a satisfactory fit. GFI and CFI approximate 1, while RMSEA is smaller than 0.08, which indicates a reasonable fit according to Browne and Cudeck (1993). Attempts were undertaken to improve the fit by freeing covariances among residuals. Such alterations did not improve the fit of the model substantially, however, and no systematic patterns within these covariances among residuals could be discovered. The effect sizes across measurement periods for each of the four skills are presented in Figure 1. Figure 1 shows for all four skills progress to gain its maximum in grade 2 (i.e. between fall, period 21, and spring, period 22, of grade 2). In this figure two main trends are immediately clear. First, the effect sizes in the period from fall to spring exceed the effect sizes in the period from spring to fall, with some minor exceptions. The amount of time spent on instruction in reading and language arts, which is greatest in the period from fall to spring in the Dutch school system, apparently influences the progress of students in each of the four skills. Second, the effect sizes are decreasing across grades. This decline is greatest for decoding efficiency, starting from the end of grade 3 (period 31 → 32). Note that the effect sizes for reading comprehension and vocabulary show a similar pattern. To inspect the tendency of decreasing effect sizes across grades, effect sizes were also determined for each grade, i.e. from spring to spring (12 → 22, 22 → 32, etc.). These effect sizes per grade are presented in Figure 2. Figure 2 shows that progress in decoding efficiency decreases strongly by grade, especially after grade 2. Progress in reading comprehension is highest in grade 3. Apparently, in grade 1 and 2 the greatest attention is on the technical aspect of reading. When decoding efficiency has reached a certain level in grade 2 and 3 (cf. Sticht & James 1984), the attention shifts to reading comprehension. The degree of progress in reading comprehension decreases in grade 4 and 5, while an upward trend is apparent from grade 5 to 6. This growth may be caused by the final examination of the elementary school students in February in grade 6. For vocabulary and spelling, the degree of progress is intermediate and decreases by grade. An exception is vocabulary in grade 5. It is possible that the increase in vocabulary in this grade is caused by the more prominent role social studies takes in the fifth grade of the Dutch elementary school curriculum. In social studies students learn a lot of new words and concepts. DEVELOPMENT OF READING SKILLS 77 Figure 2. Effect sizes for the four skills by grade. Developmental differences between low, medium and high performers In order to examine the development of students who performed differently in grade 1, the students were subdivided into three subgroups. This was done for each skill on the basis of their scores in the spring of grade 1, when these students can already read words with one or two syllables. Low, medium and high performers were distinguished according to a percentile distribution of 17%, 66%, and 17%, respectively, reflecting the cut-off points of −1, and +1 SD in a normal distribution. As two tests were involved for each skill in grade 1, the distribution of the mean z-score for the two tests served as the criterion. By using the mean of two test scores the risk of regression to the mean in comparing extreme groups is (greatly) diminished. Given that the two tests have discrete values, the subdivision into 17%, 66% and 17% could not be realized perfectly. The percentages were therefore approximated as closely as possible; the number of pupils included in each analysis is presented in Table 2. For each of the four skills and each of the three subgroups, steps 2 to 4 of the procedure were performed along the lines described for the total sample in the procedure section. In the structural equation model of step 3, however, the parameters in the main part of the model were fixed to the values attained for the total group in the foregoing part. The parameters to be estimated were restricted to the intercepts of the latent variables and the variances of the error 78 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD Table 2. Numbers of students in subgroups Decoding efficiency Reading comprehension Vocabulary Spelling Low Medium High 94 94 91 93 335 381 387 340 86 93 102 87 terms. In such a manner, the development in the means for the subgroups of low, medium and high performers in grade 1 could be identified. To properly compare the progress, effect sizes were based on the standard deviations for the total group. These effect sizes, for each of the four skills and for the subgroups of performers in grade 1, are presented in Figure 3. Figure 3 shows again a seasonal effect for decoding efficiency in the first grades, and a clear seasonal effect for reading comprehension, vocabulary and spelling in all grades. For decoding efficiency, Figure 3 does not show a clear differentiation between the three subgroups in the higher grades, even though that the group of high performers has the lowest effect sizes in most grades. This last finding does not confirm the existence of a Matthew effect for decoding. If a Matthew effect for decoding would exist, the effect sizes for the group of good performers would exceed the effect sizes for the group of low performers in all grades. For reading comprehension, Figure 3 shows a different pattern. First, low performers mostly show higher effect sizes than the medium performers, which for the most part exceed the effect sizes of the higher performers. This result does not support the existence of a Matthew effect for reading comprehension. Second, the highest effect size for the low performers is found in period 21 → 22, indicating that they are trying to catch up. Third, the highest effect sizes for the medium and high subgroups are also found in period 21 → 22. So for the total group of students (see Figure 1), the greatest amount of progress is found in grade 2, while the initially low performers are pursuing already from grade 1. It is also striking that the progress of the initially low performers exceeds the progress of the other two groups in the periods from fall to spring. Apparently, the low performers benefit most from instruction. For vocabulary and spelling similar patterns of effect sizes are found (see Figure 3). The low performers mostly show higher effect sizes than the medium and high performers. Exceptions are found for vocabulary in grade 2 (period 22 → 31) and in grade 4 (period 42 → 51). In general, the effect sizes are highest for the initially low performers and lowest DEVELOPMENT OF READING SKILLS 79 for the initially high performers, which does not confirm the Matthew effect for both vocabulary and spelling. In Figure 4 the average development for each of the subgroups of low, medium and high performers on each of the four skills is presented. The development curves for the three subgroups are distinct but nevertheless follow rather similar patterns with decreasing differences across grade for reading comprehension,vocabulary and spelling. The mean of the group of initially low performers remains lower than the mean of the group of initially medium performers, which in turn, remains lower than the mean of the group of initially high performers. The distinction between the three groups is largest for decoding efficiency and smallest for spelling. Figure 4 shows again that the gap between the low and high performers does not increase over time. The decrease is lowest for decoding. The difference between the means of the groups of high and low performers was 31.88 in grade 1 and 24.31 in grade 6. For the other three skills the decrease in difference between groups of high and low performers was much greater. Since, according to Shaywitz et al. (1995) “a Matthew effect would predict that the gap between good and poor readers should increase as time goes on” (p. 897), the above results do not confirm the existence of a Matthew effect for reading comprehension, vocabulary and spelling. For decoding, at least a Matthew effect is not supported. As far as Figure 4 is concerned, the conclusions are based on comparisons of differences in time between good and poor performers, i.e. comparisons of vertical gaps over time.The comparison of differences between groups performing equally well, i.e. comparison of horizontal gaps in Figure 4 reveals findings of another nature. It is interesting to see that only by grade 3 the poor decoders (32) attain a level at which the good decoders started in grade 1. So, there is on average difference in decoding efficiency of two years between these two groups. In grade 6 (61) the poor decoders attain a decoding efficiency level that the good decoders achieve in grade 3 (31). The same phenomenon can be noticed for reading comprehension: the poor reading comprehenders attain in grade 3 (32) a level by which the good reading comprehenders started in grade 1. In grade 6 (62) the poor reading comprehenders achieve the level that the high performers attain in grade 4 (42). The same phenomenon can be found for vocabulary and spelling (in the first grades). Discussion The present study focussed on two questions: (1) How do decoding efficiency, reading comprehension, vocabulary and spelling develop during six years of 80 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD (a) (b) Figure 3. Effect sizes for low, medium and high performers: (a) decoding efficiency; (b) reading comprehension. DEVELOPMENT OF READING SKILLS (c) (d) Figure 3. Effect sizes for low, medium and high performers: (c) vocabulary; (d) spelling. 81 82 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD (a) (b) Figure 4. Development in means for low, medium and high performers in decoding efficiency, reading comprehension, vocabulary and spelling: (a) decoding efficiency; (b) reading comprehension. DEVELOPMENT OF READING SKILLS 83 (c) (d) Figure 4. Development in means for low, medium and high performers in decoding efficiency, reading comprehension, vocabulary and spelling: (c) vocabulary; (d) spelling. 84 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD elementary school?, and (2) What are the differences between poor, average and good performers with respect to the development of these skills? With respect to the first question, clear seasonal effects were found for reading comprehension, vocabulary and spelling, while the seasonal effect for decoding efficiency was restricted to the early grades (cf. Brus & Voeten 1973; Voeten 1991). Apparently, the amount of instruction in reading and language arts influences the speeding up of these skills within these grades. For decoding efficiency, and to a lesser degree for vocabulary and spelling growth shows a declining trend across grades. For reading comprehension, the progress in grade 2 is lower than the progress in grade 3, while the rate of progress is declining across higher grades. Apparently, most students make much progress in decoding efficiency in the lower grades (1 and 2), whereafter highest progress is shifted to reading comprehension. With increasing age the rate of progress on vocabulary and spelling is declining, except for vocabulary in grade 5. Perhaps, this latter revived progress in vocabulary is caused by the more prominent role social studies takes in this grade. With respect to the second question, it appeared that initially low performers on reading comprehension, vocabulary and spelling tended to show higher progress, especially in periods where the largest amount of instruction at school is given. Although it was found that the low, medium and high groups on average remain in the same order, these findings do not support the existence of a Matthew effect for reading comprehension, vocabulary and spelling. For decoding efficiency no clear differential effect could be found: groups of initially low, medium and high performers did not differ systematically in mean development. In other words, the gap between the poor and good performers did not widen over time for this skill, so a Matthew effect for decoding efficiency is not supported. Generally, the results of this study support the expectations. With respect to the first expectation the results show that the development of the four skills to be basically steady and to gradually slow over time. These findings are for the most part in line with those of Malmquist (1969), Butler et al. (1985), Mommers (1987) and Boland (1991), as reported in the introduction. The results also show growth to be greatest in the lower grades; particularly grade 2 for decoding efficiency and grade 3 for reading comprehension. With respect to the second expectation, it appeared that students with initially poorer ability in the four skills showed greater improvement over time relative to students with initially better ability in these areas. Shaywitz et al. (1995) found the same result with regard to reading. It appeared also that the developmental curves of the three groups followed rather similar patterns with decreasing differences across grade for reading comprehension, vocabulary DEVELOPMENT OF READING SKILLS 85 and spelling. This result is for the most part in line with the findings of Aarnoutse et al. (1986), Juel (1988), Shaywitz et al. (1995) and Bast (1995). The results of this study have important implications for both theory and educational practice. With respect to theory it appeared that one of the most important hypotheses of the Matthew model of Stanovich (1986), namely the increasing performance differences between pupils over time, was not supported. Like Shaywitz et al. (1995) and for the most part Bast (1995), an increase in the achievement difference between poor and good performers was not found in this study. The question is how to explain the fact that Bast (1995) did find an increase of variability for decoding efficiency, while in the present study such a Matthew effect was not found. A possible explanation can be that Bast (1995) excluded two groups from his sampling procedure (the upper 25% and the lowest 10% of the distribution) “in order to arrive at a sample of students that could be tracked the first three grades” (p. 17). With respect to practical implications the results show growth to be greatest in the lower grades, particularly grade 2 for decoding efficiency and grade 3 for reading comprehension. This finding suggests that intervention programs should be implemented in the first three grades of the primary school or in kindergarten. The effects of programs like Reading Recovery (Center et al. 1995; Clay 1991) and Success for All (Slavin 1995; Slavin et al. 1995) provide support for this finding. Another practical implication is that the average developmental curves provide an important basis for student assessment, monitoring and prediction. Such a system can involve not only the measurement of the progress of students over longer periods of time but also the development of guidelines or programs for remediation along with a computerised registration system (Jansen 1997; Oud & Jansen 1996). A restriction on the conclusions relates to the students included in the study. Only those students with a valid score on at least one of the two tests for each of the eleven measurement periods were admitted. This meant that students repeating a grade or entering a special-education program were not involved in the study. Given the three cohorts in our study, it will nevertheless be possible to analyse the development of the repeaters from cohorts 2 and 3 as well. Several important questions can be formulated for further research like: How do decoding efficiency, reading comprehension, vocabulary and spelling relate across the six primary grades? Which relationships become stronger or weaker over time? Another important question to be addressed is: How can the learning progress of individual pupils be estimated over time? (Voeten 1990, 1991). Other questions are: How do decoding efficiency, reading comprehension, vocabulary and spelling develop when pupils appear to be (a) poor in decoding efficiency and poor in reading comprehension, (b) 86 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD poor in decoding efficiency and good in reading comprehension, (c) good in decoding efficiency and poor in reading comprehension, and (d) good in decoding efficiency and good in reading comprehension? Which minimum level of decoding efficiency, vocabulary and other cognitive and affective skills and attitudes are necessary to achieve a minimum level of reading comprehension? References Aarnoutse, C.A.J. (1987). Synoniementest [Synonyms test]. Nijmegen: Berkhout. Aarnoutse, C.A.J. (1996a). Begrijpend leestests [Reading comprehension tests]. Nijmegen: Berkhout. Aarnoutse, C.A.J. (1996b). Spellingtests [Spelling tests]. Nijmegen: Berkhout. Aarnoutse, C.A.J. (1996c). Woordenschattests [Vocabulary tests]. Nijmegen: Berkhout. Aarnoutse, C.A.J., Mommers, M.J.C., Smits, B.W.G.M. & Van Leeuwe, J.F.J. (1986). De ontwikkeling en samenhang van technisch lezen, begrijpend lezen en spellen [Development and relation between decoding, reading comprehension and spelling], Pedagogische Studiën [Pedagogical Studies] 63: 97–110. Aarnoutse, C.A.J. & Van Leeuwe, J.F.J. (1988). Het belang van technisch lezen, woordenschat en ruimtelijke intelligentie voor begrijpend lezen [Importance of decoding, vocabulary and spatial intelligence for reading comprehension], Pedagogische Studiën [Pedagogical Studies] 65: 49–59. Aarnoutse, C., Van Leeuwe, J., Voeten, R., Van Kan, N. & Oud, J. (1996). Longitudinaal onderzoek schoolvorderingen in het basisonderwijs [Longitudinal study of schoolachievements in the elementary school]. Nijmegen: University of Nijmegen. Anderson, T.W. (1957). Maximum likelihood estimates for a multivariate normal distribution when some observations are missing, Journal of the American Statistical Association 52: 200–203. Arbuckle, J.L. (1996). Amos user’s guide version 3.6. Chicago: SmallWaters Corporation. Bast, J.W. (1995). The development of individual differences in reading ability (doctoral dissertation). Amsterdam: Paedologisch Institut. Beck, I. & McKeown, M. (1991). Conditions of vocabulary acquisition. In: R. Barr, M.L. Kamil, P.B. Mosenthal & P.D. Pearson (eds.), Handbook of reading research, volume 2 (pp. 789–814). New York: Longman. Bentler, P. (1990). Comparative fit indexes in structural models, Psychological Bulletin 107: 238–246. Boland, T. (1991). Lezen op termijn [Reading in the long term] (doctoral dissertation). Nijmegen: University of Nijmegen. Boland, T. (1993). The importance of being literate: Reading development in primary school and its consequences for the school career in secondary education, European Journal of Psychology of Education 8: 289–305. Browne, M.W. & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K.A. Bollen & J.S. Long (eds.), Testing structural equation models (pp. 136–162). Newbury Park, CA: Sage. Brus, B. Th. & Voeten, M.J.M. (1973). Een Minuut Test [One minute test]. Nijmegen: Berkhout. DEVELOPMENT OF READING SKILLS 87 Butler, S.R., Marsh, H.W., Sheppard, M.J. & Sheppard, J.L. (1985). Seven year longitudinal study of the early prediction of reading achievement, Journal of Educational Psychology 77: 349–361. Carver, R.P. (1993). Merging the simple view of reading with rauding theory, Journal of Reading Behavior 25: 439–455. Center, Y., Wheldall, K., Freeman, L., Outhred, L. & McNaught, M. (1995). An evaluation of reading recovery, Reading Research Quarterly 30: 240–263. Cito (1979). Lees en begrijp 1 [Read and comprehend 1]. Arnhem: Cito. Cito (1980a). Lees en begrijp 2 [Read and comprehend 2]. Arnhem: Cito. Cito (1980b). Woorddictee 1 [Spelling test 1]. Arnhem: Cito. Cito (1980c). Woorddictee 2 [Spelling test 2]. Arnhem: Cito. Cito (1981). Begrijpend lezen 3, 4 en 5 [Reading comprehension 3, 4 and 5]. Arnhem: Cito. Clay, M.M. (1991). Becoming literate. Auckland: Heinemann. Cohen, J. (1969). Statistical power analysis for the behavioral sciences. New York: Academic Press. Cunningham, A.E. & Stanovich, K.E. (1993). Children’s literacy environments and early word recognition subskills, Reading and Writing: An Interdisciplinary Journal 5: 193–204. Daneman, M. (1991). Individual differences in reading skills. In: R. Barr, M.L. Kamil, P.B. Mosenthal & P.D. Pearson (eds.), Handbook of reading research, volume 2 (pp. 512–538). New York: Longman. De Visser, L. (1989). Leerlingkenmerken en schoolprestaties [Student characteristics and school achievements]. In: J.H. Slavenburg & T.A. Peters (eds.), Het project Onderwijs en Sociaal Milieu: een eindbalans [The project education and social environment]. Rotterdam: Rotterdamse Schooladviesdienst. Ehri, L.C. (1986). Sources of difficulty in learning to spell and read. In: M.L. Wolraich & D. Routh (eds.), Advances in developmental and behavioral pediatrics (pp. 121–195). Greenwich, CT: Jai Press. Ehri, L.C. (1991). Development of the ability to read words. In: R. Barr, M.L. Kamil, P.B. Mosenthal & P.D. Pearson (eds.), Handbook of reading research, volume 2 (pp. 383–417). New York: Longman. Garner, R. (1987). Metacognition and reading comprehension. Norwood, NJ: Ablex. Gough, P.B. & Tunmer, W.E. (1986). Decoding, reading, and reading disability, Remedial and Special Education 7: 6–10. Hoover, W.A. & Gough, P.B. (1990). The simple view of reading, Reading and Writing: An Interdisciplinary Journal 2: 127–160. Jansen, R.A.R.G. (1997). Constructing monitoring systems in the behavioral sciences. The SEM state space approach (doctoral dissertation). Nijmegen: University Press Nijmegen. Jöreskog, K.G. (1970). Estimation and testing of simplex models, British Journal of Mathematical and Statistical Psychology 23: 121–145. Jöreskog, K.G. & Sörbom, D. (1984). Lisrel-V1 user’s guide (3rd edn.). Mooresville, IN: Scientific Software. Jöreskog, K.G. & Sörbom, D. (1993). New features in LISREL 8. Chicago: Scientific Software International. Juel, C. (1988). Learning to read and to write: A longitudinal study of 54 children from first through fourth grades, Journal of Educational Psychology 80: 437–447. Juel, C., Griffith, P.L. & Gough, Ph.B. (1986). Acquisition of literacy: A longitudinal study of children in first and second grade, Journal of Educational Psychology 78: 243–255. Just, M.A. & Carpenter, P.A. (1987). The Psychology of reading and language comprehension. Newton, MA: Allyn and Bacon. 88 C. AARNOUTSE, J. VAN LEEUWE, M. VOETEN & H. OUD Lesgold, A.M. & Resnick, L.B. (1982). How reading disabilities develop: Perspectives from a longitudinal study. In: J.P. Das, R. Mulcahey & A. Wall (eds.), Theory and research in learning disabilities (pp. 155–187). New York: Plenum. Malmquist, E. (1969). Lässvarigheter pa grundskolans lagstadium: Experimentelle studier [A longitudinal study of reading skills in the elementary schools: Experimental studies]. Linköping: University of Linköping. Mommers, M.J.C. (1987). An investigation into the relation between word recognition skills, reading comprehension and spelling skills in the first two years of primary school, Journal of Research in Reading 10: 122–143. Nagy, W.E. & Herman, P.A. (1987). Depth and breath of vocabulary knowledge: Implications of acquisition and instruction. In: M.G. McKeown & M.E. Curtis (eds.), The nature of vocabulary acquisition (pp. 19–35). Hillsdale, NJ: Erlbaum. Oud, J.H.L. & Jansen, R.A.R.G. (1996). Nonstationary longitudinal LISREL model estimation from incomplete panel data using EM and the Kalman smoother. In: U. Engel & J. Reinecke (eds.), Analysis of change: Advanced techniques in panel data analysis (pp. 135– 159). New York: de Gruyter. Paris, S.G., Wasik, B.A. & Turner, J.C. (1991). The development of strategic readers. In: R. Barr, M.L. Kamil, P.B. Mosenthal & P.D. Pearson (eds.), Handbook of reading research, volume 2 (pp. 609–640). New York: Longman. Pearson, P.D. & Fielding, L. (1991). Comprehension instruction. In: R. Barr, M.L. Kamil, P.B. Mosenthal & P.B. Pearson (eds.), Handbook of reading research, volume 2 (pp. 815–860). New York: Longman. Perfetti, C.A. (1985). Reading ability. New York: Oxford University Press. Röhr, H. (1978). Voraussetzungen zum Erlernen des Lesens und Rechtschreibens [Prerequisites for learning to read and to spell] (Dissertation). Münster: University of Münster. Shaywitz, B.A., Holford, T.R., Holahan, J.M., Fletcher, J.M., Stuebing, K.K., Francis, D.J. & Shaywitz, S.E. (1995). A Matthew effect of IQ but not for reading: Results from longitudinal study, Reading Research Quarterly 30: 894–906. Slavin, R.E. (1995). Cooperative learning. Boston: Allyn and Bacon. Slavin, R.E., Madden, N.A., Dolan, L.J., Wasik, B.A., Ross, S., Smith, L. & Dianda, M. (1995). Success for all; a summary of research. Paper presented at the annual conference of the American Educational Research Association, San Francisco, CA. Stanovich, K.E. (1986). Matthew effects in reading: Some consequences of individual differences in the acquisition of literacy, Reading Research Quarterly 26: 7–29. Stanovich, K.E. (1991). Discrepancy definitions of reading disability: Has intelligence led us astray?, Reading Research Quarterly 26: 7–29. Sticht, T.G. & James, J.H. (1984). Listening and reading. In: P.D. Pearson (ed.), Handbook of reading research, volume 1 (pp. 293–318). New York: Longman. Stijnen, P.J.J. (1975). Woordenschattest [Vocabulary test]. Nijmegen: Berkhout. Taube, K. (1988). Reading acquisition and self-concept. Umea: University of Umea. Van Dongen, D. (1984). Leesmoeilijkheden. Naar diagnostiserend onderwijzen bij het leren lezen [Reading difficulties. Toward diagnostic teaching in reading] (doctoral dissertation). Tilburg: Zwijsen. Vauras, M., Kinnunen, R. & Kuusela, L. (1994). Development of text-processing skills in high-, average-, and low-achieving primary school children, Journal of Reading Behavior 26: 361–389. Venezky, R.L. & Massaro, D.W. (1987). Orthograhic structure and spelling-sound regularity in reading English words. In: A. Allport, D. Mackay, W. Prinz & Scheerer (eds.), Language perception and production (pp. 159–179). London: Academic Press. DEVELOPMENT OF READING SKILLS 89 Venezky, R.L. & Massaro, D.W. (1987). Orthographic structure and spelling-sound regularity in reading English words. In: A. Allport, D. Mackay, W. Prinz & Scheerer (eds.), Language perception and production (pp. 159–179). London: Academic Press. Voeten, M.J.M. (1990). Longitudinal onderzoek van de leesontwikkeling [Longitudinal research of reading development]. In: C. Aarnoutse & M. Voeten (eds.), Gaat en onderwijst. Liber amicorum voor Dr. M.J.C. Mommers [Go and teach. Liber amicorum for Dr. M.J.C. Mommers]. Tilburg: Zwijsen. Voeten, M.J.M. (1991). Beschrijving van de individuele ontwikkeling van leervorderingen [Description of the individual development of school achievement]. In: J. Hoogstraten & W.J. van der Linden (eds.), Methodologie [Methodology]. Amsterdam: Stiching Centrum voor Onderwijsonderzoek. Address for correspondence: Cor Aarnoutse, PH.D., University of Nijmegen, Department of Educational Sciences, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands Phone: +31 243612081; Fax: + 31 243615978; E-mail: [email protected]
© Copyright 2026 Paperzz