Evaluation of the Waterford Early Reading Program 2001-2002 Implementation and Student Achievement Julie Slayton, J.D., Ph.D. Lorena Llosa, M.A. Los Angeles Unified School District Program Evaluation and Research Branch Planning, Assessment and Research Division Publication No. 144 December 4, 2002 ACKNOWLEDGEMENT We would like to offer special thanks to Emily Hansen and Xiaoxia Ai who collaborated wholeheartedly with us on this project. TABLE OF CONTENTS EXECUTIVE SUMMARY ......................................................................................................... II RESEARCH QUESTIONS ................................................................................................................. II FINDINGS ..................................................................................................................................... II NEXT STEPS ................................................................................................................................. V INTRODUCTION......................................................................................................................... 1 BACKGROUND.............................................................................................................................. 1 CONCEPTUAL F RAMEWORK AND LITERATURE REVIEW ............................................................... 3 RESEARCH QUESTIONS ............................................................................................................... 10 METHODS .................................................................................................................................. 10 SAMPLE S ELECTION ................................................................................................................... 10 DATA COLLECTION M ETHODOLOGY .......................................................................................... 14 DATA ANALYSIS ........................................................................................................................ 19 FINDINGS ................................................................................................................................... 22 RESEARCH QUESTION 1: HOW IS THE WATERFORD PROGRAM BEING IMPLEMENTED IN TREATMENT CLASSROOMS?................................................................................................ 23 RESEARCH QUESTION 2 TO WHAT EXTENT IS THE WATERFORD COURSEWARE BEING USED ?..... 27 RESEARCH QUESTION 3 TO WHAT EXTENT ARE STUDENTS ENGAGED WHILE USING THE WATERFORD COURSEWARE?.............................................................................................. 31 RESEARCH QUESTION 4 WHAT IS THE QUALITY OF PEDAGOGY DURING READING/LANGUAGE ARTS TIME IN TREATMENT AND CONTROL CLASSROOMS? .................................................. 41 RESEARCH QUESTION 5: WHAT IS THE RELATIONSHIP BETWEEN IMPLEMENTATION OF THE WATERFORD COURSEWARE, THE PRIMARY READING PROGRAM AND STUDENT ACHIEVEMENT? .................................................................................................................. 70 CONCLUSIONS AND RECOMMENDATIONS.................................................................... 79 REFERENCES ............................................................................................................................ 85 APPENDIX A .............................................................................................................................. 87 APPENDIX B ............................................................................................................................ 101 APPENDIX C ............................................................................................................................ 143 APPENDIX D ............................................................................................................................ 150 APPENDIX E ............................................................................................................................ 157 APPENDIX F............................................................................................................................. 159 i EXECUTIVE SUMMARY This is the third in a series of documents that contain findings from the districtwide evaluation of the Waterford Early Reading Program in kindergarten and first grade. This report focuses on the overall implementation of the Program during the 2001-2002 school year and its relationship to student achievement. Years 2 through 4 of the evaluation will continue to focus on implementation and program effectiveness. Research Questions The research questions for this report include: 1. How is the Waterford program being implemented? 2. To what extent is the Waterford courseware being used? 3. To what extent are students engaged while using the Waterford courseware? 4. What is the quality of pedagogy during reading/language arts time in our sample classrooms? 5. What is the relationship between implementation of the Waterford courseware, the primary reading program, and student achievement? Findings Overall, implementation of the Waterford program was inconsistent and low. Teachers do not utilize the Waterford program in a way that maximizes the courseware’s usefulness. In the majority of classrooms, the computers are positioned in a way that allows for a greater number of distractions. Additionally, many teachers’ knowledge of the program features is incomplete and insufficient to meet their students’ needs. Teachers are not using the information provided by the courseware to make decisions about instruction. Similarly, the Waterford program is not being used for the amount of time recommended. Usage data show that on average, students used the program approximately a third of the time that would be expected if they had used it every day for ii the recommended amount of time. Also, at a classroom level, we determined usage to be at a medium level. We also examined student level of engagement while students were using the Waterford courseware and during Open Court instructional time. Approximately 75% of the students were either engaged or experienced only minor distractions while using the Waterford courseware. The level of engagement varied by classroom. In some classrooms, none of the students were distracted or disengaged and in other classrooms the majority of students were off-task. English language learners were more likely to spend time off-task than English Only students. We found a similar pattern for students during Open Court instructional time. Overall, engagement in classrooms was at a medium level. With respect to the quality of pedagogy used during Open Court instruction, we found that teachers provided a medium quality or lower in our sample classrooms. In the context of the above-described low level of implementation, we examined the relationship between the Waterford program, the primary reading program, and student achievement. In kindergarten, time spent on the Waterford program had no impact or had a negative impact on gains. On Letter Identification, students who spent more time using the courseware had lesser gains than students who spent less time using the courseware, except in classrooms with higher quality Open Court pedagogy. In those classrooms, more time spent using the courseware translated into greater gains. On Word Identification and Word Attack, time spent using the courseware on average had no effect on gains. However, in the presence of higher quality pedagogy, the effect of time spent was negative. These findings are not surprising given the overall low level of implementation in kindergarten classrooms. In the same context of low implementation, we found in first grade that the use of the courseware had limited effects on the outcomes. On the Antonyms and SAT/9 Word Reading tests, classroom level of engagement during Waterford courseware usage had a positive effect. Also, on the Passage Comprehension test and the SAT/9 Reading tests, ELL students performed better than EO students in classrooms with a high level of engagement on the Waterford courseware. The effect of time spent on the courseware on most of the SAT/9 tests varied from classroom to classroom. In some classrooms, more iii time using the courseware translated into higher scores, whereas in other classrooms, more time using the courseware resulted in lower scores. These findings are not surprising given the overall low level of implementation in first grade classrooms. Recommendations Given the findings from the first year of implementation of the Waterford Early Reading Program, a variety of steps should be taken to improve the quality of instruction: 1. Central, local district and school administrators should take steps to ensure that the program is fully implemented at the classroom level. 2. Professional development should be provided to teachers so that they understand how to manage the Waterford courseware and integrate it into their daily reading/language arts lessons. 3. Teachers should be required to know the content of the Waterford courseware so that they are aware of the specific skills covered in the different activities and how and when to make changes to their students’ individual programs so that the courseware truly works to meet the individual needs of each student. Program knowledge will also allow them to provide sufficient language support to their limited English Proficient students. 4. Literacy coaches should be knowledgeable about the content and appropriate use of the program in order to provide support to teachers. 5. Professional development provided to teachers by local districts and school site administrators should focus on an increased understanding, in the context of the Open Court curriculum, of the strategies and skills students need in order to become proficient readers. Professional development should also focus on teaching teachers how to model for, prompt, and support students as they struggle to become readers. iv Next Steps As we move on to the second year of the study, we plan to incorporate valuable information we have gained from the first year. Below are a number of modifications and additions that we will include in the analysis of the second year data. 1. Reduce the number of WRMT-R tests administered to kindergarten students to the Visual Auditory Learning, Letter Identification, Word Identification, and Word Attack. 2. Investigate the possibility of using the Open Court assessments in order to gain an additional outcome measure. 3. Modify the Waterford Observation protocol in order to capture the different types of activities presented by the courseware and the extent to which students engage in them. 4. Expand the three-point scale used to determine Waterford courseware usage at the classroom level in order to create a more nuanced scale. 5. Develop a scale to evaluate the quality of implementation of the Waterford program at a classroom level. 6. Expand the definitions for quality of Open Court pedagogy for the first grade curriculum to include a more comprehensive analysis of the “beginning to read” related activities found on the Level Two courseware. v INTRODUCTION This is the third in a series of documents containing the findings from the districtwide evaluation of the Waterford Early Reading Program in kindergarten and first grade. The first report, completed in May 2002, focused on the initial implementation of the Waterford program. The second report, completed in September 2002, provided information about the impact of the program on student achievement. The final report focuses on the overall implementation of the program and its relationship to student achievement. Years 2 through 4 of the evaluation will continue to focus on implementation of the program and program effectiveness. The document is organized in several parts. The introductory section presents the background of the evaluation, a literature review of research on the use of educational technology, as well as previous evaluations of the Waterford program, and the research questions. Second, we describe the methodology employed to examine the implementation and effectiveness of the Waterford courseware, including a discussion of the sample selection, the data collection methodology, and data analysis. Next, we present the findings regarding the implementation of the program and the relationship between implementation and effectiveness of the Waterford courseware for Year I. The findings are organized to specifically address the research questions presented earlier in the document. Finally, we present the conclusions and recommendations of the study findings and next steps as the study moves into its second year. Background According to Snow, Burns, & Griffin (1998), as many as 40% of children in the U.S. experience significant problems becoming competent readers. In fact, more than two-thirds of fourth graders fail to read at levels considered to be proficient (National Center of Education Statistics, 2001),1 and in areas of high poverty it is not uncommon to find 70%-80% of a school’s student body reading below the 30th percentile (Snow et al., 1998). 1 “Students performing at the Proficient level should be able to demonstrate an overall understanding of the text, providing inferential as well as literal information. When reading text appropriate to fourth grade, they should be able to extend the ideas in the text by making inferences, drawing conclusions, and making connections to their own experiences. The connection between the text and what the student infers should be clear.” (p. 2) The goal identified by NAEP is that all students be reading at or above the proficient level. 1 In recent years in the Los Angeles Unified School District (LAUSD), average performance of 2nd and 3rd grade students in over 300 elementary schools fell below the 50th percentile. In an attempt to address this concern, the district adopted the District Reading Plan in 1999, requiring that all schools in which students were reading below the 50th percentile on Spring 1999 grade 2 or 3 SAT/9 reading tests adopt one of three structured reading programs – Open Court, Success for All, or Language for Learning/Reading Mastery. 2 Consistent with practices advocated by researchers, these three programs were expected to provide daily classroom instruction that 1) was thoughtfully designed and implemented; 2) provided explicit and systematic instruction that would build basic word reading skills; and 3) would be balanced and integrated with rich and meaningful experiences in reading and writing within a literaturerich environment (Mathes, Torgesen, & Allor, 2001). In 2001, in an effort to provide further support to the district’s students who were the most at-risk of experiencing reading failure, the district adopted the Waterford Early Reading Program, a computer-based literacy program, for students in kindergarten and first grade. The use of the Waterford program was mandated in 244 schools in kindergarten and first grade classrooms. These schools fell into two groups. The majority of the target schools had a 50% or greater enrollment of English language learner (ELL) students in the first grade and had reading scores below the 45th percentile on the first grade SAT/9 reading test. There were 211 schools in this group. The second group of schools was comprised of those schools that were below 50% enrollment of ELL students in the first grade and had first grade SAT/9 reading scores below the 45th percentile. There were 33 schools identified in this group. At the time the Waterford program was adopted, it was expected that 2235 classrooms would receive the courseware.3 The Waterford program was to be used during each day’s reading lesson by teachers who were already implementing Open Court or Success for All. Students were expected to spend 15 minutes daily in kindergarten and 30 minutes daily in first grade on the computer. Implementation of the Waterford program began in the 2001-2002 school year. 2 The District Reading Plan was enacted in 1999. The first year of full implementation was 2000-01. For more information regarding the District Reading Plan, see D. Oliver (2002), District Reading Plan Evaluation Year One: 2000-2001. 3 In addition, 9 schools in the district opted to adopt the Waterford program in a variety of forms. Some adopted the program throughout kindergarten and first grade. Others put Waterford in kindergarten only. Still others put the program in one or more classrooms within each grade level. 2 Conceptual Framework and Literature Review Technology in the Classroom The concept of a more flexible, student centered approach to instruction that is more attentive to the intellectual content of academic subjects can be traced back to the turn of the twentieth century in John Dewey’s work all the way through the curricular reforms of the 1950s (Cohen, 1987). This perspective reflects a belief that school instruction can be exciting, intellectually challenging and attuned to children’s ways of thinking. Computer assisted instruction (CAI) is closely associated with this set of beliefs. CAI made its first foray into education in the 1960s but was overshadowed by the success of Sesame Street and its progeny. Cohen notes that the 1980s brought a new focus on the use of computers as an instructional tool. A review of research on CAI in the classroom reveals several themes in terms of effectiveness, implementation, and challenges. Mioduser, Tur-Kaspa and Leitner (2000) examined CAI in the context of literacy instruction. They compared CAI with more traditional forms of literacy instruction, such as teacher-led instruction using textbooks, and found that CAI had positive effects on students’ learning. This study found that children who received computerbased intervention significantly improved their literacy skills – namely phonological awareness, word recognition, and letter naming skills – as compared to counterparts receiving non-computer based intervention or no intervention at all. In addition, students receiving computer instruction have been shown to not only have better problem-solving and recall skills, but also a deeper understanding of learning in general (Lamon, Chan, Scardamalia, Burtis, & Brett, 1993). Computers in the classroom can also be motivating tools that build students’ selfconfidence and pride in their work. Because children feel that they are in control of their learning while on the computer, and because the computer provides immediate and individual feedback in a multi-modal environment, children are more motivated to learn (Mioduser et al, 2000; US Department of Education, 2001). While children see pencil and paper tasks as “work,” they see time spent on the computer as “play,” and they prefer the latter to the former. Thus, they spend more time engaging in the learning activities that integrate many aspects of literacy, including reading, writing, speaking, listening, and thinking (Kamil & Lane, 1998). Because editing work and correcting mistakes is easier with word-processing programs than it is in handwritten work, older students gain confidence in their work and a willingness to share and collaborate with their 3 peers (Kamil & Lane, 1998; Owston & Wideman, 1996). Technology allows students to exhibit what they have learned in ways not measured by conventional classroom activities. Teachers find that this mastery of technological skills leads to a greater sense of self-esteem and empowerment in students (US Department of Education, 2001). In fact, Mathes, et al. (2001) note that the motivational aspects of CAI for low-performing readers is well documented in the literature. On the other hand, Mathes, Torgesen, & Allor (2001), found that CAI in phonological instruction did not impact first grade student performance beyond that achieved with peerassisted literacy strategies. Similarly, Angrist & Lavy (1999) found that the use of CAI in 4th and 8th grade classrooms did not appear to have educational benefits that translated into higher test scores. Moreover, according to Angrist & Lavy (1999), the evidence for effectiveness of CAI is both limited and mixed. There are few empirical studies that meet a rigorous methodological standard. In fact, Angrist & Lavy (1999) argue that many of these studies are qualitative, “gathering impressions from participants in demonstration projects, or quantitative but with no real comparison group” (p. 2). Another limitation of CAI is that it has traditionally fit into the existing American educational structure outlined by Cohen (1987) in that it provides a one-way imparting of knowledge on an individual student, and its software uses brief, isolated exercises to develop and assess discrete skills. However, while these “drill and practice” exercises may be efficient in developing these basic skills, they have done little to reform American education in terms of promoting the development of higher order thinking skills (US Department of Education, 1993; US Department of Education, 2001). While the practice of utilizing technology in the classroom is not a new one, researchers point out that, historically, it has not been integrated well or to its fullest potential (Hasselbring & Tulbert, 1991; US Department of Education, 1993; US Department of Education, 2000). During the latter decades of the twentieth century, school districts have invested significant amounts of funding to install computers in schools, but these expenditures have been inadequate: computers have not been purchased in sufficient quantities so as to transform educational practice, and the technology is usually outdated, thus rendering it useless relatively quickly (Hasselbring & Tulbert, 1991). In addition, the software typically has no mechanism with which to collect longitudinal data on individual students’ growth in skills. When records are kept by the 4 computers, teachers have a means of analyzing these records and tailoring the instruction to each student’s individual needs. In the absence of these data, teachers cannot make instructional decisions based on their students’ growth, or lack thereof. If the software does not log a student’s history of its use, and it treats children as first-time users each time they utilize the program, then the impact of the software is automatically diminished (Hasselbring & Tulbert, 1991). As Kosakowski (1998) points out, “technology cannot exist in a vacuum, but must become part of the whole educational environment” (p. 56) if it is to be used effectively. Mioduser et al. (2000) investigated whether there was “added learning value” to computer-based materials when used as an intervention with an existing curriculum. In comparing three groups of children at risk for reading disabilities—one group receiving no intervention, one receiving a paper based intervention, and one receiving a computer based intervention—they found that the greatest gains were achieved by those students receiving both the existing curriculum and the computer-based intervention. The North Central Regional Educational Laboratory (1999) argues that the successful integration of technology into the American educational system will involve recognition on the part of all stakeholders that it is not the technology alone that will promote student achievement. It is, rather, the use of the technology as it relates to the larger educational goals that are established by the district or school. In addition, the integration of technology into all aspects of teaching and learning takes time to establish, and the school day must be adjusted to allow this integration to occur. If the integration of computers is going to be successful, it is critical that teachers be properly trained and involved in the process. Hasselbring and Tulbert (1991) assert that in the past, districts have focused on student and administrator technology needs, but have failed to give teachers direct access to and power over these tools to improve their own teaching and productivity. Early conceptualizations of technology implementation grew from a desire to find “teacher-proof” instruction, a notion that is now recognized as unrealistic and misguided (US Department of Education, 2000). In order for students to benefit from technology, teachers need ongoing, in-depth professional development that instructs them on the mechanics of the technology as well as how to use it meaningfully in an instructional setting. This goal cannot be achieved through a “one-time workshop” (North Central Regional Educational Laboratory, 1999, p. 6), but through in-depth and sustained training (Kinzer & Leu, 1997; North Central Regional Educational Laboratory, 1999). Educators must be made aware of the mechanisms by which 5 technology use leads to student learning (Hasselbring and Tulbert, 1991). Effective use of technology in the classroom is not merely predicated on training in the use of this technology; it is also predicated upon teachers’ knowledge and understanding of literacy skill acquisition (Kinzer & Leu, 1997). When teachers are trained in both areas, they will then be equipped to evaluate software and its usefulness in literacy instruction (Kinzer & Leu, 1997). They will then be able to use technology “in appropriate ways to deliver powerful instruction” (Hasselbring & Tulbert, 1991, p. 36). In light of this previous research, it is apparent that evaluating the impact of CAI technology in a classroom setting is a complex process and several key factors must be taken into account. It is difficult to measure the impact of technology on student achievement because changes in technology use are often part of larger systemic or network changes. Thus, it is difficult to employ the typical social scientific research methods to infer a simple, causal relationship between technology use and student achievement. Researchers must attempt to capture the constellation of factors and influences that are working congruently with technology use so as to draw that connection (North Central Regional Educational Laboratory, 1999). This research provides a larger context for this evaluation of the Waterford Early Reading Program. The Waterford Early Reading Program The Waterford Early Reading Program is a computer-based program developed on the premise that intervention in preschool or kindergarten is the key to reading success. In the primary grades, the program recommends that each child be given approximately 15-30 minutes a day to work individually at a computer station. The program’s courseware is personalized for each child’s learning pace and reading level, and the program gives teachers up-to-date information on their students’ progress and needs as they move through the various reading levels. In addition, children have a series of books and videotapes to take home, thus giving children further exposure to the material and giving parents the opportunity to read with their children. Implementation and effectiveness of the Waterford Early Reading Program (Waterford) has been evaluated in several school districts nationwide. Evaluations in Prince George’s County (MD), New London (CT), Whittier City and Hacienda La Puente (CA), Hollister (CA), El 6 Centrito (CA), Orem (UT), Decatur (IL), Santa Clara (CA), and Norwalk (CT) found generally positive effects of the Waterford Early Reading Program in their districts. These studies found that elementary school students using the program outperformed their control counterparts or a historical control group. Unfortunately, many of these studies contained design flaws including the lack of a baseline measure, lack of control group, no control for pre-existing differences, small sample sizes, and/or exclusion of the necessary information to compute effect sizes. Also, several studies failed to control for the level of implementation of the courseware or any other curriculum being used in conjunction with Waterford. In light of these limitations, much of the information from these evaluations should be interpreted with caution. Much of the more recent research on the courseware focuses on teacher, administrator, student, and/or parent perceptions of the program’s efficacy (Cope & Cummings, 2002; Electronic Education, 2002; Fairfax County Public Schools, 2001; Klopfer & Zilgme, 2000; Kortz, 2002; Murray-Ward, 2000; Santa Clara Unified School District, 2000), and generally found that participants were enthusiastic and positive about the program. Klopfer and Zilgne (2000) assert that “proper implementation,” completely functioning hardware and software, as well as staff buy-in and enthusiasm, are critical factors in the success of any program. Studies such as these suggest that these positive perceptions and support for the program lay the groundwork for maximizing its success. Cope and Cummings (2002) provided extensive and thorough evidence of this sort in their evaluation of the courseware in the Madisonville (TX) Consolidated Independent School District. While teachers and administrators report that they like the program, that students are benefiting from the program and that the program is being fully implemented, this information is anecdotal, and test scores were not available at the time of publication. In addition, this study involved prekindergarten students, and perceptions of effectiveness focused on reading readiness skills as well as interest in books and reading, rather than on the mechanics of reading or on comprehension, which are skills that are expected of older children. Researchers at Rutgers University and Kean College in New Jersey conducted a study on the Waterford program in Newark, NJ. Children in Waterford classes were compared to control students using a pretest (September/October) assessment and a posttest (June) assessment. The time interval was appropriate for assessing growth, and the authors report significant differences in the mean percentile gains in favor of the students who used the Waterford courseware. In 7 addition, these researchers assessed the children on more than one measure: the Waterford Reading Inventory and the TERA-2 (Test of Early Reading Ability – 2nd Edition). This use of multiple assessments strengthens the claims made that the courseware had a significant impact on children’s reading scores. It was possible to compute effect sizes based on the information provided by the authors, and these effects could be considered medium to large by conventional standards (.45-.60). Thus, it could be concluded with reasonable confidence that the Waterford program was having a significant impact on children’s literacy development. Using a nationally normed reading assessment, Jadali and Wright (2001) administered both a pretest and posttest to students using the courseware as well as a comparable group of students not using the courseware. Results pointed to superior gains by students receiving Waterford instruction compared to those receiving another program. Longitudinal studies of these children would lend insight into sustained effects of the program. Hecht (2000) analyzed effectiveness of Waterford by first controlling for pre-existing differences on pretest performance and expressive vocabulary ability between treatment and control groups in a sample of economically disadvantaged and academically at-risk kindergartners in Ohio. The study found that students receiving Waterford instruction had significantly higher gains on independent measures of letter-word identification, spelling, elision, segmenting, and sound-matching than their control counterparts. No differences were found on letter name and sound knowledge tasks, nor on the print concepts tasks. These results suggest that while teachers in both the control and treatment classrooms did a comparable job in teaching their students basic alphabetic principles, the exposure to the Waterford courseware helped to develop the phonological awareness skills of the treatment students. In 2001, Electronic Education, the Waterford Institute, and the Los Angeles Unified School District collaborated in an evaluation of the district’s second grade Intensive Academic Support (IAS) Program. This evaluation centered on an analysis of the effectiveness of the Waterford Early Reading Program within these IAS classrooms. Researchers found the use of the Waterford Early Reading Program to be correlated with increases in reading skill gains. Iinterpretation of the results of this study are limited, however, due to the fact that it had a small sample size, did not adequately reflect the district’s large number of year-round calendar schools, lacked an adequate control group and an independent pre- and posttest, and did not control for pre-existing student differences. 8 In another Electronic Education study of Waterford usage in LAUSD, revealed that students were using the program for the recommended amount of time (15 minutes per day for kindergartners and 30 minutes per day for first graders), revealed that the Waterford program seemed to have a positive effect on children’s mastery of literacy concepts according to teacher surveys. Besides the program generated data and the survey data, the only assessment used in this study, as well as in another study in Buffalo (Cañedo, Smolen, and Pollard, 1998), was the test developed by the Waterford Institute, and it is unclear whether these results can be generalized to tasks unrelated to the Waterford program. In addition, the teacher survey response rate in the Electronic Education study was very low, and a selection-bias may have existed. The work presented in this study has been informed by the research presented above related to the use of technology in classrooms in general and the effectiveness of the Waterford courseware more specifically. The current study attempts to expand upon this research by including a series of factors not addressed in earlier research. In addition to examining the quantity of computer usage as it relates to student achievement, we will use classroom observation data to account for the quality of the time spent in terms of student behavior and engagement while using the program. As the Waterford courseware is being used as a support to the district’s existing literacy programs, we will include the quality of implementation of these programs in our analyses so as to more accurately assess the unique impact of the Waterford courseware on student achievement. While truly random assignment to treatment and control classrooms is impossible, the control group used matches the treatment group in terms of ethnicity, Title I status, and English language proficiency (See Appendix B). In order to get the most complete picture of impact on student achievement, we will include multiple measures of reading ability, including a districtwide, group-administered test (Stanford/9), an individuallyadministered, norm-referenced test (Woodcock Reading Mastery Tests – Revised), and a computerized assessment (WCART).4 The sample size is large enough to detect effects and also withstand the inevitable attrition that occurs due to student mobility in and out of the district’s schools. Finally, teacher and administrator perceptions of the program and its effects will be included to offer insight into the day-to-day issues surrounding implementation of courseware, as well as teacher knowledge and understanding of the program. 4 As will be discussed in greater detail below, we were unable to use the WCART as an outcome measure for the 2001-2002 evaluation. 9 Research Questions The research questions for this report include: 1. How is the Waterford program being implemented? 2. To what extent is the Waterford courseware being used? 3. To what extent are students engaged while using the Waterford courseware? 4. What is the quality of pedagogy during reading/language arts time in sample classrooms? 5. What is the relationship between implementation of the Waterford courseware, the primary reading program, and student achievement? METHODS This section presents the methodology employed to evaluate the implementation and effectiveness of the Waterford Early Reading Program during the 2001-2002 school year. It contains a discussion of the sample selection, data collection methodology and data analysis procedures. Sample Selection Sample At the outset of the study, a total of 200 classrooms distributed equally across all calendar types and tracks were selected for participation in the Waterford Early Reading Program Evaluation using a sampling procedure designed to allow for the use of a treatment and control group.5 Of principal concern were 1) the ability to match classrooms in the treatment group (those receiving the Waterford program) with classrooms that, but for the Waterford Program, were similar in composition, and 2) the ability to determine whether differences in program effectiveness existed across calendar type and tracks.6 5 Calendar means both whether the school is on a LEARN or year-round calendar and on which track the students are located. In other words, we sought an even distribution of students across all tracks. We focused on tracks A, B, and C in Three Track schools and specifically on track D in Four Track schools. 6 It is important to note that all but 26 of the district’s 172 year-round elementary schools (or 85%) fell below the 45 percentile in reading. On the other hand, only 89 of the 255 LEARN calendar schools in the district (or 35%) 10 We used a multi-step procedure to develop the study sample. The control group classrooms were sampled first. This was done because it was anticipated that the control classrooms would be more difficult to identify than treatment classrooms. The difficulty in identifying a set of control classrooms stemmed from the fact that these classrooms had to be located in schools that overall, did not qualify to receive the Waterford Early Reading Program, while still matching the general characteristics of the treatment classrooms. To ensure that the control classrooms matched the expected characteristics of the treatment classrooms, a list of criteria was established. This list included: 1) racial/ethnic composition; 2) percentage free/reduced lunch; 3) school level average reading score;7 4) percentage of English language learners (ELLs); 5) primary reading program; 6) local district; 7) calendar; and 8) track. Data from 2000-2001 first grade classrooms were used to identify possible control group schools across each calendar and track (Track A, Track B, Track C, Track D, LEARN) with classrooms matching the above criteria. Where multiple schools contained classrooms matching the desired criteria, the school was randomly selected. Where necessary, schools with classrooms matching the criteria were purposively sampled. Table 1 shows the distribution of control classrooms included in the sample by track. Ten classrooms per track or calendar were selected to be in the control group. qualified to receive the Waterford Program. Additionally, in Comparison of student outcomes in multi-track yearround and single-track traditional school calendars, White and Cantrell (2001) found that significant differences in achievement exist for students who attend school on B-Track at three-track schools. According to the authors, “in every school type [elementary, middle, and senior], the performance of B Track students is substantially lower than other tracks in both reading and mathematics.” Based on this report and data collected in the Standards-Based Promotion Evaluation, which demonstrate that students in multi-track schools, and especially those on B-Track and C-Track, are subject to significant disadvantages as a result of their placement on those particular tracks, this evaluation will focus on the achievement level of students within tracks and calendar types. 7 Kindergarten students are not tested on the SAT/9. Similarly, first grade students are not tested on the SAT/9 until the end of first grade. Therefore, we used school level average SAT/9 reading scores to identify control schools and the remaining criteria to match classrooms. 11 Table 1 Number of Control Group Classrooms by Track Control Group Classrooms Track A Track B Track C Track D LEARN Total Kindergarten 10 10 10 10 10 50 First Grade 10 10 10 10 10 50 Grand Total 100 Once the possible control group classrooms were identified, we randomly sampled schools from the entire 244 schools adopting the Waterford program pursuant to the district’s implementation plan. We then matched treatment classrooms to control classrooms. In the majority of cases, control group classrooms were matched with treatment group classrooms on a one-to-one ratio.8 Table 2 shows the distribution of treatment classrooms included in the sample by track. Table 2 Number of Treatment Group Classrooms by Track Treatment Group Classrooms Track A Track B Track C Track D LEARN Total Kindergarten 10 10 10 9 10 49 First Grade 10 10 10 10 10 50 Grand Total 99 The sample deviates from the school population characteristics in predictable ways. It is entirely comprised of classrooms averaging below the 45th percentile in reading, have large African American and/or Latino populations and/or large English language learner populations, and have high numbers of students receiving free or reduced meals. 8 For Track D we found it necessary to match one treatment kindergarten against students in two control classrooms because of the small number of schools with Track D classrooms that could be considered for matching purposes. 12 Schools from each of the 11 local districts are represented in this sample. Again, the distribution of classrooms deviates from the distribution within the population in expected ways. Given that a disproportionate number of classrooms meeting the desired criteria are located in Districts B, E, F, G, H, and J, the number of classrooms located in these districts is notably higher than those for the population as a whole. Student Sample For the first year of the study, a random sample of five children from each control classroom and ten children from each matched treatment classroom were selected (see Table 3). Table 3 Sample Size for Control Classrooms and Matched Treatment Classrooms Track A Kindergarten Matched Control Treatment 100 50 First Grade Matched Treatment 100 Control 50 Track B 100 50 100 50 Track C 100 50 100 50 Track D 100 50 100 50 LEARN 100 50 100 50 Total 500 250 500 250 Finally, an additional 5 children per non-matched treatment classroom were selected (see Table 6). This additional non-matched treatment sample was needed to re-establish the representativeness of the entire treatment group in the district. As a result, this added an additional 250 children per grade level to the treatment group, thus creating a total sample size of 2,000 students. To the extent possible, we will examine students’ SAT/9 scores as they move on to subsequent grade levels.9 9 In the second year of the study we plan to sample first grade classrooms that have the largest number of students who were in our kindergarten sample. Because of the dispersion from kindergarten to first grade, we anticipate losing a large percentage of our sample students. Therefore, for the second year of the study, we will replace the students who leave the sample. We will also examine the SAT/9 scores of the first year first grade students as they move on to subsequent grades. 13 Data Collection Methodology Data collection consisted of both quantitative and qualitative activities. Quantitative data arose from three sources: The Woodcock Reading Mastery Test—Revised Form G; Stanford 9 (SAT/9) data for first grade students; Waterford usage data; and WCART data where available. Qualitative data were collected through classroom observation and interview protocols. Quantitative Data Collection A pretest of 2000 kindergarten and first grade students was conducted within the first four weeks of school for each track or calendar type. The Woodcock Reading Mastery Test— Revised Form G was individually administered to each student. A posttest was given within the last four weeks of the 2001-2002 school year for each track and calendar type. The Woodcock Reading Mastery Tests—Revised (WRMT-R) is a battery of tests that measures several aspects of reading ability for kindergarteners through adults. 10 The WRMT-R is a nationally standardized test that has a uniform scale across grades and has no known ceiling effects when being used with school-aged children. Figure 1 shows the structure of the WRMT-R. Figure 1. The Structure of the WRMT-R Form G 10 Other tests which were considered included: the Woodcock-McGrew-Werder Mini Battery of Achievement; Test of Academic Skills - Reading, Arithmetic, Spelling, and Listening; Test of Academic Performance; Iowa Tests of Basic Skills, Forms K, L; California Achievement Tests 5th Edition; Peabody Picture Vocabulary Test; Gray Oral Reading Tests – Revised; and the Waterford Comprehensive Reading Inventory. 14 Total Reading Readiness Visual Auditory Learning Basic Skills Letter Identification Word Identification Reading Comprehension Word Word Attack Comprehension Antonyms Passage Comprehension General Reading Synonyms Analogies ScienceMathematics Social Studies Humanities Form G of the WRMT-R consists of six tests grouped into three clusters. The Readiness Cluster is composed of a test of Visual Auditory Learning and a Letter Identification test. The Basic Skills Cluster composed of the Word Identification and the Word Attack tests. Finally, the Comprehension Cluster consists of a test of Word Comprehension and a test of Passage Comprehension. The Word Comprehension test consists of three subtests: Antonyms, Synonyms, and Analogies. According to Woodcock (1998), the Basic Skills Cluster and the Comprehension Cluster form the Total Reading Cluster. Reading Readiness is not included in the Total Reading Cluster. A brief description of each of the tests and subtests follows. • The Visual Auditory Learning test is described by Woodcock (1998) as a “miniature ‘learning –to-read’ task.” Students learn a vocabulary of unfamiliar symbols (rebuses) that represent familiar words and then they “read” those rebuses to form phrases and sentences. • The Letter Identification test measures a student’s ability to identify letters presented in uppercase and lowercase. It also requires students to identify letters in formats that may be unfamiliar to them, such as roman, italic, bold, serif and sans serif types; cursive and other special types. 15 • The Word Identification test requires the student to produce a natural reading of isolated words on sight (within approximately 5 seconds). The words are ordered by frequency of occurrence in written English. • The Word Attack test requires students to read either nonsense words or words with very low frequency of occurrence in English. This test measures the ability to apply knowledge of phonics to decode unfamiliar words. • The Word Comprehension Test is comprised of three subtests that measure students’ reading vocabulary at different levels of cognitive processing: antonyms, synonyms, and analogies. The Antonyms subtest requires students to read a word and then provide a word that means the opposite. The synonyms subtest requires students to read a word and then respond with another word that is similar in meaning. The Analogies subtest, the most cognitively demanding of the Word Comprehension test, requires students to read a pair of words, understand the relationship between them, then read the first word of a second pair and provide another word to complete the analogy using the same relationship. • The last test, Passage Comprehension, uses a modified clozed procedure. Students are required to read a short passage, one to three sentences, and supply a key word missing from the passage. The first third of the items (the easiest third) consists of one-sentence items accompanied by a related picture. Picture-text items make it possible to measure passage comprehension skills at the lower grade and age levels. In addition to the Woodcock scores we obtained the Stanford 9 (SAT/9) results for the first grade students in our sample. We examined the following tests: • Reading. Reading is composed of three subtests: Word Reading, Word Study, and Reading Comprehension. The Word Reading subtest focuses on word recognition development. It requires students to identify three printed words that are associated with a given picture. The Word Study Skills subtest measures early reading skills, such as the ability to recognize within words the structural elements required for decoding (compound words, inflectional endings, contractions), and the ability to relate consonant and vowel sounds to their most common spellings. The Reading Comprehension subtest measures the ability to comprehend connected discourse. It employs three different 16 formats and includes literature-based reading selections that appeal to students of varying backgrounds, experiential levels, and interests. Word Reading, Word Study, and Reading Comprehension combined provide a Total Reading score. • Spelling. The spelling test measures a student’s ability to apply their knowledge of both the phonetic relationships and the structural properties of words in order to spell words. • Language. The language test measures a student’s ability to identify correct punctuation, capitalization, and usage in simple sentences, as well as their understanding of effectively written sentences and paragraphs. As a third measure, we obtained from the Waterford Institute all of the Waterford Computer Adaptive Reading Test (WCART) scores for whom they had data and from these, we obtained those scores for the students in our sample. • The WCART is an interactive test for kindergarten, first grade, and second grade students, which is administered entirely on the computer. It assesses pre-literacy and early literacy skills including print concepts, initial phoneme isolation, rhyming, letter names, letter sounds, decoding, sight words, vocabulary, reading passages, and grammar. Finally, we received from the Waterford Institute, their usage data file. This file includes the number of minutes each student spent on the computer per month; the total number of minutes spent on the computer throughout the year; the level of the program at the end of the year; and the last lesson completed. Qualitative Data Collection Classroom Observations. The principal method utilized to collect data regarding the implementation of the Waterford program was observational. Data collection was carried out in the classroom by graduate students, retired elementary and middle school teachers and administrators or other data collectors trained in observational methods. A classroom visit usually lasted from 3-5 hours during the reading/language arts instructional period. In the event that students used the courseware outside of the reading/language arts instructional period, our observers remained in the classroom to observe the students using the courseware. We visited each classroom for two days during the fall and spring semester. The activities and instruments utilized in the observation included: 17 a. • Classroom Environment Map and Checklist • Fieldnotes • Activity Checklists • Activity Focus Checklist • Reflective Notes Classroom environment map and checklist: The purpose of the environmental map was to provide a sense of desk or seating arrangement; learning centers; Waterford Early Reading Program computers; the availability of language arts materials and resources; the presence or absence of Open Court related materials; instructional aides created by the teacher and or students (word walls, teacher made instructional charts); and whether or not children’s work was exhibited. After completing the map, classroom observers scanned the room and noted the seating arrangement, instructional resources, and student activities. b. Fieldnotes: Fieldnotes are defined as a written narrative describing in concrete terms and great detail the duration and nature of the activities and interactions observed. Classroom observers became the “eyes and ears” of the project and their notes described the overall context in which reading/language arts instruction took place. Fieldnote writing occurred before, during, and after the completion of the other observation instruments. c. Activity Checklist: The purpose of this instrument was to document quantitatively the number of activities during which the teacher engaged in a range of instructional strategies and techniques related to student and teacher interaction, the use of particular resources within the classroom, and the types of instructional activities that were used by the teacher over the course of the three days of observation. This instrument was used once for each activity. d. Reflective Notes: At the end of the second day of observation, observers completed a set of reflective notes. These notes provided observers with the opportunity to record any information or data gathered during their observation that does not belong on any of the other observation protocols. It was also a place for observers to document their experiences, biases, likes and dislikes of a classroom observation experience. It allowed the observer to intentionally place any subjective comments they had regarding their 18 observation so that they could avoid expressing these comments within the context of the objective fieldnotes taken during the observations. Interviews a. Teacher Interviews: In addition to the three-day observations, classroom observers also conducted an interview with each teacher observed. The interview usually occurred after observations were completed. b. Principal interviews: Interviews were conducted with 15 elementary school principals regarding their perceptions. The interview usually occurred after observations were completed at the school. Curricular Materials a. Waterford Early Reading Program Materials: Manuals and materials for Level One and Level Two were collected in order to examine the content of the curriculum and its relationship to the content covered by the primary reading program in use in the treatment classrooms. b. Open Court Teacher’s Manuals for Open Court 2000 and Collection for Young Scholars: Teacher’s manuals for Open Court Level K and Level 1 were collected in order to examine the content of the curriculum and its relationship to the content covered by the Waterford courseware for Level One and Level Two. Data Analysis Qualitative data reduction and analysis was conducted with each type of data collected. We reviewed classroom observation fieldnotes and developed a coding scheme for the observational data. We also coded teacher interview transcripts and documents collected from teachers reflecting their instructional practice during the reading/language arts time and the usage of the Waterford courseware. In addition, we used classroom observation data to determine individual student level of engagement while using the courseware and during Open Court instruction. We also examined the Waterford usage data in order to determine the extent to which 19 each individual student used the Waterford courseware during the school year. Classroom level constructs were created to represent 1) the classroom level of Waterford courseware usage; 2) the classroom level of student engagement during Waterford courseware usage; and 3) the quality of teacher pedagogy in relation to Open Court instruction. Once these constructs were defined, we coded the data and then conducted a variety of statistical analyses, including Hierarchical Linear Modeling (HLM), to determine the extent to which there might be a relationship among these student and classroom level variables and student achievement. Relationship between the Reading Programs and the Waterford Program In order to determine the effectiveness of the Waterford courseware as a support to a primary reading program, it is necessary to understand each program and their relationship to each other. Thus, the following section first provides an overview of the primary reading programs in use in the treatment and control classrooms. Second, this section describes the alignment between Open Court, the predominant reading program in use in the district, and Waterford courseware Levels One and Two. Primary Reading Programs. The primary reading programs used in the treatment and control kindergarten and first grade classrooms during the 2000-2001 school year included Open Court (Open Court 2000 and Collection for Young Scholars), Success for All, Into English, and Cuenta Mundos.11 The vast majority of classrooms were using Open Court (163), with 13 classrooms using Success for All, and 4 classrooms using other reading programs.12 The distribution of reading programs is presented in Table 4. 11 In the 2002-2003 school year, Collection for Young Scholars was replaced by Open Court 2002 and Success for All classrooms adopted Open Court 2002 anthologies for their reading comprehension component. 12 Four of these classrooms were taught primarily in Spanish and are considered Waiver to Basic. 20 Table 4 Reading Program Used by Grade and Condition13 Kindergarten Open Court Treatment classrooms 47 Success for All 5 Other 1 First Grade Control classrooms 35 Treatment classrooms 43 Control classrooms 38 8 1 2 Pursuant to the district’s literacy plan, students spend 90 minutes in kindergarten and 150 minutes in first grade on reading/language arts on a daily basis. For those students in classrooms where the Open Court curriculum is being used, this time is divided into three discrete areas of focus: Letters and Sounds (kindergarten) or Preparing to Read (first grade); Reading and Responding; and Integrating the Curriculum. In kindergarten, Success for All is a theme-based curriculum that incorporates the entire learning time and interweaves throughout all subject areas (Early Learning). Starting in first grade, the Roots program is implemented via the daily 90minute reading block. During this time, students are grouped according to instructional reading level, not their chronological age or grade level. As a result, students often rotate daily to a different teacher’s classroom for the homogenous, cross-grade Success for All lesson. Students are reassessed and regrouped every eight weeks to maintain the homogenous groupings. Alignment Between Waterford and Open Court. In classrooms using the Waterford courseware (treatment classrooms), it is recommended by the Waterford Institute that Level One courseware be used for 15 minutes a day and Level Two for 30 minutes a day. According to Waterford materials, “Level One courseware prepares students for beginning reading instruction by teaching each of the following: print concepts, phonological awareness, and letter recognition" (Getting Started). While on the computer, students engage in a range of activities on a daily basis. These activities fall into three broad categories: Daily Activities, Main Lessons, and Play and Practice. Daily Activities focus on phonological awareness. Main Lessons focus on 13 The number of classrooms within each group does not match the original sample. Some classrooms originally sampled as control had the Waterford program when we began our observations. These classrooms were thus treated as treatment classrooms. Other control classrooms acquired the Waterford program at some point during the school year after our fall observations but before our spring observations. These classrooms are considered “late treatment classrooms” and were not included in the analyses. 21 readiness activities. The first 26 Main Lessons teach reading readiness concepts. The second 26 lessons have students learn to look more closely at text (Getting Started). Play and Practice provides students with an opportunity to review Level One skills in an exploratory environment. A range of activities covering the same concepts—print awareness, phonological awareness, and letter recognition—are presented within the Sounds and Letters component of Open Court. Open Court Level K provides activities within the following broad categories: Letter Recognition; Letter Names and Letter Shapes; How the Alphabet Works; Reading and Writing Workbook activities; Sounds, Letters, and Language; and Phonics. The “Level Two courseware teaches beginning reading, which includes letter sounds, word recognition, and beginning reading comprehending” (Getting Started). Students engage in a range of activities progressing through Word Recognition, Automaticity, Reading Strategies, Reading Comprehension, Writing Practice, Unit Review, and Play and Practice. Again, Open Court Level 1 spends time on each of these categories of activities within the Preparing to Read, the Reading and Responding, and the Integrating the Curriculum components of the program. (For further analysis of the alignment between the two programs for kindergarten and first grade, please see Appendix A). As reflected in Appendix A, the subjects covered by Open Court extend beyond those presented within Levels One and Two of the Waterford courseware. This is not surprising given that the Waterford courseware was adopted as a support to the pre-existing reading program and not as a replacement. In the absence of a comparison group, the direct alignment of the Waterford content and the Open Court content would make it impossible to determine whether gains in student achievement were attributable to the Waterford courseware, the Open Court curriculum, or an interaction between the two programs. Thus, we focused our analysis of curriculum implementation on those areas we judged to be in alignment between the Waterford courseware and the Open Court curriculum. In this way, we felt we would have the best chance of determining whether gains were the result of the additional presence of the Waterford courseware in treatment classrooms. FINDINGS 22 Research Question 1: How is the Waterford program being implemented in treatment classrooms? In this section we discuss different aspects of the Waterford program implementation in treatment classrooms. Implementation refers to placement of the program computers, technical issues, and teachers’ usage of the program features. Placement of Computers Waterford recommends that the computers be placed in a center “so students can’t see one another’s computer screens” (Getting Started). According to the Getting Started manual, “students are less distracted this way” (p. 13). In the event that it is not possible for the computers to be placed so that the screens are not immediately next to each other, Waterford recommends the use of short dividers to separate the computers. As can be seen in Table 5, the vast majority of classrooms in both kindergarten and first grade did not follow these recommendations. For kindergarten, 89% of classrooms had at least two of the three computers set up immediately next to each other without dividers. For first grade, the numbers are similar. Seventy-nine percent of classrooms had at least two of the three computers directly next to each other without dividers. Table 5 Placement of Computers Placement of computers Kindergarten First Grade Three in a row 70% 52% Two facing one 19% 27% Three in a kidney table 8% 9% Other 3% 12% It is probable that the computer arrangements significantly contributed to the level of disengagement observed in some of the classrooms. As described below, most of the disengagement observed involved students looking at each other’s screens and engaging in 23 conversations with the student(s) working on the computers next to them. It is also possible that if the computer screens were not facing the rest of the class, the amount of classroom disruption experienced by students eager to see what their classmates were doing on the computer might have been reduced as well. Technical Issues According to the teacher interview data, 24 (53%) kindergarten teachers and 20 (44%) first grade teachers experienced at least one computer breakdown. 14 The most common problems included computers freezing, missing/broken headphones, poor/no sound coming out of headphones, hardware problems, broken monitors, problems with the mouse, and computers not loading the program. Teacher Usage of Program Features Program Level Placement. The Waterford program provides teachers with a Placement Screening designed to help teachers decide if their students should begin the program at Level One, Level Two, or Level Three. While this is not a formal assessment, the teacher fills out the assessment by answering questions for each student on an individual basis. It is not meant to be used as a placement tool for an entire class. Yet, when teachers were asked how they determine the start level for their students, 31 (69%) kindergarten and 28 (61%) first grade teachers said that all their students started at the same level. They did not refer to using any form of assessment in order to determine whether the level they were using was appropriate for their students. On the other hand, program materials recommend that kindergarten students start on Level One and first grade students begin on Level Two (Getting Started). Three kindergarten and one first grade teacher indicated that they believed that the computer determined the level at which students start the program. Of those teachers who did use an assessment, 4 kindergarten and 7 first grade teachers used the WCART. Three kindergarten and 4 first grade teachers used other assessments, and 7 first grade teachers used their own judgment to assess their students. When asked how they determine when to advance students to the next level, 62% of kindergarten and 33% of first grade teachers stated that the computer automatically advances students to the next level. According to Getting Started, this statement is accurate if the teacher has selected this 14 There were 45 kindergarten and 46 first grade treatment teacher interviews. 24 option in the Waterford School Manager. Otherwise, unless this option is the default setting, students will see Play and Practice activities every day once they have completed the assigned level. Supplementary Materials. The Waterford courseware comes with a range of supplementary materials. These include Traditional Tales, Readables, Power Word Readers and Review Readers, videos, and tapes and CDs of songs. Teachers were not directed to use these materials during the first year of program implementation. In fact, teachers were directed to use them only if the opportunities presented themselves but to focus primarily on having their students use the courseware on a daily basis. Interview data revealed that 8 (18%) kindergarten teachers and 16 (35%) first grade teachers did make use of the Waterford supplementary materials during reading/language arts on a daily basis. An additional 9 (21%) kindergarten and 8 (18%) first grade teachers used them at least 1-2 times per week. The supplementary materials used most often included decodable books, cassettes, Traditional Tales, and videos. Time reports. The Waterford courseware allows teachers to print out a class summary report or individual student reports through the Waterford School Manager in order to check on students’ progress and to determine which books to send home for additional practice. Interview data revealed that 36 (80%) kindergarten and 31 (67%) first grade teachers did look at students’ time reports. While 8 (29%) kindergarten and 4 (13%) first grade teachers did so on a weekly basis, the majority of the teachers who look at students’ time reports did so once a month (46% of kindergarten and 61% of first grade teachers). Kindergarten teachers stated that they used the time reports to see how much their students had progressed and to see how often and how much time each student spent on the computer. First grade teachers indicated that they used the time reports to monitor student reading performance and to adjust the time children spent on the computer. During their Waterford courseware training, teachers received information regarding uploading usage data to the Waterford Institute via the Internet. They were asked to upload student data on a monthly basis. Copies of reports reflecting how well the program is being used, what percent has been completed by students, how individual students contribute to the class average, and which students need extra attention would then be sent back to each teacher with an additional copy being mailed to the teacher’s principal, the district, and the Electronic Education 25 trainer.15 Interview data revealed that a total of 56% of kindergarten and 61% of first grade teachers had their student data uploaded to the Waterford Institute. Eighteen (42%) kindergarten and 14 (32%) first grade teachers uploaded their students’ time reports themselves. Six (14%) kindergarten and 13 (30%) first grade teachers had someone else do it for them (other teacher, staff member, literacy coach, etc). Thirty-seven percent of kindergarten and 36% of first grade teachers did not upload their data at all. Reasons given for not uploading data included not having Internet access, not knowing how, and not having the time. One teacher admitted she did not know she had to upload the time reports and another confessed to being “lazy” and thus not uploading the usage data. Reviewing student writing and listening to recorded readings. In Levels One and Two, teachers can monitor students’ progress by checking that students have done the writing connected to the courseware lessons. Eighteen (43%) kindergarten teachers and 17 (40%) first grade teachers said that they checked whether students had done the writing. Among kindergarten teachers who did not, the most common reason provided was that students did not do writing in Level One. Seven (16%) first grade teachers said that students did not do writing exercises on the computer. This suggests that teachers’ knowledge of the program is limited. Otherwise, they would have been aware that students are supposed to do limited writing on paper on Levels One and Two. First grade teachers can also listen to student’s recorded readings on the Level Two courseware. Nineteen (41%) first grade teachers indicated that they listened to students’ recorded readings. Six of them did so to monitor student reading performance and 6 of them did it to determine individual student needs and incorporate that information into instruction. Among those who do not listen to recorded readings, 8 (35%) did not know about recorded readings. Again, this points to teachers’ lack of knowledge of the courseware or that their students were on Level One for the entire year. Others felt they needed more time and training to listen to them. As reflected in the above data, teachers are not fully implementing the Waterford program. In the majority of classrooms, computers are not placed in optimal locations to discourage distractions. Also, most teachers’ knowledge of the program features is incomplete and insufficient to meet the needs of their students. Specifically, many teachers were unaware 15 This information was provided in a Waterford Institute “Sample Letter” dated September 12, 2000 as part of the Waterford Teacher Training provided by Electronic Education. 26 that they set the level of the courseware for their students or that they had the ability to determine when to advance a student from one level to the next. Not all teachers check the time reports and even fewer check the writing or listen to their students’ recorded readings. Overall, many teachers are not using the Waterford courseware fully. More importantly, the picture that is created when these data are brought together is that information related to student progress provided by the courseware is not used to make decisions about instruction. This may be related to teacher training. While 40 (89%) of the kindergarten teachers and all but one of the 46 first grade teachers interviewed were trained to implement the courseware, half of the kindergarten teachers and the majority of first grade teachers (68%) felt that they did not receive enough training to implement the program properly. The more common complaints about the training included the following: the time provided for training was not enough; the only topic covered was the computer itself; the training assumed computer literacy; hands-on time during training was necessary to become familiar with the content. Research Question 2 To what extent is the Waterford courseware being used? In order to determine the extent to which the Waterford courseware was used, we examined two sets of data: usage data generated by the Waterford Institute and data from our classroom observations. The usage data file provided by the Waterford Institute contains usage data for 36,075 kindergarten students and 39,439 first grade students in the district. This file includes the number of minutes each student spent on the computer per month; the total number of minutes spent on the computer throughout the year; the level of the program at the end of the year; and the last lesson completed. As shown in Table 6, on average, kindergarten students spent 890 and first graders spent 1277 minutes on Waterford. However, large standard deviations indicate that the number of minutes varied widely across students. These numbers suggest that the Waterford courseware was not used as recommended by the Waterford Institute.16 While it would not be realistic to expect that students use the computers every single 16 Waterford materials recommend that the program be used for 15 minutes daily in kindergarten and 30 minutes daily in first grade. Ideally, a kindergarten student in a 3-track school would spend 2445 minutes a year on Waterford (15 minutes X 163 days of instruction) and a kindergartener in a single-track or a 4-track school would spend 2700 minutes a year on Waterford (15 minutes X 180 days of instruction). Similarly, a first grade student in a 27 day of instruction, based on the usage data, the average number of minutes the Waterford courseware was being used was less than one third of the ideal. In order to examine any possible connections between usage data and achievement, we looked at the usage data for the students in our sample. We were able to match usage data to 707 of the 867 treatment students in our sample. On average, the students in our sample spent more time on the computer compared to the larger district sample. Table 6 summarizes the usage data for the district and for the students in our sample. Table 6 Total Number of Minutes Using Waterford Sample District Kindergarten First Grade Kindergarten First Grade (n = 362) (n = 345) (n = 36,075) (n = 39,439) Mean 1112 1463 890 1277 SD 506 925 509 884 The usage file also provided information regarding the program level on which students were working at the end of the year (see Table 7). As expected, the majority of kindergarteners were on Level One and the majority of first graders were on Level Two at the end of the school year. When comparing students in our sample to the district population, we found that a greater proportion of kindergarten students in our sample were working on Level Two (20.2%) compared to the district kindergarten population (10.8%). Also, a smaller proportion of first graders in our sample were working on Level One (16.5%) compared to the district population (28.3%). Unfortunately, we were not able to look at achievement in relation to program level because the data only reflect the level students were working on at the end of the school year. It is possible, for example, that a first grade student spent most of the year in Level One and advanced to Level Two towards the end of the year. This information would not be reflected in the usage data. 3-track school would spend 4890 minutes on Waterford (30 minutes X 163 days of instruction) and a first grade student in a single track or 4-track school would spend 5400 minutes on Waterford (30 minutes X 180 days of instruction). 28 Table 7 Program Level at the End of the School Year Sample District Kindergarten First Grade Kindergarten First Grade (n = 362) (n = 345) (n = 36,075) (n = 39,439) Level One 79.6% 16.5% 89.0% 28.3% Level Two 20.2% 73.0% 10.8% 61.6% Level Three 0.3% 10.4% 0.2% 10.1% The usage data had an additional limitation. The usage reflected for a student may or may not represent the actual amount of time the student spent using the courseware. This possibility arises because a student can use the courseware on behalf of another student. In other words, although the program indicates that it is John’s turn, Ken has gone to the computer and takes John’s session. We found at least 20 documented instances in which one student’s turn was taken by another student. While teachers are supposed to “mark a student absent” so that the “student will not be called to the computer” when he is not present for the day (Level One, Folder 35, Waterford School Manager), many of the teachers do not do this and send the wrong student to replace a student who is absent. In addition to examining the usage data, we also relied on observation data to determine the extent to which the Waterford courseware was used. During the four days of observation, we captured both classroom level and student level data on the use of the courseware. At the individual student level, we recorded the number of minutes each student in the class used the computer. This data is not as comprehensive as the usage data captured by the computer since it reflects usage on four days for those students who used the computers on those days. However, we did find a significant correlation (r = .606, p <.01) between the number of minutes a student used the computer over the four days of observation and the total number of minutes for the year in the usage file. In order to control for a teacher variable, we also examined the level of usage of the Waterford courseware at the classroom level. We determined the level of classroom usage by the number of days the courseware was used over four days of observation, as well as the number of 29 students in the class who used the courseware each day. Table 8 shows the distribution of the treatment classrooms according to level of implementation. Table 8 Number of Classrooms per Level of Usage in Kindergarten and First Grade Kindergarten (n = 53) 16 (30%) First Grade (n = 53) 3 (6%) Medium 28 (53%) 32 (60%) Low 9 (17%) 18 (34%) High We identified 3 high usage classrooms in first grade and 16 high usage classrooms in kindergarten. In a high usage classroom, all of the students in the class used the courseware on all 4 days, allowing for absences (at least 17 students in a classroom of 20). The difference in usage between the grades is most likely due to the different amounts of time students spend on the computer in each grade (15 minutes vs. 30 minutes).17 Table 9 shows the criteria used to identify the medium and low usage classrooms. Table 9 Criteria to Determine Level of Usage at the Classroom Level 17 or more students 11-16 students 10 or fewer students 4 days 3 days 2 days 1 day high medium medium low medium medium low low low low low low The majority of the classrooms in each grade fell under the “medium” usage category. These findings are consistent with teacher interview responses. Most of the kindergarten (87%) 17 Each classroom had three computers dedicated to the program. On average, each classroom had 20 students. In kindergarten, students were expected to spend 15 minutes a day using the program. If 3 students use the computer at 1 time, 7 rotations of 15 minutes each are needed to accommodate all 20 students. Therefore, a total of 105 minutes during the day need to be allotted to the use of the Waterford courseware. Similarly, first grade students were expected to spend 30 minutes a day using the program. If 3 students use the computer at 1 time, 7 rotations of 30 minutes each are needed to accommodate all 20 students. Therefore, a total of 210 minutes during the day is needed to accommodate all 20 students. 30 and most of the first grade teachers (89%) said they used the Waterford computers five days a week. Ninety-one percent of the kindergarten teachers and 54% of the first grade teachers stated that they allotted at least the recommended number of minutes per student. The majority of kindergarten teachers (78%) indicated that every student used the computer every day. In first grade, however, 67% of the teachers indicated that not every student used the computer on a daily basis. This difference between kindergarten and first grade teachers is reflected in the ratings for Waterford usage shown in Table 8. The main reasons identified by teachers for not having every student use the computer each day included shortened days, not wanting to interrupt the literacy program lesson, special events, and lack of time. Research Question 3 To what extent are students engaged while using the Waterford courseware? One of the most frequently mentioned advantages of the Waterford courseware is that it engages students by providing them with activities “that employ visual, auditory, and tactile learning approaches” (Getting Started). The program also meets students' learning needs at an individual level (Getting Started). Similarly, during their interviews kindergarten and first grade teachers stated that the advantages of the Waterford courseware included the visual elements (44% of first grade teachers) and the lively and fun nature of the program (39% of first grade teachers). For kindergarten teachers, 22% pointed to the interactive nature of the program and 32% said it provided a different way for their students to learn. Thus, we examined the extent to which the courseware successfully engaged the students during their time using it. As reflected on Tables 10 and 11, between 73% and 80% of the students were observed as being engaged or experiencing only minor distractions on any given day. Additionally between 20% and 27% were distracted or off-task on any given day. 31 Table 10 Kindergarten Students’ Level of Engagement Level of Fall (n=843) Spring (n=1011) engagement Day 1 (n=741) Day 2 (n=682) Day 1 (n=820) Day 2 (n=789) Full engagement 341 46.0% 332 48.7% 362 44.1% 365 46.3% Minor distractions 224 30.2% 200 29.3% 236 28.8% 226 28.6% Distractions 139 18.8% 117 17.2% 184 22.4% 156 19.8% Disengagement 37 5.0% 33 4.8% 38 4.6% 42 5.3% Table 11 First Grade Students’ Level of Engagement Level of Fall (n=748) Spring (n=870) engagement Day 1 (n=605) Day 2 (n=447) Day 1 (n=721) Day 2 (n=590) Full engagement 287 47.4% 214 47.9% 379 52.6% 287 48.6% Minor distractions 187 30.9% 132 29.5% 198 27.5% 167 28.3% Distractions 113 18.7% 83 18.6% 110 15.3% 114 19.3% Disengagement 18 3.0% 18 4.0% 34 4.8% 22 3.7% For every day of observation, approximately half of all students observed were fully engaged while using the Waterford program.18 For example, one fully engaged kindergarten student spent her time on the courseware singing out loud, choosing to stay on the computer during recess, and tracing the letters as they appear on the screen: Reading the story. Prints out word whiz certificate. Sings song “Hey diddle.” Her mom came to do a special art project. “Would you like to join us or continue working on the computer?” (Returns after 11 min.). Chooses to stay inside and be at computer during recess. Traces D with her finger on the mouse pad. Another kindergarten student engages with the courseware. He sings, traces on the screen, writes, repeats after the computer, etc.: 18 If no information was provided by an observer regarding a students’ level of engagement or the information was not clear as to the level of engagement of a particular student, full engagement was assumed. (A very small number of observations lacked comments regarding engagement). 32 Student is engaged, repeating after computer. Working quietly. Still engaged, quiet. Repeating after computer. Typing. Singing along with computer. Tracing letter on screen. Gets paper and pencil, writing. Tracing letters on screen. Writing. Working quietly. Gets up to open door. Working, repeating after computer. Typing (at first pressing any keys, and then backspacing and doing it correctly). Repeating after computer. A first grade student uses the microphone to record her reading. She appears to follow the program very closely: Maria puts earphones on. Very engaged. Works diligently. Takes the microphone and speaks into it. Prints something out. There’s a chicken with eggs on the screen. She puts the printout next to her, continues to work. Prints again. Letter F, D, R on the screen. Follows the “X” with her finger on the screen. She’s making letter sounds. Is very engaged! These students remained engaged for the entire 15 or 30 minute period on the Waterford courseware. Other students were also highly engaged but experienced minor distractions while using the Waterford courseware. Approximately 30% of the students we observed experienced minor distractions on any given day of observation. Students experiencing minor distractions followed the program but spent short periods of time looking around the room, looking at neighbor’s screens, or talking to other students. One type of minor distraction experienced by students was related to students around the student on the computer engaging in activity that attracted the student’s attention: The student comes in and greets both Brian and John and then gets to work. He glances at what the two boys are doing. He becomes distracted by Brian and John’s loud complaining. But he then gets back to work. Continues to work quietly. He writes out the letter “R.” He continues to work quietly. Another type of minor distraction might occur where the student attempts to engage a neighbor as is demonstrated in the following: The student starts working; he is not distracted. Points at screen and gets excited. Tries to get Jocelyn to look at his screen but when she doesn’t he continues working. 33 Looks at Jocelyn’s screen a few times but otherwise on task most of the time. Gets louder as time goes on. The type of activity on the screen created a third type of minor distraction. There were instances where the activity was not sufficiently engaging to hold the student’s attention fully but was still enough to hold the student’s attention for the majority of the time: She’s excited about program and calls out periodically, e.g. “It’s a mummy. It’s daddy’s mustache.” Very engrossed. When N for Newt comes on, she says earnestly, “N.” Then she becomes bored and sighs when the N lingers on her screen for a while. She plays with the mouse pad. While the majority of students were either fully engaged or experienced infrequent and short lived distractions, approximately 20% of the students observed were distracted while working on the Waterford program. One type of distraction observed occurred when a student would be distracted by his neighbor. For example, The student talks to Mel. He moves his chair closer to Mel. Looks over and participates in Mel's lessons instead of his own. Finally singing alphabet song but still talking to Mel. Takes off headphones. Mel tells him to stop it, he wants to learn. He is on task. Puts headphones back on. He says out loud that he wants to go to sleep. He takes off headphones. Puts headphones on again. Tells Robert to look again at him opening disk. He tells [the observer] there are three goats on the computer screen. He clicks on Robert's mouse, exchanges his headphone. He is singing a nonsense song. Another type of distraction was created by activities underway in the larger classroom. For example, Maria begins working, looks at Alex's screen and around room. Martin tells her to look at his screen, she does. Gets out of seat and walks to table to watch students read. Teacher tells her to go back. She doesn't. Grabs paper and returns, writes letters, but sits on floor using chair as desk. Sits in chair talking to Lenny. Writes on floor again, singing songs to Lenny. Gets out of chair and walks around classroom. Finally, other students experience general distractions. An example of a student in this category is provided in the following observation: The student sits down and starts working. He is singing to ABC song. He is bouncing in seat, singing and looking at class, pounding on desk. He clicks on the mouse but 34 still bouncing and looking around, not engaged. He goes to printer, stands there for a few minutes talking to Don. He is banging on the mouse, goes to printer. Still at printer waiting for something to print. He goes back to seat; he's writing letters out pounding on mouse also. He's at the printer again. Writing letter "o," standing at desk. The computer paused. He is back on task, engaged. A very small percentage of students working on Waterford were minimally engaged or completely off-task. These students spent the majority of their time on the courseware engaged in a variety of activities, all of which were unrelated to the courseware. For example, some students are unable to stay in their seats: He sits down and complains he can't hear. Teacher adjusts volume for him. He starts working but after two minutes turns around, stares into space. Goes back to work for a minute. He is sitting back in his chair, clicks mouse a few times but then sits back again. He points at Jose's computer. He is still not working besides an occasional click at the mouse. He goes to the bathroom for 15 minutes. After they go to the bathroom he sits down, puts headphones over his mouth. Puts them back on but does no work. He clicks absentmindedly at mouse. Another type of behavior seen is when a student spends the entire session talking to students either in the classroom or on a neighboring computer. For example, one student spent her time as follows: She is talking to Joy. Fighting with Joy. Made up. Talking to Joy about what's on her screen. Headphones off. Not watching. Recess. Participating in lesson. Not looking at computer. Watching Joy's screen. Talking to Joy re: her screen. Talking but not about lesson. Waiting on computer screen. Teachers’ perceptions are consistent with our finding that the majority of the students are engaged while working on the Waterford program. Eighty percent of kindergarten teachers and 74% of first grade teachers believe that Waterford holds students’ attention for the whole period. At the same time, however, 27 (60%) kindergarten teachers and 34 (76%) first grade teachers stated that they are regularly interrupted by students on the computer. Observation data show that students who were distracted or off-task were often disruptive to the class and the teacher. In some cases, even students who were engaged could be disruptive as illustrated in this example: 35 Jacinta watches photos of city and country. Distracted by Nikki's singing. The City, Mouse and the Country Mouse, long story. She pays close attention to screen the whole time. Now follow-up activities for story. She says “Games!” Word puzzle, parrot picture underneath. Keeps saying "Awesome" as picture is revealed. Distracts Nikki and another student. Likes the games and gets excited. Sings aloud. Distracts Alan and another girl. “ed” on screen. Jacinta is singing to a song—loud. Teacher says to her “Shhh.” Jacinta is very into songs. Aide and Teacher tell her “Shhh.” “Whale, wh, wh,” song. Sings aloud. Jacinta ignores Nikki—sings and sings. We also noticed that some students who were engaged or experienced minor distractions did not always utilize the program fully. Many of them used a “pick and choose” approach to the courseware and engaged in a variety of activities. Some students skipped certain parts of the program, usually the reading. For example: The student sings along to ABC song letter by letter. He sits sideways even though he is focused on program. There is a song about Janice but he doesn't sing along. He progresses smoothly. He doesn't read along with the picture book even though the words flash. He just bypasses the entire book. When gumdrops come on he says, “this is like the same one I did!” He traces on the screen. When ‘Stop’ comes on he gets impatient and says, “I hope it will be over!” In another instance: Sarah flips through story quickly without looking at the sentences. When it comes to singing, Sarah sings. Enjoys interactive lessons like “nouns” activity. Sees her name “Sarah” on her story. Tells Karina to look. Back to work. Continues to select letters to spell a word. Another student watched the program, and appeared engaged, but did not trace the letter as it appeared on the screen: The student is on task. On task, absorbed. Not tracing ‘T,’ impatiently clicking ‘Next.” Tells AM Teacher, “I can tie shoes.” (AM Teacher is tying another student’s shoe.) She is on task, but watches the class some and comments on what they're doing. In anther example, the student read, but did not record when prompted to do so by the courseware: 36 Selena is reading Lizzy Bee aloud. She is reading Fuzzy Lizz. It is time for recorded reading but she is not saying anything. For a small portion of the students, time spent in front of the computer was time spent disengaged from any instructional activities. Another small percentage of students experienced enough of a distraction to divert their attention from a significant portion of their Waterford lesson. And for a few, engagement meant engaging only during the activities that most appealed to them. But overall, the majority of students were engaged or experienced inconsequential distractions while using the program courseware. In fact, when asked, teachers felt that the Waterford courseware did a better job of holding the attention of those students who had the most difficulty staying engaged. Eighty-eight percent of kindergarten teachers and 93% of first grade teachers stated that the Waterford program engaged students who normally have difficulty staying engaged. Level of Engagement During Open Court Reading Instruction In order to determine the extent to which student engagement during the use of the Waterford program compares to engagement during primary reading program instruction, we captured the names of students who were consistently off-task during Open Court instruction.19 Forty-one (10%) kindergarten and 71 (16%) first grade students in our sample were off-task during Open Court instruction. Table 12 shows the percentage of students who are disengaged during each program and during both programs. In kindergarten, there were approximately 31 (7%) students who were disengaged during Open Court instruction and not while using the courseware, and in first grade, approximately 50 (11%) students in our sample were disengaged during Open Court activities and not while using the courseware. Table 12 19 It is possible, however, that the number of students captured by the observers underestimates the number of students who were disengaged during whole-group instruction. It was more difficult to pay attention to all students on the rug at the same time or associate a student’s face with his or her name than when the student is sitting in front of the computer for an extended period of time. 37 Number of Students Disengaged During Each Program20 Program Kindergarten (421) First Grade (446) Waterford 84 (20%) 77 (17%) Open Court 41 (10%) 71 (16%) Both 10 (2%) 21 (5%) Overall, student level of engagement during usage of the Waterford courseware was comparable to student level of engagement during Open Court instruction. Some students, however, were more engaged during their time spent on the Waterford program and others were more engaged during Open Court activities. English Language Learners and Engagement Some of the classroom observations revealed examples of students who were using the Waterford program but who were clearly not understanding the instructions on the screen. In some cases, they sought out assistance from fellow classmates, the teacher, or the teacher’s aide. In other cases, the students appeared to become disengaged when they could not understand what to do. As a result of these data, we decided to investigate the relationship between student level of engagement and English language proficiency. We found that, on average, the proportion of ELL students distracted or off-task was greater than the proportion of EO students who were distracted or off-task. Similarly, within the ELL population, the proportion of ELD 1-2 students who were distracted or off-task was greater than the proportion of ELD 3-4 students who were distracted or off-task. (See Tables 13 and 14). 20 For the percentage of students disengaged on Waterford, we averaged students’ level of engagement over the four days of observation. Students who averaged 2.5 or below were determined to be disengaged. For the percentage of students disengaged during Open Court instruction we included any student identified by the observer as being consistently disengaged during the observation either fall or spring. 38 Table 13 Kindergarten Students’ Level of Engagement During Waterford by English Language Proficiency ALL (n = 400) Distracted or off-task21 21.0% EO (n = 86) 14.0% ELL (n = 267) 22.8% ELD 3-4 (n = 27) 14.8% ELD 1-2 (n = 240) 23.8% Table 14 First Grade Students’ Level of Engagement During Waterford by English Language Proficiency ALL (n = 390) Distracted or off-task 19.7% EO (n = 97) 14.4% ELL (n = 245) 22.0% ELD 3-4 (n = 50) 16.0% ELD 1-2 (n =196) 23.5% The same relationship between English Language Proficiency and level of engagement was evident during Open Court time. Of those students who were identified as disengaged the majority were ELL students and, within the ELL students, the majority were ELD 1-2 students (See Table 15). 21 A student was determined to be distracted or off-task if her average over the 4 days of observation was less or equal than 2.5 based on the following scale: 4 = fully engaged, 3 = minor distractions, 2 = distracted, 1= off-task. 39 Table 15 Students Disengaged During Open Court Time by English Language Proficiency EO Kindergarten (41 students) 17.9% First Grade (71 students) 27.1% ELL 71.8% 62.9% ELD 3-4 4.9% 6.8% ELD 1-2 58.5% 93.2% These percentages suggest that students with lower English language proficiency have a more difficult time staying engaged. As shown below, this disparity between students of varying levels of English proficiency was reflected in their gains on the WRMT-R and SAT/9 achievement scores. Classroom Level of Engagement As previously mentioned, of all the students exposed to the Waterford courseware on any given day, approximately one quarter of them were distracted or off-task. However, this does not mean that a quarter of the students in each class were distracted or disengaged. In some classrooms, none of the students were distracted or disengaged. In other classrooms the majority of the students were off-task. In order to better understand level of engagement at the classroom level, we looked at the proportion of students engaged and off-task in each class. In a classroom with a “high” level of engagement, all the students who used the Waterford program were either engaged or experienced minor distractions. In a classroom with “medium” engagement, more than half, but not all, of the students using the program were engaged or experiencing minor distractions. In a classroom with “low” engagement, less than half of the students using the Waterford program were engaged or experiencing minor distractions—that is, more than half of the students were distracted or completely off-task. Table 16 shows the number of classrooms with high, medium, and low levels of engagement for each day of observation and for each grade. 40 Table 16 Classroom Level of Engagement by Grade and Day of Observation High Medium Low Kindergarten (53 classrooms) Fall Spring Day 1 Day 2 Day 1 Day 2 (n=47) (n=45) (n=53) (n=51) 6 7 10 6 (13%) (16%) (19%) (12%) 33 34 37 39 (70%) (76%) (70%) (76%) 8 4 6 6 (17%) (9%) (11%) (12%) First Grade (53 classrooms) Fall Spring Day 1 Day 2 Day 1 Day 2 (n=46) (n=39) (n=48) (n=44) 15 12 13 11 (33%) (31%) (27%) (25%) 28 21 32 28 (61%) (54%) (67%) (64%) 3 6 3 5 (6%) (15%) (6%) (11%) In kindergarten, the level of engagement was high in 12% to 19% of the classrooms, whereas in first grade, the level of engagement was high in 25% to 33% of the classrooms. It appears that, in general, kindergarten students have a more difficult time staying engaged than first grade students. In the majority of classrooms in both grades, the classroom level of engagement is medium, with more than half of the students being on-task, with some exceptions. Research Question 4 What is the quality of pedagogy during reading/language arts time in treatment and control classrooms? We examined the quality of pedagogy used by both kindergarten and first grade teachers using Open Court within the treatment and control groups to: 1) determine whether teacher pedagogy differed in classrooms where the Waterford courseware was present and where it was absent; and 2) disentangle the effect of teacher pedagogy from the Waterford courseware in treatment classrooms.22 Kindergarten As mentioned above, the focus of the Level One Waterford courseware is on preparing students for beginning reading instruction by teaching print concepts, phonological awareness, 22 We excluded classrooms using Success for All or Cuenta Mundos from this analysis because of the small number of classrooms using these programs and because the programs do not lend themselves as easily to this type of analysis. 41 and letter recognition. Thus, for our analysis of teacher pedagogy, we focused on the areas of Open Court in which these same activities are presented. The Sounds and Letters component of the Open Court curriculum covers letter recognition, letter names, letter shapes, how the alphabet works, and phonemic awareness. We created a five-point scale, ranging from high to low, to reflect the quality of pedagogy being used by each teacher in our sample during time spent on activities from the Sounds and Letters component. As Table 17 reflects, more of our treatment classrooms fell into the medium range for both fall (51%) and spring (40%) than into any other single category. Only 19% of classrooms in the fall and 15% of classrooms in the spring were found to have medium-high to high quality pedagogy present. The same is true for the control group. In the fall, 51% of control classrooms were rated as reflecting medium quality pedagogy and in the spring, 54% of the classrooms fell into this category. On the other hand, only 20% of control classrooms in the fall and spring were rated as reflecting either medium-high or high quality pedagogy. Table 17 Quality of Open Court Pedagogy in Kindergarten Classrooms Quality of OC Pedagogy Treatment Control (47 classrooms) (35 classrooms) Fall Spring Fall Spring High 3 (6%) 3 (6%) 2 (6%) 2 (6%) Medium-high 6 (13%) 4 (9%) 5 (14%) 5 (14%) Medium 24 (51%) 19 (40%) 18 (51%) 19 (54%) Medium-low 8 (17%) 10 (21%) 5 (14%) 6 (17%) Low 6 (13%) 11 (23%) 5 (13%) 3 (9%) In a high quality pedagogy classroom, teachers were observed teaching the skills presented within the teacher’s manual with a high degree of fidelity to Open Court on both days of the observation. High fidelity requires both that the teacher cover a high proportion of the activities set forth in the teacher’s manual and that the teacher does so using the same techniques and with the same quality as presented in the teacher’s manual.23 For example, one Sounds and 23 For this analysis, full implementation of the Open Court curriculum was considered to be the equivalent of high quality instruction. A teacher could receive a high rating even if she did not use the specific materials associated 42 Letters lesson includes the reading of a song “Bluebird, Bluebird;” an oral blending portion focusing on initial consonant replacement; a segmentation portion, focusing on initial consonant sounds and using Leo the Lion puppet; an introduction of the sound Jj; listening for the initial /j/ sound, playing a game I’m Thinking of Something That Starts with ____; linking the sound to the letter through a word pair exercise; and writing Jj. In one of the few classrooms where high quality pedagogy was observed, the teacher engaged the students in listening to the song “Bluebird, Bluebird,” she conducted a segmentation lesson using Leo the Lion, and progressed through the remaining activities identified above. More importantly, she did so with a high degree of fidelity to the actual content of the lessons as they were presented in the teacher’s manual. For example, the teacher’s manual instructs the teacher to introduce the Jj sound as follows: § Display the Jj Alphabet Card and say the sound of the letter, /j/. Show the picture for the /j/ sound and teach the short poem for /j/: Jenny and Jackson like to have fun. They play jacks, jump rope, and juggle in the sun. Each time they jump, their feet hit the ground. /j/ /j/ /j/ /j/ /j/ is the jumping-rope sound. § Repeat the poem, emphasizing the initial /j/. The teacher conducted this portion of the lesson as follows: T: J is the magic letter. We are going to learn the poem for the letter J. All right. Her name, her name is Jenny. I have a friend named Jenny! Ss: Jennifer! The teacher demonstrates juggling. T: I tried to juggle at home with oranges. But you know what, they got squashed. The teacher introduces the /j/ sound. T: Softly /j/ . . ./j/ . . ./j/ . . . Let’s see if I get this sound right. The teacher reads the poem to the students and then turns on a tape of a song that says “the J sound goes /j/ /j/ /j/. with Open Court as long as the teacher conducted the same types of activities using the same quality of instructional delivery. Unfortunately, there were no instances where a teacher did not use Open Court materials but did use the same instructional strategies with a high degree of instructional quality. 43 Next, the lesson directs the teacher to have the students listen for the initial /j/ sound: § Hold up and name each of these Picture Cards: jam, jar, judge, jeans, juice, and jellyfish. Ask the children to listen for the /j/ sound at the beginning of the words. § Give each child a Jj Letter Card. § Have the children hold up their Jj cards each time they hear the /j/ sound at the beginning of the words. Try these words: Green jail jeans Gail Jake jam Jim gas Jill During her lesson, this teacher conducted this portion of the lesson as follows: She now introduces the Jj card pictures. She holds up cards with the following words: jar, judge, juice, jellyfish, and Jello. The teacher describes each word and gives a Spanish translation. The teacher now passes out Jj letter cards. T: Is it a fan? Ss: No. T: Do you put it in your nose?” I’m gonna say some words and if they begin with the /j/ sound, you lift it. T: Green. Some students lift their cards. T: Did it say jreen?” The teacher goes through additional words, repeating the same pattern. In the next section, the teacher was directed to play the “I’m Thinking of Something that Starts with ___” game. According to the teacher’s manual, the teacher should: § Play the I’m Thinking of Something that Starts with ______ game, using words that begin with /j/. Choose objects that are outside of the room but give the children some clues to what you are thinking of. You might try the following objects and clues: Something you drink in the morning (juice) Something you put on your toast (jam or jelly) The sound bells make (jingle) 44 § If you have children in your class whose names begin with /j/, you might want to use their names in the game. In this classroom, the teacher conducted the activity as follows: T: Now for my magic bag. I’m thinking of something that I drink in the morning. S: Orange juice. T: Yes, it can be apple juice, grapefruit juice, or prune juice because one day I’ll be old and need prune juice. The teacher is now asking questions asking students to identify j words she pulls from her bag. She takes out a box of Jello. She spells out Jello in a humorous way. Ss: Gelatina! T: I’m going to put the container like this so you can see the letter j in Jello. T: Jelly beans and the company that makes them are called Jolly Ranchers. Jolly means happy. He is a happy man. The teacher takes out a jump rope. T: This is a jump rope. I trained with the best. The rope needs to jump. I think I know what to do. But you have to help. Every time the rope touches the ground you have to go /j/. The teacher starts jumping rope. Students say jah. T: Not jah. Its /j/, softly. I’m gonna do it one more time, but my heart . . . I think that’s about it. Go back to your usual spots on the rug. The teacher approaches the students individually. Each student says j. T: I could have had a cardiac arrest today, but it would have been worth it. You learned it! Finally, the teacher’s manual directs the teacher to have the students participate in a linking the sounds to the letter activity. For this activity, the teacher is directed to write the words jacket and packet on the chalkboard and say, “which word says jacket?” Then the teacher should have had the child come to the chalkboard and point to the correct word. The teacher should have continued as follows: When the child points to jacket, say “Right! The /j/ sound begins jacket.” 45 § Then point to packet and ask the children what they think it says. Throughout the activity, always say the word with initial /j/. Then ask children what they think the other words say. § Try with these words: Join coin Jill pill Jake cake Jam ham jingle mingle June tune Again, the teacher followed the teacher’s manual with a high degree of fidelity: She writes on the board the word jacket and packet. T: I want someone to come up to the board and circle the word that says jacket. A student does so. T: If this word says jacket, the other word says packet. The teacher then moves to the next portion of the activity. T: This is the name of a girl. Her name is Jill. Its with a . . .I can’t say but you know how to spell her name. Ss: Jill. T: His name is Jake. If his name is Jake, this word is. . . Ss: Cake. T: [Gives students the word jingle and asks them for a rhyme.] Do you know what mingle means? If the vice principle goes to a party and she has to meet and talk to people, she mingles. In both the treatment and control classrooms very few teachers were observed engaging in activities with the degree of fidelity required to meet these criteria. Overall, only 3 treatment and 2 control teachers demonstrated practice consistent with high quality pedagogy. In a medium-high classroom, teachers were either observed following less of the lesson presented in the teacher’s manual on both days of the observation and doing so with significant fidelity but less than that found in a high quality pedagogy classroom, or the teacher may have presented a high quality lesson on only one of the two days of observation with less fidelity on the second day of observation. An example of a medium-high quality classroom is depicted below: T: Who do we have to wake up for oral blending? Ss: Leo! 46 The teacher asks students to stand up. So as to “shake their sillies out.” The students sing and jump to a recorded tape. The tape is over. T: Do it [sing] one more time by yourself. Students follow the teacher’s directions. T: All right, oral blending. What sound does the letter P make? Ss: /P/ /p/ /p/ popcorn. Leo the Lion puppet says “good afternoon boys and girls. It’s already afternoon.” The teacher begins the oral blending activity. The teacher defines some of the words for the students: Leo: Damp – Cuando algo esta mojado es [when something is wet, it is] damp. Ramp, lap – what do I have on my lap? S: The book. Leo: Thank you Gerardo. This exercise is presented in the teacher’s manual as follows: § Blend word parts for words ending in the /p/ sound, using techniques from earlier lessons. Puppet: antelo Teacher: /p/ What’s the word? Everyone: antelope Continue with the following words: Slo. . ./p/ ho. . ./p/ mo. . ./p/ ti. . ./p/ Ri. . ./p/ lam. . ./p/ dam. . ./p/ ram. . ./p/ Sou. . ./p/ hoo. . ./p/ soa. . ./p/ microsco. . ./p/ Next, the teacher skips the reviewing activity provided in the teacher’s manual and moves to the listening activity. The teacher’s manual suggests that the teacher give each child one Pp, one Mm and one Dd Letter Card. Have the children put the Dd card aside. Explain to them that you will say a word and that you want them to repeat it. Say that, on your signal, they should hold up the Pp Letter Card if the word ends with /p/ or the Mm card if the word ends with /m/. The manual then provides the word the teacher should use and asks the teacher to repeat the activity for words ending with /d/ and /p/. The teacher conducted this activity as follows: The teacher hands out letters M and P on flashcards. 47 T: Put the cards down. You are looking for the ending sounds. M or P. Here’s another one Broom – Broom M for muzzy the monkey. Can you say the word cap? Students answer. T: Oh good, John knew the answer right away. Let’s say another word . . . cap. Ss: M muzzy the monkey. T: Say the word. Do you hear an M?” The students then change to P. T: Can you just pass up the Ms for Muzzy the monkey? Now pass up the Ps. If you pass them up its faster for me. If I go in between the rows to pick each up, it’s not as fast. For this exercise, the teacher did not use many of the words presented in the teacher’s manual and substituted some of them with words that were not present in the manual. The teacher engages in the same type of instruction for the subsequent activity. She again skips part of the activities presented in the manual and presents a similar activity but not with the highest level of fidelity to the material presented in the manual. She repeats this pattern over the course of the two days of observation. There were 9 treatment and 8 control teachers who provided mediumhigh quality pedagogy for at least one of the two rounds of observation. Again, as mentioned above, medium quality instruction was the most prevalent type of instruction found in both the treatment and control classrooms. Medium quality pedagogy is defined as following portions of the Sounds and Letters section and doing so with partial implementation of the activities as they conduct the lesson over the course of the two day observation. Thus, if a teacher engages in some combination of the Sounds and Letters section on both days of the observation with proficiency but excludes important aspects of the lesson, the teacher would receive a rating of medium. An example of medium quality pedagogy is presented below. Here, the teacher’s manual presents a range of activities beginning with a sounds and letters activity, moving to a rhyming lesson, then to an oral blending activity, then to the review of sounds, and finally, linking the sound to the letter. In a medium quality classroom, the teacher will engage in some but not all of these activities for the lesson and will do so with an average amount of fidelity to the lesson directions. For example, medium fidelity might look as follows: T: Ok. Look at this picture Kevin. Turn the page. Look at this picture. What does this look like . . . Shawn? S: A bowl. 48 T: And Mark L? S: A spoon. T: And what do you do with this? Ss: You cook. T: When do you wear these? (The teacher points to page 19) S: When you’re making a cake. S: When you’re washing dishes. T: They’re called oven mitts, for when you bake. Hmm. What are these? Ss: Cakes. T: What’s your favorite kind? S: Banana with ice cream. T: Mark R., what are these? S: Candles and flowers. T: Yes, good. To do what with them? S: Put on your birthday cake. S: Can you smell them? T: Probably not. Ok, someone who hasn’t had a turn. Leslie, do you know what these are? S: Whistles. T: And when do you use them? S: For your party to make noise. The teacher continues to ask this pattern of question. She then assigns groups to tables and/or centers and collects the packets. In this lesson, the teacher was directed to engage in the lesson as follows: § Assemble and distribute First-Step Story 6, which depicts the events of a birthday party. Have the children browse through the book page by page. When everyone has been through the book, ask them what they think the story is about. § After they have identified that the general theme of the book is a birthday party, take a closer look at each page. Ask the children to look at the first page – how does this scene compare to the scene on the last page? 49 § Ask individual children to describe what they think is happening or what is pictured on each page. Print their ideas on chart paper. Allow as many children as possible to contribute. Encourage children to use these ideas to think about a story they would like to write. § If you have children in your class from different cultures and countries, have them talk about birthdays and the customs for celebrating in their native countries. The differences between the lesson presented by the teacher and the one outlined by the teacher’s manual are obvious and important. The teacher did not ask the students to browse the story or identify a theme. The only component clearly reflected is the teacher having the students work their way through the story page by page identifying what is pictured on each page. The teacher did not follow any of the second half of the activity. He did not write any of the words on chart paper or ask them to think about a story they would like to write. Finally, the teacher does not have the students talk about birthdays and customs from their countries. Overall, he partially implemented this piece of the lesson and for the rest of the observation, he did not engage in the rhyming lesson or the oral blending activity. Similarly, he only partially implemented the sound reviewing activities and the linking the sound to the letter activities. His practice on the second day of observation mirrored that of the first. This pattern of practice runs throughout classrooms reflecting medium quality pedagogy. In the 17% of treatment and 14% of control teachers provided instruction of a mediumlow quality. Similarly, in the spring, 21% of treatment and 17% control teachers demonstrated medium-low quality pedagogy. In these classrooms, teachers engaged in very few of the activities presented in the teacher’s manual on one or both days of observation, and/or did so with very low fidelity to the recommendations presented in the lessons, or had such a small proportion of her students engaged in the activity that the quality of the activity was irrelevant. For example, in one medium-low classroom, the teacher was supposed to cover activities beginning with sounds and letters, moving to oral blending, then to sounds in words, next to letter names, then to letter shapes and finally to reading a pre-decodable book. In this classroom, the teacher skipped both the sounds and letters and the oral blending activities and began with the sounds in words activity. More importantly, during her implementation of this lesson, a large portion of the class is off-task and not paying attention to the teacher, nor is the teacher paying attention to their behavior. The activity looks as follows: 50 T: What’s this sound “Bb,” and then we put it together with this vowel “Ee” and we have “bee.” The observer notes that the teacher is “basically talking to herself. Very few kids are paying attention.” T: Ok, ready. Next word. The teacher thumbs through the cards. T: Ok, now say this sound, word “not.” Jesus is rocking back and forth saying, “teacher, teacher.” The teacher ignores him the first five times he says it than she responds, “Jesus, you need to switch with Alfred.” Jesus does not listen. T: Switch now. To another boy the teacher says, “you don’t bite things, nothing goes in your mouth that is not food.” The teacher continues her lesson. She has a girl in pink come up to the front to show her “Ll” sound card. Meanwhile, most of the kids are talking to each other or calling out “teacher,” rocking back and forth on the rug, lying down, etc. The teacher does nothing except to say, “you need to pay attention and sit up straight.” The kids do not follow her directions. This excerpt reflects the remainder of her lesson implementation and the type of teacher instructional quality that makes up the medium-low rating. In order for a teacher to receive a rating of “low” in relation to her pedagogy, she must have done very little or none of the activities presented in the Sounds and Letters section. Here, a teacher may have used outside materials not present in Open Court or may have moved directly into reading related activities without any time being spent on the Sounds and Letters section on either or both of the two days of observation. Thirteen percent of the treatment and the control teachers in the fall were considered as exhibiting low quality instruction and 23% of treatment and 9% of control teachers provided no more than a low quality pedagogy to their students in the spring. Distribution of Time Spent on the Open Court Sounds and Letters Component In addition to examining the quality of pedagogy being used by kindergarten teachers, we also examined the amount of time spent by teachers focusing on the subject matter contained 51 within the Sounds and Letters component of Open Court. We did this in order to examine the possibility that time spent using the Waterford courseware was being offset by time spent on these same activities in Open Court. The teacher’s manuals recommend that teachers spend between 30 and 50 minutes on the activities contained within this section, including Sounds, Letters, and Language; Phonemic Awareness; Letter Recognition; How the Alphabet Works; and Phonics. Teachers are to spend approximately 10 minutes on activities in the Sounds, Letters, and Language section.24 Activities include the reading of a poem or singing of a song. The next section is Phonemic Awareness. Here it is recommended that teachers spend approximately 15 minutes on a range of activities. At the beginning of the school year this section includes listening for sounds and feeling the rhythm. As the year progresses, these activities are replaced by oral blending and eventually segmentation activities. At the beginning of the year, the next section students spend time on is Letter Recognition. Teachers are directed to spend approximately 15 minutes on letter names and letter shapes. As the teachers move into Book B, How the Alphabet Works is added. Students spend approximately 10 minutes on this section. This time will increase to between 15 and 20 minutes on these activities. They generally include introducing a letter and sound, reviewing a letter and sound, listening for a sound, linking sounds to letters, the Reading and Writing Workbook activities, and reading pre-decodable or decodable books. Before the end of Book B, the Letter Recognition section is no longer a set of activities. In the last lesson in Book E, the Sounds and Letters section shifts one final time. By the end of the year, this section is comprised of Sounds, Letters, and Language (10 minutes) and Phonics (20 minutes). Of the 47 treatment and 35 control kindergarten teachers using Open Court, all spent some portion of the fall and spring observations engaged in activities from the Sounds and Letters component. The amount of time spent varied across sections and days. Table 18 shows the number of teachers who spent less than 30 minutes, between 30 and 50 minutes, and more than 50 minutes a day on this section. Only 8 (17%) treatment teachers in the fall and 10 (21%) treatment teachers in the spring spent less than 30 minutes on the Sounds and Letters section on two consecutive days of observation. Additionally, no teacher spent less than 30 minutes on these activities for the entire four days observed for fall and spring. For control teachers, only 3 24 Open Court 2000 Level K contains five books – Book A, Book B, Book C, Book D, and Book E. Teachers follow a pacing plan based on their calendar. The pacing plan sets forth how much time teachers should spend on each unit in each book for the school year. 52 (8%) in the fall and 1 (3%) in the spring spent less than 30 minutes on these activities on the two days of observation. Similarly, no teacher spent less than 30 minutes on these activities for both the two fall and two spring days observed. Table 18 Time Spent On Sounds and Letters Component Time Spent on both days of observation Treatment Control Fall Spring Fall Spring Less than 30 min 8 (17%) 10 (21%) 3 (8%) 1 (3%) Between 30-50 min 5 (11%) 10 (21%) 2 (6%) 7 (20%) More than 50 min 8 (17%) 2 (4%) 10 (29%) 7 (20%) Combination 26 (55%) 25 (53%) 20 (57%) 20 (57%) The district policy for use of the Waterford courseware is that it should be used during the reading/language arts portion of the day. On the other hand, the policy also established that the courseware should not be used during the Sounds and Letters component of Open Court. In order for all students to use the courseware, approximately 105 minutes of the kindergarten period need to be dedicated to computer use. Reading/language arts in kindergarten should only last for 90 minutes. Of that time, as much as 50 minutes will be dedicated to the Sounds and Letters component on any given day. If the courseware is to be used primarily during the reading/language arts time, either some portion of the Sounds and Letters section will be spent with students on the courseware, or not every student will have a turn at the computer every day. The data from treatment teachers reveal that both of these things occurred. There were classes in which students spent time on the courseware during the Sounds and Letters portion of the lesson and there were classes in which not every student had an opportunity to use the courseware because the teacher did not begin using it until after she had completed the Sounds and Letters activities for the day. In fact, interviews of our treatment teachers revealed that only 25% of treatment teachers had changed their Open Court instruction as a result of the use of the Waterford courseware. More importantly, only 1 teacher (9%) stated she paused the courseware during the reading and writing portion of Open Court. Other teachers indicated they repeated lessons twice so that students using the computers did not miss any of the Open Court instruction 53 time. These behaviors were clearly inconsistent with district expectations of how the two programs would be used in concert with each other. 54 First Grade The Waterford Level Two courseware teaches beginning reading, which includes letter sounds, word recognition, and beginning reading comprehension. Thus, for our analysis of teacher pedagogy, we focused on areas of Open Court in which many of these same activities are presented. Here we focused on some of the phonics activities in which blending was practiced and high-frequency words were reviewed (reading a decodable),25 and those where reading comprehension strategies and skills were introduced and practiced (reading the Big Books or Anthology). Again, a five-point scale was constructed to reflect high to low quality pedagogy. Table 19 provides a breakdown of treatment and control classrooms by fall and spring. As depicted in the table, there were no classrooms in which the teachers in either the treatment or the control group demonstrated high quality pedagogy. In fact, teacher pedagogy was consistently rated as medium to medium-low within both groups. Table 19 Quality of Pedagogy in Open Court First Grade Classrooms Quality of OC Pedagogy Treatment Control (43 classrooms) (38 classrooms) Fall Spring Fall Spring High 0 (0%) 0 (0%) 0 (0%) 0 (0%) Medium-high 1 (2%) 2 (5%) 1 (3%) 4 (11%) Medium 9 (21%) 17 (40%) 5 (13%) 17 (45%) Medium-low 18 (42%) 16 (37%) 25 (66%) 14 (37%) Low 15 (35%) 8 (19%) 7 (18%) 3 (8%) For a teacher to have received a rating of high, she would have had to have her students fully engaged in activities involving the use of 1) the decodable books when called for by the teacher’s manual; and 2) and have her students read selections from the Big Book or stories from the Anthology on both days of observation. In practice, this means that a teacher would have to use a decodable book as it was outlined by the teacher’s manual, no more and no less. For 25 According to the Open Court Teacher’s Manual, the decodable books in Levels K-3 “are designed to help students review and reinforce their expanding knowledge of sound/spelling correspondences. . . . Each story supports instruction in new phonic elements and incorporates elements and words that have been learned earlier. . . .At Level 1, Decodable Books help children build fluency as they apply their growing knowledge of phonics.” 55 example, for the decodable story Steve’s Secret, the teacher’s manual provides the following information: High-Frequency Words There are no high-frequency words introduced in this book. The high-frequency words reviewed are had, a, it, in, what, is, an, said, for, my, not, do, ask, the, with, that, and here. Reading Recommendations § Call on a volunteer to read the title. § Call on different children to read each page of the story aloud. Feel free to stop and discuss with the class anything of interest on a page; then have another child reread the page before going on. Help the children blend any words they have difficulty with or remind them of the pronunciation of high-frequency words. § Invite the class to reread the story for a third time, choosing new readers for each page. Responding § Ask the children whether they had any questions about what they just read, such as problems with recognizing high-frequency words, blending, or understanding vocabulary. Allow the class to try to clear up the problem areas. § To determine whether the children are focusing on the words in the story rather than simply on the pictures, have them answer questions such as the following by pointing to a word. Call on two or three children to say the word. Then call on the children to point to any word they choose and say it. What is the boy’s name in the story? (Steve) What did he have? (a secret) Where did Steve hide his secret? (in his pocket) § Ask the children to discuss what the story was about and what they liked about it. Keep this discussion brief, but prompt the children by asking “What else?” if necessary. § After they read, have the children tell in their own words what happened in the story. 56 Moreover, the teacher would have had to fully implement the Reading and Responding activity for that day’s lesson on both days of observation. The Reading and Responding section focuses on modeling comprehension strategies while reading Big Books and Anthology selections. The section also exposes students to comprehension skills to help students organize information and develop a deeper understanding of the author’s meaning. As was already indicated, none of the teachers in either the treatment or control group provided a pedagogy that reflected this level of quality. Medium-high quality pedagogy was observed in 1 treatment and 1control classroom in the fall observation round. Similarly, 2 treatment and 4 of classrooms reflected medium-high quality pedagogy during the spring observations. For a teacher to receive a rating of mediumhigh, the teacher had to 1) have her students substantially implement the decodable lesson where appropriate, and 2) either have students engage in Big Book selection or Anthology reading with substantial implementation on two days of observation, or on one day the teacher had to fully implement the Big Book selection or the Anthology reading. Substantial implementation of the Big Book selection or the Anthology reading means that the teacher engages in some of the prereading activities such as building background, previewing and preparing, and vocabulary and models at least some of the strategies required by the lesson or presents the comprehension skills provided by the teacher’s manual. For each reading selection, the teacher’s manual provides both comprehension strategies and comprehension skills26 to be either modeled by the teacher or enacted by the students during each day’s Reading and Responding time. These suggestions are located within the teacher’s manual next to the story, and teachers are guided through their use by the daily lesson plan. For example, for the story Strange Bumps the teacher is directed to model the following comprehension strategies over the course of the reading: summarizing, 26 According to Open Court, good readers use a variety of strategies to help them make sense of the text and get the most out of what they read. Trained to use a variety of comprehension strategies, children dramatically improve their learning performance. (Emphasis added). Comprehension strategies include: setting reading goals, summarizing, asking questions, predicting, making connections, and visualizing. The goal of instruction in reading comprehension skills (emphasis added), on the other hand, is to make students aware of the logic behind the structure of a written piece. If the reader is able to discern the logic of the structure he or she will be more able to tell if the writer’s logic is in fact logical and gain an understanding both of the facts and the intent of what they are reading. By keeping the organization of a piece in mind and considering the author’s purpose for writing, the reader can go beyond the actual words on the page and make inferences or draw conclusions based on what was read. These are the skills that strong, mature readers utilize to get a complete picture of what the writer is not only saying, but what the writer is trying to say. Comprehension skills include: point of view, sequencing, main idea and detail, compare and contrast, cause and effect, classify and categorize, and author’s purpose. 57 predicting, and making connections. For comprehension skills, the teacher is to discuss the concept of drawing conclusions. Each of these skills and strategies are to be infused into the reading lesson based on directions set forth in the teacher’s manual. In theory, the teacher will turn to the teacher’s manual, identify those strategies and skills she is to focus on for the day’s lesson, and then follow the manual to identify both what strategy and skills to focus on for the day and how to shape the interaction. Thus, on the second day of the lesson the teacher would know to focus on drawing conclusions on pages 153, 155, 157, and 161 of the Anthology. When the teacher turns to page 153, she will see a comprehension skill box explaining drawing conclusions. There she can read the following to the students: Explain to the children that writers don’t always give readers all the information about a character or a story event. Readers must then draw conclusions or make guesses to fill in the missing pieces. To draw a conclusion, they take small pieces of information about a character or story event and use the information to make a decision about that character or event. § Ask the children to point to story and picture clues on pages 152-153 that give information about Owl’s feelings. List the clues on the chalkboard. Ask the children to read the list and to draw a conclusion about the way Owl felt at this point in the story. (He is frightened of the two bumps in his bed). In one classroom where the teacher provided a medium-high quality of pedagogy, she was observed conducting the lesson as follows: The teacher asks if students are ready to read and sends them back to their desks, and tells them to turn to the table of contents in the textbooks. She tells the students to open their books to page 152, Strange Bumps. The teacher reads the title, and then calls on different students to read. After the first few students read, the teacher goes to board and writes ‘Drawing Conclusions.’ T: Sometimes the author doesn’t tell us exactly how a character feels, so we have to guess, we have to draw a conclusion. How do you think the owl feels from what we’ve read and from looking at the pictures? The teacher calls on a few volunteers. Ss: Sad, scared, afraid, shy. The teacher went on to page 157 of the teacher’s manual: 58 To show the children how to use clues to draw conclusions, have them review the story up to page 157. While they already know that the bumps are Owl’s own feet, what clues does the author provide to help them reach this conclusion before Owl does. She conducts this portion of the lesson as follows: The teacher writes these responses on the board. The teacher continues calling on students to read. After a couple more pages, she stops and calls on a couple of students to sum up what has happened so far. She then says: T: So the bumps are the owl’s feet, right? But did the author tell us that? S: No. T: No, we had to look at the story and all the different clues and draw that conclusion. The teacher continues having students (who have not read) take turns reading. They are done reading the story. Once the story is complete, the teacher moved on to page 165, the Theme Connections. The activity is set forth in the teacher’s manual as follows: Talk About it In the story, Owl tries to find out what the strange bumps are. He should have figured it out. Here are some things to talk about: § What did Owl do to try to find out what the bumps were? § Why was Owl afraid? § Have you ever been afraid of something at bedtime? What did you do? § Why was this story funny? Look at the Concept/Question Board and answer any questions that you can. Do you have any new questions about being afraid? Write them on the Board. Maybe the next reading will help answer your questions. In practice, this teacher engaged in the activity as follows: The teacher asks the students questions about the story: T: What was the owl so afraid of? Have you ever been afraid of something? What did you do? What would you have done if you were the owl? She calls on volunteers and students who are not raising their hands: T: Does anybody have anything to add to our concept/question board? 59 She calls on a couple of volunteers. S: Why was the owl so afraid? T: Good! And which side would this go on? Ss: Questions! The teacher writes the question on the concept/question board. S: People are afraid of things that they don’t know. The teacher asks the students what side it goes on, and then writes it on the concept side. While this teacher did not implement this lesson fully, she did so with substantial fidelity to the teacher’s manual. This teacher conducted her reading time in the same manner on the subsequent day of observation. Thus, based on the observation, it appears that her students were consistently exposed to a medium-high quality pedagogy that would allow them to become good readers. Twenty percent of the treatment teachers in the fall and 38% of the treatment teachers in the spring provided pedagogy that was considered medium quality. Similarly, 13% of control teachers in the fall and 44% of control teachers in the spring fell into this same category. Medium quality pedagogy was characterized by instructional practice that included reading of the Big Book selection or Anthology reading with 1) partial implementation of the decodable books and the pre-reading and/or strategies and skills portion of the lesson, asking questions based on the strategies presented in the teacher’s manual or other comprehension related questions in the absence of any modeling by the teacher of those strategies; 2) reading on one day of the Big Book selection or Anthology reading with substantial implementation of the strategies and/or skills portion of the lesson or other reading comprehension related questions; or 3) reading on one or two days of the Collection for Young Scholars Anthology or Big Book selection with substantial implementation of the strategies and/or skills portion of the lesson.27 Partial implementation means that no modeling takes place, not all of the questions presented in the section are asked, non-Open Court questions may be asked by the teacher, and/or the teacher may cover other strategies and/or skills not presented in the teacher’s manual. Additionally, partial implementation can be the result in the event that a teacher substantially implements the lesson but the students spend significant amounts of time during the activity off-task, or the 27 The Collection for Young Scholars version of Open Court is the version of the program that predates Open Court 2000. For the 2002-2003 school year, all schools previously using Collection for Young Scholars have adopted the newest version of the program, Open Court 2002. After a careful review of the Collection for Young Scholars program it was determined that the quality of the lessons was lower than that provided in Open Court 2000 and even a perfect implementation of the program would lack many of the components necessary for high quality pedagogy. 60 teacher’s behavior towards the students undermines their learning (i.e., the teacher engages in punishing behavior during instruction or feedback). In one classroom, in which the teacher practice reflected only a medium quality of pedagogy, the teacher conducted the following lesson: The class is gathered at listening center listening to a story. The teacher points to the Big Book's pictures as the tape goes on. The class is very quiet listening to story. Sara turns to look at computer then turns away. The class attentively listens. The teacher calls “Gabby,” and she quiets down. Jasmine is looking at book attentively. T: See the wheels and levers? She points to picture. Cesar, Jasmine, Isaac and Jesus are all paying attention. Darren looks around and he appears to be daydreaming. The teacher stops the tape. T: Isn’t that interesting. Ss: Yes. The teacher explains the pictures in the book one more time. Sara and Barbara are talking. They are not listening. Justin is attentively looking at book. T: Excuse me Gerardo. Gerardo stops talking. The teacher explains to class how ramps make labor easier. He points to story explaining different machines. T: Have you seen any ramps for wheelchairs? He tells the students about different ramps located around campus. The class appears very engaged. T: Who knows what a staple remover is? I need Darien and Laura's bottoms on the rug. The teacher explains what a Phillips head screwdriver is and says, “I know you don’t understand that, do I need to clarify that?” Ss: Yes. The teacher should have engaged in this portion of the activity as follows: Ask the children to tell what they know about a staple remover, a snowplow, and a screwdriver. Invite them to describe what they look like and how they are probably used. Tell the children that there are many different types of screws and screwdrivers. Some kinds of screwdrivers, for example, you turn by hand. Others are powered by electricity. 61 Share this information with the children, showing them the actual screws and screwdrivers, if possible: There are two main types of screwdrivers. Each one works on a different type of screw. The flathead has a thick, flat head that fits inside a slot in the screw. The Phillips has four points that fit inside a little cross-shaped place in the screw. All screws, however, have ridges on the long part. These ridges are called threads. Next, the teacher should have engaged the students in a browsing activity, set a purpose for reading, and had the students begin reading. Once the class had begun reading, he should have modeled two reading strategies, clarifying and predicting. Instead, the teacher continued the activity as follows: The teacher starts to read from Big Book, and then stops until class quiets. He has Samuel move from front row because he is too tall. Shorter kids are in the front of the rug. Jasmine is looking around room. The teacher says her name and she turns around. He continues to read and explain pictures in the story and the class appears to be very attentive. In this activity, the teacher appears to have combined two separate Building Background lessons into one day. He asked questions from the previous lesson related to wheels and levers and moved directly into the correct lesson focusing on screwdrivers. More importantly, while he did present information from portions of the Building Background section (activating prior knowledge and background information), he did not engage the students in any of the reading strategy activities identified for that day’s lesson. Had he been following the teacher’s manual more closely, he would have modeled both clarifying and predicting strategies. In fact, where the teacher “explained to the class how ramps make labor easier” he should have modeled the strategy of clarifying as follows: I’m confused by what Bea means when she says the wedge is like a tiny ramp. Let me look at the picture again. Oh, I see. The wedge does have an angle, like a ramp does. By looking again at the picture, I was able to clarify the words that were confusing to me. Yet the teacher missed this opportunity by relating the information to the students without modeling or directing their attention to the text. Similarly, this teacher does have his students use a decodable book. Here again the quality of the pedagogy he uses is medium. He has his students read the story and he reads the story to the students. He does not engage the students in any of 62 the activities associated with the reading that would allow the students to focus on the highfrequency words introduced in the story, clarify any reading difficulties they encountered, or ensure that the students were focusing on the words in the story rather than on the pictures or to have the students point to any words they can read. Overall, the types of limitations presented by this teacher’s practice is consistent with practice engaged in by teachers whose pedagogy was only of a medium quality. Forty-two percent of treatment teachers in the fall and 40% of teachers in the spring engaged in teaching practice that was of a medium-low quality. Similarly, 67% of control teachers in the fall and 36% of control teachers in the spring provided medium-low quality pedagogy to their students. For a teacher to receive a rating of medium-low, she had to engage in practice exemplified by the following features: 1) reading on only one of the two days observed of the Big Book selection or the Anthology reading with partial implementation of the strategies and/or skills presented in the teacher’s manual or other questions reflecting reading comprehension; 2) reading on one or two days of the decodable books with substantial implementation but no Big Book or Anthology reading; or 3) reading of the Big Book or Anthology reading with one day focusing on reading for fluency and the other day reflecting partial implementation of the strategies and/or skills provided in the teacher’s manual or questions reflecting reading comprehension. An example of medium-low quality pedagogy is reflected in the following observation. On the first day of the observation, the teacher has his students read a decodable to practice “voicing.” They do none of the other activities associated with the decodable book for that lesson. The only other reading related activity for the first day is having the students listen to a tape of the Anthology story they are reading. On the second day, the teacher does not have the students read a decodable, but has his students read an Anthology silently by themselves for ten minutes. During the central reading activity for the day, beginning a new Anthology story, the teacher engages the students in some of the pre-reading activities associated with the story and partially implements the comprehension strategies presented for that lesson: T: I need you to take out your anthology. I need you to go to the table of contents. He opens the book and shows them. T What page is The Kite on? Ss: 38. 63 T What s the name of the next story? Ss: The Garden. T The book is written by Arnold Lobel. Who is the author? Ss: No response. T: The author is Arnold Lobel. Does the story have the same characters as The Little Red Hen? Let’s turn to page 56 and see. What’s the title? S: The Garden. T: Who’s the author? Put your fingers under the title. Ss: By, Arnold Lobel. T: Remember to look at the pictures. They will help you to figure out what the story is about. Who are the characters? He begins reading the first paragraph with class. T: Today we will make connections. A good reader makes connections. When you read the story it helps you know what the story is about and how the characters feel. In the story Toad has a garden and it’s hard work. I have a garden and I work hard in my garden. Does anybody have a connection? S: My dad has a garden and I help him. The teacher and students continue to read together. T: Is the frog nice? Ss: Yes. S: He gave the toad a hoe. The teacher is using the Open Court teacher’s manual. T: We are using another strategy, summarizing. We say the important things about the story. Let’s look at the picture on p. 58. What does Toad have? S: A bag of seeds. T: Good. Remember, pictures help you understand the story. Look at the picture on page 59. What does Toad look like? S: Looks like he’s falling. T: Let’s read and see if that’s right. The story says that Toad says, ‘seeds start growing.’ He points out the exclamation point. Students say that means to speak loudly. 64 T: Let’s make connections. S: At another school, I planted flowers. The teacher reads from the teacher’s manual and lets them know why they need to summarize. He then asks the students to summarize. S: Toad planted seeds and watched them grow. T: Let’s go to the next page. Let’s look at the picture. Let’s read the story. Toad is upset that the seeds won’t grow. He reads that part loudly to stress the exclamation mark. The teacher asks students to make connections. The student says that she gets upset when her sister hits her. T: Next page. Four to five students are very engaged. T: Look at picture. What came out? Ss: The sun. T: Let’s read together. Students and teacher continue to read together. T: Do you think frog knows how to take care of a garden? Half of the students raise hands. T: How do you know this? The teacher helps the students say, “because he is happy that the sun has come out.” T Is this fantasy or real? Ss: Fantasy. Half of the students are engaged. The lesson is moving a little too fast. T: Look at next picture. What instrument is Toad holding? Ss: Violin. S: This story is long. T: Toad plays the violin because he thinks this will make the seeds grow. Look at the next picture. Ss: The seeds are growing. They then read the text together. T: Look at page 72. What’s happening? S: Frog and Toad are looking at the garden. 65 T: Let’s read. Only the students on back row facing the teacher are reading loudly. Students on the sides are quiet. T: Let’s mark the page. The teacher calls the students to the rug. Before they leave for lunch the teacher asks the students the title of the story, the author, and the two main characters. The students provide answers. T: Who wants to hear the story? He turns on the cassette recorder and tells students to follow the story with their eyes. T Remember to use your fingers to trace the story. Students are quiet and are very engaged. T: Antonio, are you on the right page? Student pays attention. T: I like the way Cathy is following the story with her finger. The class is quiet. A couple of students are losing interest. The tape ends. T: Tomorrow we will reread the story, so put your marker in the book. Of the two stories that you read, which one did you like the best? Who liked The Kite best? He writes /// on the board. T: Who likes The Garden best? Students are excited and many hands are in the air. The teacher writes ///// ///// // on the board. On the surface it would appear that this lesson reflects a relatively high quality pedagogy. Yet, it was only of a medium-low quality for a number of reasons. To begin with, although the teacher asks the students the name of the story, who the author is, and to browse, he first tells them the name of the author and then asks them for the information. He also did not provide the students any time to engage in the activity of browsing. Similarly, he should have discussed with them what they think this story might have to do with the theme of the unit Keep Trying. He should have had the children search for clues that tell them something about the selection and any problems, such as unfamiliar words, that they notice while reading. Additionally, while he asks the students whether the story has the same characters as the story The Little Red Hen, he should have asked the students if the story had the same characters as The Kite. He should also 66 have reminded the students that Frog and Toad behave and speak like people and that the story is a fantasy. This teacher does model the strategy of making a connection and he almost helps the students to do so. When he asks the students if they have any connections, he should have prompted them more directly by asking, “What kinds of hard work have they done?” The teacher goes on to ask questions that are not part of the Open Court lesson as well as to partially implement the strategies presented in that day’s lesson. Yet, even though this teacher did not strictly adhere to the lesson presented in the teacher’s manual, he still would have provided students with a medium quality pedagogy had he not both spent an excessive amount of time on this lesson, and lost the attention of so many of the students. By spending 48 minutes on this lesson, he lost at least half the class. In fact, within the first 28 minutes, the teacher only had the attention of four or five students. Over the course of the activity it becomes clear that less than half of the students continue to be engaged in the lesson. As a result of the teacher’s failure to have the students engage in Anthology reading on both days, to engage all of the students in the entire activity, and to fully implement a high quality pedagogy, this teacher only provided his students with a medium-low quality pedagogy over the course of the two day observation. In the fall, 36% of treatment and 18% of control teachers provided only low quality pedagogy to their students. In the spring, 18% of treatment and 8% of control teachers received ratings of low. In order to receive a rating of low, a teacher had to engage in the following types of activities during the time set aside for decodable and Big Book or Anthology time: 1) not have students read for both of the two days observed; 2) have the students read only non-Open Court materials for both days and do so solely for fluency purposes; 3) have the students read on one or two days but only decodable books for fluency; 4) have the students read on one or two days but only decodable books with partial implementation of the strategies and/or skills presented in the teacher’s manual or other reading comprehension type questions; 5) have the students read on one or two days from the Big Book or the Anthology but only for fluency. An example of low quality pedagogy is reflected in the following observation: On the first day observed, the teachers spent ten minutes reading their decodable books for sustained silent reading time. No other reading took place for the day. On the second day, the teacher has the students spend ten minutes again reading decodable books during sustained silent reading time while the teacher prepares the morning lesson. The only other reading that takes place for that day is of a decodable book. The reading of the decodable took place as follows: 67 The teacher tells her students they are going to do a new decodable. When they go back to their seats they can put their marker in their workbooks and they will do the next page later. The teacher says she will write the title of the new decodable on the board. She writes At the Vet. The students read as the teacher writes the title. The teacher tells the students the definition of the word vet. She tells them a vet is a doctor for animals. She goes on to explain abbreviation—Ms. for Misses, Dr. for doctor. The teacher then writes the high frequency words that are going to be introduced in the new book. Students read the words as the teacher writes them. The teacher reads title and author and tells students she is going to give students the new book. S: We can browse. Students get their books and open them up. T: I like the way Yesenia is browsing. She is looking at the pictures to figure out the story. The teacher starts asking questions. T: Why do you think the cats look sad? Ss: Because they are mad? The teacher tries to elicit other guesses. S: Because they are sick. T: Yes, I saw you found that word in the first line. So where does the girl take the cats? S: A doctor, a cat doctor. The class laughs. T: What is the vet? S: Pills (in Spanish). T: Yes, what’s that called? S: Medicine. T: Yes, like I had to take my medicine today. Let’s look at page 8. Are the cats happy? S: The cats are playing with a bell. Teacher repeats what the student has just said and says, “Yes, they are happy.” She then tells the class to read first page silently. 68 Ss: How about the title? T: Oh, yes, read the title silently. The class reads aloud. T: No, silently. The students become quiet. All students have books opened to correct page—all appear to be participating and engaged. The teacher tells them to read silently on page 5. S: I’m not done. T: Point to the words, very good. I can see if my boys and girls are pointing and reading silently. If you are looking at someone else, you are not reading. Now, boys and girls we are going to read together. Have your finger ready. The students read together aloud. The teacher doesn’t read with students aloud. She chimes in when class has difficulty with a word. About half of the class is reading book fluently, the others are quiet. The teacher observes, but doesn’t read with students aloud. T: Now, we are going to take turns. Student reads the title. T: I like the way he blended. He didn’t know vet and he blended it. Fred, you want to try? We can help you. S: Reads word by word but gets through page. T: Let’s give him a big applause. This is the first time he reads so much. The class claps enthusiastically. All students engaged. The teacher calls on another boy to read. She reminds students not to look at reader, to have students read their own books. She calls Tiffany to the front [because she is creating a distraction]. Kathy doesn’t speak English. She is able to read the page. The teacher explains that because she can read in Spanish, she is able to read this page in English. Class claps for her too. According to the teacher’s manual, this decodable presented no new high-frequency words. Instead, the high-frequency words within the story were review. Next, this teacher wrote the title of the story on the board. The teacher’s manual called for the teacher to “call on a volunteer to read the title of the selection.” Similarly, while the teacher provided the definition of a veterinarian for the students, the directive in the teacher’s manual was for the teacher to “ask whether any of the children know what a veterinarian is. If children are not familiar with a vet, tell them that they will learn what this type of person does when they read this book.” Here, the 69 teacher’s actions directly undermined the students’ development as independent readers by preventing them from accessing their own existing knowledge and not teaching them how to access the information they do not yet know from the text they are reading. Instead, the teacher has given the students the message that they are to rely on the teacher for information they do not possess. While this lesson was not well implemented, there were some things the teacher did correctly. First, the teacher does support student’s blending of words they do not know. And although she does not follow the teacher’s manual recommendation to “make sure children are paying attention to the words in the story, ask [questions] having children point to the word or sentence in the story that answers each question,” she does ask her students questions and seek responses based on what they have read. Overall, the combination of silent reading on the first day of observation and the partial implementation of the decodable book on the second day reflects practice commonly defined as being of a low quality. Research Question 5: What is the relationship between implementation of the Waterford courseware, the primary reading program and student achievement? Two sets of analyses were conducted to examine the effectiveness of the Waterford program by comparing treatment and control groups on multiple measures. The first analysis was part of the second interim report and consisted of a series of t-tests that examined the gains of treatment and control students on the Woodcock Reading Mastery Tests—Revised (WRMT-R). This analysis included student level demographic variables only (see Appendix B). The second analysis was an expansion and improvement over the first. Using Hierarchical Linear Modeling (HLM), we compared treatment and control students while controlling for the quality of Open Court pedagogy and school characteristics (see Appendix C). In this analysis, we also compared students on the WRMT-R and, in addition, we compared first grade students on the SAT/9 Reading, Language, and Spelling tests.28 In kindergarten, no differences were found between the treatment and the control group. In first grade, treatment students had larger gains than control students in Letter Identification. Treatment students also had larger gains than control students in the Synonyms test, but only in classrooms with higher quality Open Court pedagogy. No 28 For descriptive statistics and student level analysis of the SAT/9 results, see Appendix D. 70 differences were found between the groups on any of the other WRMT-R tests or the SAT/9 tests. In order to explain 1) the differences that we found between the groups in first grade and, 2) the lack of differences on most of the tests between students exposed to the Waterford courseware and students who were not, we looked within the treatment group to see whether factors related to the implementation of the Waterford courseware had an impact on scores. HLM was used to analyze the relationship between Waterford usage and student achievement while controlling for the quality of Open Court pedagogy. This methodology is particularly appropriate for analyzing these data because it simultaneously takes into account the effect of student background variables (e.g., student ELD level) and teacher/instructional variables (e.g., quality of pedagogy) on student achievement. Using HLM, we conducted analyses that controlled for various student and teacher/instructional characteristics.29 The student background variables included language classification (ELL or EO), ELD level (ELD 1-2 or ELD 3-4), time spent on Waterford, and average level of engagement while using Waterford.30 Time spent on Waterford was determined by the total number of minutes a student used the Waterford program during the school year. Average level of engagement was the average level of engagement of each student over 4 days of observation. Level of engagement was measured as “fully engaged,” “minor distractions,” “distracted,” and “disengaged” based on the criteria described on pages 31-35. The teacher/instructional variables related to the Waterford program included Waterford usage and classroom level of engagement while using the Waterford courseware. Waterford usage was measured as “high,” “medium,” or “low.”31 Classroom level of engagement was the average of the classroom level of engagement over the four days of observation.32 29 An alpha level of .05 was used in the HLM analysis. See Appendix E for the models tested. Language classification and ELD Level were included in the HLM analysis because they were the two most important demographic variables in predicting outcomes in exploratory analyses as well as in the t-tests analyses in Appendix 1 31 We determined the level of classroom usage by the number of days the courseware was used over 4 days of observation, as well as the number of students in the class who used the courseware each day following the criteria outlined in the table below: 30 17 or more students 11-16 students 10 or fewer 4 days high medium low 3 days medium medium low 32 2 days medium low low 1 day low low low Classroom level of engagement was determined by the proportion of students engaged and off-task in each class. In a classroom with a “high” level of engagement, all the students who used the Waterford program were either 71 Quality of Open Court pedagogy was another teacher/instructional variable included in the analyses. In kindergarten, quality of Open Court pedagogy referred to the quality of pedagogy during the Sounds and Letters section of Open Court. In first grade, quality of Open Court pedagogy referred to the quality of pedagogy during portions of phonics and reading comprehension instruction. In addition, in kindergarten we included the amount of time spent over four days on the Sounds and Letters section. Students using any other primary reading program were excluded from these analyses. The outcome measures for this analysis for kindergarten included four tests of the WRMT-R (Visual Auditory Learning, Letter Identification, Word Identification, and Word Attack). For first grade, we included all WRMT-R tests and the SAT/9 Reading, Language, and Spelling tests.33 Kindergarten Findings Table 20 summarizes various student factors, teacher/instructional factors, and crosslevel interactions that were significantly related to kindergarten student gains on four of the WRMT-R tests: Visual Auditory Learning, Letter Identification, Word Identification, and Word Attack. • Visual Auditory Learning. On average ELL students had larger gains than EO students. This difference, however, was of marginal significance (p = .052). • Letter Identification. Time using the Waterford courseware had a negative effect on gains. Students who spent more time on Waterford had smaller gains than those who spent less time using the program. This effect was moderated by the quality of Open Court pedagogy. As the quality of Open Court pedagogy increased, the effect of time spent on Waterford became less negative and moved towards a positive direction. That is, on average, time spent using the courseware had a negative impact on gains in Letter Identification in classrooms where the quality of Open Court pedagogy was average. In classrooms with higher quality Open Court pedagogy, the negative effect of time spent on engaged or experienced minor distractions. In a classroom with “medium” engagement, more than half, but not all, of the students using the program were engaged or experiencing minor distractions. In a classroom with “low” engagement, less than half of the students using the Waterford program were engaged or experiencing minor distractions, that is, more than half of the students were distracted or completely off-task. 33 We also intended to include the WCART as an additional outcome measure but due to the small number of cases, we were unable to do so. See Appendix F for descriptive statistics and other information about the WCART scores. 72 Waterford diminished and moved in a positive direction. In classroom with higher levels of engagement while using the Waterford courseware, the effect of time spent on the Waterford courseware became positive. Finally, in classrooms where more time was spent on Open Court, the effect of time spent on the Waterford courseware was lower gains. • Word Identification. The quality of Open Court pedagogy had a marginally significant effect on gains (p = .051). Students in classroom with higher quality Open Court pedagogy had larger gains than students in classrooms with lower quality Open Court pedagogy. Also, EO students had larger gains than ELL students. This advantage of EO students in Word Identification became smaller with an increase in Waterford usage. On average, time spent while using the Waterford courseware by any individual student had no effect on gains. However, for students in classrooms with higher quality Open Court pedagogy, time spent using the courseware had a negative effect. Similarly, student level of engagement had no effect on gains. However, in classrooms with higher quality Open Court pedagogy, level of engagement also had a negative effect on gains in Word Identification. Finally, in classrooms where more time was spent on Open Court, the effect of student level of engagement increased. Overall, quality of Open Court pedagogy had a positive impact on gains in Word Identification whereas time spent on the courseware was not beneficial and even detrimental in certain classrooms. This could be explained in part by the fact that word identification is not a focus of Level One of the Waterford Program and thus time spent on the courseware did not reinforce this skill or was time taken away from Open Court instruction on this skill. • Word Attack. There were no student or classroom level effects on gains in the Word Attack test. We did find a cross-level interaction. Time spent using the courseware did not have an effect on gains, except in classrooms with higher quality of Open Court pedagogy. In that case, the effect of time spent on Waterford was negative. Also, in classrooms where more time was spent on Open Court instruction, EO students had larger gains than ELL students. 73 Table 20 Student and/or Teacher/Classroom Factors Related to Differences in Gains in Kindergarten Outcome Student Factors Visual Auditory Learning Letter Identification Language Classification (ELL+) Time on Waterford (-) Teacher/Classroom Factors None Cross-Level Interactions None Time on Waterford (-) moderated by Quality of OC Pedagogy (+) Time on Waterford (-) moderated by Classroom engagement (+) Word Identification Language Classification (EO+) Quality of OC Pedagogy (+) Time on Waterford (-) moderated by time spent on OC (-) EO (+) moderated by Waterford usage (-) Time on Waterford moderated by Time on OC (+) Time on Waterford moderated by Quality of OC Pedagogy (-) Student level of engagement moderated by Time on OC (+) Word Attack None Student level of engagement moderated by Quality of OC Pedagogy (-) Time on Waterford moderated by Quality of OC Pedagogy (-) ELD level moderated by Time on OC (+) None Overall, in kindergarten, English Language Proficiency (as determined by Language classification and ELD level) was an important predictor of gains. Students with lower English language proficiency had larger gains than those with higher proficiency in the Visual Auditory Learning test. In Word Identification, students with higher proficiency had larger gains. This finding was consistent in both the HLM analysis and the t-test analysis in Appendix B. The quality of Open Court pedagogy was an important factor as well. In general, high quality Open Court pedagogy had a positive effect on gains. Time spent on Waterford, on the other hand, had 74 a negative effect on gains.34 Possible explanations for these results include the significant alignment in the curriculum presented in the Level One courseware and Level K of Open Court. The skills presented in Level One are also presented in Level K. Additionally, significantly more time is spent on these activities during the Open Court Sounds and Letters instruction (on average between 30 and 50 minutes a day) than on the Waterford courseware (15 minutes). In the Letter Identification test, (a skill that is a focus of Level One), time spent using the courseware was beneficial in those classrooms where the students were receiving higher quality pedagogy on the same skill set. However, in both Word Identification and Word Attack, which measure word recognition and decoding (skills that are not a focus of Level One, but are included in the Sounds and Letters component of Open Court), time spent using the courseware was detrimental in classrooms with higher quality Open Court pedagogy. It might be that time taken away from high quality pedagogy impacted the students’ level of achievement on these skills. On the other hand, where the quality of pedagogy is anything less than high, time spent using the courseware generally does not make a difference. First Grade Findings Table 21 summarizes various student factors, teacher/instructional factors, cross-level interactions, and random effects that were significantly related to first grade student gains on all eight WRMT-R tests, as well as scores on the SAT/9 subtests. • Visual Auditory Learning, Letter Identification, and Synonyms. No student or classrooms variables had an effect on gains. 34 It is important to note that Waterford usage at the classroom level was not a significant factor related to achievement in the HLM analysis even though its student level counterpart was. This may be due to the limited scale of the Waterford usage variable (3-point scale: high, medium, low). In order to examine whether Waterford usage at the classroom level had an effect on achievement, we also conducted independent sample t-tests to compare the students in the kindergarten classrooms with high Waterford usage and those with low Waterford usage. Students in the low usage classrooms had larger gains on Word Identification and Word Attack as shown in the table below. Word Identification Waterford Usage N Mean SD T value Low 53 100 15.4 6.4 14.8 8.5 4.056* 47 62 8.5 2.3 8.7 3.6 4.610* High Word Attack Low High *p<.05 75 • Word Identification and Word Attack. ELD 3-4 students had larger gains than ELD 1-2 students. • Antonyms. EO students had larger gains than ELL students. Similarly, ELD 3-4 students had larger gains than ELD 1-2 students. Also, students in classrooms with a higher level of engagement on the Waterford courseware had larger gains. • Analogies. EO students had larger gains than ELL students and ELD 3-4 students had larger gains than ELD 1-2 students. For students in classrooms with average quality of Open Court pedagogy, time on the courseware has no effect on gains. However, in classrooms with higher quality Open Court pedagogy, time on the Waterford courseware had a negative effect on gains. • Passage Comprehension. In classrooms with higher engagement on the courseware at the classroom level, ELL students had larger gains than EO students. • SAT/9 Reading. In all three reading subtests—Word Study, Word Reading, and Reading Comprehension—ELD 3-4 students outperformed ELD 1-2 students. In all but the Word Reading subtests, EO students outperformed ELL students. In classrooms with higher engagement on the Waterford courseware, however, ELL students performed better than EO students (except in Word Reading). We also identified a random effect of time spent on the Waterford courseware in the Total Reading, Word Study, and Reading Comprehension scores. On average, time spent on the courseware had no effect on these scores; however, it did have a random effect, which varied across classrooms. In some classrooms, the effect of time spent on the Waterford courseware was positive and in other classrooms the effect was negative.35 Students in classrooms with a higher level of engagement on the courseware performed better than students in classrooms with a lower level of engagement on the Word Reading subtest. • SAT/9 Language and SAT/9 Spelling. EO students outperformed ELL students in the Language test. ELD 3-4 students outperformed ELD 1-2 students in both Language and Spelling. Additionally, time spent on the Waterford courseware had a random effect on 35 Time spent on Waterford had a random effect on the SAT/9 Reading score. For every increase of 100 minutes spent on the Waterford courseware, students scored up to 2 NCEs higher in some classrooms and up to 1.5 NCEs lower in other classrooms. Also, for every 100-minute increase on the Waterford courseware, students scored up to 2.3 NCEs higher or up to 1.9 NCEs lower on the Word Study subtest and up to 1.6 NCEs higher or up to .8 NCEs lower on the Reading Comprehension subtest. 76 Spelling. In some classrooms, the effect of time spent on the courseware was positive and in other classrooms the effect was negative.36 Table 21 Student and/or Teacher/Classroom Factors Related to Differences in the Outcomes in First Grade Outcome Student Factors Visual Auditory Learning Letter Identification Word Identification Word Attack None Classroom Factors None None None ELD Level (ELD 3-4 +) None ELD Level (ELD 3-4 +) None Antonyms Language Classification (EO +) ELD Level (ELD 3-4 +) Synonyms Analogies None Language Classification (EO +) ELD Level (ELD 3-4+) None Classroom level of engagement on Waterford (+) None None Passage Comprehension SAT/9 Total Reading SAT/9 Word Study SAT/9 Word Reading SAT/9 Reading Comprehension SAT/9 Language SAT/9 Spelling Language Classification (EO +) ELD Level (ELD 3-4+) Language Classification (EO +) ELD Level (ELD 3-4+) ELD Level (ELD 3-4+) Language Classification (EO +) ELD Level (ELD 3-4+) Language Classification (EO +) ELD Level (ELD 3-4+) ELD Level (ELD 3-4+) Cross-Level Interactions None Time on Waterford moderated by quality of OC pedagogy (-) Language Classification moderated by classroom level of engagement on Waterford (ELL+) Language Classification moderated by classroom level of engagement on Waterford (ELL+) None Random effect None Time on Waterford (+/-) Time on Waterford (+/-) Classroom level of engagement on Waterford (+) None Language Classification moderated by classroom level of engagement on Waterford (ELL+) Language Classification moderated by classroom level of engagement on Waterford (ELL+) Time on Waterford (+/-) None None Time on Waterford (+/-) 36 For every increase of 100 minutes on Waterford, students scored up to 2.6 NCEs higher in some classrooms and up to 1.6 NCEs lower in other classrooms. 77 In first grade, English language proficiency (determined by Language classification and ELD Level) was also an important predictor of achievement. Students with higher English language proficiency performed better than students with lower English language proficiency on most tests (except Visual Auditory Learning, Letter Identification, Synonyms, and Passage Comprehension). Overall, these findings are consistent with the analysis comparing treatment and control students.37 Classroom level of engagement during Waterford courseware usage had a positive effect on two outcomes: Antonyms and SAT/9 Word Reading. Also, in classrooms with a higher level of engagement on the Waterford courseware, ELL students performed better than EO students on the Passage Comprehension and the SAT/9 Reading tests. The amount of time spent on the courseware by any individual student had a random effect on most SAT/9 tests. In some classrooms, the effect of time spent on Waterford was positive and in other classrooms the effect was negative. One explanation for this random effect may be that time spent using the courseware interacted with the quality of Open Court pedagogy, as we found to be true in kindergarten. However, because of the way we defined Open Court pedagogy, we may not have adequately captured instruction of the more basic reading skills. Because the Level Two courseware teaches word recognition and beginning reading comprehension, for first grade, we defined the quality of instruction by focusing on reading words in the context of sentences and reading comprehension skills and strategies. We did not include those components of the curriculum that focus most heavily on phonics, blending, decoding, spelling, or other basic skills.38 It may be that the quality of Open Court pedagogy on these other skills was related to the kind of effect of time spent using the Waterford courseware and explained why the effect was positive in some classrooms and negative in others. This analysis does not further inform the differences found between treatment and control students on Letter Identification and Synonyms since no student or classroom level variables had an effect on gains on these two tests. 37 In the analysis comparing treatment and control students, students with higher English language proficiency performed better than students with lower English language proficiency on the Synonyms and Passage Comprehension tests as well. Additionally, in that analysis, students with lower proficiency had larger gains than students with higher proficiency in Visual Auditory Learning and Letter Identification. 38 We will be expanding our definition of quality of pedagogy in the next year’s analysis in order to address this issue. 78 CONCLUSIONS AND RECOMMENDATIONS This is the third in a series of reports, which contains findings from the first year implementation of the Waterford Early Reading Program in kindergarten and first grade. This report focuses on the overall implementation of the Program during the 2001-2002 school year and its relationship to student achievement. The findings address five research questions: 1. How is the Waterford program being implemented? 2. To what extent is the Waterford courseware being used? 3. To what extent are students engaged while using the Waterford courseware? 4. What is the quality of pedagogy during reading/language arts time in sample classrooms? 5. What is the relationship between implementation of the Waterford courseware, the primary reading program, and student achievement? With respect to the first question, overall teachers do not utilize the program in a way that maximizes the courseware’s usefulness. In the majority of classrooms, the computers are positioned in a way that allows for a greater number of distractions. Contrary to the recommendations provided in the Getting Started Manual, computers are typically set up so that students sit next to each other with nothing to prevent them from looking at each other’s screens or talking to each other. Additionally, many teachers’ knowledge of the program features is incomplete and insufficient to meet their students’ needs. Teachers have to be knowledgeable enough about the program in order to allow their students to take advantage of the fact that the program allows for instruction to be individualized. Therefore, teachers must know that they are the ones who determine what level to start their students on, decide whether or not the student should see specific activities or stay on a set of activities for additional support, and decide when a student should advance to the next level. Yet, many teachers were unaware that they set the level of the courseware for their students or that they had the ability to determine when to advance a student from one level to the next. Additionally, while the courseware does provide teachers with tools they can use in order to determine whether their students are making progress by using the program or need additional attention, many teachers did not use these resources. Twenty percent of kindergarten and 33% of first grade teachers did not look at their student time reports. Similarly, less than half of the 79 kindergarten and first grade teachers checked to see whether their students had done the writing. Less than 50% of first grade teachers listened to their students’ recorded readings. Overall, it appears that teachers are not using the information provided by the courseware to make decisions about instruction, and are not fully implementing the courseware component of the program. There are a few possible explanations for this finding. First, while teachers were provided training on these program features, as some of the teachers noted during their interviews, the training was too short and did not provide them with as much information as they felt they needed in order to realize all of the program benefits. Second, it is possible that teachers received conflicting messages about how to use the program. On the one hand, for the program to operate most effectively, the teacher has to be knowledgeable enough about courseware to make decisions for their students. And yet, teachers are also told that the program is entirely free standing so that the teacher does not have to do anything but turn it on. Most importantly, teachers do not understand how the two programs might align in terms of their activities, skills, and strategies presented and how they can use them to meet their students’ needs and improve their instruction and students’ learning. With regard to the second question, the program is not being used for the amount of time recommended. On the one hand, the majority of teachers state that they have students use the program five days a week. Similarly, the majority of teachers believe they have their students use the program for at least the recommended number of minutes each day. Yet 22% of kindergarten and 67% of first grade teachers also indicated that not every student uses the computer on a daily basis. This is also reflected in the usage data and the observation data. Usage data show that students used the program less than would be expected if they had used it every day for the recommended amount of time. Also, based on our observations, we determined that the majority of classrooms level of usage was medium. Given that this was the first year the program was present in these classrooms, this seems to be a positive finding. It is likely that usage will increase as teachers become more comfortable integrating it into their daily routine. For the third question (Are students engaged while using the courseware?) we found that, on any given day, approximately 75% of the students were either engaged or experienced only minor distractions. The level of engagement varied by classroom. In some classrooms, none of the students were distracted or disengaged and in other classrooms the majority of students were off-task. We did find that English language learners were more likely to spend time off-task than 80 English Only students. More specifically, the lower the level of English proficiency, the higher the chance that the student spent time disengaged or off-task. This may be due to the fact that, although the courseware has features that may be attractive to language learners–extensive visual support that corresponds to the oral language students hear, singing and chanting activities to build oral language vocabulary, and activities that are simple and direct–it may lack some of the language support that these students need. While a student can play an activity repeatedly to ensure that concepts are understood, the student cannot ask the courseware to try to explain something in a different way, or to use different language. Additionally, if the student does not understand the instructions or information provided by the courseware, he cannot engage in the activity without external support from someone who can provide the information to him. We also examined engagement at the classroom level. Here we found that, overall, engagement at the classroom level was also of a medium quality. For kindergarteners, the percentage of classrooms where engagement was high was less than 20%. For first grade, 25% to 33% of classrooms had high classroom level engagement. Differences between kindergarten and first may be due to the fact that kindergarten students seem to have more difficulty staying engaged for extended periods of time. For both grades though, engagement was probably impacted by the location of the computers, the lack of dividers between students, and possibly the fact that these classrooms are crowded and the computers are not really offset from the rest of the classroom. Our data for the fourth question revealed that overall, pedagogy was of a medium quality or lower in both the treatment and control group. Fewer than 20% of kindergarten treatment and control teachers provided their students with a medium-high or high quality pedagogy. For first grade, the quality pedagogy was even lower. More than half of the first grade teachers provided a medium-low or low quality of pedagogy to their students. Teacher practice at the kindergarten level was exemplified by teachers who followed portions of the Sounds and Letters section and did so with partial implementation of the activities as they conducted the lesson over the course of a two day observation. For first grade, this meant that more often than not, students did not engage in reading activities during both of the two days of observation. Moreover, when reading activities were conducted by the teacher, they were done with only partial implementation of the lesson as detailed in the teacher’s manual. This meant that decodable books were read for fluency and not for word recognition, high-frequency word review, and blending. Similarly, 81 when students read Big Book or Anthology selections, the teacher did not model or engage students in the strategies and/or skills presented in the teacher’s manual or other questions reflecting reading comprehension. Question five focuses on the relationship between the implementation of the Waterford courseware, the primary reading program, and student achievement, in the context of low implementation. Given this low level of implementation, it is not surprising that we found that, in kindergarten, time spent on the Waterford program had no impact or had a negative impact on gains. On Letter Identification, students who spent more time using the courseware had lesser gains than students who spent less time using the courseware, except in classrooms with higher quality Open Court pedagogy. In those classrooms, more time spent using the courseware actually translated into greater gains. One possible explanation for this interaction between the quality of pedagogy and Waterford usage is that the courseware can only provide support to students’ learning where the initial instruction is of a higher quality. In the absence of a higher quality of pedagogy, the program has little to support. If the students are receiving strong instruction in Open Court in letter identification, a skill that is taught throughout the Sounds and Letters component of the curriculum, and they are also receiving reinforcement from the Level One courseware, they are probably able to build on their knowledge as they move back and forth between the two curricula. On two of the other tests, Word Identification and Word Attack, time spent using the courseware on average had no effect on gains. However, in the presence of higher quality pedagogy, the effect of time spent was negative. Again, the areas of emphasis of the Level One courseware might explain this. The Level One courseware does not focus on word recognition and decoding. These skills are found on the Level Two courseware. On the other hand, Open Court focuses on both letter and word recognition in Level K. Consequently, students who spent time on the Waterford courseware during Open Court instruction focusing on word identification and decoding, did not have the same consistent exposure to this skill set. Although the courseware randomly calls students for their sessions, ensuring that no student misses the same exact part of regular instruction on a daily basis, as long as the two programs are running concurrently, students are going to miss some portion of the lesson that focuses on word identification and decoding. 82 In first grade, the use of the courseware had limited effects on the outcomes. On the Antonyms and SAT/9 Word Reading tests, classroom level of engagement during Waterford courseware usage had a positive effect. Also, on the Passage Comprehension test and the SAT/9 Reading tests, ELL students performed better than EO students in classrooms with a high level of engagement on the Waterford courseware. The effect of time spent on the courseware on most of the SAT/9 tests varied from classroom to classroom. In some classrooms, more time using the courseware translated into higher scores, whereas in other classrooms, more time using the courseware resulted in lower scores. These findings are not surprising given the overall low level of implementation in first grade classrooms. Recommendations Given the findings from the first year of implementation of the Waterford Early Reading Program, a variety of steps should be taken to improve the quality of instruction: 1. Central, local district and school administrators should take steps to ensure that the program is fully implemented at the classroom level. 2. Professional development should be provided to teachers so that they understand how to manage the Waterford courseware and integrate it into their daily reading/language arts lessons. 3. Teachers should be required to know the content of the Waterford courseware so that they are aware of the specific skills covered in the different activities and how and when to make changes to their students’ individual programs so that the courseware truly works to meet the individual needs of each student. Program knowledge will also allow them to provide sufficient language support to their limited English Proficient students. 4. Literacy coaches should be knowledgeable about the content and appropriate use of the program in order to provide support to teachers. 5. Professional development provided to teachers by local districts and school site administrators should focus on an increased understanding, in the context of the Open Court curriculum, of the strategies and skills students need in order to become proficient readers. Professional development should also focus on teaching teachers how to model for, prompt, and support students as they struggle 83 to become readers. Next Steps As we move on to the second year of the study, we plan to incorporate valuable information we have gained from the first year. Below are a number of modifications and additions that we will include in the analysis of the second year data. 1. Reduce the number of WRMT-R tests administered to kindergarten students to the Visual Auditory Learning, Letter Identification, Word Identification, and Word Attack. 2. Investigate the possibility of using the Open Court assessments in order to gain an additional outcome measure. 3. Modify the Waterford Observation protocol in order to capture the different types of activities presented by the courseware and the extent to which students engage in them. 4. Expand the three-point scale used to determine Waterford courseware usage at the classroom level in order to create a more nuanced scale. 5. Develop a scale to evaluate the quality of implementation of the Waterford program at a classroom level. 6. Expand the definitions for quality of Open Court pedagogy for the first grade curriculum to include a more comprehensive analysis of the “beginning to read” related activities found on the Level Two courseware. 84 REFERENCES Angrist, J., & Lavy V. (1999). New evidence on classroom computers and pupil learning. Working Paper 7424. National Bureau of Economic Research. Cambridge MA. Cohen, D.K. (1987). Educational technology, policy, and practice. Educational Evaluation and Policy Analysis, 9, 2, 153-170. Kamil, M. L., & Lane, D. M. (1998). Researching the relation between technology and literacy: An agenda for the 21st Century. In D. Reinking, M. C. McKenna, L. D. Labbo, & R. D. Kieffer (Eds.), Handbook of Literacy and Technology: Transformations in a PostTypographic World. Mahwah, NJ: Lawrence Erlbaum Associates. Hasselbring, T. S., & Tulbert, B. (1991). Improving education through technology: Barriers and recommendations. Preventing School Failure, 35, (3), 33-37. Kinzer, C., & Leu, D. J. (1997). Focus on research—The challenge of change: Exploring literacy and learning in electronic environments. Language Arts, 74, (2), 126-136. Kosakowski, J. (1998). The benefits of information technology. ERIC Digest, EDO-IR-98-94. Mathes, P.G., Torgesen, J.K., Allor, J.H. (2001). The effects of peer-assisted literacy strategies for first-grade readers with and without additional computer-assisted instruction in phonological awareness. American Educational Research Journal, 38, 2, 371-410. Mioduser, D., Tur-Kaspa, H., & Leitner, I. (2000). The learning value of computer-based instruction in early reading skills. Journal of Computer Assisted Learning, 16, 54-63. North Central Regional Educational Laboratory (1999). Critical Issue: Using Technology to Improve Student Achievement. http://www.ncrel.org/sdrs/areas/issues/methods/technlgy/te800.htm. Owston, R. D., & Wideman, H. H. (1996). Word processors and children’s writing in a highcomputer access setting (in review). http://www.edu.yorku.ca/~rowston/written.html Snow, C.E., Burns, M.S., & Griffin, P. (1998). Preventing reading difficulties in young children. Washington, DC: National Academy Press. United States Department of Education (1993). Using Technology to Support Education Reform. http://www.ed.gov/pubs/EdReformStudies/TechReforms/chap2b.html. United States Department of Education (2000). Technology and Education Reform. http://www.ed.gov/pubs/EdReformStudies/EdTech. 85 United States Department of Education (2001). The Nation’s Report Card: Fourth grade reading highlights 2000. Office of Educational Research and Improvement. 86 APPENDIX A Waterford/Open Court Alignment Levels One and Two of the Waterford courseware align with parts of the Open Court curriculum for kindergarten and first grade. The Level One focus on print concepts, phonological awareness, and letter recognition are covered in the Open Court Level K Sounds and Letters section. For each of the pieces of the courseware—Name that Letter, Daily Activities, Main Lessons, and Play and Practice—there are parallel types of activities within the Open Court Level K Sounds and Letters component. As demonstrated below, a match can be made reflecting the same skills being learned in both programs. In the beginning Waterford courseware lessons, there are six different games that help students learn to spell their own first name. In the Hen Name Game, students help the mother hen’s eggs hatch by clicking the letters of their names in the correct order. As they do, chicks hatch from the shells. If students have difficulty choosing the correct egg, the farmer helps them concentrate on fewer eggs at a time. A model of a student’s name is also given. (Getting Started) In Open Court Level K, Unit 1, Lesson 1 of Book A, the first activity students are asked to do is to engage in a name activity. The lesson is set forth as follows: § Prepare a Name Necklace for each child and one for yourself. Each necklace should have the child’s first name written on an oak tag rectangle, with yarn tied through holes at each end so the necklaces can be worn. § Explain that every person has a name. Put on your Name Necklace and introduce yourself. Go around the room and ask the children to state their names, including their last names if they choose to do so. § Then provide each child with his or her own Name Necklace. Hold up each necklace, read the name aloud as you point to the word, and have the child come up and get the necklace. Encourage them to put on the necklace and say their names aloud as they point to their written name. § When all the children have their necklaces on, demonstrate how to turn to the right and have the children introduce themselves to the person on the right. Then demonstrate turning to the left, and have the children repeat the procedure. 87 For the next lesson, students are asked again to use their name necklaces. Additionally, students continue to wear their name necklaces and are directed to rely on them during later activities. The Level One courseware has students engage in ABC and letter song activities on a daily basis. These range from a variety of presentations of the ABC song in both capital and lower case letters to letter sound songs covering alphabet sounds, tongue twisters, and vowel songs. In Open Court Level K, students begin the Letter Recognition section by learning the alphabet song. Over subsequent lessons students teach the Leo the Lion puppet letters it doesn’t know by singing the Sing Your Way to Z game during which they sing the alphabet song until they reach the letter in the alphabet which is unfamiliar to the puppet. Additionally, vowel songs are introduced in the Phonemic Awareness section of Level K. In Level One of the courseware, students are exposed to Sound Sense activities, which focus on rhyme, alliteration, blending, and phoneme deletion. In Open Court Level K, rhyming activities occur in the Letter Recognition portion of the Sounds and Letters component. For example, Level One has an activity called Make it Rhyme in which students are required to choose the correct rhyming word. This activity allows students to practice recognizing and producing rhymes. In Open Court Level K, students engage in a similar Making Rhymes activity in which the teacher says a word, see, and asks the students to say a word that rhymes with it. The teacher then repeats the word see and says the word suggested by the children, then asks for another word that rhymes with see. The teacher’s manual then directs the teacher to repeat all three words and ask the children if the words all rhyme. The manual also presents seven other words to be used for rhyming. In fact, some of the Sing a Rhyme nursery rhymes presented in Level One are also found in Level K. Finally, the Readiness Activities presented in Level One prepare students for reading and writing, “by learning print concepts (such as knowing that sentences are read left to right and top to bottom, or that words are separated by spaces)” and students “build basic concepts and vocabulary (including recognizing position words, identifying shapes, recognizing numbers, and sequencing events).” Here again, these types of activities are found in the Sounds and Letters component of Level K. For example, Level One presents Introduction and Screening and Instruction activities. In these activities, students receive a basic introduction to a concept and complete a brief screening exercise. In one activity, students are asked to remember the objects that have been removed from a character’s purse. In a similar activity in Level K, students are 88 asked to remember words from a poem in which the teacher intentionally substitutes the wrong word. Level One Letter Picture activities parallel the Letter Names and Letter Shapes activities presented in the Letter Recognition section of Level K. In both programs there is a Find a Letter activity. In Level One, students “practice the name, shape, and sound of letters” by clicking on the target capital or lowercase letter they see in the two or more words on the screen that are related to other activities in the Main Lesson. In Level K, the teacher reviews the letters that the children found the previous day. The teacher then points to an Alphabet Card and has the children find examples of matching letters around the room. The students are asked to point out and name their letters. These are added to the list of letters being found. This same activity overlaps with the Make a Scene activity in the Level One courseware. The Read with Me Books parallel the use of the Alphabet Big Book, the reading selections from the Big Books and the predecodable and decodable books throughout Level K. For Level One, these books help students “explore printed text and practice the sound of a particular letter in the context of words and sentences.” In Level K, students are exposed to an Alphabet Big Book rhyme for each letter of the alphabet. Each rhyme focuses on the sound of one letter. For example, Level One has the story Five: Five fish. Five fat fish. Five fat, feathered fish Five fat, feathered, freckled fish, Five fat, featherhead, freckled, frilly fish . . . . . . flying! Open Court Level K does the same type of presentation for each sound of the alphabet. For example, for the letter Nn: Norman says Nelly is noisy and natters all night with the nurse. But Nelly says Norman is nosy which Nelly says is much worse. Nosy or noisy, which is worse? Nobody here can choose. 89 Nothing else has happened. That’s the end of the news. Where Level One has the Letter Checker, Letter Fun, and Fast Letter Fun, Level K uses activities like the Find a Letter, Secret Passletter, and What is the Letter activities within the Letter Recognition section of Sounds and Letters. The Level Two courseware focuses on letter sounds, word recognition, and beginning reading comprehension. For the first half of the courseware (Units 1-5), in the Main Lessons, students focus on word recognition activities (Sing a Rhyme, Readiness Activities, Picture Story, Letter Pictures, Make a Scene). In the second half of the courseware (Units 6-10), in the Main Lessons, students move to Automaticity, Reading Strategies, and Reading Comprehension and Writing Practice (Advanced Readiness Activity, Sing around the World, Read with Me Book, Letter Checker, Letter Fun, Fast Letter Fun). These same types of activities are covered throughout the Preparing to Read, Reading and Responding, and Integrating the Curriculum sections of Open Court. The first half of a typical Waterford session bears many similarities to activities found in the Preparing to Read section of Open Court Level 1. A Waterford session starts off with Daily Activities which quickly run through three sections: Let’s Read, Sound Sense, and Review. In Let’s Read, students read ‘Readable’ or Traditional Tales, which emphasize a number of pattern words and power words. For example, in the Readable, “I Am Sam,” the Pattern Words are Sam and am and the Power Word is I. In Open Court, students read Decodable books, which use the equivalent High-Frequency Words. In the Decodable, “The Map,” the High-Frequency Words are is, the and on. The next section of Daily Activities, Sound Sense, “promotes phonological awareness.” One exercise, Barnyard Bash, “allows students to choose which letter they want to change in a word and then choose the new letter.” In this way, students “practice manipulating phonemes to create new words.” This activity is much like Open Court blending. Typical blending involves switching out the first or last letters of words and swapping in new ones to create new words. The third and final section of Daily Activities, Review, uses various songs like “Vowels Side by Side.” Similarly, Open Court often has songs in its Preparing to Read section such as the “Short-Vowel Song.” 90 Waterford continues on with Introductory Lessons in which students are introduced to new power words in new Readable books. Once again, the equivalent in Open Court are HighFrequency Words in Decodable books, books that make up a fundamental component of nearly every Open Court lesson. The Units section is the heart of a Waterford session. Especially in the first half year of Level Two, much time is spent on the Word Recognition categories Sounds Fun!, Pattern Word Play, and Power Word Games. One of the three main activities in Sounds Fun! is the Sound Room where a student can click an easel to “watch the letter being traced as the name is spoken.” In Open Court, to learn a letter, in this case, “A,” the teacher will “trace over the ‘A’ several times and have the children write the letter with their fingers on a surface in front of them.” In Sound Adventures, part of the Waterford courseware’s Pattern Word Play, kids can “Make a Word with Rusty” which “lets the students combine letter sounds and hear the words they have created.” Rusty Raccoon, a cartoon animal, blends words together with refrigerator magnets. In Open Court, a teacher commonly uses some puppet, often Leo the Lion, as an aid to teaching oral blending, segmentation and consonant restoration. Waterford’s Power Word Games and Word Master Games such as Word Eggspert might be seen as computer variations of dictation and spelling (in Open Court, found in the section, Integrating the Curriculum). In Word Eggspert, a narrator says a word and the student is invited to choose the egg labeled with that word. In Rascal Presents a Word, the student hears a word used in a sentence, which appears on the screen and is invited to click the word. In Tug-a-Word, the student clicks one of a few choices of words on screen to match the spoken word. In Open Court, the teacher dictates words and sentences, which the students write in their Reading and Writing Workbooks. Also in the workbooks are exercises in which the student circles one of two word choices to complete such sentences as, “Would you like a _______ of milk?” (grass glass) and “Adam needs to put a ______ on the letter.” (stamp clamp) In another main Units section, Reading Strategies, the Context Clues activities utilize what Open Court would consider the Comprehension Strategy of Clarifying: “rereading, using context, or asking someone else” to figure out the meanings of unfamiliar words. For example, the activity, Rusty and Rosy’s Clues, “help(s) the students learn how using clues in the story text can help them read and understand new words. Rusty reads a paragraph aloud, pointing out the 91 words and phrases that help him know what the unfamiliar word (in this case) banquet means.” Other similar activities which utilize clarifying are Use-A-Clue, Watch Me Read, and Mystery Word. Also, the next Units section -- Reading, Comprehension, and Writing Practice—shares many topics covered in Open Court’s Comprehension, Skills & Strategies. Some subjects covered in Waterford under Comprehension Strategies for Level Two are: Set reading expectations, make predictions; Build background knowledge; Visualize or picture what is happening in the story; and Sum up – Remember order. Open Court addresses some of these skills within the Preparing to Read component and some of them are covered within the Reading and Responding component. Teachers are asked to teach students to activate prior knowledge, build background, browse, seta purpose for reading, and review new or unfamiliar vocabulary within each Preparing to Read lesson accompanying every reading of the Anthology. In addition, within each Reading and Responding lesson, the teacher models one of a number of comprehension strategies: Ask questions; Clarify; Make connections; Make predictions; Summarize; and Visualize. Some of the Waterford Comprehension Skills covered are: Distinguish between reality and fantasy; Recall details; Make inferences; Recognize cause and effect; and Compare and Contrast. This list aligns with Open Court’s Comprehension Skills covered: Distinguish reality from fantasy; Identify main ideas and details; Make inferences; Comprehend cause-and-effect relationships; and Compare and contrast items and events. An example of activities, which contain these reading comprehension strategies and skills, is the Get Ready story “Seeing Fingers.” Here, students are asked to “imagine what a morning would be like if they couldn’t see. Then they write a poem about ‘seeing without sight.’” In the activity, Think About It, they “may be asked to answer questions, remember the order of the story events, or describe characters. For example, after reading ‘Seeing Fingers,’ the students answer some questions about what it might be like to live without sight.” The screen asks students to check yes or no to a list of questions such as: “Can people who don’t see recognize their family? Can people who don’t see learn to read? Can people who don’t see go to school? Can people who don’t see have fun with their pets?” Students thus are coaxed into seeing another person’s point of view. 92 Open Court has its own Think About It activity, under Theme Connections, in its Reading and Responding section. For example, after reading “Matthew and Tilly,” a story about two friends, the Think About It activity is to “invite the children to think about the selection. You might ask the following questions: Have you ever fought with a friend? How did you feel?” Open Court also has the activity, Making Connections where students discuss and share their experiences, then draw conclusions like: “I think Tilly feels lonely. I guess she wants to play with Matthew. Sometimes I feel lonely when I play by myself.” And: “Matthew said he was sorry. Tilly did, too. Now they get along again. That happened to me and my friend. Have you ever had to tell a friend that you were sorry?” In the last Units sections, Unit Review and Play and Practice students do learn a new grammar or punctuation skill in Skill Builder but for the most part review their Readable books and practice what they have just learned. For Open Court, Grammar, Usage and Mechanics are covered in the Integrating the Curriculum section. Examples: Waterford Level 2 covers the –ing ending in Readables 13, “Who Is at the Door?,” and Readable 28, “Brave Dave and Jane.” Open Court covers –ing twice in Level 1, Book 1 and three times in Book 2. Waterford covers quotation marks in Readable books 4a, “What Is It?” and 4b, “Dan and Mac.” Open Court covers quotation marks twice in Book 2. As this analysis, and Table 1 below, reflect, the vast majority, if not all, of the content of Level One and Level Two of the Waterford courseware are covered within the Open Court curriculum, Level K and Level 1 (see Table 1). 93 Table 1 Alignment of Open Court Kindergarten and First Grade and the Waterford Early Reading Program Level One and Level Two Open Court Reading OC OC First Kindergarten Grade WERP Level One WERP Level Two Print/Book Awareness (Recognize and understand the conventions of print and books) Distinguish between capital Distinguish between capital and lowercase letters and lowercase letters Capitalization Y Y Constancy of Words End Punctuation Y Y Y Y Follow Left-to-right, Top-tobottom Y Y Letter Recognition and formation Page Numbering Y Y Y Y Picture/Text Relationship Quotation Marks Y Y Y Y Relationship Between Spoken and Printed Language Sentence Recognition Table of Contents Y Y Y Y Y Y Word Length Y Y Understand the concept of a word Y Y Understand that words are separated by spaces and read one at a time in sequential order. Understand that words are separated by spaces and read one at a time in sequential order. Y Y Blend onset and rime to make words Blend onset and rime to make words Y Recognize initial consonant Recognize initial consonant sounds; Identify beginning sounds consonant blends Word Boundaries Understand the concept of a Understand the concept of a word word Read sentences left to right, Read sentences left to right, top to bottom top to bottom Understand the concept of a Understand the concept of a letter letter Explore the connection between pictures and text Explore the connection between pictures and text Explore the relationship between speech and text Explore the relationship between speech and text Identify a sentence Phonemic Awareness (Recognize discrete sounds in words) Oral Blending: Words/Word Parts Oral Blending: Initial Consonants/Blends Y Oral Blending: Final Consonants Y Y Recognize final consonant sounds Oral Blending: Initial Vowels Y Y Recognize initial sounds (alliteration) Oral Blending: Syllables Y Y 94 Identify ending consonant blends; Identify ending consonant blends Recognize initial sounds (alliteration) Blend syllables to make words Oral Blending: Vowel Replacement Y Y Segmentation: Initial Consonant/Blends Y Y Recognize initial consonant sounds Recognize final consonant sounds; Identify ending consonant blends Segmentation: Final Consonants Segmentation: Initial Vowels Segmentation: Words/Word Parts Rhyming How the Alphabet Works Y Y Y Y Y Y Y Y Recognize rhyme Count syllables Recognize rhyme Letter Knowledge Y Y Learn the letters of the alphabet Learn the letters of the alphabet Letter Order Y Y Learn the letters of the alphabet Learn the letters of the alphabet Sounds in Words Letter Sounds Y Y Y Y Y Y Y Y Y Y Recognize (count) individual phonemes in words Phonics (Associate sounds and spelling to read words) Blending Sounds into Words Consonant Cluster Consonant Diagraphs Consonant Sound and Spelling Phonograms Syllables Vowel: Diphthongs Vowels: Long Sounds and Spellings Vowels: r-controlled Vowels: Short Sounds and Spellings Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Blend onset and rime to make words Blend sounds into words Identify digraphs Recognize final consonant sounds/Recognize initital consonant sounds Count syllables Recognize long vowels and their spellings Recognize short vowel sounds Recognize vowels and their spellings; Recognize short vowel sounds Comprehension Strategies (Selfmonitoring techniques) Asking Questions/Answering Questions Clarifying Predicting/Confirming Predictions Y Y Making Connections Summarizing Y Y Y Y Answer questions about the text Set Reading expectations, make predictions (Peek at the Story) Set reading expectations; make predictions (Peek at the Story) Connect experiences and knowledge with the text Sum up - Remember Order 95 Y Y Sum up - Five W's (who, what, when, where, why) Visualizing Comprehension Skills (Deciphering the meaning of text) Cause/Effect Y Y Visualize or picture what is happening in the story (Step into the Story) Y Y Recognize cause and effect Classify/Categorize Y Y Compare and Contrast Draw Conclusions Y Y Y Y Main Idea and Details Making Inferences Y Y Y Y Describe characters; Recall details Make inferences Reality/Fantasy Y Y Distinguish between reality Distinguish between reality and fantsy and fantasy Sequencing Vocabulary Y Y Recognize logical sequence of events Organize (map) stories Antonyms Y Y Understand opposites (antonyms) Comparatives/Superlatives (not taught in OC Kinder) Compound Words Connecting Words Context Clues Contractions High-Frequency Words Homophones/Homonyms Idioms Inflectional Endings Irregular Plurals Multiple Meaning Words Multisyllabic Words Position Words Prefixes Question Words Root Words Learn to sort items by categories Recognize objects that are similar or different Comparative Adjectives (i.e, fast, faster, fastest) Comparative Adjectives Compound Words Y Y Y Y Y Y Y Compare Characters; Compare or Contrast Use context to understand words meaning; make use of context Contractions Y Y Spell high-frequency words; Recognize high-frequency words Y Y Y Y Y Y Y Understand position words (over, under, through, top, beside, bottom) Y Y Y Y 96 Selection Vocabulary Suffixes Synonyms Y Y Y Y Y Time and Order Words (Creating Sequence) Y Y Utility Words (Body Parts, Colors, Common Classroom Objects, Days of the Week, Time of Day, Weather Words Word Families Writing/Composition Approaches Collaborative Writing Group Writing Process Brainstorming Drafting/Proofreading Publishing Revising Forms Biography/Autobiography Describe a Process Descriptive Writing Expository Folklore (Folktales, Fairytales, Talltales, Legends, Myths Information Text Journal Writing Letter Writing Narrative Personal Narrative Play/Dramatization Poetry Writers Craft Characterization Descriptive Writing Dialogue Effective Beginnings Effective Endings Event Sequence Figurative Language Identifying Thoughts and Feelings Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Learn about numbers; number words and sets Build Vocabulary Identify colors; color words; Learn to identify parts of the Identify colors; color words body Folk Tales; Traditional Stories Informational Text Y Y Y Y Y Y Y Y Y Build vocabulary; number words and sets; shapes; sizes Poetry (songs, rhymed verse, free verse) 97 Mood and Tone Plot (Problem/Solutions) Rhyme Setting Suspense and Surprise Topic Sentence Using Comparisons Purposes Determining Purposes for Writing Integrated Language Arts Grammar Parts of Speech Adjectives Adverbs Conjunctions Nouns Prepositions Pronouns Verbs Sentences Parts (Subjects/Predicates) Y Y Y Y Y Verb ( Action, Helping, Linking, Regular/Irregular Usage Adjectives Adverbs Nouns Pronouns Verbs Mechanics Capitalization (Sentence, Proper Nouns, Titles, Direct Address, Pronoun I) Punctuation (End punctuation, comma use, quotation marks, apostrophe, colon, semicolon, hypen, parentheses) Spelling Contractions Inflectional Endings Recognize Rhyme Y Y Y Y Y Y Y Y Y Y Y Identify adjectives Identify nouns Identify verbs Y Structure (Simple, Compound, Complex) Types (Declarative, Interrogative, Exclamatory, Imperative) Verb Tenses Y Y Y Y Y Y Y Y Y Y Y Y Use present and past tense Y Use irregular verbs Y Y Y Y Y Y Y Y Use capitalization Y Y Y Y Possessives Use Punctuation Y Y Contractions Y 98 Irregular Plurals Long Vowel Patterns Multisyllabic Words Phonograms r-controlled Vowel Spellings Short Vowel Spelling Sound/Letter Relationships Y Y Special Spelling Patterns (-ough, augh, -all, -al, -alk, -ion, -sion, tion) Listening/Speaking/Viewing Listening/Speaking Y Y Y Y Y Y Y Y Analyze and evaluate intent and content of Speaker's Message Answer Questions Y Y Y Compare Language and Oral Traditions Determine Purposes for Listening Follow Directions Y Y Y Y Y Y Learn about Different Cultures through Discussion Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Listen for Poetic Language (Rhythm/Rhyme) Participate in Group Discussions Respond to Speaker Speaking Compare Language and Oral Traditions Describe Ideas and Feelings Give Directions Learn about Different Cultures through Discussion Participate in Group Discussions Present Oral Reports Read Fluently with Expression, Phrasing, and Intonation Read Orally Share Information Summarize/Retell Stories Use Appropriate Vocabulary for Audience Viewing Y Y Y Y Y Y Y Y Appreciate/Interpret Artists' Techniques Compare Visual and Written Material on the Same Subject Y Y Y 99 Gather Information from Visual Images View Critically View Culturally Rich Materials Inquiry & Research/Study Skills Charts, Graphs, and Diagrams/Visual Aids Y Y Y Y Y Y Y Y Make signs, lists, and cards Skills Infused Throughout Skills Infused Throughout Curriculum Curriculum Nursery Rhymes (infused throughout; Songs and Games) Build background knowledge (Infused through Reading and Responding) Learn letters in students' own name (Names taught in Realistic Fiction Lesson 1, Book A) Learn letters in students' own names Plurals Spell decodable words Skills not explicitly taught Skills not explicitly taught in OC in OC Identify and create patterns Learn to use a mouse Understand one-to-one relationships Practice simple word processing skills Recognize and identify every day sounds Learn new words by analogy Practice visual memory (recall objects from a scene) Learn to use a mouse (point, click, drag) Learn to use a simple paint program 100 APPENDIX B September 18, 2002 Report FINDINGS Of the 2000 students sampled for the study, 1713 students took the Woodcock Reading Mastery Tests-Revised in both Fall and Spring. The remaining 287 students were not tested in the Spring because they had either left the district or moved to a different school outside of our sample.39 Forty-three of the students tested were in classrooms that started out as controls but received Waterford computers in the middle of the school year. These students in the “late treatment” group were not included in any of the analyses in this report. Table 4 shows the distribution of all the students tested by grade and condition. The total number of students for whom we have test scores is 1670. Table 4 Students with Pre and Post Test Scores on the WRMT-R by Grade and Condition Grade Kindergarten First Grade Total Treatment 421 446 Control 189 174 Non-matched 219 221 Late treatment 18 25 Total 847 866 867 363 440 43 1713 Confirmation of sampling matches In order to ensure that treatment and control students were comparable in terms of ethnicity, Title 1 participation, and English language proficiency, we checked our initial matches using end of year SIS data. English language proficiency was determined by students’ Master Plan Classification: Limited English Proficient (LEP), Initially Fluent English Proficient (IFEP), 39 In order to make up for the large attrition in our sample that took place during the 2001-2201academic school year, we have oversampled for the 2002-2003 study. We tested 10 students in each treatment and control classroom, not including the kindergarteners in our study who are now in first grade who will also be tested in the Spring. 101 Redesignated Fluent English Proficient, and English Only (EO).40 For ELL students, we also looked at ELD levels.41 We conducted chi-square tests to investigate the extent to which treatment and control students were similar in terms of the demographic variables listed above. When examined as a group, no significant differences were found between the treatment and control students in kindergarten or first grade. Table 5 Matches for Treatment and Control Students Kindergarten First Grade Ethnicity No differences No differences Title 1 designation No differences No differences Master Plan Classification No differences No differences ELD Levels No differences No differences We also examined the extent to which treatment and control students matched within each track. In the kindergarten sample, there were no differences on any of the variables on Tracks A, B ,C , and D. However, we did find significant differences among the kindergarten students in the LEARN calendar in terms of ethnicity (÷2 (4, N=127) = 11.024, p = 0.026). In the treatment group, 50% of the students were Hispanic, 30% African-American, 18% White, and 2% Other. In the control group, however, 76% of the students were Hispanic, 20% AfricanAmerican, and 4% Other. We also found differences in terms of Title 1 designation (÷2 (2, N=129) = 11.17, p = 0.004). In the treatment group, 80% of the students were Title 1 compared to 62% of the students in the control group. We also found differences in the LEARN kindergarten group in terms of Master Plan classification (÷2 (3, N=129) = 20.183, p = .000). In 40 According to the Master Plan, LEP, IFEF, RFEP, and EO and defined as follows: LEP students are those identified as not having sufficient English academic language proficiency to successfully participate in a mainstream English Program. IFEP students are those initially identified through the formal initial assessment process as having sufficient English academic proficiency to successfully participate in a mainstream English program. RFEP students are those who acquired English in school and subsequently passed assessments to redesig nate as Fluent English Proficient. EO students and identified on the basis of parent responses to the Home Language Survey at the time of enrollment. English Only students speak various language forms, including mainstream and nonmainstream forms. 41 ELD levels are English Language Development Levels. Students range from level 1 through level 5 before being redesignated as an RFEP. 102 the treatment group, 53% of the students were EO and 36% were ELL. In the control group, only 24% of the students were EO and 53% were ELL. No differences were found in the LEARN classrooms between treatment and control students in terms of ELD Levels. Table 6 summarizes the kindergarten matches by track. Table 6 Kindergarten Matches by Track Track A Track B Track C Track D LEARN Ethnicity No differences No differences No differences No differences Proportionately more Hispanics in Control; more African Americ ans and Whites in Treatment Title 1 No differences No differences No differences No differences Proportionately more Title 1 students in Treatment ELD Level No differences No differences No differences No differences No differences Master Plan Classification No differences No differences No differences No differences Proportionately more LEPs in Control, more EOs in Treatment When examining first grade matches by track, we found differences in Track A in Master Plan Classification (÷2 (2, N=126) = 7.099, p = .029). In the treatment group, 70% of the students were LEPs, whereas in the control group, 92% of the students were LEPs. We also found differences in Track B in Title 1 designation (÷2 (1, N=127) = 4.635, p = .031). All of the control students were Title 1 and 89% of the treatment students were Title 1. We also found differences in LEARN in terms of Title 1 designation (÷2 (2, N=126) = 11.95, p = .003). Ninety-seven percent of treatment students were Title 1, whereas 83% of the students in the control group were designated Title 1. Table 7 summarizes the first grade matches. 103 Table 7 First Grade Matches by Track Track A Track B Track C Track D LEARN Ethnicity No differences No differences No differences No differences No differences Title 1 No differences Proportionately more Title 1 in Control No differences No differences Proportionately more Title 1 in Treatment ELD Level No differences No differences No differences No differences No differences Master Plan Classification Proportionately more LEPs in Control No differences No differences No differences No differences In addition to examining the comparability of the treatment and control students in terms of demographic variables, we also examined the comparability of the two groups in terms of reading ability at the beginning of the school year, as measured by the tests in the Woodcock Reading Mastery Test—Revised (WRMT-R). Table 8 provides the mean scores for kindergarten students in each group and Figure 2 presents the average fall scores of treatment and control students side by side. While we found statistically significant differences between treatment and control students in the Analogies test, the difference was of no practical significance since the average score for both groups was less than 1. 104 Table 8 Kindergarten WRMT-R Fall Scores Condition Treatment Control N 411 185 Mean 52.28 52.10 SD 26.12 25.99 T value .079 Letter Identification Treatment Control 408 188 12.46 11.02 11.68 11.43 1.403 Word Identification Treatment Control 401 188 0.80 0.36 3.17 1.12 2.494 Word Attack Treatment Control 403 188 0.17 0.02 1.25 0.29 2.274 Antonyms Treatment Control 400 186 0.06 0.02 0.35 0.18 1.764 Synonyms Treatment Control 394 186 0.02 0.00 0.14 0.00 2.131 Analogies Treatment Control 397 186 0.14 0.01 0.88 0.07 3.105* Passage Comprehension Treatment Control 394 183 1.05 0.87 1.97 1.19 1.331 Visual Auditory Learning * p < .01 105 Figure 2. Kindergarten Fall Scores by Condition 60.0 52.3 52.1 50.0 Average Scores 40.0 30.0 20.0 12.5 11.0 10.0 0.8 0.4 0.2 0.0 0.1 0.0 0.0 0.0 1.1 0.9 0.1 0.0 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRMT-R Tests Fall Treatment 106 Fall Control We conducted the same analyses to compare first grade treatment and control students in terms of reading ability at the beginning of the school year. We found no differences between treatment and control students' performance in the fall. See Table 9 and Figure 3. Table 9 First Grade WRMT-R Fall Scores Visual Auditory Learning Condition Treatment Control N 415 162 Mean 77.81 82.28 SD 24.70 23.78 T value -1.972 Letter Identification Treatment Control 430 174 30.30 31.72 7.48 6.27 -2.204 Word Identification Treatment Control 427 170 12.27 11.35 12.82 10.99 .829 Word Attack Treatment Control 425 168 5.44 5.15 6.39 5.71 .509 Antonyms Treatment Control 424 170 0.93 1.11 1.44 1.97 -1.265 Synonyms Treatment Control 422 172 0.41 0.38 0.84 0.78 .385 Analogies Treatment Control 421 170 0.97 1.06 2.79 3.08 -3.75 Passage Comprehension Treatment Control 414 170 4.59 3.79 5.20 4.81 1.709 * p < .01 107 Figure 3. First Grade Fall Scores by Condition 90.0 82.3 77.8 80.0 70.0 Average Scores 60.0 50.0 40.0 30.3 31.7 30.0 20.0 12.3 11.3 10.0 5.4 5.2 4.6 3.8 0.9 1.1 0.4 0.4 1.0 1.1 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRMT-R Tests Fall Treatment Fall Control In kindergarten and first grade, our treatment and control students were comparable both in terms of demographics and reading ability. The only exception was the kindergarten LEARN track which had differences in terms of ethnicity, Title 1 designation, and Master Plan Classification. Kindergarten Gains Having determined that the two groups were comparable, we then computed gain scores for each of the subtests of the WRMT-R by subtracting the pre-test from the post-test in order to determine the impact of Waterford on student achievement. Figure 4 shows the average gains of kindergarten students in the treatment and control groups. 108 Figure 4. Kindergarten Average Gains by Condition 35.0 31.2 30.0 29.9 Average Gain Scores 25.0 19.7 20.0 18.0 15.0 10.0 8.8 8.0 4.2 5.0 3.1 3.2 3.1 0.8 0.6 0.3 0.2 0.9 0.4 Antonyms Synonyms Analogies 0.0 Visual Auditory Learning Letter Identification Word Identification Word Attack Passage Comprehension WRMT-R Subtests Treatment Control We conducted independent sample t-tests to investigate differences between the gains of treatment and control students. We discovered statistically significant differences in the Antonyms test. However, in practical terms, the difference is not significant. Treatment students gained .22 (less than 1 point) more than control students. Table 10 provides the mean gains for kindergarten students. 109 Table 10 Kindergarten Gain Scores: Descriptive statistics and t-values Condition Treatment Control N 410 184 Mean Gain 31.19 29.88 SD 20.59 23.28 T value .690 Letter Identification Treatment Control 407 187 17.97 19.67 10.24 10.39 -1.877 Word Identification Treatment Control 400 186 8.81 8.05 10.01 8.47 .896 Word Attack Treatment Control 328 142 4.16 3.15 5.77 4.74 1.985 Antonyms Treatment Control 399 185 0.78 0.56 1.15 0.73 2.826* Synonyms Treatment Control 393 185 0.30 0.24 0.70 0.85 .868 Analogies Treatment Control 396 185 0.86 0.44 3.02 1.77 2.09 Passage Comprehension Treatment Control 392 182 3.15 3.12 4.59 4.25 .074 Visual Auditory Learning * p < .01 Kindergarten Gains by Track In order to investigate in more detail the impact of Waterford in kindergarten, independent sample t-tests were also conducted within each track. Table 11 lists the tests in which treatment and control students' gains differed. In Track B, treatment students' gains were larger on the Visual Auditory Learning test. In Track C, however, control students had larger gains than treatment students on the Letter Identification and Passage Comprehension tests. In LEARN we found the greatest number of differences between the groups. This was not surprising given that in this track students differed significantly on most demographic variables as well. In LEARN, treatment students had larger gains than control students on the Word Identification, Word Attack, Antonyms and Synonyms subtests. No differences were found between the groups in Tracks A and D. 110 Table 11 Kindergarten: Differences between groups by track Subtest Treatment Mean Gain 32.838 Control Mean Gain 21.640 T value Track B Visual Auditory Learning Track C Letter Identification 18.671 24.707 -3.397* Passage Comprehension 1.329 3.777 -2.890* Word Identification 11.645 6.455 2.861* Word Attack 6.298 2.364 3.391* Antonyms 1.117 0.455 3.325* Synonyms 0.468 0.091 3.563* LEARN 2.655* * p<.01 Overall, the comparison of kindergarten treatment and control students provides no evidence that exposure to the Waterford program results in improved reading ability. A closer examination of gain scores by track provides conflicting evidence: in Tracks B and LEARN, treatment students had larger gains than control students on some of the tests; in Track C, the reverse is true. It is interesting to note that LEARN, the track in which treatment students had larger gains than control students in the greatest number of tests, is also the track where we found a mismatch in demographics. In LEARN, the treatment group had a greater number of English Only students and greater number of Whites and African Americans. We explore this issue in greater detail later in the report. First Grade Gains The first grade treatment and control students were comparable in terms of demographics and reading ability at the beginning of the school year. Using independent sample t-tests, we then investigated differences in gains between first grade treatment and control students on the WRMT-R tests. As shown in Table 12 and illustrated in Figure 5, treatment students had larger gains than control students in the Visual Auditory Learning test. However, there were no differences between the groups on any of the other tests. 111 Table 12 First Grade Gain Scores: Descriptive Statistics and T-values Condition Treatment Control N 415 162 MeanGain 22.16 16.73 Letter Identification Treatment Control 430 174 6.30 5.43 Word Identification Treatment Control 427 170 27.60 29.04 Word Attack Treatment Control 425 168 12.95 12.93 9.23 9.42 .023 Antonyms Treatment Control 424 170 2.92 2.80 2.81 2.62 .489 Synonyms Treatment Control 422 172 0.88 0.91 1.25 1.38 -.309 Analogies Treatment Control 421 170 7.61 7.05 6.00 5.46 1.054 Passage Comprehension Treatment Control 413 169 11.46 12.11 7.33 6.95 -.997 Visual Auditory Learning * p < .01 112 SD T value 20.19 2.903* 20.19 6.03 5.07 1.671 12.78 -1.281 11.51 Figure 5. First Grade Average Gains by Condition 35.0 30.0 29.0 27.6 25.0 Average Gain Scores 22.2 20.0 16.7 15.0 12.9 12.9 11.5 12.1 10.0 7.6 6.3 7.1 5.4 5.0 2.9 2.8 0.9 0.9 0.0 Visual Auditory Learning Letter Identification Word Identification Word Attack Antonyms Synonyms Analogies Passage Comprehension WRMT-R Subtests Treatment Control In order to further examine the impact of Waterford in first grade, we compared gains by track. The only difference found between treatment and control students was in Track A. Treatment students had larger gains than control students in the Word Identification test (Treatment Mean Gain =30.909. Control Mean Gain = 24.269, t = 2.834*). Overall in first grade, the only difference between treatment and control students was that students exposed to the Waterford program had larger gains in the Visual Auditory Learning test, which is a test of Reading Readiness. Exposure to Waterford did not have an impact on students' word identification, decoding, or comprehension. We then examined kindergarten and first grade gains in relation to other variables such as English Language Proficiency, ethnicity, Title 1 participation, and primary reading program. Before calculating gains for each subgroup, we looked for differences in pre-test scores. No 113 differences were found in kindergarten on any subgroups. The only difference in first grade was in the Visual Auditory Learning test, in which LEP control students outperformed LEP treatment students (Treatment Mean = 75.10, Control Mean = 82.29, t = -2.727*), and ELD 1-2 control students outperformed ELD 1-2 treatment students (Treatment Mean = 71.37, Control Mean = 81.24, t = -3.108*). Gains and English Proficiency: Kindergarten In order to investigate whether exposure to the Waterford program benefits students with different levels of English proficiency differently, we analyzed gains according to Master Plan classification and ELD Levels. Table 13 shows the number of treatment and control students within each of the Master Plan categories in our kindergarten sample. Table 13 Kindergarten Students by condition and Master Plan classification Condition Master Plan Classification Treatment Control Total EO 92 35 127 IFEP 37 14 51 LEP 278 131 409 Missing 14 9 23 421 189 610 We conducted independent sample t-tests to determine whether there were any interactions between treatment or control (condition) and English language proficiency. Figure 6 illustrates the relationship between condition and English language proficiency (focusing on EO and LEP student). 114 Figure 6. Comparison of EO and LEP Kindergarten Students on all WRMT-R Tests 40.0 35.0 Average Gain Scores 30.0 25.0 20.0 15.0 10.0 5.0 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRMT-R Tests EO Treatment LEP Treatment EO Control LEP Control We found no differences between the LEP students in the treatment group and LEP students in the control group. Similarly, we found no differences between the IFEP students in the treatment group and those in the control group. However, we did find statistically significant differences on three of the tests between EO students in the treatment group and EO students in the control group. EOs in the treatment group had larger gains than EOs in the control group on Word Identification, Antonyms, and Analogies as show in Table 14. 115 Table 14 Differences between EO Students in the Treatment Group and EO Students in the Control Group Condition N Mean SD T value Word Identification Treatment Control 89 35 12.64 7.37 12.23 8.19 2.779* Antonyms Treatment Control 87 35 1.25 0.63 1.67 0.91 2.643* Analogies Treatment Control 87 35 2.14 0.40 5.00 1.54 2.918* * p<.01 In order to further examine the relationship between condition and English proficiency, we compared the gains of EO and LEP students within each condition. Within the control group, we found no differences in the gains of EO and LEP students. However, within the treatment group, we found that LEP students had larger gains than EO students in the reading readiness tests (Visual Auditory Learning and Letter Identification), and that EO students had larger gains than LEP students in all other tests (Basic Skills and Comprehension) except Synonyms. Table 15 shows the differences between EO and LEP students with the treatment group. 116 Table 15 Differences between EO and LEP students in the Treatment Group Visual Auditory Learning Master Plan EO LEP N 90 269 Mean 25.92 34.16 SD 16.91 21.26 T value -3.739* Letter Identification EO LEP 91 266 14.70 19.04 9.14 10.45 -3.521* Word Identification EO LEP 89 261 12.64 7.13 12.23 8.93 3.909* Word Attack EO LEP 65 220 6.42 3.24 7.56 4.90 3.192* Antonyms EO LEP 87 262 1.25 0.62 1.67 0.91 3.379* Analogies EO LEP 87 259 2.14 0.46 5.00 2.00 3.045* Passage Comprehension EO LEP 85 257 5.07 2.28 6.08 3.68 3.994* * p<.01 Overall, in kindergarten, Waterford benefited EO students in developing basic skills and reading comprehension. The program did not have an effect for LEP students. Gains and English Proficiency: First Grade The same analyses were conducted to investigate the relationship between condition and language proficiency in first grade. Table 16 shows the distribution of first grade students by condition and Master Plan classification. 117 Table 16 First Grade Students by Condition and Master Plan classification Condition Master Plan Classification Treatment Control Total EO 114 28 142 IFEP 46 19 65 LEP 278 126 404 RFEP 1 Missing 7 1 8 446 174 620 1 We found no differences between the IFEP students in the treatment group and IFEP students in the control group. Similarly, we found no differences between the EO students in the treatment group and those in the control group. However, we did find statistically significant differences on the Visual Auditory Learning test between LEP students in the treatment group and LEP students in the control group. (Treatment Mean = 22.56, Control Mean = 15.85, t = 3.112*). Since LEP students comprise the majority of the sample, it makes sense that the difference found when comparing LEP students in treatment and control is the same found when comparing all first grade students in treatment and control. We also examined language proficiency within each condition and discovered that within treatment, EO students had larger gains than LEPs in Word Identification, Antonyms, Analogies, and Passage Comprehension as shown in Table 17. 118 Table 17 Differences between EO and LEP First Grade Students within Treatment Word Identification Master Plan EO LEP N 108 269 Mean 30.44 25.89 SD 11.75 12.70 T value 3.212* Antonyms EO LEP 109 266 3.69 2.44 2.64 2.78 3.993* Analogies EO LEP 108 264 9.77 6.32 6.49 5.31 4.887* Passage Comprehension EO LEP 107 258 13.53 10.33 7.28 7.00 3.934* * p <.01 Within the control group, there were no significant differences between EO students and LEP students. The relationship between condition and English language proficiency in first grade is illustrated in Figure 7. 119 Figure 7. First Grade Gains of EO and LEP Students by Condition 35.0 30.0 Average Gain Scores 25.0 20.0 15.0 10.0 5.0 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRMT-R Tests EO Treatment LEP Treatment EO Control LEP Control Overall, EO students had larger gains in the Basic Skills and Comprehension tests than LEP students regardless of condition. With the exception of Visual Auditory Learning, Waterford did not make a difference for first grade students regardless of language proficiency. Gains and ELD Level: Kindergarten To summarize, in kindergarten we found no differences between LEP students in the treatment group and LEP students in the control group. In first grade, LEPs in the treatment group had higher gains in the Visual Auditory Learning test than LEPs in the control group, but showed no differences on any other tests. We then took a closer look at LEP students in terms of ELD Level for a more comprehensive understanding of the relationship between use of Waterford and English language proficiency. Table 18 shows the distribution of kindergarten LEP students according to ELD level. 120 Table 18 Distribution of Kindergarten LEP students by ELD level Kindergarten Treatment Control Total ELD 1-2 252 127 379 ELD 3-4 27 6 33 279 133 412 We conducted independent sample t-tests to examine the relationship between ELD level and condition. This relationship is illustrated in Figure 8. Figure 8. Kindergarten Gains by ELD Level and Condition on all WRMT-T Tests 40.0 35.0 Average Gain Scores 30.0 25.0 20.0 15.0 10.0 5.0 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRMT-R Tests ELD 1-2 Treatment ELD 3-4 Treatment 121 ELD 1-2 Control ELD 3-4 Control In kindergarten, we found no statistically significant differences between ELD 1-2 treatment students and ELD 1-2 control students. Similarly we found no statistically significant differences between ELD 3-4 treatment students and ELD 3-4 control students. We also analyzed the gains according to ELD level within the treatment group. Not surprisingly, we found differences between ELD 1-2 and ELD 3-4. Students in ELD 3-4 had larger gains than ELD 1-2 students in Word Identification (ELD 1-2 Mean Gain =6.17, ELD 3-4 Mean Gain = 14.96, t = -3.617*) and Word Attack (ELD 1-2 Mean Gain =2.73, ELD 3-4 Mean Gain = 7.60, t = -2.887*). No comparisons were made between ELD 1-2 and ELD 3-4 within the control group because there were only 6 ELD 3-4 students in that group. Gains and ELD Level: First Grade The same analyses were conducted to examine the relationship between condition and ELD level in first grade. Table 19 shows the distribution of first grade students by condition and ELD level. Table 19 Distribution of First Grade LEP students by ELD level First Grade Treatment Control Total ELD 1-2 212 93 305 ELD 3-4 66 32 98 278 125 403 The only statistically significant difference between ELD 1-2 students in the treatment group and ELD 1-2 students in the control group was in the Visual Auditory Learning Test (Treatment Mean = 23.64, Control Mean = 15.42, t =3.188*). This is consistent with our finding that first grade treatment students in general had larger gains than first grade control students. No differences were found between ELD 3-4 students in the treatment group and ELD 3-4 students in the control group. 122 Within treatment, there were differences between ELD 1-2 and ELD 3-4 in Word Identification, Synonyms, Analogies, and Passage Comprehension (See Table 20). Not surprisingly, ELD 3-4 had larger gains in these tests of Basic Skills and Comprehension. Table 20 Differences between ELD 1-2 and ELD 3-4 First Grade Students within Treatment ELD Levels ELD 1-2 ELD 3-4 N 205 64 Mean 24.70 29.77 SD 12.64 12.18 T value -2.824* Word Attack ELD 1-2 ELD 3-4 203 63 10.87 15.56 8.85 8.60 3.698* Synonyms ELD 1-2 ELD 3-4 201 64 0.66 1.13 1.03 1.24 3.004* Analogies ELD 1-2 ELD 3-4 201 63 5.19 9.92 4.77 5.37 6.651* Passage Comprehension ELD 1-2 ELD 3-4 196 62 9.27 13.68 6.90 6.29 -4.481* Word Identification * p<.01 Within control, ELD 3-4 students had larger gains than ELD 1-2 students on Word Attack (ELD 1-2 Mean Gain = 10.50, ELD 3-4 Mean Gain = 15.91, t =-3.068*), and Antonyms (ELD 1-2 Mean Gain = 2.27, ELD 3-4 Mean Gain = 3.84, t = -3.529*). Figure 9 illustrates the relationship between ELD levels and condition in first grade. 123 Figure 9. First Grade Gains by ELD Level and Condition on all WRMT-T Tests 35.0 30.0 Average Gains 25.0 20.0 15.0 10.0 5.0 om pr eh en si An al og ie s on S yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRMT-R Tests ELD 1-2 Treatment ELD 3-4 Treatment ELD 1-2 Control ELD 3-4 Treatment In summary, as would be expected, ELD 3-4 had larger gains than ELD 1-2 students regardless of condition. With the exception of Visual Auditory Learning, exposure to Waterford did not make a difference for LEP students regardless of ELD level. Gains and Ethnicity: Kindergarten Analyses were also conducted to examine the impact of Waterford on different ethnic groups. Table 21 shows the distribution of students by ethnicity and condition. 124 Table 21 Distribution of Kindergarten Students by Ethnicity and Condition Condition Ethnicity Treatment Control Total American Indian/Alaskan 1 1 2 Asian 2 1 3 African American 43 15 58 Hispanic 348 169 517 White 17 Filipino 1 1 2 412 187 599 17 Independent sample t-tests were conducted to determine whether exposure to Waterford had an impact on each ethnic group. Due to the low number of cases, American Indians/Alaskans, Asians, Whites, and Filipinos were not included in the analysis. No statistically significant differences were found on any of the tests between African American students in the treatment group and African American students in the control group. Similarly, there were no differences between Hispanic students in the treatment group and Hispanic students in the control group. We also examined the gains of African American and Hispanic students within each condition. Within treatment, we found that Hispanic students had larger gains than African Americans on Letter Identification (AA Mean = 14.09, Hispanic Mean = 18.98, t = -3.005*). The relationship between condition and ethnicity is illustrated in Figure 10. 125 Figure 10. Kindergarten Gains of African American and Hispanic Students by Condition 35.0 30.0 Average Gain Scores 25.0 20.0 15.0 10.0 5.0 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRMT-R Tests African American Treatment Hispanic Treatment African American Control Hispanic Control The finding that there were no differences between African American students in the treatment group and African American students in the control group was surprising, given that African Americans comprised approximately half of the EO population in kindergarten, and given that EO students in the treatment group had larger gains than EO students in the control group in Word Identification, Antonyms, and Analogies. So we looked more closely at the group of kindergarten EO students in terms of ethnicity. Table 22 shows the distribution of kindergarten EO students by ethnicity. 126 Table 22 Distribution of English Only Kindergarten Students by Ethnicity Ethnicity Treatment Control Total American Indian/Alaskan 1 1 2 Asian 1 African American 41 14 55 Hispanic 36 20 56 White 13 1 13 92 35 127 We then looked at the three tests where we had found differences between the EO students in the treatment and control groups: Word Identification, Antonyms, and Analogies. Figure 11 illustrates the average gains of EO students by ethnicity and condition on those tests. 127 Figure 11. Differences between Kindergarten EO students by Ethnicity and Condition 35.0 30.0 29.3 Average Gain Scores 25.0 20.0 15.0 11.0 10.0 8.9 7.9 7.9 7.4 5.0 3.2 0.7 1.1 0.8 0.6 1.1 1.0 0.9 0.1 0.0 Word Identification Antonyms Analogies WRMT-R Tests White Treatment African American Treatment Hispanic Treatment African American Control Hispanic Control When we compared the EO African American students in the treatment group with those in the control group, we found no differences. Similarly, there were no differences between the EO Hispanic students in the treatment group and those in the control group. Thus, the overall difference between EO treatment students and EO control students was driven by the gains of the White students in the treatment group. Unfortunately, since there were no White students in the control group, we were not able to determine whether exposure to Waterford contributed to the White students gains or whether their gains were due to other factors. Interestingly, those 13 White students were in the LEARN track, which would also explain why we found larger gains in that particular track. 128 Gains and Ethnicity: First Grade The same analyses were conducted to examine the relationship between condition and ethnicity in first grade. Table 23 shows the distribution of first grade students by ethnicity and condition. Table 23 Distribution of First Grade Students by Ethnicity and Condition Condition Ethnicity Treatment Control Total Asian 3 2 5 African American 52 14 66 Hispanic 380 155 535 White 1 1 2 Filipino 4 1 5 440 173 613 Independent sample t-tests showed no differences between African American students in the treatment group and African American students in the control group. Hispanic students in the treatment group had larger gains than Hispanic students in the control group in the Visual Auditory Learning test (Treatment Mean = 21.86, Control Mean = 15.92, t = 3.016*). Again, since Hispanic students comprise the majority of the sample, this is consistent with the finding that overall first grade treatment students had larger gains than control students on that particular test. Within the treatment group, there were no differences between the African American and Hispanic students on any of the tests. Figure 12 illustrates the relationship between ethnicity and condition. 129 Figure 12. First Grade Gains of African American and Hispanic Students by Condition 35.0 30.0 Average Gain Scores 25.0 20.0 15.0 10.0 5.0 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRMT-R Tests African American Treatment Hispanic Treatment African American Control Hispanic Control Overall, exposure to the Waterford program did not make a difference for any particular ethnic group. The only exception was that in first grade Hispanic students exposed to Waterford had larger gains than Hispanic students who were not on the Visual Auditory Learning test. Within the kindergarten treatment group, the only difference found was that kindergarten Hispanic students had larger gains on Letter Identification than African American students. Gains and Title 1 participation We would have liked to examine the relationship between Title 1 participation and condition, however, due to the low number of students who were not Title 1 recipients in our sample, it was not possible to compare the impact of Waterford on students who are not Title 1 participants. Table 24 shows the distribution of kindergarten and first grade students by condition and Title 1 designation. 130 Table 24 Distribution of Kindergarten and First Grade Students by Condition and Title 1 Designation Kindergarten Title 1 First Grade Treatment Control Total Title 1 Treatment Control Total No 27 14 41 No 35 10 45 Yes 370 169 539 Yes 410 162 572 Missing 24 6 30 Missing 1 2 3 421 189 610 446 174 620 Even though we could not compare students who were not Title 1 participants, we did compare Title 1 treatment students and Title 1 control and found no significant differences on any of the tests in kindergarten. In first grade, we found differences between treatment and control on the Visual Auditory Learning test (Treatment Mean = 22.27, Control Mean 16.53, t = 2.918*), which is consistent with our previous first grade finding. Gains and Reading Program: Kindergarten We also investigated whether Waterford benefited students differently based on the primary reading program already being used in the classroom (e.g., Open Court, Success for All, other). In kindergarten, 89% of treatment students were in Open Court classrooms, whereas the remaining 11% were in Success for All classrooms. On the other hand, 95% of control students were in Open Court classrooms. The remaining students used a program other than Open Court or Success for All, as summarized in Table 25. Table 25 Distribution of Kindergarten Students by Condition and Reading Program Reading Program Treatment Control Open Court 374 179 Success For All 47 Other 421 Total 553 47 10 10 189 610 131 When comparing kindergarten students in Open Court classrooms in the treatment group with students in Open Court classrooms in the control group, we found that OC treatment student had larger gains on Antonyms (Treatment Mean = .802, Control Mean = .572, t = 2.769*). The difference, however, is not practically significant. No comparison could be made between treatment and control for students in Success for All classrooms since there were no students who used Success for All in the control group. We also compared students in Open Court and Success for All within the treatment group and found no significant differences between the groups. Figure 13 illustrates the relationship between condition and reading program in kindergarten. Figure 13. Kindergarten Gains by Condition and Reading Program 40.00 35.00 Average Gain Scores 30.00 25.00 20.00 15.00 10.00 5.00 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.00 WRMT-R Tests Open Court Treatment Success for All Treatment 132 Open Court Control Gains and Reading Program: First Grade We also examined the relationship between condition and reading program in first grade. All the first grade control students and 79.5% of the treatment students were in Open Court classrooms. The remaining treatment students were either in Success for All classrooms (16%) or another program (4.5%). See Table 26. Table 26 Distribution of First Grade Students by Condition and Reading Program Reading Program Treatment Control Total Open Court 354 174 528 Success For All 72 72 Other 20 20 446 174 620 When comparing treatment and control student who used Open Court we only found differences in Visual Auditory Learning (which, again, is consistent with previous first grade findings). When we looked within the treatment group and compared students in Open Court classrooms against students in Success for All classrooms, we found that student who were exposed to Waterford and Open Court had larger gains than those exposed to Waterford and Success for All in Word Identification (OC Mean = 28.12, SFA Mean = 24.55, t =2.08*) and Word Attack (OC Mean =13.41, SFA Mean =10.57, t = 2.524*). Figure 14 illustrates the relationship between condition and reading program in first grade. 133 Figure 14. First Grade Gains by Condition and Reading Program 35.0 30.0 Average Gain Scores 25.0 20.0 15.0 10.0 5.0 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRTM-R Tests Open Court Treatment Success for All Treatment Open Court Control Overall, exposure to Waterford did not have an impact on gains regardless of reading program. The only exception was in first grade, where treatment students in Open Court classrooms had larger gains than control students in the Visual Auditory Learning test. Comparison of matched treatment and non-matched treatment At the outset of the study, non-matched treatment classrooms were selected in order to reestablish the representativeness of the entire treatment group in the District. Since the classrooms in the matched treatment group were selected to match eligible control classrooms within schools that did not qualify to Waterford, we expected that our matched treatment classrooms would be slightly different than the overall Waterford population. In order to determine to what extent our matched treatment group differed from the overall Waterford population in the district, we 134 conducted chi square tests to compare the two groups in terms of English language proficiency, Title 1 participation, and ethnicity. Even though some differences were found, none of them were statistically significant. Overall, our non-matched and matched treatment groups did not differ in terms of demographic variables, as summarized in Table 27. Table 27 Matches for Matched and Non-matched Treatment Students Kindergarten First Grade Ethnicity No differences No differences Title 1 designation No differences No differences Master Plan Classification No differences No differences ELD Levels No differences No differences In addition to examining the comparability of the matched and non-matched treatment students in terms of demographic variables, we also examined the comparability of the two groups in terms of reading ability at the beginning of the school year, as measured by the tests in the WRMT-R. Figure 15 illustrates the average scores of matched and non-matched treatment students at the beginning of the school year. 135 Figure 15. Average Fall Scores of Kindergarten Matched and Non-matched Treatment Students 60.0 55.3 52.3 50.0 Average Scores 40.0 30.0 20.0 12.5 10.2 10.0 1.0 0.8 0.2 0.2 0.1 0.1 0.0 0.0 1.0 1.1 0.1 0.1 si en eh pr om S An al og ie s on yn on nt A on s ym s ym ta At d or W Id ag W e or C d r tte Le ck n io at ic en en Id ry to di Au Pa Vi ss su al tif tif Le ic ar at ni io ng n 0.0 WRMT-R Tests Non-matched Treatment Matched Treatment In kindergarten, we found statistically significant differences between matched and nonmatched treatment students on the most relevant test at the beginning of kindergarten: Letter Identification (Matched Treatment Mean = 12.78, Non-matched Mean = 10.17, t = -2.322*). This means that, as predicted, non-matched treatment students started out lower in terms of reading ability than the matched treatment group. Having determined the starting point of both groups, we then computed gains. Figure 16 illustrates the average gains of each group. 136 Figure 16. Average Gains of Kindergarten Matched and Non-matched Treatment Students 35.0 32.0 30.9 30.0 25.0 Average Gain Scores 21.5 20.0 17.9 15.0 10.0 8.6 8.8 5.0 4.0 4.0 3.7 0.7 0.8 0.2 0.3 0.5 0.9 Synonyms Analogies 3.1 0.0 Visual Auditory Learning Letter Identification Word Identification Word Attack Antonyms Passage Comprehension WRMT-R Tests Non-matched Treatment Matched Treatment When we compared gains between these two groups we found that the non-matched students had larger gains in Letter Identification (Matched Treatment Mean Gain = 17.968, Nonmatched Mean Gain = 21.521, t = 3.995*). The comparability of first grade students in the matched and non-matched treatment groups was also tested. No differences were found in students' performance at the beginning of the school year on any of the tests, as illustrated by Figure 17. 137 Figure 17. Average Fall Scores of First Grade Matched and Non-matched Treatment Students 90.0 81.8 80.0 77.8 70.0 Average Scores 60.0 50.0 40.0 30.8 30.3 30.0 20.0 11.7 12.3 10.0 5.2 5.4 4.3 4.6 0.8 0.9 0.3 0.4 1.3 1.0 Antonyms Synonyms Analogies 0.0 Visual Auditory Learning Letter Identification Word Identification Word Attack Passage Comprehension WRMT-R Tests Non-matched Treatment Matched Treatment When we compared gains between the two groups, we found that non-matched treatment students had larger gains on the Analogies test (Treatment Mean Gain = 7.61, Non-match Mean Gain = 6.117, t = -3.057*). See Figure 18. 138 Figure 18. Average Gains of First Grade Matched and Non-matched Treatment Students 30.0 27.6 26.8 25.0 22.2 20.2 Average Gain Scores 20.0 15.0 12.9 12.2 11.3 11.5 10.0 7.6 6.1 6.3 6.1 5.0 2.9 2.9 1.1 0.9 0.0 Visual Auditory Learning Letter Identification Word Identification Word Attack Antonyms Synonyms Analogies Passage Comprehension WRMT-R Tests Non-matched Treatment Matched Treatment Overall, the non-matched and matched treatment groups were comparable both in terms of demographics and students' reading ability. This suggests that our finding about the effectiveness of Waterford based on our comparison of the treatment and control group can be generalized to the entire district. It also suggests that it may not be necessary to sample and test a non-matched group in subsequent years of the study. CONCLUSION The findings for kindergarten are: 1. Overall, the comparison of treatment and control students provides very little evidence that exposure to the Waterford program results in improved reading ability. While the treatment students had larger gains than the control on the Antonyms test, 139 these differences were of no practical significance. There were no differences between the groups on any of the other tests. 2. Differences in student gains across tracks provide conflicting evidence. In Tracks B and LEARN, treatment students had larger gains than control students on some tests; on Track C, the reverse is true. 3. No student gain differences exist between the LEP students in the treatment group and LEP students in the control group. Similarly, no differences were found between IFEP students in the treatment group and the control group. Statistically significant differences exist on three of the tests between EO students in the treatment and EO students in the control group. The differences were driven by the gains of White students in the treatment group. 4. No statistically significant differences were found between African American students in the treatment group and those in the control group. Similarly, there were no differences between Hispanic students in the treatment group and those in the control group. Within the treatment group, Hispanic students did have larger gains than African American students on Letter Identification. 5. No significant differences exist on any of the tests between the treatment and control groups based on Title 1 participation. 6. Students in Open Court classrooms in the treatment group had larger gains on the Antonyms test that those students in the control group. This difference is not of any practical significance. No differences existed within the treatment group between those students in Open Court classrooms and those in Success for All classrooms. The findings for first grade are: 1. Overall, the comparison of treatment and control students provides very little evidence that exposure to the Waterford program results in improved reading ability. Treatment students had larger gains than control students in the Visual Auditory Learning Test, which is a test of Reading Readiness. There were no differences between the groups on any of the other tests. 140 2. The only difference between the treatment and control students, when compared by track, was that the students on Track A had larger gains than control students in the Word Identification Test. 3. No differences were found between the IFEP students in the treatment group and the IFEP students in the control group. Similarly, no differences were found between the EO students in the treatment and control groups. However, statistically significant differences on the Visual Auditory Learning test between LEP students in the treatment and control groups were found. This is consistent with the finding that as a whole, first grade treatment students had larger gains than the control students on this particular test. Overall, EO students had larger gains than LEP students in the Basic Skills and Comprehension tests, regardless of condition. 4. No differences exist between African American students in the treatment group and those in the control group. Hispanic students in the treatment group had larger gains than those in the control group in the Visual Auditory Learning test. Again, this is consistent with the overall finding that students in the treatment group had larger gains on this particular test. There were no differences between African American and Hispanic students on any of the tests. 5. No significant differences exist on any of the tests between the treatment and control groups based on Title 1 participation. 6. Treatment students in Open Court classrooms had larger gains on the Visual Auditory Learning test (again consistent with previous first grade findings). Within the treatment group, those students exposed to Waterford in Open Court classrooms did have larger gains than those exposed to Waterford in Success for All classrooms on the Word Identification and Word Attack tests. The findings for the Non-Matched Treatment are: 1. There were no significant differences in terms of ethnicity, Title 1 designation, Master Plan Classification, or ELD levels for either kindergarten or first grade between the non-matched treatment sample and the matched treatment sample. 2. Overall, the non-matched and matched treatment groups were comparable in terms of students’ gains. 141 Possible explanations for not finding gains associated with the use of Waterford: a. Waterford implementation is not consistent, which may obscure any benefits of using the program. It is possible that students who fully utilize the program have larger gains. But we will not know that until we incorporate usage data into our analysis. b. This analysis does not take into account other variables that can impact the results, such as teacher quality and teachers' attitudes towards Waterford. In order to understand the findings and obtain a fuller picture of the effectiveness of the Waterford program we will do the following for the final report: § Bring observation data to bear on the test scores. § Incorporate data reflecting the time spent using the program, program level, and program completion (based on classroom observations and usage reports generated by the Waterford Institute) to determine the relationship between usage of the program and gains. § Examine observation data to determine the interaction between the Waterford program and other reading programs, such as Success for All and Open Court. Specifically, the use of a control-treatment matched group will allow investigation of the supplemental effect of Waterford with these and other existing reading programs. § When SAT/9 data becomes available, compare the treatment-control group on that measure as well. § Examine whether there is a positive correlation between students’ scores on the Woodcock Test, the SAT/9 and the WCART tests. § Examine the relationship between time spent using the program and students’ gain scores on all three tests. We anticipate that this careful and exhaustive set of analyses will provide a comprehensive and thorough assessment of all aspects of this program in relation to its impact on student reading achievement. 142 APPENDIX C Comparison of Treatment and Control Classrooms using Hierarchical Linear Modeling Hierarchical Linear Modeling (HLM) was used to compare treatment and control students while controlling for the quality of Open Court pedagogy and school characteristics.42 This analysis is intended to supplement and expand on the t-test analyses in the second interim report.43 The student background variables included Language Classification (ELL or EO), ELD level (ELD 1-2 or ELD 3-4), and Condition (Treatment or Control).44 The teacher/instructional variables included the Quality of Open Court pedagogy. In kindergarten, quality of Open Court pedagogy referred to the quality of pedagogy during the Sounds and Letters section of Open Court. In first grade, quality of Open Court pedagogy referred to the quality of pedagogy during portions of phonics and reading comprehension instruction. In addition, in kindergarten we included the amount of time spent over four days on the Sounds and Letters section of Open Court. The school variable included the School Characteristics Index (SCI) scores from the California Department of Education. This index is a composite measure of a school’s background characteristics which predict achievement on the SAT/9. The outcome measures for this analysis for kindergarten included four tests of the WRMT-R (Visual Auditory Learning, Letter Identification, Word Identification, and Word Attack). For first grade, we included all WRMT-R tests and the SAT/9 Reading, Language, and Spelling tests.45 42 An alpha level of .05 was used in the HLM analyses. No effect sizes were computed since there is no consistent or accepted way to compute effect sizes in HLM. 43 We were able to calculate effect sizes for the t-test analyses presented in Appendix B. In kindergarten, Cohen’s d statistics ranged from minimal to small (.06 - .19) favoring control students in Letter Identification (Cohen’s d = .16). In first grade, effect sizes were minimal (ranging from .002 to .11) in all tests except Letter Identification (Cohen’s d = .16 in favor of treatment) and Visual Auditory Learning (Cohen’s d = .27 in favor of treatment). 44 Language classification and ELD Level were included in the HLM analysis because they were the two most important demographic variables in predicting outcomes in exploratory analyses as well as in the t-tests analyses in Appendix B. 45 We also intended to include the WCART as an additional outcome measure but due to the small number of cases, we were unable to do so. See Appendix E for descriptive statistics and other information about the WCART scores. 143 Kindergarten Findings When controlling for classroom pedagogy and school characteristics, we found no differences between treatment and control students.46 Table 1 summarizes various student factors, teacher/instructional factors, school factors, and cross-level interactions that were significantly related to kindergarten student gains on four of the WRMT-R tests: Visual Auditory Learning, Letter Identification, Word Identification, and Word Attack. • Visual Auditory Learning. There were no differences between treatment and control students. On average, ELL students had larger gains than EO students. • Letter Identification. There were no differences between treatment and control students. On average, ELL students had larger gains than EO students. However, in classrooms with higher quality Open Court pedagogy, EO students had larger gains than ELL students. • Word Identification. There were no differences between treatment and control students. On average, EO students had larger gains than ELL students. • Word Attack. There were no differences between treatment and control students. EO students had larger gains than ELL students and ELD 3-4 students had larger gains than ELD 1-2 students. 46 The following model was tested: (Variables with *’s are grand mean centered) Level 1 Y = ð0 + ð 1 (Condition) + ð 2 (Language Classification) + ð 3 (ELD Level) + r Level 2 ð 0 = β00 + β01 (Time on OC)* + β02 (OC Quality)* + u 0 ð 1 = β10 + β11 (OC Quality)* ð 2 = β20 + β21 (OC Quality)* ð 3 = β30 + β31 (OC Quality)* Level 3 β00 = γ000 + γ001 (SCI)* + u 0 β01 = γ010 β02 = γ020 β10 = γ100 β11 = γ110 β20 = γ200 β21 = γ210 β30 = γ300 β31 = γ310 144 Table 1 Student, Teacher/Classroom, and School Factors Related to Differences in the Outcomes in Kindergarten Outcome Student Factors Visual Auditory Learning Letter Identification Language Classification (ELL+) Language Classification (ELL+) Word Identification Word Attack Language Classification (EO+) Language Classification (EO+) ELD Level (ELD 3-4 +) Classroom Factors None School Factors Cross-Level Interactions None None None None None None None None None Program Effect None Language Classification moderated by quality of OC pedagogy None These findings are consistent with the t-tests analysis in Appendix B, which also detected no differences between the treatment and the control students on these tests. Also, consistent with the t-test analysis findings, students with lower English language proficiency had larger gains in the Readiness tests (Visual Auditory Learning and Letter Identification), while students with higher English language proficiency had larger gains in the tests of Basic Skills (Word Identification and Word Attack). First Grade Findings When controlling for the quality of teacher pedagogy and school characteristics, we found a marginally significant difference between the treatment and control students on the Letter Identification test.47 On average, treatment students’ gains were larger than control 47 The following model was tested: (Variables with *’s are grand mean centered) Level 1 Y = ð0 + ð 1 (Condition) + ð 2 (Language Classification) + ð 3 (ELD Level) + r Level 2 ð 0 = β00 + β01 (OC Quality)* + u 0 ð 1 = β10 + β11 (OC Quality)* ð 2 = β20 + β21 (OC Quality)* ð 3 = β30 + β31 (OC Quality)* Level 3 β00 = γ000 + γ001 (SCI)* + u 0 β01 = γ010 145 students’ gains by less than one and a half points. We also found that treatment students in classrooms with higher quality Open Court pedagogy had larger gains than control students. No differences were found between the groups on any of the other Woodcock tests or the SAT/9 subtests. Table 2 summarizes various student factors, teacher/instructional factors, school factors, and cross-level interactions that were significantly related to first grade student gains on all eight WRMT-R tests, as well as scores on the SAT/9 subtests. • Visual Auditory Learning. No differences were found between treatment and control students. On average, ELD 1-2 students had larger gains than ELD 3-4 students. • Letter Identification. There was a marginally significant difference between the treatment and the control students (p = .057). On average, treatment students had larger gains than control students (1.4 point difference). On average, ELD 1-2 students had larger gains than ELD 3-4 students. • Word Identification. No differences were found between treatment and control students. On average, ELD 3-4 students had larger gains than ELD 1-2 students. • Word Attack. No differences were found between treatment and control students. On average ELD 3-4 students had larger gains that ELD 1-2 students, and their gains were even larger in classrooms with higher quality Open Court pedagogy. • Antonyms, Analogies, and Passage Comprehension. No differences were found between treatment and control students. On average, EO students had larger gains than ELL students and ELD 3-4 students had larger gains than ELD 1-2 students. • Synonyms. On average there were no differences between treatment and control students. However, as quality of Open Court pedagogy increased, treatment students’ gains increased. Also ELD 3-4 students had larger gains than ELD 1-2 students. • SAT/9 Reading. No differences were found between treatment and control students on any of the SAT/9 reading subtests. On average, EO students outperformed ELL student β02 = γ020 β10 = γ100 β11 = γ110 β20 = γ200 β21 = γ210 β30 = γ300 β 31 = γ310 146 and ELD 3-4 students outperformed ELD 1-2 students. Also, students in classrooms with high quality Open Court pedagogy outperformed students in classrooms with lower quality Open Court pedagogy. • SAT/9 Language and SAT/9 Spelling. No differences were found between treatment and control students. EO students outperformed ELL students and ELD 3-4 students outperformed ELD 1-2 students. Also, School Characteristics Index had a marginal negative effect on these tests. 147 Table 2 Student and/or Teacher/Classroom Factors Related to Differences in the Outcomes Outcome Student Factors Visual Auditory Learning ELD Level (ELD 1-2 +) ELD Level (ELD 1-2 +) ELD Level (ELD 3-4 +) ELD Level (ELD 3-4 +) Letter Identification Word Identification Word Attack Classroom Factors None School Factors Cross-Level Interactions None Program Effect None Yes a None None ELD Level moderated by quality of OC pedagogy (+) None Antonyms, Analogies, and Passage Comprehension Language None Classification (EO+) ELD Level (ELD 3-4 +) Synonyms ELD Level Moderated b (ELD 3-4 +) SAT/9 Total Language Quality of OC None Reading Classification (EO+) pedagogy (+) ELD Level (ELD 3-4 +) SAT/9 Word Language Quality of OC None Study Classification (EO+) pedagogy (+) ELD Level (ELD 3-4 +) Language SAT/9 Word Quality of OC None Classification (EO+) Reading and pedagogy (+) ELD Level SAT/9 Reading Comprehension (ELD 3-4 +) Language SAT/9 School None Classification (EO+) Language and Characteristics SAT/9 Spelling ELD Level Index (-) (ELD 3-4 +) Notes: a Marginally significant favoring treatment students. Treatment students’ gains on Letter Identification were on average 1.35 points larger than control students. b Favoring treatment students in classrooms with high quality Open Court pedagogy. These findings are consistent with the t-tests analysis in Appendix B with a few exceptions. The t-tests analysis detected a statistically significant difference between the treatment and control group on the Visual Auditory Learning test. The HLM analysis, which takes into account the fact that students are nested within classrooms, did not find the difference to be significant. On the other hand, HLM identified a marginally significant difference favoring treatment students on the Letter Identification test. Also HLM detected a cross-level interaction 148 in the Synonyms test between condition and quality of Open Court pedagogy. In classrooms with higher quality Open Court pedagogy, treatment students had larger gains than controls. The finding regarding the relationship between English language proficiency and achievement are also consistent with the t-test analyses. Student with lower English language proficiency had larger gains than students with higher levels of proficiency on Visual Auditory Learning and Letter Identification. Students with higher levels of proficiency performed better on all other tests. The quality of Open Court pedagogy was also an important factor. Students in classrooms with high quality Open Court pedagogy performed better on the SAT/9 reading tests. The comparison of treatment and control students resulted in mixed findings. Overall, there is little evidence that use of the Waterford program resulted in improved reading ability. This is not surprising given the low level of implementation of the Waterford program in both kindergarten and first grade classrooms. 149 APPENDIX D Comparison between Treatment and Control First Grade Students on the SAT/9 In addition to comparing treatment and control students’ gains on the Woodcock Reading Mastery Tests—Revised, we also compared end of year achievement of first grade students in each group as measured by the Stanford Achievement Test Ninth Edition (SAT/9). Since students take the SAT/9 for the first time in first grade, we only have end-of-year scores for the first grade students in the study. Of the 847 students in the study’s first grade sample, 827 took the SAT/9 at the end of first grade. Table 1 shows the distribution of first grade students with SAT/9 scores by condition. Table 1 First Grade Students with SAT/9 Scores by Condition Treatment Control Non-matched Late treatment Total 425 161 219 22 827 We conducted independent sample t-tests using the Bonferroni adjustment to investigate differences between the treatment and the control group on relevant SAT/9 tests. We found no statistically significant differences between treatment and control students on any of the tests.48 Figure 1 illustrates the mean NCE scores of treatment and control students in Reading, its three subtests (Word Study, Word Reading, and Reading Comprehension) Language, and Spelling. 48 We also computed effect sizes to supplement these analyses. The effect sizes were minimal (less than .2), with Cohen’s d statistics ranging from .08 to .17 in favor of the control group. 150 Figure 1. First Grade SAT/9 NCE in Spring 2002 70.0 60.3 60.0 57.4 55.2 56.7 59.2 56.9 55.3 56.3 55.2 53.5 51.0 50.0 47.9 NCE Scores 40.0 30.0 20.0 10.0 0.0 Reading Word Study Skills Word Reading Reading Comprehension Language Spelling SAT/9 Tests Treatment Control We also compared SAT/9 scores by track. We found no differences between the groups on Tracks A, B and LEARN. However, we did find differences between the groups on Tracks C and D. On Track C, control students outperformed treatment students on all tests (See Table 2). On Track D, control students outperformed treatment students on Total Reading, Word Study, and Reading Comprehension as shown in Table 3. 151 Table 2 Differences between Track C Treatment and Control Students Condition Treatment Control N 83 31 Mean 51.3 64.7 SD 18.6 18.5 T value -3.43* Word Study Skills Treatment Control 85 31 54.4 68.8 19.8 21.3 -3.38* Word Reading Treatment Control 87 31 50.9 62.8 20.7 21.2 -2.73* Reading Comprehension Treatment Control 87 31 48.7 61.7 17.1 16.3 -3.66* Language Treatment Control 87 31 43.3 59.9 16.0 19.2 -4.70* Spelling Treatment Control 88 31 52.3 67.7 25.5 21.3 -3.02* Reading * p<.05 152 Table 3 Differences between Track D Treatment and Control Students Condition Treatment Control N 79 35 Mean 48.3 61.3 SD 19.9 13.9 T value -4.02* Word Study Skills Treatment Control 80 35 48.3 65.8 20.1 15.0 -4.61* Reading Comprehension Treatment Control 79 36 47.9 57.4 18.2 14.2 -2.75* Reading * p<.05 We also examined SAT/9 achievement data according to Language classification and ELD levels. We found no differences between the ELL students in the treatment group and the ELL students in the control group. Also, no differences were found between the EO treatment students and the EO control students. Within the treatment group, EO students outperformed ELL students on Total Reading, Reading Comprehension, and Language (See Table 4). Within the control group, we found no differences between the ELL and the EO students. Table 4 Differences between EO and ELL students in the Treatment Group Reading Reading Comprehension Language Language Classification ELL EO N 259 110 Mean 52.4 58.1 SD 18.9 19.1 T value -2.66* ELL EO 262 112 50.5 56.5 17.0 17.4 -3.07* ELL EO 264 111 44.6 50.7 18.3 19.4 -2.88* * p<.05 We also examined the relationship between ELD levels and condition. In the Language test, ELD 1-2 control students outperformed ELD 1-2 treatment students (Treatment Mean = 40.2, Control Mean = 46.2, t = -2.7*). No differences were found between ELD 3-4 students in 153 each group. Within both the treatment and the control group, EDL 3-4 students outperformed ELD 1-2 students on all the tests as illustrated in Tables 5 and 6. Table 5 Differences between ELD 1-2 and ELD 3-4 students in the Treatment Group ELD Level ELD 1-2 ELD 3-4 N 194 65 Mean 47.9 65.3 SD 17.7 16.4 T value -6.97* Word Study Skills ELD 1-2 ELD 3-4 196 66 49.5 66.9 20.5 19.3 -6.04* Word Reading ELD 1-2 ELD 3-4 197 66 48.9 64.9 19.5 17.0 -6.34* Reading Comprehension ELD 1-2 ELD 3-4 197 65 46.7 62.2 15.8 15.2 -6.94* Language ELD 1-2 ELD 3-4 198 66 40.2 57.4 16.3 18.2 -7.17* Spelling ELD 1-2 ELD 3-4 198 66 47.2 69.9 22.8 21.9 -7.11* Reading * p<.05 154 Table 6 Differences between ELD 1-2 and ELD 3-4 students in the Control Group ELD Level ELD 1-2 ELD 3-4 N 83 32 Mean 52.3 66.2 SD 15.8 15.8 T value -4.23* Word Study Skills ELD 1-2 ELD 3-4 83 32 55.0 69.1 18.0 20.2 -3.62* Word Reading ELD 1-2 ELD 3-4 84 32 51.8 65.6 17.7 15.7 -3.88* Reading Comprehension ELD 1-2 ELD 3-4 84 32 50.7 63.6 16.1 14.5 -3.95* Language ELD 1-2 ELD 3-4 84 29 46.2 60.1 17.9 15.4 -3.73* Spelling ELD 1-2 ELD 3-4 83 30 55.1 70.5 23.7 19.7 -3.18* Reading * p<.05 Analyses were also conducted to examine the impact of Waterford on different ethnic groups. No differences were found between African American students in the treatment group and those in the control group. Similarly, no differences were found between Hispanic students in each group. Within each group, we found no differences between Hispanic and African American students. We also compared the groups according to Title I participation and found no differences between Title I treatment students and Title I control students. Similarly, we found no differences between Non-Title I participants in each group. Within the treatment and control group, we found no differences between students who participated in Title I and those who did not. Finally, we compared the groups according to their primary reading program. No differences were found between the treatment students in Open Court classrooms and the control students in Open Court classrooms. Within the treatment group, students in Open Court classrooms outperformed students in Success for All classrooms on three of the tests (See Table 7). 155 Table 7 Differences between students in Open Court and SFA classrooms within the Treatment Group Reading Word Study Skills Spelling • Reading Program OC SFA N 338 70 Mean 56.9 48.7 SD 19.0 18.2 T value 3.32* OC SFA 341 70 59.3 46.7 21.3 17.4 5.27* OC SFA 345 70 58.5 48.3 24.3 21.2 3.29* p<.05 Comparison of matched treatment and non-matched treatment We found no differences between matched treatment and non-matched treatment students on any of the on the SAT/9 tests. 156 APPENDIX E (Variables with *’s are grand mean centered) Kindergarten Analysis Level 1 Y = β 0 + β 1 (Language Classification) + β 2 (ELD Level) + β 3 (Time on Waterford)* + β 4 (Level of engagement)* + r Level 2 β 0 = γ00 + γ01 (Time on OC)* + γ02 (OC Quality)* + γ03 (Engagement)* + γ04 (W-Usage)* + u0 β 1 = γ10 + γ11 (Time on OC)* + γ12 (OC Quality)* + γ13 (Engagement)* + γ14 (W-Usage)* β 2 = γ20 + γ21 (Time on OC)* + γ22 (OC Quality)* + γ23 (Engagement)* + γ24 (W-Usage)* β 3 = γ30 + γ31 (Time on OC)* + γ32 (OC Quality)* + γ33 (Engagement)* + γ34 (W-Usage)* β 4 = γ40 + γ41 (Time on OC)* + γ42 (OC Quality)* + γ43 (Engagement)* + γ44 (W-Usage)* First Grade Analysis Level 1 Y = β 0 + β 1 (Language Classification) + β 2 (ELD Level) + β 3 (Time on Waterford)* + β 4 (Level of engagement)* + r Level 2 β 0 = γ00 + γ01 (Classroom engagement)* + γ02 (Waterford Usage)* + γ03 (OC Quality)* + u0 β 1 = γ10 + γ11 (Classroom engagement)* + γ12 (Waterford Usage)* + γ13 (OC Quality)* β 2 = γ20 + γ21 (Classroom engagement)* + γ22 (Waterford Usage)* + γ23 (OC Quality)* β 3 = γ30 + γ31 (Classroom engagement)* + γ32 (Waterford Usage)* + γ33 (OC Quality)* β 4 = γ40 + γ41 (Classroom engagement)* + γ42 (Waterford Usage)* + γ43 (OC Quality)* 157 Model for SAT/9 Total Reading, SAT/9 Word Study, SAT/9 Reading Comprehension, SAT/9 Spelling Level 1 Y = β 0 + β 1 (Language Classification) + β 2 (ELD Level) + β 3 (Time on Waterford)* + β 4 (Level of engagement)* + r Level 2 β 0 = γ00 + γ01 (Classroom engagement)* + γ02 (Waterford Usage)* + γ03 (OC Quality)* + u0 β 1 = γ10 + γ11 (Classroom engagement)* + γ12 (Waterford Usage)* + γ13 (OC Quality)* β 2 = γ20 + γ21 (Classroom engagement)* + γ22 (Waterford Usage)* + γ23 (OC Quality)* β 3 = γ30 + γ31 (Classroom engagement)* + γ32 (Waterford Usage)* + γ33 (OC Quality)* + u3 β 4 = γ40 + γ41 (Classroom engagement)* + γ42 (Waterford Usage)* + γ43 (OC Quality)* 158 APPENDIX F Waterford Computer Adaptive Reading Test (WCART) The Waterford Institute provided us with a file of all the students in the district with pretest and posttest WCART scores. Sixty-three students in our sample were included in that file. Only a small number of students were identified because many teachers did not administer the WCART either in the fall or in the spring, or both. Some teachers administered the test but did not upload the results to the Waterford Institute. Interview data shows that only 24 (55%) kindergarten teachers administered the WCART at the beginning of the year, and only 18 of those teachers uploaded the results. Thirty-eight (83%) of the first grade teachers administered the WCART at the beginning of the year and 21 of them uploaded the results. Reasons for not administering and uploading the results included not knowing how to do it, not knowing what the WCART was, and technical difficulties. Thirty-four (81%) kindergarten teachers and 33 (73%) first grade teachers said they would administer the WCART at the end of the year. Because the WCART was not consistently administered, we were left with only 62 students with pretest and posttest scores. Table 1 shows their distribution by grade, mean scores (in percentages), and standard deviations. The standard deviations are large and make interpretations of the scores difficult. Table 1 WCART Mean Scores and Standard Deviations Grade WCART—Fall WCART—Spring WCART Gains N Mean SD Mean SD Mean SD Kindergarten 26 13 17 55 26 42 24 First Grade 47 33 25 79 8 46 24 These scores were not included in the HLM analysis due to the small number of cases. We did find a significant correlation, however, between the number of minutes spent using the Waterford courseware and WCART gains in first grade (r = .536, p< .01). No relationship was found in kindergarten between the total number of minutes and WCART gains. 159
© Copyright 2026 Paperzz