Validation of the Engagement, Alignment, and Rigor (EAR

R305R070025
1
Validation of the Engagement, Alignment, and Rigor (EAR)
Classroom Visit Protocol
&
Every Classroom, Every Day Efficacy Trial
Final Report on R305R070025
(Originally titled: Scaling Up the First Things First Reform Approach)
Edward L. Deci, Principal Investigator
Submitted to the Institute of Education Sciences
U.S. Department of Education
Edward L. Deci
Diane M. Early
J. Lawrence Aber
Richard M. Ryan
Juliette K. Berg
Stacey Alicea
Yajuan Si
R305R070025
2
Table of Contents
I. Overview of Accomplishments ................................................................................................. 5
Overview of the Reformulated Grant Research ........................................................................ 5
Accomplishments Regarding Validity of the EAR Classroom Visit Protocol Project ............. 5
Accomplishments Regarding the ECED Efficacy Trial Project ............................................... 7
II. Introduction to the Research ................................................................................................ 10
History of Every Classroom, Every Day ................................................................................ 11
Meta-Analyses Regarding Middle and High School Instruction ............................................ 14
Multi-Faceted Approaches to Improving High School Instruction ........................................ 17
Professional Development to Change Teacher Practices and Increase Student Achievement 23
Every Classroom, Every Day (ECED): The Model ................................................................ 26
Hypotheses .............................................................................................................................. 42
III. Method................................................................................................................................... 44
Study Design ........................................................................................................................... 44
School Selection...................................................................................................................... 45
Study Teachers ........................................................................................................................ 49
Study Students ........................................................................................................................ 54
Teacher Questionnaires ........................................................................................................... 61
Engagement, Alignment, and Rigor (EAR) Classroom Visit Protocol .................................. 68
Student Questionnaires ........................................................................................................... 76
Student Demographics ............................................................................................................ 82
Course Enrollment .................................................................................................................. 83
Standardized Test Scores ........................................................................................................ 83
Student Performance (Attendance, Credits, GPA) ................................................................. 89
Missing Data ........................................................................................................................... 89
Suitability of the Data for Analyses ........................................................................................ 95
Analysis Plan .......................................................................................................................... 96
Unconditional Models ........................................................................................................... 101
IV. Implementation ................................................................................................................... 105
Measuring Variation in Implementation ............................................................................... 105
Implementation Strengths and Weaknesses .......................................................................... 109
R305R070025
3
V. Results for Teachers’ Attitudes, Experience, and Observed Practice ............................ 111
Data Analytic Strategy .......................................................................................................... 111
Results for Math Teachers .................................................................................................... 114
Results for ELA Teachers ..................................................................................................... 126
VI. Impacts on Students ........................................................................................................... 128
Data Analytic Strategy .......................................................................................................... 128
Point-in-Time Analyses Predicting Students’ Attitudes Toward School ............................. 133
Point-in-Time Analyses Predicting Individual Student Survey Scales................................. 135
Point-in-Time Analyses Predicting Math and ELA Achievement ....................................... 136
Point-in-Time Analyses Predicting Performance (GPA, Credits, and Attendance) ............. 139
Growth Curves Predicting Students’ Attitudes Toward School ........................................... 142
Associations Between ECED Implementation and Student Outcomes Across All Study
Schools ............................................................................................................................ 147
Associations Between ECED Implementation and Student Outcomes in Intervention Study
Schools ............................................................................................................................ 150
VII. Implementation and Data Collection Challenges........................................................... 154
Challenges in Recruiting Schools to Participate ................................................................... 154
Challenges in Implementing ECED ...................................................................................... 156
Data Collection Challenges................................................................................................... 161
Implications of the Challenges for the Impact Evaluation.................................................... 163
VIII. Discussion ......................................................................................................................... 164
Summary of Most Important Findings .................................................................................. 165
Implications........................................................................................................................... 166
Strengths of the Design and Analyses .................................................................................. 171
Limitations ............................................................................................................................ 172
Future Analyses .................................................................................................................... 173
Conclusions and Recommendations ..................................................................................... 175
IX. References ............................................................................................................................177
Appendix 1: Change in Project Focus ..................................................................................... 184
Appendix 2: Findings From the First Component of Revised Project:
Validation of the EAR Classroom Visit Protocol ............................................................. 186
Appendix 3: Sample Memorandum of Understanding ......................................................... 226
Appendix 4: Recruitment and Participation Diagram ......................................................... 236
R305R070025
4
Appendix 5: Teacher Survey Items ......................................................................................... 237
Appendix 6: EAR Protocol Training for ECED Efficacy Trial Data Collectors ................ 248
Appendix 7: Student Questionnaire Administration Procedures ........................................ 249
Appendix 8: Student Questionnaire Items ............................................................................. 251
Appendix 9: Restructuring the Course Files for Use in Analysis ......................................... 254
Appendix 10: Test Scores Received for Grade Cohort 1....................................................... 256
Appendix 11: State-Specific Decisions Regarding Combining Test Scores ........................ 258
Appendix 12: Indicators of Variation in Implementation..................................................... 260
Appendix 13: Teacher-Level Correlations Among Outcome Variables ...............................264
Appendix 14: Student-Level Correlations Among Outcome Variables ...............................266
Appendix 15: School-Level Correlations Among Outcome Variables .................................268
Appendix 16: Child-Level Interactions ....................................................................................274
Endnotes......................................................................................................................................275
R305R070025
5
I. Overview of Accomplishments
Originally, this grant was funded as a Randomized Field Trial to evaluate the
effectiveness of the First Things First (FTF) approach to Comprehensive School Reform in high
schools that serve large percentages of disadvantaged students. That reform was designed and
implemented by the Institute for Research and Reform in Education (IRRE). In the first year of
the grant we encountered major difficulties in relation to the FTF project, so we worked with the
program division of NCER to reformulate the research. See Appendix 1 for a discussion of the
difficulties and the process of reformulating the grant research.
Overview of the Reformulated Grant Research
The reformulation was based on the idea that high-quality classroom instruction is the
core of effective schooling. Indeed, the National Research Council’s Committee on Increasing
School Students’ Engagement and Motivation to Learn (2004) argued forcefully that, although
school-level policies and efforts to restructure schools may benefit students in a myriad of ways,
student learning is most directly and deeply affected by how and what teachers teach. The
reformulation had two projects related to high-quality classroom instruction: (1) a validation
study of the Engagement, Alignment, and Rigor (EAR) Classroom Visit Protocol to assess the
quality of classroom instruction; and (2) a school-level randomized field trial (S-RFT) to
examine the efficacy of the instructional improvement strategy contained within FTF, referred to
as Every Classroom, Every Day (ECED) when it is implemented without the other two FTF
strategies, namely, Small Learning Communities and a Student and Family Advocacy System.
Accomplishments Regarding Validity of the EAR Classroom Visit Protocol Project
The project to assess inter-rater reliability and predictive validity of the EAR Classroom
Visit Protocol had two studies. In Study 1 of this first project, district personnel, professional
R305R070025
6
development providers, and outside educators conducted 2,171 EAR Protocol visits to all types
of courses in four racially and economically diverse high schools. Intra-class correlations
indicated adequate inter-rater reliability. The observations that were conducted in classes for
math (125 observations of 33 teachers) and English/language Arts (ELA) (102 observations of
25 teachers) were analyzed in multi-level models (using Hierarchical Linear Modeling) to predict
students’ standardized achievement test scores, controlling for their previous year’s scores.
Findings indicated that engagement, alignment, and rigor were each significantly (p < .05) or
marginally (p < .10) associated with math and ELA achievement scores, with standardized
coefficients ranging from .06 to .17. Additionally, students’ self-reports of their engagement in
school were predictive of test scores in models that also included perceived academic
competence as well as observed engagement, alignment, or rigor. Study 2 of this first project was
intended to replicate the predictive-validity findings of Study 1 using data collected in eight
racially diverse, low-income high schools. Study 2 included 261 observations of 63 math and 64
ELA teachers. Observed engagement again emerged as a significant predictor of test scores in
both subjects, but alignment and rigor did not. The results of the two studies of this first project
assessing the validity of the EAR Classroom Visit Protocol are summarized in a manuscript that
is under editorial review. The complete manuscript appears in Appendix 2 of this report.
The second project of the reformulated grant represents the core of the grant research and
of this Final Report. It sought to evaluate the Every Classroom, Every Day (ECED) instructional
improvement model for high school math and literacy instruction, with the goal of increasing
standardized achievement test scores in these two areas. As noted, the components of ECED
were first implemented as the Instructional Improvement strategy of the First Things First
approach to school reform, so this efficacy trial of ECED represents the first time these
R305R070025
7
components were implemented in the absence of First Things First’s other two components.
Accomplishments Regarding the ECED Efficacy Trial Project
Central to the intervention were the concepts of engagement, alignment, and rigor as
markers of high-quality instruction. Across math and literacy content areas, teachers received
professional development and on-going coaching and supports from external change agents and
internal school personnel to teach in ways that are more Engaging for students, more Aligned
with local, state, and federal standards, and more Rigorous for all students (EAR). Math teachers
also received initial training and ongoing supports in IRRE’s “I Can…” Math Benchmarking
process and tools. Literacy teachers were provided with a two-year curriculum called Literacy
Matters and wrap-around supports in its use. Trained independent raters, blind to the intervention
status of the schools, observed classrooms as an important source of research data. Teachers and
students completed questionnaires, and school districts provided information about students,
including demographic characteristics, scores on standardized achievement tests in math and
English/language arts (ELA), progress toward graduation, grade point average (GPA), and
attendance.
Our primary hypothesis was that ECED’s instructional improvement interventions would
improve math and ELA achievement as measured by standardized test scores and analyzed
experimentally. Secondary hypotheses were: that the intervention would improve other student
performance outcomes such as attendance, GPA, and progress toward graduation, and would
enhance teacher and student attitudes. Finally, we hypothesized that both the fidelity of
implementation and the number of semesters students were in intervention schools would predict
better school outcomes in non-experimental analyses.
Twenty high schools (5 districts, 4 schools per district) were randomly assigned to either
R305R070025
8
the treatment (n = 10) or control (n = 10) condition, with two high schools from each district
being assigned to each condition. There were five primary types of data collected: (1) student
surveys, (2) teacher surveys, and (3) classroom observations, using the EAR Classroom Visit
Protocol, collected in four waves, once near the start and once near the end of each of the two
academic years that the school participated; as well as (4) variation-in-implementation interviews
with math and literacy coaches and/or department chairs and the school principal or assistant
principal, and (5) student records collected twice, once at the end of each academic year that the
school participated.
Findings from this evaluation provide some evidence that ECED was efficacious in
improving student achievement in math. Students in the treatment schools scored significantly (p
= .04) higher on standardized tests of math than did students in the control-group schools, after
controlling for pre-intervention math achievement and school district, although that result
became only marginal (p = .06) when student demographic controls were added to the models
(see p. 138). In addition, there was some evidence that fuller implementation of ECED was
linked to better student outcomes such that students in schools with higher implementation
scores had marginally higher math achievement (see p. 149), significantly higher grade point
averages (see p. 152), and significantly higher credits toward graduation (see p. 152). As well,
the number of terms students were enrolled in ECED schools moderated the association between
implementation and attendance, such that those students with more semesters in schools with
higher ECED implementation had better attendance (p. 149). In contrast to these impacts, the
ECED intervention did not improve achievement in English language arts (see p. 138), nor was
the degree of overall ECED implementation related to ELA achievement (see p. 149 & p. 152).
Preliminary analyses of the classroom observations for math indicated that across the two years,
R305R070025
9
schools that participated in ECED showed increased rigor of math classroom instruction relative
to control-group schools (see p. 123 & 125).
Although there was indication that the ECED intervention led to some positive outcomes,
including math achievement and rigor in math classrooms, there was also evidence that these
improvements might have come at some cost to students and teachers. Among students in ECED
treatment schools, students who were enrolled more terms reported worse attitudes toward
school; whereas among students in control-group schools, those who were enrolled more terms
reported better attitudes toward school (see p. 135). This finding appears to be primarily driven
by a single component of student attitudes regarding school, namely perceived academic
competence (see p. 136), which could be a function of the significant increase in the rigor of
math instruction observed in the ECED treatment schools. Further, math teachers in treatment
schools reported less mutual support among colleagues than did those in control schools (see p.
117). And, across the two years of the study, math teachers in ECED schools who taught more
semesters of the courses targeted by the intervention reported that their districts’ leadership was
less committed to change, while the opposite was true for teachers in control schools (see p.
117). Finally, based on preliminary analyses of the classroom observations it appears that ECED
had a negative impact on observed student engagement in math classes (see p. 123 & p. 125). No
effects of ECED were seen for ELA teacher attitudes and experiences (see p. 127).
In sum, ECED Math appears to be a valuable path to improved student standardized test
scores in math. In schools from five districts in four states, which were fraught with problems
and served high percentages of students from economically disadvantages homes, the use of the
ECED math approach resulted in improved math scores relative to control-group schools. We
found this effect with a relatively small sample size (10 schools per condition) and stratified
R305R070025
10
random assignment within districts, leaving only about 14 degrees of freedom, which is
extremely low for finding statistically significant effects. Further, of the 10 treatment schools,
two had dropped out by the end of the first year, seriously weakening the intervention
implementation. Still there was indication that the approach did enhance achievement in math.
Nonetheless, the intervention did have some effects on teacher and student experiences
and self-perceptions that are cause for concern, effects that might dampen ECED’s positive
effects over time, although it is possible that the negative effects would fade away over time as
students and teachers adjusted to the new instructional strategies and experienced improvement
in outcomes.
II. Introduction to the Research
In this Introduction, we first describe the history of ECED. Next, we review the existing
literature on models for improving instruction, followed by a review of the literature on effective
professional development with teachers. Following those literature reviews, this chapter presents
a detailed description of the Every Classroom Every Day (ECED) intervention.
Following the Introduction, this report includes a chapter that details the Method of the
ECED Efficacy Trial (Chapter III), a chapter describing variation in implementation of the
ECED strategies in the treatment schools (Chapter IV), one summarizing the findings regarding
changes in teachers’ attitudes and practices (Chapter V), another summarizing the findings
regarding impacts on students (Chapter VI), and one outlining challenges IRRE faced in
implementing ECED and challenges the research team faced in collecting the evaluation data
(Chapter VII). The final substantive chapter is the Discussion, which summarizes the findings in
the context of these challenges, the implications of the findings, the study’s limitations, and
future plans (Chapter VIII). The Reference list is presented in Chapter IX.
R305R070025
11
History of Every Classroom, Every Day
Every Classroom, Every Day is the instructional improvement component of the First
Things First (FTF) approach to school reform. First Things First (FTF)—developed by the
Institute for Research and Reform in Education (IRRE)—is a comprehensive educational reform
model with three key strategies: (1) Small Learning Communities (SLC), which involves
breaking large schools into smaller, thematically focused “schools-within schools;” (2) the
Family and Student Advocate System, which involves small groups of 15 to 20 students within
an SLC who meet regularly with one teacher or administrator who serves as an advocate for the
students in the group for the entire time the students are in the school and is the liaison to the
students’ families; and (3) Instructional Improvement, which involves supporting teachers to
create engaging learning activities that are rigorous for all students and aligned with the district,
state, and national standards. Accordingly, engagement, alignment, and rigor (EAR) are the
central elements used in assessing the quality of instruction and providing feedback to schools.
To achieve improved instruction, administrators and teachers are provided with professional
development on identifying and observing what engaging, aligned, and rigorous instruction looks
like in the classroom and are trained to use a classroom observation protocol on an ongoing basis
to provide feedback and to guide instructional activities. The proximal goal of FTF is to improve
the quality of supports the teachers and students need to be successful. The distal goal is to
increase learning, with a particular emphasis on literacy and math as evidenced by higher scores
on standardized achievement tests as well as improved attendance, decreased dropout, increased
graduation rates, and increased enrollment and completion of college. FTF also targets
enhancement of non-academic factors, such as relationships with teachers, school engagement,
and motivation. Improvements on these various outcomes are expected, in turn, to lead to
R305R070025
12
increased academic achievement, and greater numbers of graduates enrolling in post-secondary
education and going on to meaningful careers and citizenship.
FTF has been evaluated in two quasi-experimental efficacy studies (Gambone, Klem,
Summers, Akey, & Sipe, 2004; Quint, Bloom, Black, Stephens, & Akey, 2005), each concluding
that the FTF model was very promising, with significant gains in some districts. Gambone and
her colleagues compared outcomes in Kansas City, Kansas (KCK)—the first district to
implement FTF—to those from all schools in other districts within the state. The mean effect size
on achievement in KCK was .33, which compared favorably to the results of a meta-analysis of
studies of the 29 most widely adopted CSR models, which found an effect size of .15 on student
achievement (Borman, Hewes, Overman, & Brown, 2003). It is important to note, however, that
the Gambone et al. study and 97% of the studies included in the Borman et al. meta-analysis
were non-experimental, so it is likely that they overestimate the effect sizes. Gambone et al. also
found evidence that White, African American, and Latino students all benefited from FTF, but
students of Color benefited more than White students.
Quint and colleagues used interrupted time-series analyses to investigate the impact of
FTF in Kansas City, Kansas and four districts from other states at varied stages of FTF
implementation. Comparison schools were selected for each intervention site, matched on preintervention test scores and other variables such as school size, racial/ethnic make-up, and
eligibility for free or reduced-priced lunches. Findings indicated that in KCK high schools and
middle schools academic outcomes improved substantially over comparison schools. The
findings were inconclusive in the other four districts included in this study where FTF had been
implemented for a shorter period of time. Further, neither the Gambone et al. (2004) study nor
the Quint et al. (2005) study attempted to examine the individual components of FTF, such as the
R305R070025
13
instructional improvement strategy, which was examined in the current trial.
FTF’s Instructional Improvement strategy has evolved since it was first introduced.
During the past few years an enhanced version, pilot tested in a small number of schools, led to
substantial achievement gains when implemented within the context of the full FTF intervention.
For example, four high schools in Kansas City, Kansas implemented the enhanced strategies and
saw rapid acceleration in their student achievement gains both in mathematics and reading on the
state’s high-stakes assessments. On the Kansas 10th-grade math achievement test, the percentage
of students meeting or exceeding the state proficiency standard increased by over 30 percentage
points (18% to 54%) over a four year period. On the Kansas 11th-grade reading test, the
percentage of students meeting or exceeding the state proficiency standards during those same
four years increased by 15 percentage points (from 46% to 61%) (Kansas State Department of
Education, n.d.). However, these findings must be interpreted with caution because they are nonexperimental and could be explained by a number of factors other than FTF’s Instructional
Improvement efforts. School test scores may have already been on upward trajectories, the
population of students or teachers providing the instruction may have changed during
implementation, or the schools may have implemented other changes that lead to these gains.
The current study was designed to use a highly rigorous experimental design to evaluate
the efficacy of the FTF’s enhanced instructional improvement strategy, called Every Classroom
Every Day (ECED), in schools that are not implementing the full FTF reform model. Due to
pressures of the No Child Left Behind federal legislation, prior to the start of this study, many
districts and high schools had already undertaken some type of school restructuring effort (e.g.,
block schedules, reduced class sizes, 9th-grade academies) and viewed FTF as too comprehensive
for their current needs. Instead, they were looking for ways to systematically improve instruction
R305R070025
14
using a more targeted approach, especially in literacy and math. Based on the quasi-experimental
and pilot work described above, Every Classroom, Every Day appeared to be a promising avenue
to meet schools’ current needs. In sum, although there is some evidence that ECED strategies are
linked to higher test scores within the context of the larger FTF reform, there has not been a
rigorous experimental evaluation of their efficacy as an intervention. Further, prior to the current
study, the strategies had never been used in schools that were not implementing the full FTF
school reform model.
The ECED Efficacy Trial described in this report is a school-level randomized field trial
(S-RFT) involving 20 high schools. Half of the schools were randomly assigned to receive the
ECED supports (n = 10); the other half were assigned to a ‘business as usual’ control condition
(n = 10) in which schools continued with whatever school improvement efforts they already had
underway without adding any elements of ECED. The primary outcomes of interest were
students’ standardized test scores in math and English Language Arts (ELA), students’ attitudes
toward school, observed teacher instruction, teachers’ experiences of support for instructional
innovation and improvement at school, and the other student outcomes of grade point average,
attendance, and progress toward graduation.
Meta-Analyses Regarding Middle and High School Instruction 1
Several meta-analyses of studies conducted in middle- and high-schools suggest that
broad-based reform models employing multiple methods and components may result in better
literacy and math outcomes than those that are narrowly targeted on a single curriculum or
instructional technique. For example, a comprehensive meta-analysis of effective reading
programs for middle and high schools conducted by Slavin, Cheung, Groff and Lake (2008)
systematically reviewed four types of approaches: (1) reading curricula, (2) mixed-method
R305R070025
15
models (i.e., methods that combine various instructional approaches such as large and smallgroup instruction and computer activities), (3) computer-assisted instruction, and (4)
instructional process programs (i.e., methods that focus on providing teachers with extensive
professional development in implementing specific instructional methods such as cooperative
learning and individualized instruction). Thirty-three studies met the inclusion criteria, which
included the use of quasi-experimental or experimental designs (i.e., randomized or matched
control groups), study duration of at least 12 weeks, and valid achievement measures
independent of experimental treatments. Findings suggest that programs designed to change
daily teaching practices had substantially greater impacts on student reading comprehension
compared to those focused on curriculum or technology alone. Positive achievement effects were
found for mixed-method programs and instructional process programs, especially those
involving cooperative learning. Across nine studies involving approximately 10,000 students, the
weighted mean effect size for mixed-method models was +0.23 in predicting reading
comprehension test scores. Across seven studies of instructional processes, programs involving
cooperative learning approaches to school reading had a weighted mean effect size of +0.28.
In a meta-analytic review of mathematics programs for middle- and high-schools, Slavin,
Lake, and Groff (2009) found similar results. Of the 26 studies that met inclusion criteria (e.g.,
use of a randomized or matched control group, study at least 12 weeks long, and quality at
pretest), effect sizes were greatest for instructional process programs, such as cooperative
learning and classroom motivation and management programs and other approaches that focused
on changing teacher and student behaviors during daily lessons. For example, the median effect
size for cooperative learning programs for middle and high school studies in predicting math test
scores was +0.32. Studies of these instructional process programs were also more likely to have
R305R070025
16
used random assignment to treatments. These findings were further supported by another metareview focused solely on Algebra I outcomes with similar inclusion criteria (e.g., quasiexperimental or experimental designs with comparison groups, targeted learning of algebraic
concepts, student academic achievement measures) by Rakes, Valentine, McGatha, and Ronau
(2010). They found that while a variety of math interventions resulted in increased student
achievement in algebra, instructional strategies (i.e., cooperative learning, mastery learning,
multiple representations, and assessment strategies) produced some of the higher effect sizes in
comparison to other interventions.
Importantly, all three meta-reviews highlighted a number of limitations. First, the
majority of interventions that met inclusion criteria were relatively short term interventions in the
range of 12 weeks to 1 year, with no data to ascertain their long term effects. Second, many of
the interventions and their evaluations were poorly designed—small sample sizes, implemented
at only one or too-few schools—resulting in insufficient power to detect effects and limited
generalizability of impacts to other school contexts. Third, the vast majority of studies employed
matching and randomization of samples primarily at the student-level, limiting the ability of
evaluators to take into account the nested structure of the data (i.e., students are nested in
teachers, classrooms, and schools) and, in turn, prohibiting a rigorous assessment of teacher and
classroom-level impacts associated with interventions. Nonetheless, existing meta-reviews
provide some evidence that instructional learning processes, specifically cooperative learning
approaches that target teachers’ instructional behaviors rather than math or literacy content
alone, produce better literacy and math outcomes for middle and high school students. Still,
findings suggest relatively small gains on traditional student-level academic outcomes. Given
findings across literacy and math reviews, Slavin and colleagues suggest educators and
R305R070025
17
researchers should: (1) pay more attention to classroom processes that maximize student
engagement and motivation rather than focusing solely on implementing and testing new
textbooks and curricula, and (2) consider the potential additive effects of creating multicomponent interventions.
Multi-Faceted Approaches to Improving High School Instruction
Surprisingly, despite literature suggesting that student achievement at the high schoollevel would benefit from more comprehensive interventions (Heller & Greenleaf, 2007; Quint,
2006) research on multi-faceted approaches to improving high school instruction is limited.
Scholars have identified four key, overlapping areas that should be addressed in order to build
high-quality secondary schools that will increase student achievement and graduation rates, and
prepare students for college work and citizenship: (1) clear definitions of teacher roles and
responsibilities, (2) clear definitions of skills that must be taught, (3) provision of ongoing
professional development in teaching skills, and (4) clear sets of state standards and
accountability, and district rules and regulations (Heller & Greenleaf, 2007; Quint, 2006). These
recommendations speak to the need for evidence-based models for high school improvement that
have the capacity to effect setting-level change. Few projects have bundled these intervention
elements, while simultaneously working with school and district administrators to ensure
successful implementation of school wide comprehensive models. And, the work that has been
done focuses almost entirely on student-level changes, without paying attention to changes at the
teacher or and classroom levels. While students influence their classrooms and school-level
settings, those settings also affect students. Settings provide an important point of entry for
making meaningful changes in the lives of students, especially students of color and students
from disadvantaged backgrounds (Tseng & Seidman; 2007; French, Seidman, Allen, & Aber,
R305R070025
18
2006; Seidman, Allen, Aber, Mitchell, & Feinman, 1994).
In the past decade, a number of interventions targeting academic outcomes in high
schools have emerged. Several of these models have addressed previous design and analytic
limitations through rigorous efficacy trials; however, results from these quasi-experimental and
experimental evaluations have been mixed. While some interventions have shown significant and
promising preliminary academic gains for students with effect sizes between .18 and .33
(Gambone et al., 2004; Quint, 2006; Quint et al., 2005; Lang et al., 2009); many have resulted in
null findings (Cavalluzzo, Lowther, Mokher, & Fan, 2012; Corrin et al., 2012) or small effect
sizes (Corrin, Somers, Kemple, Nelson, & Sepanik, 2008; Kemple et al., 2008).
Quint (2006) highlighted three promising high school whole school interventions that
have undergone quasi-experimental impact evaluations with promising results: First Things First
(one component of which is Every Classroom, Every Day, as discussed above), Talent
Development (TD), and Career Academies (CA). These three interventions have been
implemented in more than 2,500 high schools across the United States. Additionally, they all
contain multiple intervention components intentionally linked to underlying theories of change
targeting whole school reform. Quint (2006) argued that these studies, although quasiexperimental in nature, provide compelling evidence that interventions targeting the whole
school can improve student achievement via the structural promotion of positive school climate,
focusing on students with poor academic skills, improving curricula and teacher instructional
content and practice, and preparing students for post-secondary education and employment.
In addition to whole school models highlighted by Quint (2006), some recent schoollevel intervention models have produced some promising results. Lang and colleagues (2009)
conducted a 3-year impact study of four intensive reading interventions for 1265 9th-grade
R305R070025
19
struggling readers in 89 classrooms in seven high schools in Florida. In addition to instituting
new reading curricula, teachers underwent ongoing professional development and the research
team worked with school officials to address issues related to implementation and fidelity.
Students were identified as struggling readers based on the prior year’s state reading performance
test, and placed in one of two risk groups: Level 1 (i.e., high risk)—reading below a fourth-grade
reading level, or Level 2 (i.e., moderate risk)—reading between a fourth- and sixth-grade level.
Students were then randomly assigned, within school and within level, to classrooms where one
of four intensive reading interventions were taught. A 2 × 4 (Risk Level × Intervention Group)
linear mixed model with random coefficients was used to model students’ gains in reading
developmental scale scores. Although gains made by students in the high risk group varied by
intervention, these students demonstrated improvements that were more than twice the
magnitude of the state benchmark for expected annual growth across all four interventions. For
the moderate risk group, results indicated that while gains were not as great for this group or
significant across all four interventions, students in this group showed greater average gains in
state tests scores compared to other 9th-grade students statewide. It is important to note that
despite attempts to estimate causal effects, this study suffered from several limitations, including
cross-contamination concerns due to randomization at the classroom-level, failure to account for
nested structure of data in analyses, large amounts of missing data, and limited generalizability to
other educational contexts. Nonetheless, this study supports the use of school-wide models in
enhancing reading comprehension, particularly for low-achieving, at-risk students in high
schools.
Conversely, a small number of more recent studies of instructional interventions using
rigorous causal evaluation methods have resulted in mixed results or no impact on student
R305R070025
20
academic achievement outcomes. The Enhanced Reading Opportunities (ERO) study evaluated
two literacy interventions embedded in a larger school reform model targeting reading
comprehension and school performance among 2,916 low performing 9th-graders in 34 high
schools in 10 school districts across the U.S. Students who scored between two and five years
below grade-level on reading comprehension standardized test scores were considered at-risk for
poor reading outcomes, and were included in the study. Literacy programs were designed as fullyear courses, which replaced a 9th-grade elective and supplemented the regular English
curriculum. Teachers were provided with ongoing professional development. The ERO
evaluation utilized a two-level random assignment research design. Within each district, high
schools were randomly assigned to use one of the two ERO literacy programs. Low performing
students within each high school were then randomly assigned to enroll in ERO classes, or to
remain in a regularly scheduled elective class. In each of two cohorts, ERO programs produced a
relatively small but significant effect size of +.08 on reading comprehension (Kemple et al.,
2008; Corrin et al., 2008). Despite the significant change, 77% of ERO students in the second
cohort were still reading two or more years below grade-level after participation (Corrin et al.,
2008).
The Content Literacy Continuum (CLC) combines whole-school and targeted approaches
to supporting student literacy and content learning, placing emphasis on greater academic
support for students with increased academic needs. CLC is a high school-level equivalent to
tiered response-to-intervention (RTI) reading instruction frameworks used in elementary schools
with some success (see Faggella-Luby & Wardwell, 2011). CLC’s comprehensive framework
includes structural components (e.g., specialists working with school leaders to establish a
literacy leadership team or creating supplemental reading classes), professional development of
R305R070025
21
core content and reading teachers, and a targeted student tiered curriculum that addresses
students’ varying comprehension and reading skill levels. A rigorous experimental efficacy study
of CLC in 33 high schools in nine districts across four Midwestern states evaluated program
impacts on students’ reading comprehension test scores and accumulation of course credits in
core content areas using a cluster randomized trial design. Participating high schools within each
district were randomly assigned either to implement CLC or to continue with “business as
usual”. Student outcomes were compared for schools randomized to the CLC and non-CLC
schools. In both Year 1 and 2 of the study, there were no statistically significant differences in
reading comprehension scores or students’ accumulation of core credits between CLC schools
and non-CLC schools in either grades 9 or 10. Analyses also revealed no significant differences
in effects for subgroups (e.g., defined by 8th-grade reading proficiency, being over-age for grade,
or special education status) of students or districts (Corrin et al., 2012).
Another example of an instructional intervention evidencing null findings can be found in
the Kentucky Virtual Schools Hybrid program, which aimed to improve student math
achievement and course enrollment outcomes by enhancing hybrid (i.e., bundled online and faceto-face classroom) instructional practices via intensive teacher professional development. Fortyseven Kentucky high schools volunteered to participate in the impact evaluation. Schools were
randomized at the school level to the hybrid program or “business as usual”. Intent-to-treat
analyses were conducted using two-level hierarchical linear models that nested students within
schools, and assessed differences in outcomes between treatment and control schools. Outcome
measures included scores on a pre-algebra/algebra standardized college assessment test (i.e.,
PLAN) and 9th-grade students’ enrollment in 10th-grade math courses. The findings indicated that
the treatment had no statistically significant effect for either outcome. Additional exploratory
R305R070025
22
analyses further revealed no differences in impacts among subgroups of students or schools
(Cavalluzzo et al., 2012).
Across ERO, CLC and the Kentucky Virtual Schools Hybrid studies, evaluators cited
poorly aligned implementation and fidelity as the primary possible reason for small effect sizes
or null findings. ERO evaluators cited a number of implementation issues in year one leading to
long delays in enrolling students in ERO classes. Similar to RTI evaluations in middle schools,
CLC evaluators cited challenges to implementing the intervention’s structural components and
highlighted teacher training as needing significant improvement. For example, in year one of
CLC, roughly 75% of experimental schools implemented five or fewer of nine structural
components at an adequate level or better. In the case of the Kentucky hybrid intervention,
evaluators cited poor fidelity of implementation across schools, and low levels of engagement
and participation among teachers and students assigned to the treatment, the voluntary nature of
the study, and lack of study generalizability beyond Kentucky. Indeed, past literature has
provided evidence of potential and important barriers to implementation of instructional
improvement models, including lack of model specificity, inconsistent policies across the school
or district, and student, teacher and leadership mobility (Berends, Bodilly, & Kirby, 2002;
Desimone, 2002).
Taken together, studies reviewed suggest that instructional improvement models must
address a number of existing limitations, most notably: (1) the need for clear theories of change
that incorporate increased attention to structural components of the instructional improvement
models that support reform efforts at all relevant setting levels (e.g., district, school, classroom,
teacher, and student), (2) implementation and fidelity challenges that may threaten the ability of
interventions to produce theorized meaningful and sustainable impacts for targeted populations,
R305R070025
23
and (3) the need to adopt appropriate and rigorous evaluation methods that take into account the
complex nature of whole school reform efforts (e.g., nested data structure, estimating impacts at
multiple ecological levels, estimating causal impacts, etc.).
Professional Development to Change Teacher Practices and Increase Student Achievement
At its core, the Every Classroom, Every Day model seeks to change instructional
practices in ways that are likely to benefit students learning, while carefully addressing several of
the critical shortcomings of past school reform efforts. Consistent with the literature reviewed
above, ECED has multiple components including use of instructional coaches, summer institutes,
classroom observations with feedback, multiple half-day professional development sessions, a
literacy curriculum, and restructuring of the pace, sequence, assessment, and grading of math
courses. Professional development, traditionally thought of as workshops, college courses, and
study groups for teachers, is now defined more broadly to encompass any activity aimed at
improving instruction or teachers’ skills and knowledge (Desimone, 2009). Using this broad
definition, ECED is a professional development model.
Even with this ever expanding definition, several authors have argued that the field of
education has reached a near consensus regarding the key aspects or critical features of high
quality professional development that are likely to result in change in instructional strategies and
increased student learning (Darling-Hammond, Wei, Andree, Richardson, & Orphanos, 2009;
Desimone, 2009; Elmore, 2002; Garet, Porter, Desimone, Birman, & Yoon, 2001). These authors
have argued that the form of the professional development (e.g., workshop versus coaching) is
less important than the extent to which it embodies these critical features. The exact names and
framings of the critical features vary a bit from author to author, but all include the same core
ideas. For the current exploration, we will use the names assigned by Desimone (2009). She
R305R070025
24
makes a cogent case that this consensus surrounding these critical features is based both on
research and the experience of experts in the field. The five critical features according to
Desimone are: (1) content focus, (2) active learning, (3) coherence, (4) duration, and (5)
collective participation. Importantly, the Every Classroom, Every Day intervention evaluated in
this report strives to embody each of these key components of effective professional
development that together are thought to lead to changes in teacher practice and, ultimately, to
changes in student learning.
Content focus. Effective professional development is focused on extending and
intensifying teacher knowledge of a subject area, such as math, science, or English Language
Arts (ELA), and how children learn that specific content (Garet et al., 2001). Darling-Hammond
and colleagues (2009) argued that effective professional development should emphasize
“concrete, everyday challenges involved in teaching and learning specific academic subject
matter” (p. 10). Content focused professional development stands in contrast to more general
efforts to improve instruction via discussion of pedagogy that is not tied to specific content,
abstract educational principals, or non-content issues, such as classroom management, which
tend to be less effective.
Active learning. This critical feature refers to the extent to which teachers engaged in the
professional development have opportunities to work directly with the concepts, as opposed to
passively listening as material is presented. In effective professional development, teachers are
engaging with and analyzing the material through activities such as reviewing student work,
discussing a videotaped lesson, and working in groups to apply the information to their own
practice (Garet et al., 2001). Active learning stands in contrast to simply listening to a lecture or
reading a book.
R305R070025
25
Coherence. Teachers experience a wide range of professional development activities.
Coherence refers to the extent to which a professional development activity is aligned with other
professional development activities in which the teacher is participating and with the school and
district’s standards and culture. Professional development will not engender changed instruction
or improved student outcomes if teachers’ various professional development activities contradict
one another or if school and district administrators do not support the types of changes
encouraged by the professional development (Darling-Hammond et al., 2009).
Duration. As with all learning that promotes change in behavior, high quality
professional development must be sustained over time (Darling-Hammond et al., 2009). Longer
activities can provide more time to explore new ideas in-depth and multiple sessions devoted to
related concepts allow teachers to practice and discuss their experiences and receive feedback
(Desimone, 2009; Garet et al., 2001). Further, the cognitive psychology literature indicates that
repeated exposure, as well as a requirement to actually reproduce or use the new material,
optimizes memory and application (Roediger, Agarwal, McDaniel, & McDermott, 2011).
Collective participation. High quality professional development promotes professional
communication and collaboration among teachers by including all teachers (from a department,
school, or district). Collective participation allows teachers to support one another in changing
practice and sustaining that change by providing a shared set of goals and language. The call for
collective participation is based on the idea that educators will learn better if they are working in
concert with others who are facing the same issues and that learning is essentially a collaborative
process (Elmore, 2002). Additionally, teachers from the same setting often share curricula and
students, making it easier to apply the professional development to their specific setting (Garet et
al., 2001). Unfortunately, districts often undermine this critical feature by creating an array of
R305R070025
26
professional development opportunities from which teachers select the ones that appeal to them.
Thus, sessions may include teachers from many different schools, grades, and school-cultures.
Such a system weakens the experiences of all teachers by preventing them from working toward
a common set of goals.
Every Classroom, Every Day (ECED): The Model
Every Classroom, Every Day (ECED) was designed to provide 9th- and 10th- grade
English/Language Arts (ELA) and math teachers, as well as instructional leaders, with two years
of intensive professional development and curricular support, using tools and processes
developed by the Institute for Research and Reform in Education (IRRE). In keeping with the
findings from the meta-analyses reviewed above (Rakes et al., 2010; Slavin et al., 2008; Slavin et
al., 2009) and the professional development literature (Desimone, 2009), ECED employs a broad
range of strategies and components, including instructional coaches, professional development
sessions that are content-focused and encourage active participation, and curricular and
assessment support to improve instruction. All literacy and math teachers within intervention
schools take part in the same activities, with a relatively long duration compared to most
interventions reviewed here (viz., two years). Coherence is created by a continual focus on three
instructional goals: (1) Engagement of all students in their learning, (2) Alignment of what
students are being asked to learn with state and national standards and high stakes assessments;
(3) Rigor in how all students are taught and the level of content all students are being asked to
learn. These instructional goals – referred to as EAR – form the core of all of the ECED training,
coaching supports, curricula, and instructional tools and processes.
ECED has three major components: EAR Classroom Visit Protocols, ECED Math, and
Literacy Matters. This section describes each component as IRRE intended for it to be
R305R070025
27
implemented in the treatment schools in the ECED Efficacy Trial. Of course, there was variation
in the implementation at the schools that participated in the evaluation. Later in this report, we
describe that variation, both quantitatively and qualitatively.
EAR Classroom Visit Protocol. The EAR Classroom Visit Protocol is a cornerstone of
the ECED process. It is designed for use by school and district personnel, as well as technical
assistance providers, to equip districts, schools, departments, instructional coaches, and teachers
with up-to-date information about the quality of teaching and learning. It provides schools and
districts with a common language for discussing and promoting high-quality instruction. See
Early, Rogge, and Deci (2013; which appears in Appendix 2) for a description of the protocol
and its psychometric properties.
The EAR Classroom Visit Protocol is a 15-item observational tool completed by trained
observers during and following a 20-minute observation. Typically teachers receive multiple 20minute classroom visits across the school year to gain a full picture of instruction and student
learning in their classroom(s). All users of the EAR Protocol must go through a rigorous set of
trainings in order to use the tool. Data are uploaded to a central server that provides reports at
different levels (teacher, subject area, department, grade, school, district) for use in professional
development and reflective conversations with individuals and teams of teachers.
Classroom visitors use two items on the protocol to assess engagement: one measures the
percentage of students who are on-task and the second measures the percentage of on-task
students who are actively and intellectually engaged in the work. The first item is scored based
entirely on observations of students at work. This second item is scored using a combination of
observations of students at work and, when possible, brief conversations with some students. The
conversations, which take place only if they will not disrupt the class, include questions such as
R305R070025
28
“What does your teacher expect you to learn by doing this work?” and “Why do you think the
work you are doing is important?” To assess alignment, classroom visitors make eight binary
judgments about whether the learning materials, learning activities, expectations for student
work, and students’ class work reflect relevant federal and state standards and high stakes
assessments, as well as designated local curricula. Rigor is assessed with five indicators (three
binary, two percentages) that relate to both the cognitive level of the material and student work
expected and the extent to which students are required to demonstrate mastery of the content.
Items concern whether the learning materials and student-work products are appropriately
challenging, whether students are expected to meet/surpass state standards, and whether all
students have an opportunity to demonstrate proficiency and be retaught material to master.
Training. At the start of the first year of a school’s participation in ECED, instructional
leaders—including school-level instructional supervisors, math and literacy coaches, teacher
leaders, and district-level instructional staff—engage in four days of training to develop common
definitions and understanding of engagement, alignment, and rigor and to learn how to use the
EAR Classroom Visit Protocol. These trainings ensure that district efforts to improve instruction
in the participating schools, as well as the activities provided by IRRE, are being viewed through
the same lens. These trainings also engage instructional staff in action planning around data
emerging from EAR Classroom Visits, planning that helps shape later ECED activities. The
initial training consists of: (1) two full days of group instruction, including several classroom
visits followed by scoring discussions, (2) a two to three week window during which those
participating in the training make at least 10 practice visits as teams to calibrate their scoring,
and (3) two additional full days of group instruction focusing on calibration and use of the
subsequent data for instructional improvement. At the start of the second year of ECED
R305R070025
29
implementation, IRRE provides a 2-day condensed EAR Classroom Visit Protocol refresher
training to participating schools. In addition to these trainings for instructional leaders, all
instructional staff members in ECED treatment schools receive a 90-minute orientation to the
EAR Classroom Visit Protocol to build their awareness and begin building school-wide common
definitions of engagement, alignment, and rigor.
Use of EAR Protocol in ECED. Upon completion of the EAR Classroom Visit Protocol
training, participants visit at least five classes per week, using the EAR Protocol. Assuming that
five individuals are trained in each school and 28 weeks are available for visits each year, 700
EAR visits are made in each ECED treatment school each year. This tool is meant to monitor and
improve instruction throughout the school, not just in literacy and math classes, so schools are
encouraged to use the protocol in all subject areas. Data from these visits are uploaded to the
secure server and used to generate reports about the state of teaching and learning. Those reports
are a key source of information for ongoing teleconferences between school leaders and the
IRRE consultant, IRRE site visits, and school-level discussions about improving teaching and
learning. Further, this tool provides the entire school with a common lens and language for
identifying and discussing high quality instruction. (Note: As detailed in the Method chapter, in
order to assess changes in instruction, individuals unrelated to the schools made EAR Protocol
Visits to ELA and math classes in both treatment and control schools. Those visits were for
purposes of the evaluation only. Those data were not uploaded to the servers accessed by the
treatment schools and were not included in the data reports used for improving instruction.)
ECED Math. The math component of ECED is not a curriculum but a system for
delivering math instruction and assessing student progress that is specifically targeted to state
and national math standards. Additionally, ECED math involves a school-based math coach and
R305R070025
30
site visits from IRRE that include professional development for teachers and coaches. ECED
Math was based on the work of James Henderson and Dennis Chaconas in Kansas City, Kansas
in the late 1990s and in Kansas City, Missouri starting in 2009. Two high schools in Kansas City,
Missouri using the math benchmarking system reported an increase of more than 23 percentage
points in the number of students scoring proficient or higher on their Algebra 1 test after three
years of implementation (Robertson, 2013). In keeping with the recommendations from metaanalyses conducted by Slavin and colleagues (2009) and Rakes and colleagues (2010), ECED
Math involves multiple strategies for reorganizing instruction and maintaining student
engagement, with a focus on mastery learning and improved assessment strategies, as well as
changing teacher and student attitudes toward mathematics. ECED Math could be used in any or
all math courses, but treatment schools participating in the ECED Efficacy Trial used it only in
Algebra 1 and Geometry classes. These are the classes in which the highest proportions of 9thand 10th-graders are typically enrolled.
Organization of instruction and assessment. ECED Math begins with IRRE consultants
working with Algebra 1 and Geometry teachers and math coaches from treatment schools to
identify key standards or outcomes that their students must be able to demonstrate on high-stakes
accountability measures, such as state- mandated testing programs, the ACT, and the SAT, and
to be successful at the next course level. Once these teams of teachers make critical decisions
about what students must know and be able to do, IRRE consultants continue to assist with
prioritizing and grouping those standards into meaningful sequences of skills and units of study
referred to as benchmarks. Rather than using textbooks to determine the day-to-day classroom
activities, instruction in ECED Math courses is focused on a specific benchmark, phrased in
student-friendly terms called “I Can…” Statements. After each lesson or unit of instruction, each
R305R070025
31
student should be able to say, for example, “I Can…find the slope of a line,” or “I Can…solve
quadratic equations.” Once the initial curriculum work is completed, the teacher teams, with
ongoing support from IRRE consultants and local math coaches, design and develop a pacing
guide to ensure that all standards/benchmarks are addressed within a timeframe that allows
maximum exposure to the standards and testing formats prior to the state or district testing
window(s).
To ensure that all students have mastered the key standards (i.e., benchmarks) identified
by the teacher teams, the Algebra 1 and Geometry teachers develop a series of assessments that
are administered to students at the end of instruction around a particular “I Can…” statement. In
addition to the built in opportunities teachers have to check each student’s understanding
throughout each lesson, these short (five-question) benchmark assessments provide a
culminating check for mastery that is aligned with the original set of standards/benchmarks.
Students must achieve mastery of at least 80% on each “benchmark assessment” (i.e., answer
four of the five items correctly). In addition, students are asked to show mastery again by taking
“capstone” assessments, which integrate a small number of related individual benchmarks into a
coherent application of logically related concepts and skills. These capstone assessments might
also include performance items that move beyond the selected-response format common to most
classroom chapter or unit tests, and provide exposure and practice to the types of assessments
required in high stakes testing programs. Benchmark assessments are typically given every three
to five days and capstone assessments are typically given every two to three weeks.
Students do not receive “credit” for a benchmark until they demonstrate “double
mastery”: first on the benchmark assessment and a second time on the capstone assessment.
When students do not master one or more benchmarks, the classroom teacher is expected to give
R305R070025
32
students corrective instruction, using small group or individual instructional strategies. A second
form of the benchmark assessment is administered by the classroom teacher for students who did
not master the original benchmark assessment. Evidence of student achievement is posted
publicly in the classroom in the form of a chart that indicates which benchmarks each student has
mastered. Additionally, students have cards indicating which benchmarks they have mastered so
that their other teachers and parents are aware of their progress and can encourage them to
master the needed benchmarks. This restructured instruction and assessment clearly defines the
teachers’ roles and responsibilities and delineates the skills that must be taught, all in keeping
with clear state and district accountability standards, as supported by the work of Heller and
Greenleaf (2007) and Quint (2006).
Grading. One of the key features of the ECED Math program is the implementation of
mastery grading. Students are graded solely according to the number of benchmarks on which
they demonstrate double-mastery during the grading period. For example, if the district
determines that 90% is an A, then the student must demonstrate double mastery of at least 90%
of the benchmarks to receive that A. Individual schools can decide what percentage of
benchmarks a student must master for each grade, but ECED Math requires that the grades be
based solely on the percent of benchmarks mastered. Until students master enough benchmarks
to attain a C (typically 70%) they are assigned an I (incomplete). If a student still has an I when
the grading period ends, that I appears on his/her report card. Students are given multiple
opportunities, described below, to change that I to a C or higher. If all opportunities are
exhausted prior to attaining a C, the I is changed to an F.
This grading system represents a major shift in thinking for secondary math teachers,
students, and parents for two reasons. First, scores on different assessments are not averaged
R305R070025
33
together. So, a student who consistently answers 70% of the assessment items correctly will not
master any benchmarks and will not pass the class, even though that same student might have
received a C under more standard grading practices. The second major shift is that only
benchmark mastery—not homework, classwork, effort, or behavior—is used to determine
student grades. Some teachers find it challenging to motivate students when these other factors
are not part of their grade; however, ECED Math is based on the concept that only mastering the
content is truly important for determining success in math. IRRE provides multiple supports for
teachers (see below) as they make these significant changes.
Supports for struggling students. As noted above, a student who does not master a
benchmark on the first attempt receives additional support in the classroom and is then given a
second form of the benchmark assessment. If the student again does not master the benchmark at
the 80% level, she or he can participate in the “Benchmark Café”, where math teachers provide
additional individual assistance and validate student mastery of the benchmark. The Benchmark
Café is a room in the school where students can go if they need additional support in mastering
their benchmarks. Typically, certified teachers will take turns staffing the Benchmark Café, and
each school establishes a schedule for when it is open to students (e.g., during lunches, before
and after school, during vacation, etc.). At times the Benchmark Café is staffed by instructional
aides, college math students, or more advanced high school math students. In addition to the
Benchmark Café, ECED Math requires that schools provide summer school where students
receive additional instruction and opportunities to show mastery in order to raise their grade or
change an I to a C. Again, the summer program is staffed by math teachers or other qualified
individuals.
Supports for ECED Math teachers. ECED Math requires significant changes in the daily
R305R070025
34
practice of math teachers. Such changes require support and feedback, which are provided by a
summer institutes led by IRRE; by math coaches at each school who make regular classroom
visits, provide feedback and facilitate meetings on an on-going basis; and by professional
development sessions led by IRRE consultants throughout the school year.
Summer institutes. Supports for ECED Math begin during summer prior to the first year
of implementation. Teachers slated to teach ECED Math, along with their school’s math coach,
participate in three full days of curriculum mapping and common assessment training with
experienced IRRE consultants. Once these teams of teachers make critical decisions about what
students must know and be able to do, IRRE consultants continue to assist with prioritizing and
grouping those standards into benchmarks or ‘I Can’ statements described above. During the
second summer a two-day summer institute is held to refine the ‘I Can’ statement, as well as the
benchmark and capstone assessments, as needed. Newly hired teachers are introduced to ECED
math during this time.
Coaching and use of EAR Protocol. IRRE’s approach to instructional coaching is based
largely on the work of Joyce and Showers (2002). They describe their model for professional
development as one that fulfills teachers’ needs to learn the new skill or knowledge, to
experience that new learning, practice it, reflect on the practice and receive feedback and finally
to participate in coaching around the new learning in the context of their own classroom.
In order to participate in ECED, each school must agree to employ a math coach with at
least 50% full-time equivalent devoted to coaching the ECED Math teachers. One of the main
strategies by which coaches support teachers is through the use of the Engagement, Alignment,
and Rigor Classroom Visit Protocol described above. Coaches are trained to score this 15-item
tool, during and following a 20-minute classroom visit. The information is uploaded to a
R305R070025
35
centralized data base that allows coaches and others at the school to create various reports at the
teacher, department, or school level. Each coach is expected to make at least five EAR Protocol
visits each week. Coaches use the information they gather during EAR Protocol visits to support
individual teachers through the use of reflective questioning and one-on-one meetings. Coaches
and IRRE consultants make EAR Visits together during site-visits and use the information to
plan professional development activities and target supports.
In addition to EAR Protocol visits, coaches facilitate weekly meetings with ECED Math
teachers. The time is used to discuss emerging issues around use of ECED Math, including
reflection on lessons taught, data discussions based on student mastery levels, preview and
modeling of upcoming lessons, and discussion of modifications for struggling students. Coaches
also provide targeted support to teachers through strategies such as co-teaching, demonstrations
of lessons and strategies, and arranging for teachers to visit one another’s rooms.
Professional development led by IRRE. As part of the supports that IRRE provides to
ECED schools, IRRE consultants make four site-visits to each school each year. The site visits
last three days, and during that time each ECED Math teacher participates in a half-day
professional development session. In order to be considered for participation in the ECED
Efficacy Trial, districts had to agree to pay the cost of substitute teachers while their teachers
took part in these professional development activities. The content of the professional
development is determined jointly by the IRRE consultants, math coaches, and school and
district administrators. This decision is informed by the EAR Protocol data that have been
uploaded to the server. Examples of topics include using relevance to support student
engagement, increasing expectations by creating an effective learning environment, and
embedding learning through reflection.
R305R070025
36
Supports for math coaches. Math coaches are trained by IRRE, starting in the summer
before the start of implementation, and supported throughout the project via conference calls that
take place every other week and four three-day site visits each year. During the site-visits, the
coaches make EAR Protocol visits with the IRRE consultants, debrief about what they saw and
what supports teachers need, discuss reflective coaching strategies, and plan how the coaches can
best support each teacher. Additionally, IRRE consultants model the coaching process by
working directly with teachers to support improved instruction, while the coaches observe, and
IRRE consultants observe the coaches during a coaching session with a teacher and give the
coach feedback for enhancement. Each site visit ends with a meeting that includes the principal
and other members of the school leadership to discuss progress and plan for continued
improvement. The work of the site-visits is sustained and supported through regularly scheduled
conference calls between the IRRE consultants and the math coaches.
Links to EAR. IRRE intentionally designed the ECED Math to focus on their three core
instructional goals: Engagement, Alignment and Rigor (EAR). ‘I Can…’ statements are intended
to make math engaging and personally relevant by showing students what skills they will acquire
during each lesson. Alignment is assured because each school creates its own pace and sequence
based on its state and local standards. The double mastery system, coupled with grading based
solely on mastery, ensures rigor by holding all students to the same high standards for
understanding. The frequent benchmark assessments and the larger capstone assessments provide
continual checks for understanding and ensure that the information is being incorporated into the
students’ larger knowledge base of mathematics.
Literacy Matters 2: ECED’s Literacy Component. The centerpiece of ECED’s
approach to literacy is a two-year curriculum delivered in a stand-alone class that supplements
R305R070025
37
the regular 9th-and 10th-grade English curriculum. Additionally, Literacy Matters involves a
school-based literacy coach and site visits from IRRE that include professional development for
teachers and coaches. This multi-pronged approach to supporting improved literacy instruction is
in keeping with the Slavin and colleagues’ (2008) meta-analysis findings that indicated that
instructional process programs that provide teachers with extensive professional development to
implement specific instructional strategies are most effective. Further, as supported in the
literature reviewed earlier, the curriculum itself employs mixed-methods of small and large
group instruction, (Slavin et al., 2008), clearly defines teacher roles and responsibilities and
skills that must be taught (Heller & Greenleaf, 2007; Quint, 2006), and employs a school-wide
model specifically aimed at enhancing reading comprehension (Lang et al., 2009).
Secondary students’ literacy skills – their ability to read, write, speak, and listen – form a
fundamental building block both for academic achievement in high school and for life-long
success. Many of the schools also face the challenge of building these skills with students for
whom English is not their primary language. To meet these pressing needs, ECED provides a
two year, researched-based, structured literacy curriculum that supplements traditional English
courses by using authentic, real world, expository texts and engaging activities which provide
students with additional time and opportunities to foster and strengthen their literacy skills and
habits on a daily basis.
The first year of the curriculum, called Reading Matters 3, aims to strengthen students’
abilities to comprehend and gather information, helping students to identify ways to make
learning easier. Students refine critical thinking skills and learn how to work well with others
through activities such as debates, exploring career interests, and writing speeches to express the
change they want to see in the world. The curriculum includes four units each of which is
R305R070025
38
dedicated to a different text genre: (1) Who are we? (persuasive), (2) Our footprints on society
(expository), (3) The Change we want to be (research/analytical/persuasive), and (4) Learning
with others (research/analytical/descriptive). Each unit supports students in answering the
question: “What is our personal role and responsibility in seeing the need for and creating
positive changes in society?” Teachers use a collection of interdisciplinary instructional
strategies, called the Power 10, that equip students with transferable skills for comprehending,
organizing, and remembering information in multiple disciplines. Students work collaboratively
in small groups and teams on daily basis to share their thinking, expand their ideas, and reach
consensus, as well as listen to and present information with others. The curriculum includes
seven project based, rubric-driven assessments, so that students receive feedback regarding
mastery.
The second year of the curriculum, called Writing Matters 4 aims to strengthen students’
abilities to share and communicate information with others, helping students to identify ways to
express and personalize their knowledge. Using six traits of writing, instructional strategies, and
relevant topics, students explore what writing is and who writers are, the art of sharing
narratives, how to analyze the audience, and how to make their voices heard through oral,
written, and visual communication. The curriculum includes four units, each of which is
dedicated to a different text genre: (1): Who are writers? What do they do? (descriptive), (2) If
walls could talk, if hearts could speak (narrative), (3) The audience – Who is really listening?!
(analytical/persuasive), and (4) A call to action – Making our voice heard (research/persuasive).
Each unit supports students in answering the question: “In a culture of visible communication,
how can I increase my communication proficiency so that I can confidently, creatively, and
curiously explore, interact with, express, and make sense of self and society to impact the
R305R070025
39
world?” The curriculum embeds multiple activities and opportunities to reflect and determine
growth, including the development of a writer’s portfolio, so that students make connections to
the skills they are developing and their growing proficiency as writers. As with the first year of
the curriculum, teachers use the Power 10 strategies to equip students with skills for
comprehending, organizing, remembering, and communicating information in multiple
disciplines. The curriculum includes five project-based, rubric-driven assessments, allowing
students to receive feedback towards mastery.
As a prerequisite of participation in the ECED Efficacy Trial, schools had to agree to
enroll all of their 9th- and 10th-grade students in this supplemental literacy course. The only
exceptions were students in self-contained special education and “newcomers” whose English
skills were too low to be enrolled in the regular high school curriculum. In schools on a
traditional six- or seven-period schedule, this course was intended to meet for one period per day
for the entire school year. In schools on an accelerated four-by-four block schedule, it was
intended to meet for one block (approximately 90 minutes) per day for one semester. During the
first year of implementation, both 9th- and 10th-grade classes used the 9th-grade Reading Matters
curriculum, because none of the students had been exposed to the curriculum at that point.
During the second year of implementation, 9th-grade classes used the 9th-grade Reading Matters
curriculum, and 10th-grade classes used the Writing Matters curriculum. Students were also
expected to be enrolled in the regular 9th- or 10th- grade English/language arts course, again
either for one period per day all year or one block per day for one semester. Thus, the ECED
Literacy course effectively doubles the amount of ELA exposure each 9th- and 10th-grader
receives.
Links to EAR. The Literacy Matters curriculum, like ECED Math, was designed with an
R305R070025
40
intentional focus on IRRE’s three core instructional goals: Engagement, Alignment, and Rigor
(EAR). Making the material personally relevant is a well-established path to encourage
engagement (National Research Council and the Institute of Medicine, 2004). Literacy Matters
uses texts and assignments that address personal responsibility, positive societal change, and
one’s personal impact on the world to ensure high personal relevance and engagement. IRRE
works with school districts and states to map the Literacy Matters curriculum onto their state and
local standards, ensuring all standards are met and supported. Rigor is ensured through
appropriately challenging texts and the Power 10 strategies and assessment rubrics provide
students and teachers with on-going information about mastery.
Supports for Literacy Matters teachers. When schools start participating in ECED, the
literacy curriculum is new for all teachers and many of the Power 10 strategies are also new.
Thus, teachers require significant support for successful implementation. As with ECED Math,
supports for teachers using the Literacy Matters curricula come in the form of summer institutes
led by IRRE; ELA coaches at each school who make regular classroom visits, provide feedback,
and facilitate meetings; and professional development sessions led by IRRE consultants
throughout the school year.
Summer institutes. Supports for Literacy Matters begin during summer prior to the first
year of implementation. Teachers who will be teaching ECED Literacy, along with the school’s
literacy coach, participate in three full days of ECED Literacy introduction, modeling, and
practice, with experienced IRRE consultants. During the second summer, a two-day summer
institute introduces the second year of the curriculum and introduces newly hired teachers to
ECED Literacy.
Coaching and use of EAR Protocol. Literacy Matters’ coaching and use of the EAR
R305R070025
41
protocol closely parallel ECED Math, based largely on the work of Joyce and Showers (2002).
As part of the agreement to take part in ECED, each school must employ a literacy coach with at
least 50% full-time equivalent devoted to coaching ECED literacy teachers. Literacy coaches are
trained to use the Engagement, Alignment, and Rigor Protocol and use the information they
gather to support individual teachers using reflective questioning and one-on-one meetings. Each
coach is expected to make at least five EAR Protocol visits each week Also, coaches and IRRE
consultants make EAR Visits together during site-visits and use the information to plan
professional development activities and to target supports.
As with ECED Math, literacy coaches also facilitate weekly meetings with ECED
Literacy teachers. The time is used to discuss emerging issues around use of the curriculum,
including reflection on lessons taught, preview and modeling of upcoming lessons, and
discussion of modifications for struggling students. Coaches also provide targeted support to
teachers through strategies such as co-teaching, demonstrations of lessons and strategies, and
arranging for teachers to visit one another’s rooms.
Professional development led by IRRE. As part of the supports that IRRE provides to
ECED schools, IRRE consultants make four site-visits to each school each year. The format of
these visits parallels the ECED Math site visits. They last three days, and during that time each
ECED Literacy teacher participates in a half-day professional development session. The exact
content of the professional development is determined jointly by the IRRE consultants, literacy
coaches, and school and district administrators. This decision is informed by the EAR Protocol
data that has been uploaded to the server. Examples of topics include using relevance to support
student engagement, increasing expectations by creating an effective learning environment, and
embedding learning through reflection.
R305R070025
42
Supports for literacy coaches. The supports for literacy coaches are similar to those for
math coaches. They are trained by IRRE, starting in the summer before implementation begins,
and supported throughout the project via conference calls and four site-visits each year. During
the site-visits, the coaches make EAR Protocol visits with the IRRE consultants, debrief about
what they saw and what supports teachers need, discuss reflective coaching strategies, and plan
how the coaches can best support each teacher. IRRE consultants model the coaching process by
working directly with teachers to support improved instruction, while the coaches observe. IRRE
consultants also observe the coaches during a coaching session with a teacher and give the coach
feedback for enhancement. Each site visit ends with a meeting that includes the principal and
other members of the school leadership to discuss progress and next steps. The work of the sitevisits is sustained and supported through regularly scheduled conference calls between the IRRE
consultants and the literacy coaches.
Hypotheses
In order to evaluate the ECED approach to instructional improvement described above, a
school-clustered randomized trial was conducted in which 10 schools were randomly assigned to
receive all ECED support for two years and 10 were assigned to a ‘business as usual’ control
condition. The primary hypothesis was that student achievement in math and ELA, as measured
by standardized test scores, would increase as a result of participation in ECED. However, we
see achievement as related to many other aspects of education and hypothesized that other parts
of the achievement equation would be affected by ECED as well. Thus, we have six main
hypotheses driving the methods and analyses. They are presented here in the order in which they
are presented in this report:
1. Participation in ECED will improve teachers’ attitudes toward their school and work,
R305R070025
43
including the extent to which the feel supported by their school and district
administrators, their self-reported engagement in teaching, the extent to which they see
their colleagues as supportive.
2. Participation in ECED will increase engagement, alignment, and rigor, as measured by
the EAR Classroom Visit Protocol.
3. Students’ attitudes toward school (i.e., self-reported engagement, experience of teacher
support, perceived competence) will improve as a result of participation in ECED.
4. ECED will lead to increases in student achievement, as measured by math and ELA
standardized tests.
5. Student academic performance and commitment, as measured by grade-point average
(i.e., performance), attendance, and progress toward graduation (i.e., commitment), will
improve as a result of participation in ECED.
6. Within the ECED treatment schools, the extent to which the ECED components were
implemented as intended by IRRE will be associated with greater changes in teacher
attitudes, EAR Protocol scores, student achievement, student attitudes, and student
performance and commitment.
R305R070025
44
III. Method
Study Design
The Every Classroom, Every Day Evaluation is a school randomized field trial examining
the efficacy of a high school instructional improvement intervention. The intervention was
created and administered by the Institute for Research and Reform in Education (IRRE) and the
evaluation was conducted by a team of researchers through a US Department of Education,
Institute of Education Sciences grant to the University of Rochester. Twenty high schools (5
districts, 4 schools per district) were randomly assigned to either the treatment (n = 10) or control
(n = 10) condition, with two high schools from each district being assigned to each condition. 5
The five districts were spread across four states: two in California, one in Arizona, one in
Tennessee, and one in New York. Each school participated for a two year period. Schools in the
first recruitment group (n = 8) participated in 2009-10 and 2010-11. Schools in the second
recruitment group (n =12) participated in 2010-11 and 2011-12. There were five primary types of
data collected: (1) student surveys, (2) teacher surveys, and (3) classroom observations, using the
Engagement, Alignment, and Rigor (EAR) Classroom Visit Protocol, were collected in four
waves, once near the start and end of each academic year that the school participated; (4)
variation-in-implementation interviews with math and literacy coaches and/or department chairs
and the school principal or assistant principal, and (5) student records were collected twice, once
at the end of each academic year that the school participated. Each type of data is discussed in
detail later in this report. Treatment schools received all ECED supports (e.g., site visits from
IRRE consultants, curriculum materials) free of charge 6. Control schools were given a $10,000
honorarium ($5,000 per year) to thank them for their participation in the data collection
activities.
R305R070025
45
This study sought to address many of the design concerns raised in the three metaanalyses regarding improving classroom instruction. The intervention and data collection lasted
two years. Whereas many school reformers might argue that this is still too short to see
meaningful change, it is considerably longer than most past work in this area. Additionally, there
are 10 schools in each group, spread across five districts and four states. This is a relatively large
and diverse sample as compared to others randomized field trials of school improvement models,
allowing us to feel confident that the results can generalize to a wide variety of schools and
settings. Most importantly, schools were randomly assigned to condition, allowing for the
strongest form of causal inference.
School Selection
School recruitment. Initially, the goal was to include only schools that enrolled at least
220 9th-graders and had a minimum of 30% of students eligible for free/reduced price lunch. To
be considered for participation, a district needed to include at least four high schools that met the
recruitment criteria and were interested in participation.
As a first step in recruiting, the Common Core of Data (US Department of Education)
was used to create a list of all districts in the country that met the criteria for inclusion. This list
included over 150 districts throughout the US. Districts where IRRE or the research team had
personal contacts were contacted directly, generally via telephone. The remaining districts,
received an email message describing the study, a letter containing a one-page description about
the study, and/or a phone call. Additionally, a letter was sent to each state superintendent of
education with at least one eligible district, outlining the study and encouraging them to contact
districts within their state that might benefit from participation.
R305R070025
46
Follow-up contact was made with districts that expressed interest. This generally took the
form of several phone calls between IRRE leadership, the research team, and the district’s
leadership. If the district remained interested, a site-visit was conducted. Each site visit involved
one member of IRRE’s leadership and one member of the research team. Site visits included a
meeting with the principal from each school considering participation to explain the ECED’s
instructional improvement model and the research requirements (including the random
assignment), visits to at least two schools to see classrooms and meet a larger group of school
leaders, and a meeting between the research team member and the district’s research director.
After the site visit, interested districts signed a memorandum of understanding (MOU) outlining
all implementation and research requirements, including the random assignment procedures. A
sample MOU appears in Appendix 3. Following the recommendations of the Consort
Transparent Report of Trials (Schulz, Altman, & Moher, 2010), a flow diagram indicating the
numbers of schools identified during each step of recruitment appears in Appendix 4.
Random assignment. In order that districts and schools felt certain that a truly random
process was used to assign schools to the treatment versus control condition, the drawing was
broadcast via webcam. It involved placing four slips of paper in a bowl, each with a different
school’s name on each, and then having an individual unaffiliated with the project pull out two
that would participate in the treatment condition. District and school leadership teams were
invited to watch. Random assignment took place as soon as the MOU was signed, generally in
the spring prior to the start of the district’s participation in the study. 7
Characteristics of participating schools. Based on data from the Common Core of Data
(CCD) during the year the school began participating (2009-10 for the first recruitment group
and 2010-11 for the second recruitment group), the schools were generally large, with an average
R305R070025
47
enrollment of over 1,300 (SD = 690; median = 1,151), but the range of school sizes was also
large (156 to 2,553). 8 As described above we sought to include only schools with over 220 9thgraders. The average number of 9th-graders was over 350, but there were five schools that fell
below the 220 threshold. This was largely due to the open enrollment policies in several districts
through which students could choose to attend any school in the district, making it difficult for
districts to predict the following year’s enrollment. The racial/ethnic composition of the schools
varied widely. In ten schools over half the students were Latino, in five schools over half were
African American, and in three schools over half were White. In the remaining two schools there
was no racial/ethnic group that made up over half of the student body. On average, 70% of
students in the participating schools were eligible for free or reduced price lunch (FRPL) and
schools ranged from 46% to 98% FRPL (median = 69%). There was a wide range of
pupil/teacher ratios from 8.8 to 26.1 (mean = 18.3, median = 19.7, SD = 5.3).
Most of the schools (13 out of 20) served the traditional high school grades of 9th through
12th. Five schools were combined middle and high schools, serving either 6th- through 12thgrades 9 or 7th- through 12th-grades 10. Two schools—one treatment and one control, 11 from the
same district—opened their doors for the first time the year their participation in ECED began.
Those schools enrolled only 9th-graders during their first year of operation. During their second
year they enrolled 9th- and 10th-graders and were slated to add a grade each year until they were
serving 9th- through 12th-grade. Half of the schools were located in mid-size cities (n = 10; 5
treatment, 5 control). The remainder were in large suburbs (n = 5; 2 treatment, 3 control), large
cities (n = 3; 2 treatment, 1 control); or fringe rural areas (n = 2; 1 treatment, 1 control) (US
News and World Report, 2013).
R305R070025
48
Site attrition. Two treatment schools stopped their participation part way through the
project. The first 12 stopped its participation in the ECED supports after the first semester. Staff
buy-in was low at that school from the beginning, especially among teachers in the math
department. Further, the district superintendent who had originally supported ECED and
encouraged the schools to participate was let go during that first semester. That school did not
take part in data collection in the spring of the first year (Wave 2) or the fall of the second year
(Wave 3), but did participate in the fall of the first year (Wave 1) and the spring of the second
year (Wave 4). In Wave 4, the school administered surveys to 10th-grade students and allowed
EAR Classroom visits in 10th-grade rooms only, because 9th-grade students and teachers had had
no exposure to ECED. Additionally, in Year 2 school personnel participated in variation in
implementation interviews. They had not taken part in those interviews at the end of Year 1, but
did provide information about Year 1 during the Year 2 end-of-year interviews. The district
provided student records for both years for that school.
The second school to leave the project 13 participated for one year and then pulled out of
the ECED supports as the second year got underway. The school had not implemented the
reform well in Year 1 and the teachers had been unhappy that they were asked to participate in
this reform effort. That school had three different principals in the first year it participated and
was assigned a fourth new principal as the second year started. The new principal was not
interested in taking part in the supports. In the second year (Waves 3 and 4), the new principal
allowed the research team to collect student surveys, but he did not allow teacher surveys, EAR
Protocol visits, nor interviews. The district did provide student data for both years at this school.
Leadership turnover. The participating schools and districts experienced a very high
level of leadership turnover during the course of the study. From the time we began working
R305R070025
49
with them in the spring prior to their first year of participation to the end of the second year of
participation, the superintendents in four of the five participating districts left their positions. The
fifth kept his position throughout the project, but announced his resignation as toward the end of
the second year after having serious health problems during the project’s second year. At the
principal-level, there was one or more change in principals in 11 out of the 20 schools (six
treatment, five control) between the time the school was recruited into the study and the end of
their second year of participation. For more specifics on leadership changes and other
disruptions, see Chapter VII.
Study Teachers
Study teachers are individuals who taught a target course (Algebra 1, Geometry, 9thGrade English, 10th-Grade English or ECED Literacy) at any point during the four terms that
their school participated. 14 Demographic information about teachers was collected from both
district records and teacher surveys. Between these two sources we have demographic
information for between 74% and 95% of teachers, depending on the variable. Table 1 provides
demographic information for all study teachers, separated by subject taught and condition.
Math study teachers. Math study teachers are individuals who taught Algebra 1 or
Geometry at any point during the four terms that their school participated in the study. There
were 238 math study teachers in all; 116 in treatment schools and 122 in controls. 15 16 As seen in
Table 1, roughly half of math study teachers were male. Over half of math teachers were White,
with the non-White teachers roughly equally split among Latino, Black, and Asian. In general,
math teachers in the study had a lot of teaching experience, with about two-thirds having more
than 6 years of experience. There were no significant demographic differences between math
teachers in treatment versus control schools. Turnover was high among math teachers in the
R305R070025
50
study. As seen in Table 1 (in the section called “Study semesters teaching at the school”) only
57% of math teachers at treatment schools and 62% of teachers at control schools taught at the
school during all four terms of the intervention. The remainder started after the intervention
began, left before the intervention ended, or both.
In addition to many teachers leaving or joining the target schools during the two years of
the study, a fairly high number of teachers remained at a target school throughout the four terms
their school participated, but only taught target math courses some terms. Thus, considering
whether study math teachers were teaching target math courses of Algebra 1 and Geometry in all
terms, turnover was even higher. The average number of terms a math target teacher taught a
target course was less than three and just over one-third taught a target math course during all
four terms that their school participated. This represents a combination of teachers leaving the
school and changes in teachers’ assignments. For instance, the fact that 57% of teachers in
treatment schools were at the school all four terms, but only 36% taught a target course all four
terms, means that 21% of teachers were present at the school but were not assigned to teach a
target course during at least one term. Both teacher turnover and teacher reassignment have
potentially major impacts on the intervention’s success. Teachers who were not teaching a target
course during a particular term did not participate in the ECED supports that term and therefore
we might anticipate that they benefitted less from the supports than they would have with four
terms of exposure. Similarly, teachers who were assigned to target courses after the first study
term would not have received the full training and supports and would not have participated in
the summer institute during which the ‘I Can…” statements were created and the benchmark
assessments drafted.
English Language Arts (ELA) study teachers. ELA study teachers are individuals who
R305R070025
51
taught 9th- and/or 10th-Grade English 17 or ECED Literacy at any point during the four terms that
their school participated in the study. There were 298 18
19
ELA teachers in all; 166 in treatment
schools and 132 in control schools. As seen in Table 1, about two-thirds of ELA teachers were
female and about two-thirds of ELA teachers were White, with Hispanic/Latino teachers making
up the second largest racial/ethnic group. In general, ELA teachers had a lot of teaching
experience, with about two-thirds having more than 6 years of experience. There were no
significant demographic differences between ELA teachers in treatment versus control schools.
However, the ELA study teachers at treatment versus control schools are not entirely
comparable because ECED Literacy was an additional course that was added in treatment
schools as part of the intervention; control schools did not include such a course. As such,
comparisons between them are non-experimental and should be interpreted with caution. Most
treatment schools selected 9th-/10th-grade English teachers to teach their ECED Literacy courses,
so 9th-/10th-grade English teachers in control schools are being used for comparison. We cannot,
however, know that these same individuals would have taught ECED Literacy had their school
been selected to participate in the ECED treatment condition.
Further, in treatment schools only teachers who taught ECED Literacy received the
ECED supports. Although there was high overlap between the 9th-/10th-Grade English teachers
and the ECED Literacy teachers in treatment school, there were some 9th-/10th-Grade English
teachers at treatment schools who never taught ECED Literacy and therefore never participated
in ECED supports. Among the 166 ELA teachers at the treatment schools, 77 (46%) taught both
9th-/10th-grade English and ECED Literacy at some point during the study, 52 (31%) taught 9th/10th-grade English but not ECED Literacy, and 37 (22%) taught ECED Literacy but not 9th/10th-grade English. Thus the 31% of ELA teachers at treatment schools who never taught ECED
R305R070025
52
Literacy did not have any exposure to the intervention. They are nonetheless included in our
intent-to-treat analyses because ECED is intended to be a school-level reform.
As with math, turnover was high among ELA teachers. As seen in Table 1 (in the section
called “Study semesters teaching at the school”) only 49% of ELA teachers at treatment schools
and 61% at control schools were teaching at the school during all four terms of the intervention.
The remainder started after the intervention began, left before the intervention ended, or both.
As with math teachers, considering whether the ELA study teachers were teaching target
ELA (i.e., 9th-/10th-grade English and ECED Literacy) all terms, turnover was even higher. The
average number of terms a target teacher taught a target ELA course was only about 2.6, and less
than one-third of ELA teachers taught a target ELA course during all four terms. This is due both
to teachers leaving the schools and changes in teachers’ assignments. For example, in treatment
schools, 49% of teachers were at the school all four terms, but only 33% taught a target course
all four terms, meaning that 16% of teachers were present at the school but were assigned to
teach only non-target courses during at least one term. As with math teachers, both teacher
turnover and teacher reassignment have potentially major impacts on the intervention’s success.
As with math, ECED supports were only provided to study teachers who were teaching ECED
Literacy that term or year. 20
R305R070025
Table 1. Characteristics of Study Teachers
53
116
43.8
Math
Control
122
52.1
1.27
.20
166
31.4
65.3
11.2
13.3
8.2
2.0
56.5
12.0
10.9
17.4
3.3
-1.24
.13
-.50
1.59
.39
.22
.89
.62
.11
.69
71.8
17.7
4.8
4.0
1.6
7.9
10.9
22.8
22.8
22.8
12.9
3.06 (1.15)
7.3
10.4
17.7
26.0
26.0
12.5
3.16 (1.16)
.53
7.8
13.3
20.3
30.5
19.5
8.6
2.91 (1.16)
4.6
12.0
19.4
32.4
23.1
8.3
3.17 (1.11)
1.93
.06
12.1
26.7
4.3
56.9
2.66 (1.10)
13.9
18.0
6.6
61.5
2.58 (1.21)
.59
14.5
28.9
7.8
48.8
2.55 (1.10)
10.6
22.7
6.1
60.6
2.55 (1.12)
-.01
.99
12.9
44.0
6.9
36.2
23.0
33.6
5.7
37.7
15.1
47.0
5.4
32.5
17.4
42.4
7.6
32.6
Tx
N
Gender (% male)
Race/ethnicity (%)
White (non-Hispanic)
Hispanic/Latino
Black (non-Hispanic)
Asian/Pacific Islander
Other
Years teaching (%)
< 1 year
1-2 years
3-5 years
6-10 years
11-20 years
20 or more years
Study semesters teaching at the
school: Mean (SD) 21
% study semesters teaching at the
school (any course, target or not)
1 term
2 terms
3 terms
4 terms
Study terms teaching target
course: Mean (SD)
Study terms teaching target
course (%)
1 term
2 terms
3 terms
4 terms
t
.64
-.54
p
Tx
ELA
Control
132
31.5
t
p
.02
.99
71.0
.22
15.0 -.47
10.3 1.65
0.9 -1.38
2.8
.71
.83
.64
.10
.17
.48
R305R070025
54
Study Students
The target population of students for this study was all students who were in 9th- or 10thgrade in either of the years of implementation who met the inclusion criteria (ni = 22,131; ni =
10,515 treatment and 11,616 control).
Inclusion/exclusion criteria. 22 Students in 9th- or 10th-grade were excluded from both
the intervention and the evaluation if they were (1) in a self-contained special education class,
meaning that they had a disability that was too involved to participate in the regular curriculum,
or (2) a ‘newcomer’ to the country with such limited English skills that they were excluded from
the regular curriculum, as defined by their school district. Across all districts, a total of 497
students (2.2% of otherwise eligible students) were excluded because they were in self-contained
special education; of these 261 were in treatment schools (2.4%) and 236 were in control schools
(2.0%). Across all districts, 292 students (1.3%) were excluded because they were newcomers;
of these 160 were in treatment schools (1.5%) and 132 were in control schools (1.1%). Note that
these are small sub-groups of the special education and English language learner (ELL)
populations; most special education and ELL students were included in this study.
Student rosters were obtained from the districts two times each year (four times across
the study) to ascertain exactly which students were in 9th- or 10th-grade and to organize the
survey administration. Rosters were generally received about one month after the start of the
school year and two months prior to the end of the school year. Any student who appeared on
any of the four rosters is considered part of the study population (aside from the two groups
described above), regardless of how long they attended the school. A few students may have
been excluded because they enrolled after the fall rosters were generated and left before the
spring rosters were generated. Although we think this group was very small, there is no way to
R305R070025
55
know its precise size. Additionally, this strategy means that some students are counted as part of
the study population who actually attended a study school for very few days or not at all, because
their names were on the roster the day it was created for the study. Some schools were slow to
remove students from rosters, so some students who enrolled but never attended remained on the
rosters and are therefore counted as part of the study-students.
At the start of the study, parents were sent a brief description of the study and given a
form to return if they did not want their child to participate in the questionnaires or to have their
child’s records released. Across all districts, parents of 366 students (1.7%) returned this form to
exclude their child; in treatment schools, parents of 114 students (1.1%) returned the form and in
control schools, parents of 252 (2.2%) returned the form. Because they are still part of the target
student population, all values for those students were considered missing and were imputed.
Movement between schools. The total target student population of 22,131 actually
represents only 21,641 individual students. During the two years of the intervention, 476 students
(2.2%) moved from one target school to another target school one time and 7 (<1%) students
moved from one target school to another target school two times. Data were collected from and
about these students in all schools whenever possible. For analyses purposes, they are treated as
different students and appear in the data set two or three times.
Student demographic characteristics. Students fell into three different grade cohorts:
Grade Cohort 1 included students in 9th-grade in Year 1 and/or 10th-grade in Year 2, Grade
Cohort 2 included students in 10th-grade in Year 1, and Grade Cohort 3 included students in 9thgrade in Year 2. Students who were in 9th-grade both years (n = 382) are part of the first grade
cohort, whereas students who were in 10th-grade both years (n = 170) are part of the second
grade cohort. The analyses in this report include only Grade Cohort 1 because those students had
R305R070025
56
the potential of being exposed to the intervention for two years, and therefore are the most likely
to evidence intervention effects. Table 2 presents demographic information about study students,
separated by grade cohort and condition. Demographic information comes primarily from
records provided by the districts. When demographic information was missing from the district
records, we used the students’ report of race/ethnicity and gender from the student surveys. This
happened in roughly 1.3% of cases. After combining data sources, race/ethnicity was missing for
2.8% of cases, gender was missing for 2.5% of cases, and other demographic variables were
missing for roughly 4.1% of cases.
No attempt was made to collect student survey data from students who left a participating
school and did not enroll in a different participating school. School records were requested for all
students in the target student population; however, districts often could not provide records about
course enrollments for students who left during the year and those students did not typically take
the standardized tests. The last section of Table 2 indicates how many terms students were
enrolled in the study schools. For Grade Cohort 1 students in treatment schools to have the full
benefit of the ECED they would need to have been enrolled for all four terms of the project.
Among students in Grade Cohort 1 (9th-grade in Year 1 and/or 10th-grade in Year 2), just over
half (54%) were enrolled all four terms, meaning 46% either arrived after the first wave of data
collection, left prior to the last wave of data collection, or both.
As seen in Table 2, there are some differences between the study students in the treatment
and control conditions, with control schools enrolling slightly more students of color, students
who were eligible for free or reduced price lunch, and English language learners. To account for
these differences, these demographic variables are used as covariates in the analyses regarding
the impacts of ECED on student outcomes (see Chapter VI).
R305R070025
Table 2. Demographic Characteristics of Study Students
57
Grade Cohort 1
(9th-Grade in Y1)
N
Tx
Ctrl
3,999
4,434
t
Grade Cohort 2
(10th-Grade in Y1)
p
Tx
Ctrl
3,108
3,539
t
Grade Cohort 3
(9th-Grade in Y2)
p
Tx
Ctrl
3,408
3,643
t
p
Gender (% male)
53.1
52.3 -0.74
.46
50.1
51.3
0.89
0.37
49.7
50.6
0.72
0.47
Race/ethnicity (%)
Hispanic
Black, non-Hispanic
White, non-Hispanic
Asian/Pacific Islander
Other
49.7
22.5
17.1
8.2
2.6
52.0 2.10
26.0 3.66
11.8 -6.84
7.9 -0.40
2.3 -0.75
0.04
0.00
0.00
0.69
0.43
50.0
20.9
17.9
8.9
2.3
53.8 3.10
22.9 1.91
11.2 -7.73
9.3 0.60
2.8 1.19
0.00
0.06
0.00
0.55
0.24
50.6
22.2
17.0
7.2
3.0
58.7
19.4
11.3
7.9
2.7
6.73
-2.87
-6.72
1.05
-0.75
0.00
0.00
0.00
0.30
0.46
Free/Reduced Price Lunch
(either year) (%)
72.7
78.6
6.17
0.00
68.6
75.9
6.60
0.00
69.7
74.7
4.57
0.00
ELL (Y1) (%)
19.9
24.3
4.49
0.00
19.8
26.4
6.29
0.00
NA
NA
NA
NA
5.4
5.6
0.43
0.67
4.2
5.2
1.95
0.05
NA
NA
NA
NA
Mean age at baseline in
years (SD) 23
14.69
(.66)
14.71
(.70)
0.17
15.63
(.59)
15.65
(.63)
0.24
13.64
(.57)
13.61
(.54)
-2.01
0.05
Mean terms enrolled (SD)
3.06
(1.18)
3.01
(1.17) -1.81
0.07
1.86
(.44)
1.85
(.45) -1.51
0.13
1.84
(.37)
1.82
(.38)
-1.77
0.08
16.4
83.6
NA
NA
18.3
81.7
NA
NA
16.0
84.0
NA
NA
17.6
82.4
NA
NA
Special Education (Y1) (%)
Terms enrolled (%)
1 term
2 terms
3 terms
4 terms
16.5
17.2
10.1
56.2
16.7
18.1
12.5
52.7
1.37
1.19
R305R070025
58
Enrollment in Literacy Matters. As described in Chapter II, as part of the ECED
treatment all target students in treatment schools were supposed to be enrolled in a Literacy
Matters 24 course. Target students in Grade Cohort 1 at treatment schools on a traditional
schedule or an AB block schedule were supposed to be enrolled in one period or block of
Literacy Matters during each of the four terms. Those in treatment schools on a 4 X 4 block
schedule were supposed to be enrolled for one-block for one semester each year. 25 As seen in
Table 3, only about one-quarter of the 3,999 students in Grade Cohort 1 at a treatment school
took the full amount of Literacy Matters prescribed by IRRE. Over 40% of students were
enrolled for less than the full four terms, and therefore could not have taken the full amount of
Literacy Matters prescribed. Another 10% were enrolled in one of the two schools that left the
study. Those schools did not offer Literacy Matters in Year 2. As seen in the last column of
Table 3, looking just at those students who were enrolled for the full four terms of the project,
almost half took the full amount of Literacy Matters prescribed by IRRE. As described in more
detail later in this report (see p. 109 and 159), two schools that were on a traditional schedule
only offered Literacy Matters for one term each year. Many of the students who did not take the
full amount of Literacy Matters were in one of those two schools. Note that all students are
included in the intent-to-treat analysis regardless of whether or not they were enrolled in Literacy
Matters, making the comparison very conservative. Of course, we expect the strongest effects for
students who were exposed to the full amount intended by IRRE.
R305R070025
59
Table 3. Enrollment in Literacy Matters Among Grade Cohort 1 Students in Treatment
Schools
Number
Percent of All
Grade Cohort 1
Students at
Treatment
Schools
(n = 3,999)
Percent of Students
in Grade Cohort 1
and Enrolled All
Four Terms in a
Treatment School
(n = 2,247)
Enrolled in treatment school all four terms and…
took amount of Literacy Matters prescribed
took some Literacy Matters but less than
prescribed
took no Literacy Matters
1,036
596
25.9
14.9
46.1
26.5
68
1.7
3.0
408
10.2
18.2
unknown, did not receive all course schedules
but did take at least some ECED Lit
53
1.3
2.4
unknown, did not receive all course schedules,
no ECED Lit on schedules received
86
2.2
3.8
1,752
43.8
-
at one of the two schools that left ECED
26
Enrolled at treatment school less than four terms
Enrollment in target math. There was no requirement that students in ECED treatment
schools be enrolled in particular math courses, but only Algebra 1 and Geometry courses
employed the ECED Math strategies and only teachers of Algebra 1 and Geometry courses
received the ECED supports. In most high schools it is standard practice for the majority of 9thand 10th-graders to be enrolled in Algebra 1 and/or Geometry, thus we anticipated that most
target students would be enrolled in those courses. As seen in Table 4, about one-third of
students in Grade Cohort 1 took the expected pattern of at least one term of Algebra 1 and one
term of Geometry. Approximately another 20% took either Algebra 1 or Geometry, but not both.
As noted above, a large percentage of students were not enrolled all four terms (44% in treatment
and 47% in control), so they could not take the expected patterns of courses. Because this is a
school-level intervention, all target students are included in the intent-to-treat analyses regardless
if they were enrolled in targeted courses or not. That said, we would anticipate the effects to be
strongest for those students who were enrolled in the target courses of Algebra 1 and Geometry.
R305R070025
60
Table 4. Enrollment in Algebra 1 and Geometry Among Grade Cohort 1 Students in Treatment and Control Schools
Treatment
Number
Percent of
All Grade
Cohort 1
Students
(N = 3,999)
Control
Percent of
Students in Grade
Cohort 1 and
Enrolled All Four
Terms
(N = 2,247)
Number
Percent of
All Grade
Cohort 1
Students
(N = 4,434)
Percent of Students
in Grade Cohort 1
and Enrolled All
Four Terms
(N = 2,337)
Enrolled in treatment school all
four terms and…
took one or more Algebra 1
class and one or more
Geometry class
1,224
30.6
54.5
1,377
31.1
58.9
took one or more Algebra 1
class and no Geometry class
466
11.7
20.7
378
8.5
16.2
took no Algebra 1 class and
one or more Geometry class
364
9.1
16.2
372
8.4
15.9
took neither Algebra 1, nor
Geometry
49
1.2
2.2
35
0.8
1.5
unknown, did not receive all
course schedules but did take
at least some Algebra 1 or
Geometry
99
2.5
4.4
139
3.1
5.9
unknown, did not receive all
course schedules, no Algebra
1 or Geometry on schedules
received
45
1.1
2.0
36
0.8
1.5
Enrolled at treatment school less
than four terms
1,752
43.8
2,097
47.3
R305R070025
61
Teacher Questionnaires
Administration schedule and procedures. Teacher questionnaires were administered
four times, at the beginning and end of each year the school participated in the study. At each
wave, we attempted to include (1) all teachers who were teaching a target course that wave, and
(2) all teachers who had taught a target course during a pervious wave and were still employed at
the school. Target courses were Algebra 1, Geometry, ECED Literacy, 9th-Grade ELA, and 10thGrade ELA.
The fall (Wave 1 and Wave 3) teacher surveys were administered on paper. Teachers
received their survey either from the individual conducting the EAR Protocol Classroom Visits
(see below), or it was placed in their mailboxes at school. Teachers were given stamped,
addressed envelopes to return surveys. The teacher’s name appeared on a cover letter requesting
their participation. Teachers were asked to remove the cover letter prior to returning the survey.
The survey itself contained only the teacher’s study identification number. Teachers were
reminded to complete the surveys on several occasions via email from the research project
director or from the ECED research contact within the school. 27
Spring surveys (Wave 2 and Wave 4) were administered on-line. 28 Teachers received the
web address for the survey (i.e., the url) and their personal access code via both email and notes
placed in their school mailboxes. The survey access codes were different from the teacher study
identification number and each one was used only one time. Teachers who did not complete the
survey were reminded on several occasions via email and prompts from their ECED research
contact within the school or from the research project manager.
Response rates. Response rates varied considerably by administration wave and school.
Table 5 shows the response rates for each wave, separately by treatment and control schools, as
R305R070025
62
well as the reasons for non-response when known. Considering only teachers who were at the
school when the surveys were administered, the response rate for math teachers ranged from
52% at Wave 1 control schools to 74% at Wave 4 treatment and control schools. For ELA, they
ranged from 43% at Wave 1 control schools to 72% at Wave 4 control schools. Response rates
were lower when looking at all teachers in the study, rather than just those who were teaching at
the school at the time of administration. This was due to turnover. Many teachers were not
employed at a study-school in all waves and therefore could not have participated in the survey.
R305R070025
63
Table 5. Teacher Survey Response Rates
Math Teachers
Wave 1
N
Participated (%) 29
Wave 2
ELA Teachers
Wave 3
Tx
Wave 4
Ctrl
Tx
Wave 1
Ctrl
Tx
Wave 2
Ctrl
Tx
Wave 3
Ctrl
Tx
Wave 4
Tx
Ctrl
Tx
Ctrl
Ctrl
Tx
Ctrl
116
50.0
122
40.2
116
50.9
122 116 122 116 122
48.4 39.7 48.4 55.2 59.0
166 132 166 132 166 132 166 132
34.3 34.1 48.8 53.8 38.0 44.7 44.0 55.3
10.7
26.5 22.0
Reasons for non-participation (%)
not teaching target
course 30
school did not
allow 31
14.7
14.8
6.9
0
0
11.2
no longer/not yet at
school
24.1
22.1
20.7
18.9 26.7 23.0 25.9 20.5
other 32
11.2
23.0
10.3
22.1 19.0 23.8 17.2 20.5
65.9
51.6
64.1
59.6 54.1 62.8 74.4 74.2
Participation among
those teaching at the
school at the time of
administration (%) 33
3.4
4.9
0
0
0 11.2
0
1.7
0
0
8.4 16.7
0 11.4
4.8
6.1
4.2
0.8
0 10.8
0
1.8
0
30.7 21.2 22.9 15.9 27.7 23.5 29.5 22.7
8.4 22.7
8.4 13.6 18.7 25.8 20.5 21.2
49.6 43.3 63.3 64.0 52.5 58.4 62.4 71.6
R305R070025
64
Teacher questionnaire items and scale development. The items on the teacher
questionnaires came primarily from IRRE’s past research. Appendix 5 lists all items in the
teacher questionnaire, along with the order in which the item appeared on the questionnaires, the
response options, the construct each was originally intended to measure, and the waves in which
the item was included. Most of these items had been administered to teacher’s participating in
IRRE’s First Things First supports for many years and had been revised repeatedly. The items
specific to ECED implementation and professional development were written for the current
project. Factor analyses using the ECED data resulted in a slightly different set of scales than the
ones that had previously been used by IRRE.
Data reduction and scale development were conducted on items included in the teacher
survey. As noted above, the Wave 1, Year 1 survey for teachers in schools in the first recruitment
group contained a subset of the full set of items administered in subsequent waves. Thus, Wave 2
data were used for the scale development. Due to the small sample size (n = 129) relative to the
number of items, EFA’s were conducted on the full sample of teachers in the first recruitment
group using most items. The factor structure was then confirmed using Wave 2 data from
teachers in the schools in the second recruitment group. All analyses were conducted in MPlus
(Muthén & Muthén, 1998-2009; Version 6.12). Standard errors were adjusted to take into
account school cluster sampling.
The initial EFA revealed that not all items loaded on their pre-hypothesized factors. The
seven items making up the original construct of ‘Perceived Value of Professional Development’
loaded cleanly onto one factor (α =.93) and the three ‘Perceived Competence’ items loaded
cleanly onto another (α =.88). These factors were left intact and removed from the exploratory
analyses. Based on the initial EFA, we determined that the rest of the teacher items fell into two
R305R070025
65
categories: (1) teachers’ ratings of support (teacher collective commitment, support from school
administration, support from district administration); and (2) beliefs about change and morale
(commitment to change, confidence in change, and teacher morale). A four-factor solution was
chosen for the model that included items in the first category. These factors were: (1) teacher
collective commitment composed of 3 items (α =.82); (2) teacher mutual support composed of 3
items (α =.85); (3) Support from district administration composed of 3 items (α =.77); (4)
Support from school administration composed of 3 items (α =.83). A three-factor solution was
chosen for the model with items in the second category. These were the same as the original
constructs: (1) District/school commitment to change composed of 3 items (α =.87); (2)
Confidence in change composed of 4 items (α =.86); and (3) individual teacher morale composed
of 3 items (α =.70). Three items were dropped due to low loadings or cross loadings. 34
Correlations between the factors ranged from .21 for individual teacher morale and teacher
mutual support to .76 for support from school administration and support from district
administration.
Four factors loaded onto a second-order factor with good model fit: (1) Support from
district administration; (2) Support from school administration; (3) Commitment to change; and
4) Confidence in change. This factor—which we call Administrative Support for Instructional
Innovation and Improvement—measures teachers’ beliefs that (1) administrators are responsive
to teachers’ needs; and (2) efforts are being made to improve teaching and learning. The
remaining four factors did not fit together well into a second-order factor. These will be analyzed
separately. The primary impact analyses on the teacher survey were therefore conducted on one
second-order factor and four first-order factors. See Table 6 for the final structure and factor
loadings for the separate and second-order factors.
R305R070025
66
Table 6. Standardized Results from Final CFA from Teacher Questionnaire Items
Loading SE
Teacher collective commitment
Teachers at this school do what is necessary to get the job done right.
Teachers at this school don't give up when difficulties arise.
Teachers at this school go beyond the call of duty to do the best job they
can.
Teacher mutual support
Teachers in this school encourage each other to do well.
Teachers in this school share resources with each other.
Teachers in this school go out of their way to help each other.
Support from district administration
District administrators are attentive to school personnel and provide the
encouragement and support they need for working with students.
District administrators allow school staff to try educational innovations
that the teachers believe would be helpful for the students.
District administrators are responsive to the needs of teachers and staff for
professional development.
Support from school administration
School administrators understand and respond to the teachers' needs.
School administrators support teachers making their own decisions about
their students.
School administrators help teachers and staff get what they need from the
district office.
Commitment to change
How committed do you think the School Board is to making changes that
will improve instruction and achievement in your district?
How committed are the staff in the District Office to improving teaching
and learning in the district?
How committed is your superintendent to strengthening the quality of
instruction within your district?
Confidence in change
How confident are you that your school is making changes that will
improve the performance of your students?
How confident are you that instruction can be improved in your school to
ensure that all students experience high quality teaching and learning every
day?
0.816
0.710
0.032
0.044
0.824
0.024
0.867
0.707
0.815
0.038
0.047
0.035
0.797
0.035
0.694
0.050
0.729
0.032
0.830
0.023
0.753
0.041
0.805
0.023
0.797
0.049
0.863
0.038
0.845
0.031
0.956
0.012
0.628
0.031
R305R070025
67
Loading SE
How confident are you that your school is improving instruction in ways
that can be sustained over time?
How committed is your principal to supporting changes in the school that
will improve the quality of teaching in all classrooms?
0.914
0.024
0.672
0.036
0.685
0.632
0.729
0.071
0.075
0.067
0.798
0.716
0.742
0.026
0.042
0.038
0.819
0.026
0.873
0.011
0.695
0.867
0.048
0.020
Perceived competence
I am very confident in my abilities as a teacher.
I think I am a very skilled teacher.
I feel very competent as a teacher.
0.825
0.848
0.871
0.037
0.029
0.027
Administrative support for instructional innovation and improvement
(second-order factor)
Support from district administration
Support from school administration
Commitment to change
Confidence in change
0.883
0.837
0.701
0.761
0.037
0.046
0.064
0.056
Individual teacher morale
I look forward to going to work in the morning.
When I am teaching, I feel happy.
When I am teaching, I feel discouraged. (reversed)
Perceived value of professional development
Helped me to increase student engagement in my classes.
Helped me to better understand the subjects I teach.
Enhanced my classroom management skills.
Helped me to challenge and encourage all students to work at or above
grade level.
Increased my use of effective instructional strategies for improving
academic achievement.
Increased the extent to which my instruction is aligned with the course
standards and curriculum.
Are likely to have a lasting impact on my instructional practices.
Note. Results shown are from Wave 2 (Spring of Year 1) of the first and
second recruitment groups; N = 281
Some of the original constructs were not included in the EFA. Items contributing to the
Relative Autonomy Index (RAI) were left out of the exploratory analyses because they form a
well validated scale intended to measure a continuum of motivation (Relative Autonomy Index;
R305R070025
68
Grolnick & Ryan, 1989) and will therefore be analyzed separately. 35 Additionally, the items
measuring amount and type of professional development and the items measuring
implementation of ECED were not included because they were not intended to form a factor (i.e.,
not expected to be correlated with one another). Rather, they were intended to provide
information on variation in experience.
Engagement, Alignment, and Rigor (EAR) Classroom Visit Protocol
The EAR Protocol is a 15-item observational tool completed by trained observers after a
20-minute observation designed to measure classroom-level Engagement, Alignment, and Rigor.
Typically the tool is used by instructional leaders within their own school/district and by IRRE
consultants working with schools, as a means of providing feedback to teachers and guiding
professional development. Use of this tool by leaders and IRRE consultants for purposes of
instructional improvement in the treatment schools was one component of the ECED
intervention. Additionally, the research staff used the EAR Protocol in order to obtain
independent, objective, systematic information from both treatment and control schools. The 15
items of the EAR Protocol appear in Table 7. For more details about the tool, its psychometric
properties, and scoring, see Early et al. (2013 which appears in Appendix 2). Table 7 also
provides means and standard deviations for the 3,558 EAR observations conducted in the
classrooms of target teachers while instructing target classes as part of the ECED Efficacy Trial.
This information is provided simply to give a picture of the type of distribution we see across a
large number of observations. As described below, for analysis purposes, EAR observations
were scored, and the scores were averaged for each teacher, at each wave. In addition to scoring
the 15 items, observers recorded: the number of students present, the learning materials used
(e.g., calculators, journals, smart boards), and the learning activities used (e.g., cooperative
R305R070025
69
learning strategies, individual projects, lecture).
Engagement. The EAR Protocol includes two items to assess engagement (labeled E1
and E2 on Table 7): the first item measures the percentage of students who are on-task and the
second measures the percentage of on-task students who are actively and intellectually engaged
in the work. This second item is scored using a combination of observations and, when possible,
brief conversations with students. The conversations, which take place only if they will not
disrupt the class, include questions such as “What does your teacher expect you to learn by doing
this work?” and “Why do you think the work you are doing is important?” Using the scoring
method presented in Early et al. (2013), the final Engagement score is the mean of the proportion
of students on task (E1) and the proportion of students actively engaged in the work (E1 * E2).
Thus, the final formula is (E1 + (E1 * E2))/2).
Alignment. Observers make eight binary judgments about whether the learning
materials, learning activities, expectations for students, and students’ class work reflect relevant
federal, state, and local standards, as well as designated curricula. Only four of the eight items
(A1c, A2c, A3, and A4) are included in the final Alignment score due to low variance on the
others. Again using the scoring method presented in Early et al. (2013), the final Alignment
score is the proportion of positive answers on those four indicators.
Rigor. This construct is assessed with five judgments (three binary, two percentages) that
relate to both the difficulty of the material presented and the extent to which students are
required to demonstrate mastery of the required material. Items concern whether learning
materials and instruction are appropriately difficult, whether students are expected to
meet/surpass state standards, and whether they have an opportunity to demonstrate proficiency.
Four of the five Rigor items (R1, R2, R3, and R4) are used to calculate the final Rigor score (see
R305R070025
70
Early et al., 2013). In order to combine the rigor indicators on a common scale, they are
standardized using estimates of population means and standard deviations from 1,551
observations conducted by the IRRE intervention team in 19 high schools in six school districts
across the country between 2004 and 2010. After standardization, the four items are averaged
together.
R305R070025
71
Table 7. EAR Classroom Visit Protocol Items (n = 3,558)
Item
Engagement
E1
% of students on task
E2
% of students actively engaged in the work requested.
Product of E1 * E2 36
Alignment
A1a The learning materials did(1) / did not(0) reflect content standards
guiding this class.
A1b The learning materials were(1) / were not(0) aligned with the
designated curriculum to teach those standards.
A1c The learning materials were(1) / were not(0) aligned with the
pacing guide of this course or grade level curriculum
A2a The learning activities did(1) / did not(0) reflect content standards
guiding this class.
A2b The learning activities were(1) / were not(0) aligned with the
designated curriculum to teach those standards.
A2c The learning activities were(1) / were not(0) aligned with the scope
and sequence of the course according to the course syllabus.
A3 The student work expected was(1) / was not(0) aligned with the
types of work products expected in state grade level performance
standards.
A4 Student work did(1) / did not(0) provide exposure to and practice
on high stakes assessment methodologies.
Rigor
The learning materials did(1) / did not(0) present content at an
R1
appropriate difficulty level.
The student work expected did(1) / did not(0) allow students to
R2
demonstrate proficient or higher levels of learning according to
state grade level performance standards.
Evaluations/grading of student work did(1) / did not(0) reflect state
R3
grade level performance standards.
% of students required to demonstrate whether or not they had
R4
mastered content being taught.
% of students demonstrated threshold levels of mastery before
R5
new content was introduced.
Mean
SD
79.20
72.09
60.59
20.37
22.92
27.01
.96
.19
.96
.19
.90
.30
.95
.22
.94
.23
.89
.31
.79
.41
.42
.49
.93
.26
.53
.50
.32
.47
39.35
39.20
15.64
25.42
R305R070025
72
Classroom Observer Training. Training on the EAR protocol is typically led by IRRE
and consists of (1) two full days of group instruction, including several classroom visits with
discussion of scoring, (2) a two to three week window during which individuals participating in
the training make practice visits as teams to calibrate their scoring, and (3) two additional full
days of group instruction focusing on calibration and use of the data for school improvement.
Generally, these training sessions are conducted in school districts with all or most of the
participants being school district employees, such as assistant principals or academic coaches,
who will be using the protocol as part of their district’s instructional improvement efforts. See
Appendix 6 for details about each data collectors’ training experiences.
The seven individuals who independently collected EAR Protocol data for ECED were
all highly experienced educators. All had been classroom teachers and most had been school and
district administrators (e.g., superintendent, principal, or assistant principal). Several were
currently consulting with districts across the nation. None had any type of relationship (past or
present, employee or consultant) with the districts in which they collected data. 37 All were blind
to the treatment status of the schools in which they collected EAR data.
EAR Classroom Protocol Schedule. At each wave, classroom observers spent two
(typically consecutive 38) weeks in each district, conducting EAR Protocol visits. Typically, there
were two classroom visitors in each district for both weeks, but in districts with a small number
of teachers there was occasionally one classroom visitor one week and two the other week. The
research project director and a research associate created schedules for each visitor for each day,
indicating which teachers should be visited during each period. The goal was for each observer to
conduct 12 observations on each observation day; however, due to teacher absences and other
scheduling conflicts, observers often completed 10 or 11 visits in a day.
R305R070025
73
At each wave, the goal was to observe each teacher who was teaching a target class a
minimum of two times. The target classes were Algebra 1, Geometry, ECED Literacy, and 9thand 10th-grade English/Language Arts. If the teacher taught multiple sections of target classes,
the observers attempted to visit each section of his/her target classes at least once. Note that the
individuals targeted for EAR visits were slightly different from those targeted for teacher
surveys. EAR Protocol visits were only made to target classes, so teachers who were not
teaching a target class during a particular term were not visited. This is because the EAR
Protocol is specific to instruction in a particular course, so including non-target classes would not
have been meaningful. However, not only teachers who were teaching a target class but also any
teacher who had taught a target class during a pervious wave was asked to complete the survey.
The surveys cover general experiences and attitudes and were not tied to individual courses, so
survey responses from teachers who were not currently teaching a target course are meaningful
for conveying teachers’ attitudes and feelings in the participating schools.
Table 8 shows the mean number of observations completed of each target teacher at each
wave. The mean was always between 2 and 3. As seen on the last line of Table 8, when teachers
were at the school and teaching a target course, they were almost always observed. However, as
noted earlier, the high amount of turnover—both in terms of who was teaching in the schools and
who was teaching target courses—led to large amounts of missing data (see the second to last
line of Table 8).
R305R070025
74
Table 8. Number of EAR Classroom Visit Protocols Collected
Math Teachers
Wave 1
Wave 2
ELA Teachers
Wave 3
Wave 4
Wave 1
Wave 2
Wave 3
Wave 4
Tx
Ctr
Tx
Ctr
Tx
Ctr
Tx
Ctr
Tx
Ctr
Tx
Ctr
Tx
Ctr
Tx
Ctr
n
116
122
116
122
116
122
116
122
166
132
166
132
166
132
166
132
Observed (%) among all study
teachers 39
59.5 55.7 56.9 65.6 50.0 56.6 64.7 63.9
41.6 51.5
48.8
64.4 45.2 53.0 51.8 61.4
2.73 2.49 2.67 2.64 2.59 2.58 2.76 2.50
.85 .87 .87 .85 .88 1.13 .94 .79
1-4 1-5 1-5 1-5 1-5 1-7 1-5 1-4
2.48 2.34
.95 .82
1-4 1-4
2.58
.92
1-5
2.36 2.77 2.56 2.58 2.37
.90 1.07 1.13 .96 .93
1-5
1-5 1- 6 1-6 1-6
15.5 19.7 11.2 14.8 11.2 19.7
26.5 22.7
12.7
17.4 14.5 22.0 15.7 15.9
Observations per teacher
Mean
SD
Range
Reasons for no observation (%)
not teaching target course
school did not allow
observations 40
no longer/not yet at school
other 41
Observed (%) among teachers
teaching target course 42
0
0 11.2
0 11.2
0
6.9 12.3
1.7
0
0
0
11.4
0 10.8
0
1.8
0
24.1 22.1 20.7 18.9 26.7 23.0 25.9 20.5
0.9 2.5
0 0.8 0.9 0.8 0.9 3.3
30.7 21.2
1.2 4.5
22.9
4.2
15.9 27.7 23.5 29.5 22.7
2.3 1.8 1.5 1.2
0
98.6 95.8 83.5 98.8 80.6 98.6 96.2 95.1
97.2 91.9
75.7
96.6 78.1 97.2 94.5
100
R305R070025
75
Inter-rater agreement. To assess inter-rater agreement, data collectors visited some
classrooms in pairs during each wave of data collection. They stayed in the same classroom for
20 minutes and then completed their scoring of that classroom, independently and without
discussion. Once each person had finished scoring, they discussed any discrepancies in their
scores and settled on a set of consensus scores. They did not change their own scores after the
discussion began.
Across the four data collection waves, 281 visits were made in which two data collectors
were present. The intra-class correlations (one-way random, single measures) across those pairs
of original scores (not consensus scores) were Engagement = .73, Alignment = .63, and Rigor =
.70. See Table 9 for the intra-class correlations (one-way random, single measures) by wave. As
would be expected, the intra-class correlations were higher when observers’ scores are compared
with the consensus scores agreed upon by the pair of observers after discussion. Across all visits,
the intra-class correlations (one-way random, single measures) with the consensus scores were
Engagement = .91, Alignment = .82, and Rigor = .86. 43 As a reference, Cicchetti (1994) referred
to ICCs below .40 as “poor”, those between .40 and .59 as “fair”, those between .60 and .74 as
“good”, and those between .75 and 1.00 as “excellent.”
Table 9. EAR Intra-Class Correlations (one-way random, single measures) By Wave
n
Engagement
Alignment
Rigor
Wave 1
Wave 2
Wave 3
Wave 4
Overall
77
0.697
0.552
0.540
61
0.671
0.707
0.694
59
0.691
0.788
0.752
84
0.799
0.563
0.752
281
0.726
0.633
0.695
R305R070025
76
Student Questionnaires
Administration schedule and procedures. Student questionnaires were administered
four times, at the beginning and end of each year that the school participated in ECED. As noted
earlier, students in Grade Cohort 1 (9th-grade in Year 1 and/or 10th-grade in Year 2) participated
in all four waves of student questionnaires if they were enrolled during all four waves. The other
two grade cohorts were each given surveys two times. Grade Cohort 2 was in 10th-grade in Year
1 and took part in the surveys in Year 1 only (Waves 1 and 2). Grade Cohort 3 was in 9th-grade
in Year 2 and took part in the surveys in Year 2 only (Waves 3 and 4). Table 10 indicates which
survey data we attempted to collect for each grade cohort. Student surveys were typically
administered on-line, during the school day. Often surveys were administered during English or
Literacy Matters classes because most students were enrolled in those, but that decision was
entirely up to the schools. See Appendix 7 for a more detailed description of the administration
procedures.
Table 10. Data Collection Plan
Wave 1
Surveys
Wave 2
Surveys
Year 1
Student
Records
Wave 3
Surveys
Wave 4
Surveys
Year 2
Student
Records
Grade Cohort 1 (9th-grade in Year
1 and/or 10th in Year 2)
X
X
X
X
X
X
Grade Cohort 2
(10th-grade in Year 1)
X
X
X
X
X
X
Grade Cohort 3
(9th-grade in Year 2)
Response rates. Every effort was made to ensure that all students had a chance to
participate in the surveys. Each participating school designated a research contact. The research
project director or research associate met with each research contact at the start of each
R305R070025
77
administration to review the methods, distribute the tickets and strategize about administration.
During the administration windows, the research project director or research associate
communicated regularly with research contacts, providing lists of students who had not yet been
given a chance to take the survey, reminding them of the importance of a high response rate, and
troubleshooting as needed.
Nonetheless, student survey response rates varied dramatically by administration wave
and school. Table 11 shows the response rates for each grade cohort, as well as the reasons for
non-response when known. Considering only students who were enrolled at the time the rosters
were created (last line of Table 11), the response rate for Grade Cohort 1 ranged from a low of
64% at Wave 2 treatment schools to a high of 83% at Wave 1 treatment schools.
R305R070025
Table 11. Student Survey Response Rates
78
Grade Cohort 1 (9th-grade in Year 1)
Wave 1
Tx
Total cohort (N)
Participated (%)
Wave 2
Ctrl
Tx
3,999 4,434 3,999
66.9
60.4
Wave 3
Ctrl
Tx
Grade Cohort 3
th
th
(10 -grade in Year 1)
Wave 4
Ctrl
Grade Cohort 2
Tx
Wave 1
Ctrl
4,434 3,999 4,434 3,999 4,434
Tx
(9 -grade in Year 2)
Wave 2
Ctrl
Tx
Wave 3
Ctrl
3,108 3,539 3,108 3,539
Tx
Wave 4
Ctrl
3,408 3,643
Tx
Ctrl
3,408 3,643
50.2
51.2
54.3
57.1
54.1
54.2
76.2
63.0
60.3
63.2
69.0
75.9
62.1
73.8
2.5
0.7
1.2
0.7
2.6
0.7
1.4
2.0
1.1
1.1
1.5
1.3
2.3
0.9
0.9
0.6
2.3
0.1
0.8
0.7
2.4
0.7
0.4
1.4
1.1
1.4
0.4
1.4
1.0
1.3
1.0
0.5
0.0
0.6
0.0
0.5
0.2
3.8
0.1
0.1
0.0
0.4
0.1
0.5
0.2
0.6
0.9
0.2
0.2
0.3
0.6
1.3
1.7
0.2
0.3
0.3
0.5
0.2
0.3
1.0
1.1
2.5
1.4
1.0
0.8
1.1
1.1
1.1
0.8
2.8
2.0
2.3
0.5
1.3
1.1
0.0
11.5
0.0
8.4
0.0
0.0
0.0
0.0
0.0
11.1
0.0
12.9
0.0
13.3
0.0
7.5
13.4
11.4
21.4
6.4
13.1
13.6
10.3
9.3
25.1
13.6
20.2
6.7
13.4
11.1
11.5
not enrolled this
wave
19.8
20.2
21.2
22.5
25.8
26.5
27.3
29.6
6.4
8.3
10.2
11.1
6.4
7.5
9.6
10.0
Participated (%)
among only those
enrolled at time
rosters were
created
83.4
75.7
63.7
66.0
73.2
77.8
74.4
76.9
81.4
68.7
67.1
71.1
73.7
82.1
68.7
82.0
Reasons for non-participation
(%)
parent refusal
1.4
student refusal
0.5
excluded by
1.1
school 44
enrolled but not
1.8
attending 45
disenrolled 46
1.0
school did not
administer this
0.0
wave 47
unknown
R305R070025
79
Student questionnaire items and scale development. The items on the student
questionnaire came primarily from IRRE’s past research. They had been administered to students
participating in IRRE’s First Things First supports for many years and had been revised
repeatedly. See Appendix 8 for a complete list of the items and the original construct each was
intended to measure.
Although the items in the student survey have been used extensively by IRRE, a
complete factor analysis using past data was not available. For that reason, we conducted our
own data reduction and scale development using factor analysis on items included in the student
survey. 48 All analyses were conducted in MPlus (Muthén & Muthén, 1998-2009; Version 6.12).
Exploratory factor analysis (EFA) was first conducted in a randomly selected half of the students
in the schools in the first recruitment group in Wave 1 (Fall of 2009). The factor structure was
then confirmed on a second random half of the students in schools in the first recruitment group.
To further support the results, the same analyses were conducted using the students in the
schools in the second recruitment group. Standard errors were adjusted to take into account
school cluster sampling.
Through an initial EFA, we determined that the student survey items fell into two
categories: (1) students’ ratings of their teachers (teacher expectations, teacher rigor, teacher
support); and (2) students’ ratings of themselves (engagement in school, perceived competence).
Exploratory analyses were conducted on each set of items separately. A two factor solution was
chosen for each model. The two factors that emerged from the items measuring students’ ratings
of their teachers were: (1) Positive teacher support/expectations, composed of 9 items (α =.76);
and (2) Lack of teacher support, composed of 5 items (α =.76). The two factors that emerged
from the items measuring students’ ratings of themselves were: (1) Engagement in school
R305R070025
80
composed of 6 items (α = .61); and (2) Perceived competence composed of 6 items (α =.80). Five
items were dropped due to low loadings or cross loadings. 49 Correlations between the factors
ranged from .51 for perceived competence and lack of teacher support to .77 for perceived
competence and engagement in school. These four factors were then combined and tested as a
second order factor. Results showed an adequately fitting second order factor, representing
students’ attitudes towards school. The goal of obtaining a second order factor was to avoid a
multiple comparisons problem when examining impacts. The primary impact analysis on student
survey items were therefore conducted on this second-order factor. Follow-up sensitivity
analyses were also conducted on the separate factors. See Table 12 for the final factor structure
and loadings for the separate factors and second order factor.
R305R070025
81
Table 12. Standardized Results from Final CFA from Student Questionnaire Items
Loading
SE
Positive teacher support/expectations
My teachers show us examples of the kinds of work that can earn us good grades. 0.650
My teachers make it clear what kind of work is expected from students to get a
good grade.
0.732
0.009
My teachers expect all students to do their best work all the time.
0.661
0.011
My teachers expect all students to come to class prepared.
0.579
0.015
I am asked to fully explain my answers to my teachers questions
Our classroom assignments and homework make me think hard about what I’m
learning.
0.342
0.012
0.372
0.012
My teachers make sure I understand before we move on to the next topic.
0.620
0.008
My teachers care about how I do in school.
0.735
0.012
My teachers like to be with me.
0.517
0.010
My teachers like the other students in my classes better than me. (R)
0.672
0.009
My teachers interrupt me when I have something to say. (R)
0.692
0.008
My teachers don’t make clear what they expect of me in school. (R)
0.636
0.007
My teachers are not fair with me. (R)
0.818
0.007
My teachers’ expectations for me are not realistic. (R)
0.542
0.009
It is important to me to do the best I can in school.
0.726
0.011
I work very hard on my schoolwork.
0.749
0.011
I don’t try very hard in school. (R)
0.631
0.012
I pay attention in class.
0.724
0.009
I often come to class unprepared. (R
0.411
0.011
A lot of the time I am bored in class. (R)
0.399
0.035
When I’m doing a class assignment or homework, I understand why I’m doing it. 0.606
0.008
I feel confident in my ability to learn at school.
0.751
0.006
I am capable of learning the material we are being taught at school.
0.622
0.011
I feel able to do my schoolwork.
0.726
0.009
I feel good about how well I do at school.
I know what kind of work it takes to get an A in my classes.
0.708
0.563
0.011
0.010
Positive teacher support/expectations
0.819
0.007
Lack of teacher support
0.683
0.018
Engagement in school
Perceived competence
0.818
0.867
0.011
0.008
0.012
Lack of teacher support
Engagement in school
Perceived competence
Second-order factor
Note. Results from Wave 1 from both recruitment groups and all grade cohorts; n = 9,983.
R305R070025
82
Student Demographics
As a condition of participation, each district agreed to provide students’ school records to
the research project. 50 These records were provided attached to student IDs. Student IDs were
transformed into study IDs by a consultant (i.e., not a member of the research team), so they
could be linked to student questionnaire responses and other data. Records included:
demographic characteristics, 7th- through 10th-grade standardized test scores, course schedules,
attendance, credits towards graduation, and grade point averages. Additional details about each
type of record are provided in the next sections. See Chapter VII for a discussion of the
challenges encountered in obtaining the school records.
For each study student for each year of participation, we requested the following data:
grade in school, gender, ethnicity/race (including Hispanic origin), data of birth, free or reduced
price lunch status, English language learner status, and special education status. Because
different districts maintain these types of records in different formats, some response categories
had to be collapsed to make the variables comparable across districts. 51
As noted above, we obtained demographic information from the districts for almost all
study students. When the information was missing from the district files for a student who had
completed a survey (~1.3% of cases), we used his or her gender and race/ethnicity as reported on
the student survey. After filling in data from the student surveys, race/ethnicity was missing for
2.8% of cases, and gender was missing for 2.5% of cases. We did not use student reports of
special education, English fluency or free/reduced price lunch because the questions asked did
not map directly onto the district data received. Thus, demographic variables other than
race/ethnicity and gender were missing for roughly 4.1% of cases.
R305R070025
83
Course Enrollment
For each of the four terms that a district participated, we requested each study student’s
course schedule, including name of course, teacher ID, period/block, and final grade. Districts
provided relatively complete records for students who were enrolled for the entire school year.
On the other hand, course file information was often missing for students who had only been
enrolled for part of the year. Additionally, because this information was often housed in multiple
databases within the district (e.g., often the final grades and period information were not in the
same database) districts often sent multiple data files. Information within the files was often
contradictory, requiring extensive cleaning in order to create a single, cohesive set. See
Appendix 9 for details on how the course files were restructured for analysis.
Standardized Test Scores
The five participating districts were in four different states. This meant that there were
four different sets of standardized tests administered to the students in this study. We requested
that the districts send all available 7th-, 8th-, 9th- and 10th-grade math and ELA scale scores for all
study students. 52 Below is a description of each state’s tests, followed by a description of how
they were combined to make the scores comparable across tests and districts. Appendix 10
shows the types of tests for which we received scores in each district, as well as the number and
percentage of scores received for students in Grade Cohort 1.
State testing systems.
Districts 1 and 2 (California). Two types of standardized tests scores were requested and
received from Districts 1 and 2: California Standards Test (CST) and California High School
Exit Exam (CAHSEE). The CST is administered in the spring of each year in grades 2nd through
11th including math and ELA tests each year. In math in 7th- and 8th- grade, students either took a
R305R070025
84
grade-level test (e.g., 7th-grade math) or a subject specific test (e.g., Algebra 1) depending on
their course enrollment. In 9th- and 10th-grade, they took either ‘General Math’ or a subject
specific test such as Algebra 1, again depending on their course enrollment. In English, they took
a grade specific test each year.
District 3 (Tennessee). Students in District 3 took the Tennessee Comprehensive
Assessment Program (TCAP) Achievement Tests in the spring of each year from 3rd- through
8th-grade. Starting in 9th-grade students took TCAP End of Course exams in various subjects,
depending on course enrollments. There were separate tests for 9th-, 10th- and 11th-grade English,
as well as for Algebra 1 and 2. The state of Tennessee did not administer a standardized
Geometry test during the time District 3 was participating in ECED. However, District 3 did
administer its own geometry test, created by the district rather than the state, to all students
enrolled in Geometry. ECED requested and received those scores.
District 4 (Arizona). Students in District 4 took the Arizona Instrument to Measure
Standards (AIMS) exam in grades 3 through 8. Typically it includes reading, writing, and math;
however the 8th-grade writing test was suspended after 2009. Starting in the spring of grade 10,
students took the AIMS Exit Exam in reading, writing, and math. Students who did not pass the
first administration of the AIMS Exit Exam continued to take it each semester until they passed;
however we requested and received only scores from the first time the student took the test.
Arizona does not have a state-wide standardized test in 9th-grade; but District 4 9th-grade students
took the Stanford 10, a nationally standardized test, in math, language, and reading.
ECED requested and received 7th- and 8th- grade AIMS test scores, 9th-grade Stanford 10
scores, and scores from the first administration of the AIMS Exit exam (10th-grade). However,
District 4 was a ‘high school only’ district meaning that it oversaw the local high schools only
R305R070025
85
and the students entered the high schools from five separate elementary feeder districts. District
4 did not routinely receive 7th- and 8th-grade scores from these feeder districts and had to request
them especially for ECED. Some feeder districts were reluctant, resulting in significant missing
data, although we did receive at least some 7th- or 8th-grade data from each of the five feeder
districts.
District 5 (New York). Students in District 5 took part in the New York State Testing
Program (NYSTP) in grades 3 through 8 in ELA and math. Starting in 9th-grade, students had to
pass a certain number of Regents Exams in order to be eligible for graduation. The exact Regents
exams taken each year depended on the student’s course schedule. Most students took a math
exam in 9th- and/or 10th- grade; however, very few students took an ELA exam in those grades.
Because the ECED intended to use ELA achievement on standardized tests in the 10th-grade as a
primary outcome, we arranged for 10th-grade students in District 5 to take the Gates MacGinnitie
Reading (GMRT) test at the end of Years 1 and 2. The GRMT is a group-administered, paper
and pencil test designed to assess student achievement in reading at all levels (kindergarten
through adult). It includes separate tests of comprehension and vocabulary, which can be
combined into an overall reading score. ECED paid the costs of purchasing and scoring these
exams. Unfortunately, despite a prior written commitment to administer this test to all 10thgraders each year, the actual administration/response rate was low (i.e., 27% of Grade Cohort 1
students took the GRMT in Year 2).
Combining test scores. As is clear from the description of each state’s testing system,
there was wide variation in the timing and content of the tests across states, but our analyses
required that they be combined into comparable scores indicating student performance, relative
to peers, in each subject. To that end, for each student in Grade Cohort 1, we sought to calculate
R305R070025
86
six test scores: math at baseline, math at Year 1, math at Year 2, ELA at baseline, ELA at Year 1,
and ELA at Year 2. We did not intend to calculate Year 2 test scores for Grade Cohort 2 or Year
1 test scores for Grade Cohort 3 because those students were not in the study at that time. The
general rules used to create common test scores across districts are outlined below. Following the
general rules is a description of the rationale for these rules and descriptions of district specific
decisions that were made in order to apply the general rules. 53
The general rules applied for combining tests onto a common scale were: (1) standardize
each test within test name and district, but across administration years 54 and grade cohorts; (2)
for math, when students had more than one score at baseline, Year 1, or Year 2, use the lowest
level test 55 (e.g., if a student had both Algebra 1 and Geometry achievement scores use the
Algebra 1 score); (3) for Year 1 and Year 2 ELA, when a student had more than one test in a
single year, use the one that corresponded to his or her grade level (e.g., use 10th-grade test for a
10th-grader who took both 9th- and 10th-grade), if both were at the same level (e.g., 9th-grade
reading and 9th-grade language), use the average of the two; (4) for baseline ELA, when students
had more than one test, take the average of all tests (e.g., mean of 7th- and 8th-grade).
Additionally, for math, we created three indicator variables (baseline, Year 1, and Year 2) to
indicate the subject and level of the test used for that student’s math score (e.g., 7th-grade math,
Algebra 1, Geometry, Algebra 2).
The decision about how to treat math scores for Year 1 and Year 2 stemmed from three
competing concerns. First, we did not want to discard data if at all possible because that would
result in imputing scores in cases where the district had provided data. We believed that the
actual score on a standardized test would be better information than the imputed score. Second,
we were concerned that math course taking – and therefore math test taking – might be
R305R070025
87
influenced by the intervention itself. Third, we were concerned that combining different level
math tests would introduce error because the same student might have received a higher score if
she or he had been given a lower level test. The first concern was addressed by keeping at least
one score per student, even if that student took a test taken by few other students. The second
concern was addressed by reviewing test taking in each district and finding no clear pattern of
test taking across treatment and control schools. In some districts students took higher tests in
treatment schools, in some districts students took higher tests in control schools, and some
showed no difference. Further, analyses of math achievement will control for the level of the
tests (e.g., Algebra 1, Geometry, etc.) to account for the fact that different districts administered
different tests and test level often depended on students’ course schedules. We were only able to
partially address the third concern. We did this by selecting the lowest level math score for
students who took more than one test in a given year. However, across students we still
combined multiple levels of tests (standardized within test) into the same variable. There was no
way to fully address that concern.
The approach for combing math baseline scores and its rationale were slightly different
from the approach and rationale for Years 1 and 2. At baseline, we often had a score for the same
student in both 7th- and 8th-grade. In those cases, we used the 7th-grade score for baseline. This
was to minimize the range of tests included. In all districts, 7th-graders take a 7th-grade math test,
but in many districts the math test taken in 8th-grade depended on his or her course taking, with
relatively large groups of students taking the Algebra 1 tests in 8th-grade. In general, we would
expect that those are the district’s more advanced 8th-graders, but they are taking a harder test
than those taking the 8th-grade math tests, possibly resulting in lower scores despite being
advanced students. In order to minimize the number of different tests being combined into a
R305R070025
88
single score, we opted to take the lowest math tests at baseline, meaning that the baseline score
often comes from the 7th-grade (more than one year before the start of the study). When we did
not have a 7th-grade test score, we used the 8th-grade score. Thus, multiple tests were sometimes
still being combined, but this strategy minimized the concern.
Most students had only one ELA test in Year 1 and one ELA test in Year 2, so we used
that one. A few students, however, had more than one. When the two tests were at different
levels (e.g., 9th-and 10th-grade), we used the one that matched the students’ grade to avoid
combining tests within grade cohort. When the two tests were at the same level (e.g., 9th-grade
reading and 9th-grade language), we took the mean of the two scores, to represent the broadest
conceptualization of ELA and minimize error.
At baseline, when students had more than one ELA score (7th- and 8th-grade) we elected
to average the two together, after standardization, to minimize error. Unlike math, in 8th-grade
most students took the same ELA tests and the level of the test was not linked to the student’s
course taking or past ELA achievement. Additionally, the nature of ELA tests is somewhat
different from math. In ELA, each year builds systematically on the previous year, with no
qualitative shift in content making averaging across years appropriate. For that reason, it seemed
like the average of the two scores would result in the least error. In math, on the other hand,
averaging different tests together seemed more problematic because quite abrupt changes in
content could be found across course and test.
After establishing these general rules for combining scores, we were still faced with
several decisions resulting from the unique testing system in each state. Appendix 11 details
state-specific decisions to address each state’s unique issues.
R305R070025
89
Student Performance (Attendance, Credits, GPA)
Attendance. We requested number of days enrolled and number of days present for each
student for each year the district participated; however, District 4 was not able to provide
attendance data at the student level, so attendance data for those students were imputed 56
Credits earned. In addition to passing certain required courses and exam(s), students
must earn a certain number of credits in order to graduate from high school. The number needed
varies by district. Additionally, each district specifies a certain number of credits a student must
earn each year to be considered on-track for graduation. Thus, the proportion of credits earned
out of the number needed to be on-track for graduation can be used as an indicator of progress
toward graduation. We requested that each district provide the number of credits each student
earned by the end of each year they participated. Four of the five districts provided the needed
information for both years. District 2 was only able to provide that information for Year 2,
Rather than imputing data for an entire district for Year 1, we analyzed Year 2 credits data only.
Grade point average (GPA). We requested that each district provide each student’s
total, unweighted grade point average at the end of each year of participation. This value
includes all courses that the student took, rather than just English and math courses included in
the course file. Four of the five districts provided the needed information for both years. District
4 was only able to provide that information for Year 2. As with credits, rather than imputing data
for an entire district for Year 1, we analyzed Year 2 GPA data only.
Missing Data
Amount of missing data. Table 13 through Table 15 show the percentage of students
and teachers (math and ELA separately) for whom we obtained valid (non-missing) data on some
of the key variables. The student table includes students who were in Grade Cohort 1. As seen in
R305R070025
90
Table 13, we obtained key demographic variables such as ethnicity and free/reduced price lunch
for almost all students. The percentage of valid data for surveys and test scores were much lower
and varied considerably by district. Among teachers, there is a relatively high amount of missing
data of all types.
Note that values on all three tables are based on ‘intent to treat.’ That is, they reflect all
students or teachers who were enrolled/employed in a study school at any point in the two years
their school participated. At each wave, some of the missing data results from the fact that some
students were not enrolled and some teachers were not employed (i.e., had left the school or had
not yet enrolled/begun employment) and therefore were not available to provide data. Those data
are counted as missing at those time-points. Among students, 46% enrolled after the first wave of
data collection, left prior to the last wave of data collection, or both. Among math teachers, 41%
were not employed at the school during one or more terms of the study. Among ELA teachers,
46% were not employed one or more terms.
R305R070025
91
Table 13. Percentage of Grade Cohort 1 Students for Whom We Have Valid Data on Key Variables
Total
Students Ethnicity
Free/
Reduced
Lunch
Wave 1
Survey
Wave 4
Survey
Baseline
ELA
Score
Year 1
ELA
Score
Year 2 Baseline
ELA
Math
Score
Score
Year 1 Year 2
Math
Math Year 2
Score Score
GPA
Year 2
Credits
Year 2
Attendance 57
District 1
District 2
District 3
District 4
District 5
2,628
1,590
1,264
2,222
729
99
93
89
98
99
98
95
86
98
94
63
68
60
68
49
59
55
43
60
38
82
77
72
60
70
82
76
67
69
0
75
76
58
73
32
82
77
72
60
71
78
76
57
69
54
71
73
61
71
49
71
85
90
67
85
71
80
91
47
85
81
72
63
0
86
Treatment
Control
3,999
4,434
97
96
96
94
67
60
54
54
75
71
68
68
68
68
75
70
69
70
69
67
77
77
75
66
56
55
Overall
8,433
96
95
63
54
73
68
68
73
70
68
77
70
56
92
Table 14. Percentage of Math Teachers for Whom We Have Valid Data on Key Variables
Total
Teachers Ethnicity
Wave 1
Survey
Wave 4
Survey
Wave 1
EAR Visit
Wave 4
EAR visit
District 1
District 2
District 3
District 4
District 5
61
51
38
51
37
72
71
100
98
60
36
39
53
55
46
69
57
71
47
38
49
67
58
63
51
74
65
63
67
46
Treatment
Control
116
122
85
75
50
40
55
59
60
56
65
64
Overall
238
80
45
57
58
64
Table 15. Percentage of ELA Teachers for Whom We Have Valid Data on Key Variables
Total
Teachers Ethnicity
Wave 1
Survey
Wave 4
Survey
Wave 1
EAR Visit
Wave 4
EAR visit
District 1
District 2
District 3
District 4
District 5
83
53
45
73
44
68
74
98
99
46
22
43
53
34
27
52
62
62
43
25
35
72
56
41
34
49
62
64
63
41
Treatment
Control
166
132
81
75
34
34
44
55
42
52
52
61
Overall
298
78
34
49
46
56
Multiple imputation. We used multiple imputation (Rubin, 1987) to handle missing
values, which can generate multiple completed datasets and mitigate the uncertainties induced by
missing data. For student data, we used a combination of a latent class approach (Si & Reiter,
2013) for categorical data and the R package “mi” for continuous data to generate the multiply
93
imputed datasets. For teacher data, we used the latent class approach exclusively, treating all
variables as categorical. The following steps were taken to generate five multiply imputed
datasets for both students and teachers.
For student data, we implemented a two-step imputation procedure because there was a
mix of categorical and continuous variables. We first imputed the student demographic and
survey data using the latent class approach proposed by Si and Reiter (2013), which can flexibly
and efficiently deal with a large number of categorical variables with complex dependency
structures. This approach assumes that the students are divided into several latent classes. Within
each class, the variables are conditionally independent, meaning the variables are independent
and have the same distributions. The number of classes and class assignment is determined by
the data information. The missing values are then imputed within each class. As informative
background variables, we included school district, school code, and treatment status. We also
included the following student demographic variables: grade, gender, ethnicity (5 categories),
age at baseline (in years), English language learner, special education, and free or reduced price
lunch. Keeping all survey items in their original nominal scales, we used the latent class
approach to impute the missing values of all the categorical variables using Markov chain Monte
Carlo computation for 5,000 iterations. Five completed datasets were generated.
Conditional on the imputed demographic information from one randomly chosen dataset
of the five generated above, we then used the R package “mi” (developed by Andrew Gelman’s
research group at Columbia University) to impute the continuous variables which included
students’ test scores, attendance, GPA, and credits earned. “Mi” uses an algorithm known as a
chained equation approach: the user specifies the conditional distribution of each variable with
missing values conditioned on other variables in the data, and the imputation algorithm
94
sequentially iterates through the variables to impute the missing values using the specified
models.
We used the imputed demographic information, treating them as unordered categorical,
from one randomly chosen dataset of the five generated above as covariate to impute the student
outcomes in the iterative regression models. We also include an interaction term between state
and treatment status as a covariate. We first imputed the baseline math and ELA test scores, as
well as the indicator for type of math test taken. We ran “mi” for five chains with 50 iterations
for each of them and obtained five completed datasets that included the students’ baseline
demographic characteristics and baseline test scores. We then sequentially imputed Year 1 and
Year 2 outcomes conditional on each completed baseline dataset. For the outcome variables in
Year 1, we imputed standardized math test scores as well as the indicator for type of math test
taken, standardized English test scores, standardized proportion of days present, standardized
GPA, and credits earned divided by credits needed to be on track for graduation. We treated the
standardized variables as continuous and the proportion of credits variable as non-negative
continuous. Based on each baseline dataset, we ran “mi” with 50 iterations for one chain and
obtained one completed dataset that included Year 1 outcomes and baseline data. Next,
conditional on each completed Year 1 dataset, we impute Year 2 outcome variables defined
similarly as above in Year 1. We ran “mi” for 50 iterations with one chain conditional on each
completed Year 1 dataset. Finally, we obtained five completed datasets that included students’
demographic characteristics, baseline test scores, Year 1 and Year 2 test scores, Year 1 and Year
2 standardized attendance, Year 1 and Year 2 standardized GPA, Year 1 and Year 2 credits
earned, and the three (baseline, Year 1, Year 2) math test type indicator.
95
For the imputation of teachers’ demographic and survey data, we assumed the variables
were nominal and imputed the data using the latent class approach. The imputation procedure
followed that of the student demographic and survey data procedures.
Suitability of the Data for Analyses
The primary purpose of the ECED Efficacy Trial was to estimate the impacts of ECED
on student attitudes towards school, scores on standardized tests of achievement, and
performance such as attendance and progress toward graduation, and on teacher attitudes and
experiences and their observed classroom practices. The data collection described here was
successful in providing data that allow for unbiased comparisons between treatment and control
schools. In all cases the two conditions were treated comparably and, when baseline differences
between conditions were identified they will be controlled in the analyses. Thus, we are
confident that data serve the intended purpose and are appropriate for use in treatment versus
control groups, point-in-time analyses of all outcomes of interest.
Further, because most data were collected with identical measures across the two years of
the study, most of the data are also appropriate for longitudinal growth curve analyses. However,
the one exception is the student test scores. Students in the four different states took different
tests and there was between-state variation in the goals of the tests (e.g., end of course versus
high school exit exams). Further, students’ took different tests of different topics (e.g., Algebra 1
versus Geometry) across the baseline, 9th-, and 10th-grade years, In order to combine across the
varied testing systems and time-points, we had to standardize within test. Thus we cannot be
certain that across time, the scores are directly comparable, so the test scores are not appropriate
for growth curve analyses. In Chapter VI, where student impact results are reported, point-intime analyses are presented for achievement scores, but growth curve analyses are not.
96
Additionally, it should be noted, that although there is no reason to believe these data are
biased, they do contain significant measurement error. Throughout this report we described
various difficulties encountered in the structural implementation of ECED (e.g., “dosage” issues
due to teacher and student movement, and scheduling of targeted classes) and in data collection,
including major issues such as two schools ceasing participation, high turnover at the
administrative and teacher levels, between-state differences in testing systems, and districtmaintained data systems that were difficult to navigate. In all cases, our solutions have been
conservative and erred in the direction of minimizing bias, at times at the expense of adding
measurement error. This approach lowers our chances of finding ECED impacts, but increases
our confidence in any impacts we do see.
Analysis Plan
The primary question this study sought to answer was: Is the ECED intervention
efficacious in changing students’ math and ELA achievement, school performance and
commitment, and attitudes towards school? Additionally, because teachers were key to the
success of students, we sought to learn if the ECED intervention was efficacious in improving
instruction and teachers’ experience of support, competence, and engagement.
As a first step in devising an analysis plan, we considered difference in treatment and
control on baseline characteristics, using the imputed data. As seen in Table 16, the
randomization did not lead to entirely equivalent group. For that reasons, covariates were added
to all models to account for pre-existing differences.
97
Table 16. Treatment versus Control Baseline Characteristics Using Imputed Data
Baseline characteristic
Student
Gender (%)
Girls
Boys
Race/ethnicity (%)
Hispanica
Non-Hispanic Blacka
Non-Hispanic Whiteb
Asian/Pacific Islander
American Indian/Multi-Racial
Free/reduced price lunch (%)a
Special education (%)
ELL Services (%)a
Mean age in years (SD)
Mean attitudes toward school (SD)
Mean math achievement (SD)a
Mean ELA achievement (SD)b
Teachers
Mean years of teaching (SD)
Mean teacher mutual support (SD)
Mean teacher collective commitment (SD)
Mean support from school administration (SD)
Mean support from district administration (SD)
Treatment
(n = 3,999)
Control
(n = 4,434)
Total
(N = 8,433)
46.8
53.2
47.9
52.1
47.4
52.6
49.6
22.5
17.2
8.2
2.6
80.3
5.5
20.0
14.70 (.75)
3.18 (.37)
-0.15 (.96)
0.04 (.97)
52.0
25.8
11.8
8.0
2.4
85.4
5.6
24.0
14.70 (.78)
3.17 (.35)
-0.07 (1.0)
-0.09 (.95)
50.9
24.2
14.3
8.1
2.5
83.0
5.6
22.3
14.70 (.76)
3.17 (.36)
-0.10 (.98)
-0.03 (.96)
4.68 (1.46)
3.31 (.60)
4.81 (1.38)
3.32 (.58)
4.74 (1.42)
3.31 (.59)
3.37 (.58)
3.36 (.57)
3.37 (.58)
3.06 (.62)
2.78 (.68)
3.03 (.61)
2.77 (.68)
3.05 (.62)
2.77 (.67)
Note. aBaseline covariate significantly higher for control than treatment group: Hispanic, t(8341)
= 2.22, p < .05; Black, t(8405) = 3.64, p < .001; free/reduced lunch, t(8217) = 5.67, p < .001;
ELL, t(8415) = 4.06, p < .001; math achievement t(8403) = 3.88, p < .001. bBaseline covariate
significantly higher for treatment than control group: White, t(7900) = -7.10, p < .01; ELA
achievement, t(8304) = -6.08, p <.001.
Teacher impact analyses and variation in implementation analyses. For the teacher
analyses, we used an intent-to-treat approach including all target teachers regardless of how
many study terms they taught in a participating school, how many study terms they taught a
98
target class, or the extent to which their school successfully implemented ECED. Math and ELA
teachers were always analyzed separately, both because the intervention and supports are quite
different for math and ELA teachers and because the ELA teachers are not truly comparable in
the treatment versus control schools. That is, the analyses of data from ELA teachers are not
experimental because we cannot know which teachers in control schools would have taught the
ECED Literacy course had their school been selected to participate in the intervention. Further,
as described in the section defining ELA student teachers, in order to make the groups as similar
as possible, all 9th - and 10th-grade English teachers were included from both experimental and
control schools, even though the 9th- and 10th-grade English teachers at treatment schools who
were not also teaching ECED Literacy had no exposure to the ECED supports.
To test the effects of ECED on teacher outcomes at the end of Year 1, the end of Year 2,
and across the two years we estimated a series of 2-level hierarchical linear models and
longitudinal growth curve models accounting for the nesting of teachers within schools.
Outcomes included teacher responses to questionnaires and engagement, alignment, and rigor (as
measured by the EAR Protocol). For the teacher questionnaire analyses, we first considered the
second order factor measuring perceived support for instructional innovation and improvement.
That analysis was followed by analyses testing each of the separate teacher questionnaire scales
(1) teacher collective commitment, (2) teacher mutual support, (3) support from district
administration, (4) support from school administration, (5) commitment to change, (6)
confidence in change, (7) individual teacher morale, (8) professional development, (9) perceived
competence, and (10) the relative autonomy index. The primary predictor of interest was
experimental condition (treatment versus control). Control variables included district, baseline
score, gender, race/ethnicity, and years of teaching experience. The models testing engagement,
99
alignment, and rigor also controlled the number of times the teacher was observed, because more
observations may increase the reliability of the scores.
We followed the intent-to-treat analyses with analyses in which the overall variation in
implementation variable replaced the treatment condition variable. The intent-to-treat analyses
are the most stringent type to answer the impact question. However, there was large variation in
implementation, including that two treatment schools stopped participating before the
intervention was complete, making it important to understand the association between
implementation and student and teacher outcomes. Thus, following the intent-to-treat analyses, a
parallel set of analyses was conducted in which the experimental condition was replaced by the
overall indicator of implementation. That indicator is described in detail in Chapter IV. We
selected the overall value, rather than one of the six that was specific to subject and year because
the six were highly inter-correlated (α = .98). The results from all teacher analyses are reported
in Chapter V.
Student impact analyses and variation in implementation analyses. As with the
teacher analyses, for student analyses, we used an intent-to-treat approach in which all Grade
Cohort 1 students in all 20 schools were included in the analyses, regardless of how many study
terms they were enrolled in a participating school, the extent to which their school successfully
implemented ECED, or whether they were enrolled in the courses targeted by ECED. We
focused on the Grade Cohort 1 students because they could have been exposed to the
intervention for as much as four terms, whereas the other two grade cohorts had a maximum of
two terms of exposure. 58
As with the teacher analyses, to test the effects of ECED on student outcomes at the end
of Year 1 and the end of Year 2, we estimated a series of 2-level multi-level linear models
100
accounting for the nesting of students within schools. The six student outcomes were: math test
scores, ELA test scores, students’ attitudes toward school (second order factor from the student
questionnaire), grade point averages, attendance, and credits towards graduation. For the student
questionnaire analyses, we first considered the second order factor measuring students’ attitudes
toward school. That analysis was followed by analyses testing each of the separate student
questionnaire scales (1) positive teacher support/expectations, (2) lack of teacher support, (3)
engagement in school, (4) perceived competence, and (5) relative autonomy index. The primary
predictor of interest was experimental condition (treatment versus control groups). Control
variables included district, baseline score, gender, race/ethnicity, free/reduced price lunch
eligibility, special education, and receipt of English Language Learner services.
Those analyses were followed by a series of longitudinal growth curve models. Again,
these were three-level, accounting for the nesting of students within schools, using the same
predictors and control variables. However, these models were tested only for students’ attitudes
toward school, followed by the separate student questionnaire scales. A noted above, the test
scores are not necessarily comparable across time and are therefore note suitable for this type of
analysis. And, we did not gather baseline information on the performance variables (i.e., GPA,
attendance, and credits toward graduation) so growth-curve models could not be performed.
Parallel to the teacher analyses, following the student intent-to-treat analyses, we
conducted analyses to estimate the effects of the intervention accounting for level of
implementation by replacing the treatment variable with the overall variation in implementation
variable. Additionally, within the treatment schools only, we used the overall indicator of
implementation to predict teacher and student outcomes. The results from all student analyses are
reported in Chapter VI.
101
Unconditional Models
The variance in the outcomes was partitioned into their within-school and betweenschool components by fitting an unconditional model with no predictors (Bryk & Raudenbush,
1992). Intraclass correlation coefficients (ICC), a measure of the ratio of the variance that lies
between schools to the total variance, were calculated for each of the outcome variables. The
lower the proportion of school-level variance explained in a measure, the more power we have to
detect effects. The ICC(2) takes the group sample size into account and is an estimate of the
reliability of the group-mean ratings. The ICC(2) can be calculated by applying the SpearmanBrown formula to the ICC(1). 59 An ICC(2) between .70 and .85 is generally considered to
indicate acceptable reliability (Ludtke, Trautwein, Kunter, & Baumert, 2006). The lower the
reliability of the measure, the less sensitive the measure can be to intervention impacts.
Student outcomes. The ICC for students’ attitudes towards school (second order factor
from survey) was .01 in both years of the study, indicating that only 1% of the variance lay
between schools. The ICCs for the other survey scales ranged from 0.003 to 0.015. See Table 17.
These ICCs indicate that almost all of the variation in the survey outcomes lay between students
within schools rather than between schools, suggesting that there was little between-school
variation in these outcomes to be explained. Even so, a chi-square test of significance revealed
that variability at the school level was significantly different from zero for each of these
outcomes. For math and ELA achievement, the percent variance that lay between schools ranged
from 6% to 9% across the two years of the study, indicating greater but still modest betweenschool variance (see Table 17). The ICCs for grade point average and attendance were somewhat
higher, ranging from .07 in Year 1 to .10 in Year 2. The ICC for credits earned was noticeably
102
higher than any other, with 20% of the variance between schools in Year 1 and 52% of the
variance between schools in Year 2.
The estimated reliability (ICC(2)) with which schools could be distinguished on students’
attitudes towards school was .77 in Year 1 and .83 in Year 2, indicating high reliability.
Reliability estimates for the survey outcomes that made up the second order factor were also
high for positive teacher support, lack of teacher support, and perceived competence in Year 2,
but were below the acceptable range for engagement in both years and for perceived competence
in Year 1. Reliabilities consistently improved in the second year of the study. Reliabilities for
math and ELA achievement were high. The reliability estimates for math and ELA achievement
were .97 for both subjects and in both years of the study, indicating a very high level of
reliability. Grade point average, credits earned, and attendance each had very high reliabilities,
ranging from .97 to 1 across the two years of the study.
Table 17. Intraclass Correlations and Reliability Estimates for Student Outcomes
Outcome
ICC
Year 1
Attitudes towards school
Positive teacher support
Lack of teacher support
Engagement
Perceived competence
RAI
Math achievement
ELA achievement
Grade point average
Credits earned
Attendance
0.009
0.011
0.009
0.003
0.004
0.006
0.085
0.070
0.106
0.523
0.102
Reliability
Year 2
0.011
0.015
0.012
0.004
0.006
0.008
0.063
0.062
0.077
0.201
0.072
Year 1
0.765
0.796
0.773
0.564
0.610
0.697
0.972
0.965
0.978
0.998
0.977
Year 2
0.830
0.868
0.836
0.624
0.703
0.764
0.966
0.966
0.968
0.989
0.966
103
Teacher outcomes. As with the student outcomes, most of the variation in the teacher
survey outcomes lay between teachers within schools rather than between schools. The ICC for
perceptions of administrative support for instructional innovation and improvement (second
order factor) was .11 in Year 1 and .06 in Year 2, indicating that between 6 and 11 percent of the
variance lay between schools. The ICCs for the other survey scales ranged from 0.01 to 0.21. See
Table 18. A chi-square test of significance revealed that variability at the school level was
significantly different from zero for all outcomes except perceived competence, individual
teacher morale, commitment to change in Year 1, and support from district administration and
professional development in Year 2, indicating that these scales had so little school-level
variance that we are unlikely to find meaningful program impacts.
Reliability estimates for the teacher survey outcomes are shown in Table 18. The
estimated reliability (ICC(2)) with which schools could be distinguished on teacher support for
instructional innovation and improvement was .70 in Year 1 and .63 in Year 2, indicating
moderate to high reliability. With a few exceptions (i.e., teacher collective commitment, support
from school administration, confidence in change), reliability estimates for the individual survey
scales were low to moderate across the two years of the study.
104
Table 18. Intraclass Correlations and Reliability Estimates for Teacher Survey Outcomes
Outcome
ICC
Reliability
Wave 2 Wave 4 Wave 2 Wave 4
Perception of administrative support for
instructional innovation and improvement
Teacher collective commitment
Teacher mutual support
Support from district administration
Support from school administration
Commitment to change
Confidence in change
Individual teacher morale
Professional development
Perceived competence
RAI
0.107
0.095
0.054
0.212
0.109
0.022
0.133
0.022
0.048
0.011
0.050
0.055
0.079
0.080
0.008
0.073
0.021
0.073
0.024
0.044
0.034
0.030
0.703
0.674
0.530
0.842
0.708
0.309
0.751
0.305
0.501
0.183
0.509
0.626
0.709
0.712
0.180
0.693
0.384
0.692
0.408
0.571
0.501
0.469
105
IV. Implementation
As with any school-level intervention, schools in the treatment condition varied with
regard to how well the ECED components were implemented. This first part of this chapter
describes how variation in implementation was measured. The second part focuses on which
aspects of the intervention were implemented with more and less success, as well as the
differences between treatment and control schools.
Measuring Variation in Implementation
Variation in ECED implementation in the 20 schools taking part in the ECED Efficacy
Trial was quantified using seven values. Broadly speaking, ECED activities can be broken into
three categories: English/Language Arts (ELA), math, and EAR protocol. Each category is
hypothetically independent of the others (i.e., it would be possible to fully implement ELA, with
relatively weak implementation of math), so we elected to create separate scores for each
category. Further, each school participated for two years, and implementation may have varied
by year. Thus the seven scores calculated for each school were: (1) Year 1 ELA, (2) Year 2 ELA,
(3) Year 1 math, (4) Year 2 math, (5) Year 1 EAR Protocol, (6) Year 2 EAR Protocol, (7)
Overall. Treatment and control schools were assigned scores using the same data and scoring
systems, making the scores in the two conditions directly comparable. The seven values have a
theoretical range of 0 to 100. There were four major steps involved in arriving at these scores: (1)
creating indicators and operational definitions to define full implementation; (2) gathering data
from multiple sources, including key-informant interviews, and linking it to operational
definitions, (3) reliably coding the interviews that provided the bulk of the information about
implementation, and (4) combining all information to create final scores. These steps are similar
to the first four steps advocated by Hulleman, Rimm-Kaufman, and Abry (2013), although our
106
data collection was somewhat less structured. We address their fifth and final step—linking the
measure of implementation to outcomes—in the results chapters (Chapter V and VI).
Creating indicators and operational definitions. IRRE senior staff worked with the
ECED research staff to create a list of the specific activities that would define ‘full
implementation’ of ECED in math, ELA, and use of the EAR protocol. As a group, they
identified 30 indicators of full implementation: 12 for ELA, 14 for math, and 4 for the EAR
Protocol. The research staff then identified ways to measure each of these 30 indicators. Most of
the information came from key informant interviews conducted each spring at each participating
school. The research staff created a scoring rubric to assign values to the interviews and other
sources of data (outlined below) for each of the full implementation indicators. IRRE’s senior
staff provided weights for each indicator. The ELA weights ranged from 4 to 11 and summed to
100. IRRE indicated that each math and EAR Protocol indicator are equally important, so those
indicators are all weighted equally, again summing to 100. See Appendix 12 for a list of
indicators, data sources, and weights.
Gathering data. Most of the information used to judge the extent to which the indicators
were met came from key informant, semi-structured, open ended interviews conducted in the
spring of each year. The interview protocols were written by the research team and the
interviews were conducted by the research project director and a research associate. Information
for the final math scores came primarily from interviews with the math coaches. Math
department chairs were interviewed when there was no coach. Some information for the math
scores also came from teacher questionnaires (e.g., their participation in ECED professional
development). Information for the ELA scores came primarily from interviews with the
Literacy/ELA coaches. ELA department chairs were interviewed when there was no coach.
107
Additionally, some information for ELA scores came from student records (e.g., proportion for
students enrolled in ECED Literacy) and from teacher questionnaires (e.g., how many ECED
lessons each teacher covered). Information for the EAR Protocol scores came from interviews
with math coaches (or math chairs), ELA coaches (or ELA chairs), and the school principal or
assistant principal. Additional information came from IRRE’s EAR Protocol database, which
includes information about how many EAR visits were conducted and uploaded.
The same sources of information and coding systems were used for both treatment and
control schools making the scores directly comparable. For some indicators of full
implementation, however, the control schools were automatically set to zero because the
indicator represented an IRRE support that was not offered at the control schools (e.g., number
of teachers who participated in IRRE’s summer training for ECED).
Interview coding. Two individuals worked independently to reduce each interview into a
series of very brief (i.e., yes/no) responses that directly addressed the full implementation
activities. They compared their responses regularly, ensuring over 90% agreement. Next, one of
those two individuals transformed the brief responses into numeric scores using the scoring
rubric created by the research team. As a check on this scoring, the research project director also
completed two rounds of scoring, each time scoring 10% of the responses. In the first round, the
project director’s codes agreed with the numeric scores 81% of the time. After some discussion
of coding rules and inconsistencies, the research project director coded 10% more responses. The
codes matched 92% of the time in the second round.
Combining all information into final scores. Once each indicator had been scored
using all available information, the scores were combined using the weights devised by the IRRE
senior staff. Table 19 shows descriptive statistics for the final scores on the six indicators,
108
separately for treatment and control schools. As would be expected, the values in the treatment
condition are much higher than in the control, but there is variation within both the treatment and
control conditions. Also, as would be expected, the eight treatment schools that remained in the
study both years had much higher implementation (mean = 71.64, SD = 5.47) than the two
treatment schools that stopped participation part way through the project (mean = 39.42, SD =
0.95).
As seen in Table 20, the correlations among the six variables are quite high. Thus, an
overall score—which is the mean of the six items—was calculated for each school. Cronbach’s
alpha for these six items together is .98.
Table 19. Descriptive Statistics for Variation in Implementation Scores
Treatment (n = 10)
Mean
Y1 ELA
Y1 Math
Y1 EAR
Y2 ELA
Y2 Math
Y2 EAR
Overall
*** p <.001
74.32
80.29
51.59
62.79
66.15
56.06
65.20
Control (n = 10)
SD
Range
Mean
SD
11.87
8.39
11.78
24.24
26.84
27.10
14.42
54.33 to 88.38
61.43 to 90.29
37.50 to 71.67
20.16 61 to 90.98
4.14 to 84.07
0 to 90.28
38.74 to 78.34
14.46
18.37
0 60
16.17
16.54
0
10.92
3.53
5.50
0
4.13
6.78
0
2.15
Range
9.32 to 20.56
9.50 to 25.50
10.56 to 23.28
6.50 to 28.36
8.72 to 14.52
t
15.29***
19.51***
13.85***
6.00***
5.67***
6.54***
11.77***
109
Table 20. Correlations Among Variation in Implementation Scores
ELA Y1
ELA Y1
Math Y1
EAR Y1
ELA Y2
Math Y2
EAR Y2
Overall
1.00
.938***
.928***
.857***
.794***
.823***
.938***
Math Y1
1.00
.950**
.788***
.787***
.812***
.927***
EAR Y1
1.00
.863***
.850***
.913***
.967***
ELA Y2
1.00
.958***
.956***
.953***
Math Y2
1.00
.959***
.941***
EAR Y2
1.00
.961***
***p < .001
Implementation Strengths and Weaknesses
In general, the eight treatment high schools that participated in ECED for two years
implemented ECED’s math components fairly successfully. All eight schools organized
instruction around the “I Can…” statements and implemented the benchmarking and capstone
assessment system in Algebra 1 and Geometry. Most used the mastery grading system and all
had some system in place to help struggling students. The deployment of the math coach was
less successful, with only about one-third of schools reporting that they had a math coach who
actually spent the recommended one-half FTE on coaching. There was wide variation in the
amount of support coaches reported receiving from IRRE. Additionally, few schools held the
weekly meetings of the math teachers focused on instruction that are required for full
implementation.
Likewise, the eight treatment schools that participated in ECED for two years
implemented the ECED Literacy components fairly successfully. All offered the ECED Literacy
course both years; however two schools only had time in student schedules to offer the course for
half of the recommended time. On average, treatment schools enrolled 88% of 9th- and 10thgraders in ECED Literacy in the first year and 84% in the second year. Each year, teachers in
110
about half the schools covered the full ECED Literacy curriculum. ELA coaching was stronger
than math coaching, with about two-thirds of schools reporting that they had an ELA coach who
devoted the recommended one-half FTE to coaching. As with math, there was wide variation in
the amount of support coaches reported receiving from IRRE and few schools held the required
weekly meetings of the ECED Literacy teachers focused on instruction.
Use of the EAR Protocols by school leaders to improve instruction was the weakest
aspect of the ECED implementation. The number of visits completed and uploaded to the server
varied widely by school. Full implementation, as defined by IRRE, would require a total of 700
EAR Protocol visits per year conducted by school or district leaders or IRRE consultants. In the
first year of implementation the average number of visits was 141 (SD = 144, range = 7 to 381)
across the ten schools. In the second year of implementation, the average number of visits was
233 (SD = 262, range = 0 to 691) across the eight schools who continued to participate. The two
schools that no longer participated did not use the EAR Protocol in the second year.
There was virtually no spill-over or contamination of ECED into the control schools. The
interviews revealed that the chairs, coaches, and school administrators at the control schools had
almost no awareness of the supports being received at the treatment schools, and none had made
any attempt to replicate ECED in their school. Of course, some components of ECED existed in
the control schools anyway. For instance, some control schools had math or ELA instructional
coaches or held regular meetings of math or ELA teachers, focused on instruction. Thus, there is
variation in Implementation Scores among control schools.
111
V. Results for Teachers’ Attitudes, Experience, and Observed Practice
This chapter presents the findings for teacher outcomes including teachers’ self-reports of
attitudes and experiences from the teacher questionnaires and teacher practice as observed using
the Engagement, Alignment, and Rigor Classroom Visit Protocol. We start by presenting overall
data analytic strategy. Next, we present findings for math teachers’ self-reports of attitudes and
experiences, followed by observed EAR in math classes. After the math teacher findings, we
report the findings for ELA teachers. Math and ELA teacher findings are reported separately,
because the interventions for the two groups were quite different and because the results for math
teacher are experimental; whereas those for the ELA teachers are not.
Data Analytic Strategy
Point-in-time. For each outcome, we estimated a series of 2-level Hierarchical Linear
Models (HLM 6.02; Raudenbush & Bryk, 2002) with fixed effects to consider the impact of the
ECED treatment on teacher questionnaire responses and teacher practices at the end of Year 1
(Wave 2) and the end of Year 2 (Wave 4). (Note that a matrix presenting the correlations among
all teacher outcomes appears in Appendix 13). It is important to note that only 5 of the 11
outcomes from the teacher questionnaire were reliable enough at the school level to have a
strong possibility of detecting effects (see ICCs in Method Chapter). We tested impacts on all 11
outcomes with the knowledge that low school-level reliability would make finding impacts on
those outcomes more difficult. The models accounted for nesting of teachers within schools. The
Year 1 analyses included all teachers who taught a target course (i.e., Algebra 1 or Geometry for
math, ECED Literacy or 9th-/10th-grade English for ELA) during the first year of the study
(Wave 1 and/or 2). The Year 2 analyses included all teachers in the study, that is all teachers who
taught a target course in either math or English/ELA during either the first or second year (i.e., at
112
any point during the four waves. Imputed data were used for the analyses testing impacts on
teacher questionnaire responses. However, imputation has not yet been done on the teacherobservation data so non-imputed data were used for the analyses testing impacts on teacher
practices so there are some missing data in these analyses. 62 In this case, the Year 1 analyses
included all teachers who were observed in Wave 2 and the Year 2 analyses included all teachers
who were observed in Wave 4.
For each outcome, a series of six separate models was estimated. Table 21 lists variables
included in each model. The first two models included condition (treatment versus control)
(Model 1) and condition plus four dummy codes accounting for the five school districts (Model
2) at Level 2 (school). The next set of models added covariates at Level 1 (teacher). The third
model added the teachers’ baseline (Wave 1) responses to the dependent variable, if that variable
was collected at baseline. The fourth model added teacher baseline demographic covariates:
gender, race/ethnic background, and years of teaching. The fifth model added a variable
indicating the number of semesters the teacher taught a target class and, for EAR Protocols, a
control variable for number of times the teacher was observed. Finally, the last model tested for
moderation effects of number of semesters teaching a target class by including cross-level
interactions between this covariate and treatment condition. The number of semesters teaching a
target class and number of times the teacher was observed are endogenous to treatment and could
have been affected by the treatment. For that reason, in this report we typically present Model 4,
which contains only variables that are exogenous to treatment and note the findings for the fifth
and sixth models.
113
Table 21. Variables Included in Teacher Models
Model
Condition
Four dummy codes for district
Baseline (W1) response, if available
Teacher demographics (gender,
race/ethnicity, years of teaching)
Semesters teacher taught target class
(plus number of times teacher was
observed for EAR analyses)
Treatment X semesters
Level
1
2
3
4
5
6
2 (school)
2 (school)
1 (teacher)
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
1 (teacher)
1 (teacher)
cross-level
X
Maximum likelihood parameter estimates with robust standard errors were used to
estimate the parameters. All covariates were grand-mean centered, following guidelines by
Enders and Tofighi (2007) for cluster randomized studies where a Level 2 treatment effect is of
interest. In interpreting the results we consider an alpha level of p < .05 as statistically
significant, but given the nature of the design (resulting in only 14 degrees of freedom and
therefore relatively low power to estimate the intervention effect), we note effects up to the .10
level, particularly in the case of interactions (McClelland & Judd, 1993). Effect sizes were
calculated by dividing the estimate of the intervention effect by the raw standard deviation of the
dependent variable for the control group (a variant of Cohen’s d, attributed as Glass’s ∆; Cohen,
1992).
Growth curve models. Following the point-in-time analyses, growth curve analyses
were conducted in order to better understand the pattern of change in intervention impacts over
time. Estimates of intervention impacts on change in the teacher outcomes for which we had four
waves of data were calculated using a series of three-level hierarchical linear growth models in
HLM. In these models, Level 1 represents time (i.e., the repeated assessments of the outcomes of
114
interest for each teacher), Level 2 represents the teacher, and Level 3 represents schools. The
same teacher-level covariates as included in the point-in-time models were included at Level 2.
Level 3 included an intervention dummy and four district dummies representing the five districts
in the study. A series of unconditional models were first estimated and compared to determine
the most appropriate functional form for each of the outcomes. These were an intercept only,
intercept-slope, and intercept-slope-quadratic model. Models were compared using the likelihood
ratio test (Raudenbush & Bryk, 2002) based on change in the deviance estimate between models
generated in HLM and the number of parameters in the model. 63 The best-fitting model was used
to test program impacts.
Variation in implementation. In order to test the degree to which variation in
implementation affected intervention impacts on the main outcomes of interest, follow-up
analyses were conducted for math teachers in which the treatment/control dummy variable was
replaced with the overall variation in implementation indicator variable (see Chapter IV for a
description of this variable). Because variation in implementation was not randomly assigned,
these analyses are non-experimental and are meant to complement the experimental results. In
the interest of parsimony, we limited our analyses here to math teachers—our experimental
group—and to main effects without interactions (Models 1 through 5).
Results for Math Teachers
Point-in-time analyses predicting math teacher attitudes and experiences. The
outcome for the first set of models was the math teachers’ perceptions of administrative support
for instructional innovation and improvement, which was the second order factor described in the
method chapter (Chapter III). That factor measures the extent to which teachers believe that
school and district administrators are responsive to their needs and that efforts are being made to
115
improve teaching and learning. Not all items from the second order factor appeared on the Wave
1 questionnaire, so the baseline response added here was only the mean of support from the
school administrators. Of the factors measured at Wave 1, support from school administration
was mostly highly correlated with both Wave 2 and Wave 4 administrative support for
instructional innovation and improvement.
Findings from these models largely indicated that ECED did not affect teachers’
perceptions of administrative support for instructional innovation and improvement. Table 22
presents the findings from the fourth model, the one that controls district, baseline, and teacher
demographics. The patterns of significance found in Model 4 were the same as found in each of
the other models. This model indicates that baseline response was a significant and fairly large
predictor of response at the end of the first year and the end of the intervention. However, there
was no evidence that the treatment had an effect on teachers’ perceptions of support from school
administration and beliefs that efforts were being made to improve teaching and learning, either
at the end of the first year or the end of the intervention.
116
Table 22. Predicting math teachers’ perceptions of administrative support for instructional
innovation and improvement (second order factor)
Year 1
n teachers = 178
j schools = 20
Estimate
Treatment (0 = control)
District 2
District 3
District 4
District 5
Baseline
Gender (0 = male)
Race/ethnicity
Years of teaching
0.04
-0.18
-0.32
-0.06
-0.14
0.50
0.17
0.07
-0.03
Year 2
n teachers = 239
j schools = 20
SE
p
0.14
0.19
0.30
0.22
0.20
0.09
0.14
0.05
0.03
0.799
0.366
0.291
0.778
0.508
0.000
0.238
0.181
0.322
Estimate
SE
p
-0.17
-0.06
-0.09
-0.06
-0.07
0.58
0.16
0.05
0.04
0.10
0.15
0.17
0.19
0.13
0.09
0.10
0.06
0.04
0.118
0.680
0.622
0.758
0.598
0.000
0.129
0.415
0.315
The fifth model added the number of terms that the teacher taught one or more target
classes during the intervention (range = 1 to 4). This control was added to account for the fact
that some teachers were only teaching a target course for a short time (e.g., 1 term), and therefore
if they were in a treatment school their exposure to the ECED was less than if they had taught a
target course all four terms. However, as noted earlier, this variable is not exogenous to treatment
and could, theoretically, be affected by the treatment itself. Findings from these models
paralleled those presented above and the added control was non-significant.
The sixth model added the interaction between the number of semesters teaching a target
course and treatment condition. This cross-level interaction was significant and negative for this
second order factor of administrative support for instructional innovation and improvement in
Year 1 (β = -.68, SE = .28, p = .016). Teachers in ECED schools who taught a target course in
both semesters reported less administrative support for instructional innovation and improvement
117
compared to teachers who only taught a target course one semester, while the opposite was true
in control schools. This interaction was not significant at the end of Year 2.
Comparable models were then estimated for each of the individual scales on the teacher
questionnaire: (1) teacher collective commitment, (2) teacher mutual support, (3) support from
district administration, (4) support from school administration, (5) commitment to change, (6)
confidence in change, (7) individual teacher morale, (8) professional development, (9) perceived
competence, and (10) relative autonomy index. The same set of six models was estimated for the
first four scales. Because Wave 1 did not include the last six scales we could not include baseline
in these analyses, meaning only five models were estimated for each.
Again, the findings generally indicated that ECED had no impact on teachers’ attitudes
and experiences. Baseline scores, when available, were always significant predictors of the
outcome scores. The only outcome for which the treatment was significant was teacher mutual
support in Year 2. In the first five models, teachers in the control condition reported a greater
sense of support from colleagues than teachers in the treatment condition at the end of Year 2
(final model: β = -.20, SE = .09, p = .033; effect size = .36).
The final model for each of the individual subscales added the interaction between the
number of semesters teaching a target course and treatment condition. A significant cross-level
interaction was found for teacher collective commitment, support from school administration,
confidence in change, individual teacher morale, and RAI in Year 1, and commitment to change
in Year 2. Each of these cross-level interactions followed a similar trend; the more semesters
teachers taught target courses in intervention schools, the more negative their responses to each
of the outcomes were, while the reverse was true for teachers in control schools. It seems that
trying to facilitate change among teachers in difficult situations leaves them with somewhat less
118
positive attitudes than comparable control-group teachers in similar situations where the change
is not being facilitated.
Growth curve analyses predicting math teacher attitudes and experiences. Growth
curve analyses were subsequently conducted on the four questionnaire outcomes for which we
had four waves of data (i.e., teacher collective commitment, teacher mutual support, support
from district administration, support from school administration) in order to better understand the
pattern of change over time. Estimating impacts on growth in the second order factor was not
possible because this factor consisted of sub-factors for which we had only two waves of data.
Before testing the effects of the ECED intervention on growth in our four questionnaire
outcomes, three unconditional models without any covariates were compared for each outcome
to determine the appropriate functional form of the growth curve. For each outcome, the three
models compared were an intercept only model, an intercept-slope only model, and an interceptslope-quadratic model. Once the appropriate form of the growth curve was determined, that
model was then estimated.
Findings indicated that the intercept-slope model was a significantly better fit than the
intercept-only model for two of the four outcomes. For teacher collective commitment and
support from district administration the slope parameters were significant, indicating that these
outcomes showed meaningful growth across the four time points. The intercept-slope-quadratic
model was not a significantly better fit than the intercept-slope only model for any of the
outcomes, indicating linear rather than curvilinear change across time. For teacher mutual
support and support from school administration the intercept-slope model was not a significantly
better fit than the intercept-only model, indicating that these outcomes did not show meaningful
growth or change across the four time points, on average. However, the lack of average growth
119
could potentially mask divergent growth or change across the treatment and control groups.
Therefore, despite the lack of average growth in these two outcomes, intervention impacts on
linear growth in all four outcomes were estimated.
The findings generally paralleled the point-in-time findings indicating that ECED only
impacted one of the four outcomes. No significant intervention-control differences were found in
the slopes of teacher collective commitment, support from district administration, and support
from school administration. Despite the non-significant slope parameter of teacher mutual
support, there was a significant negative intervention-control difference in this slope parameter
(β = -.67, SE = .03, p = .049). As shown in Figure 1, the ECED group showed a decline in
mutual support over the two years of the study, relative to the control group. This finding is in
keeping with the negative impact found in the point-in-time models for this outcome.
3.38
Control
Teacher Mutual Support
Intervention
3.33
3.27
3.22
3.16
1
2
3
Time (Fall of Year 1 -- Spring of Year 2)
Figure 1. Impact of intervention on Teacher Mutual Support.
120
Associations of variation in implementation with math teacher attitudes and
experiences. When the overall variation in implementation variable was used in place of the
treatment/control dummy variable in the point-in-time analyses of teacher questionnaire
responses, results suggested that variation in implementation was not associated with teachers’
attitudes and experiences. Table 23 presents results from the models predicting perceived
administrative support for instructional innovation and improvement at the end of Year 1 and the
end of Year 2. The degree to which ECED was implemented at the school-level was not
associated with teachers’ perceptions of support from school administration and beliefs that
efforts are being made to improve teaching and learning, either at the end of the first year or the
end of the intervention. In the fifth model, there was no association between the indicator for
number of terms the teacher taught a target class and perceived administrative support for
instructional innovation and improvement. Findings from comparable models estimating
associations between variation in implementation and each of the individual scales on the teacher
questionnaire (i.e., teacher engagement, collective engagement, support from district
administration, support from school administration, commitment to change, confidence in
change, individual engagement, professional development, perceived competence, relative
autonomy index) also indicated no effect of ECED treatment on these individual scales.
121
Table 23. Associations between variation in implementation and math teachers’
perceptions of administrative support for instructional innovation and improvement
(second order factor)
Year 1
n teachers = 178
j schools = 20
Estimate
Overall implementation
District 2
District 3
District 4
District 5
Baseline
Gender (0=male)
Race/ethnicity
Years of teaching
0.00
-0.19
-0.34
-0.09
-0.13
0.50
0.17
0.07
-0.03
Year 2
n teachers = 239
j schools = 20
SE
p
0.00
0.21
0.23
0.21
0.25
0.11
0.13
0.05
0.04
0.479
0.387
0.155
0.689
0.615
0.000
0.199
0.164
0.438
Estimate
SE
p
0.00
-0.04
-0.07
-0.03
-0.07
0.58
0.15
0.05
0.04
0.00
0.17
0.18
0.18
0.18
0.09
0.11
0.06
0.04
0.307
0.825
0.700
0.877
0.715
0.000
0.169
0.392
0.280
Point-in-time analyses predicting observed Engagement, Alignment, and Rigor in
math classes. In order to analyze the classroom observation data, for each teacher at each wave,
three scores were created (one each for E, A, and R). This was accomplished by first applying
the scoring methods outlined in Early et al. (2013) to calculate continuous E, A, and R scores for
each observation and then calculating each teacher’s average E, A, and R score across all
observations during that wave. As noted earlier, this section reports preliminary findings using
the unimputed data. Final analyses will be conducted using the imputed data when they become
available. Table 24 presents findings from the fourth model. At the end of Year 1, instruction in
math classes in treatment schools was rated as more aligned and more rigorous than the instruction in
the control schools, controlling for district, baseline, and teacher demographic variables. There was
no difference in observed engagement. However, at the end of Year 2, observed engagement was
significantly lower in treatment schools as compared to control schools, and rigor was significantly
higher. There was no significant difference in alignment. Baseline scores were significant predictors
122
of end of Year 1 and 2 engagement and alignment but not rigor scores (although at the end of Year 2,
baseline rigor approached significance). The other controls were largely non-significant. In the sixth
model (not tabled), no significant cross-level interactions between treatment and number of semesters
teaching a target class were found.
The first three models (not tabled) were not always consistent with the later model.
Specifically, in Year 1 the positive treatment effect on alignment reached significance only when
baseline was added to the model (Model 3). In Year 2 the negative treatment impact on observed
engagement reached significance only when baseline and demographic covariates were added to the
model. In addition, there was a significant positive treatment effect on alignment before the
demographic covariates were added to the model (Models 1-3).
123
Table 24. Predicting Observed E, A, and R for Math teachers
Engagement
Year 1
n teachers = 96
j schools = 19
Treatment
(0 = control)
District 2
District 3
District 4
District 5
Baseline
Gender
Race
Yrs. teaching
Est.
SE
0.02
-0.03
-0.09
0.00
-0.02
0.28
0.04
-0.02
0.01
0.03
0.05
0.06
0.06
0.07
0.08
0.03
0.01
0.01
p
Alignment
Year 2
n teachers = 77
j schools = 19
Est.
0.56 -0.09
0.60 0.12
0.16 0.15
0.94 0.21
0.81 0.17
0.00 0.52
0.19 0.04
0.14 0.01
0.42 0.00
Year 1
n teachers = 96
j schools = 19
SE
p
Est.
SE
p
0.03
0.05
0.06
0.06
0.07
0.10
0.03
0.02
0.01
0.03
0.04
0.03
0.00
0.04
0.00
0.28
0.74
0.73
0.09
0.03
-0.14
-0.07
-0.01
0.20
-0.03
-0.03
0.00
0.03
0.06
0.07
0.06
0.07
0.09
0.03
0.02
0.01
0.02
0.59
0.06
0.28
0.85
0.03
0.34
0.04
0.84
Rigor
Year 2
n teachers = 77
j schools = 19
Est.
SE
p
0.07
0.04
0.18
0.13
0.09
0.24
0.05
0.01
0.00
0.04
0.07
0.08
0.08
0.09
0.12
0.04
0.02
0.02
0.13
0.57
0.04
0.11
0.38
0.05
0.27
0.80
0.95
Year 1
n teachers = 96
j schools = 19
Year 2
n teachers = 77
j schools = 19
Est.
SE
p
Est.
SE
p
0.23
0.30
-0.50
-0.16
-0.23
0.18
0.09
-0.06
0.02
0.07
0.14
0.14
0.17
0.15
0.11
0.07
0.05
0.03
0.01
0.06
0.00
0.36
0.14
0.11
0.20
0.25
0.51
0.25
-0.04
0.16
0.33
0.22
0.21
0.08
-0.01
0.01
0.10
0.15
0.18
0.19
0.21
0.12
0.10
0.05
0.04
0.03
0.77
0.40
0.10
0.33
0.09
0.41
0.82
0.88
124
Growth curve analyses predicting EAR for math teachers. Growth curves were
modeled for each of the three EAR outcomes in order to better understand the pattern of change
over time. Unconditional models suggested that an intercept-slope model was not a significantly
better fit then the intercept-only model for any of the three outcomes, nor were the average
slopes significant. As was done with the teacher questionnaire, intervention impacts on linear
growth were estimated for each of the three outcomes despite the finding that these outcomes did
not show meaningful change across the four time points, on average.
Consistent with the point-in-time analyses, there was a negative effect of ECED on the
slope of observed engagement over the course of the study (β = -.03, SE = .01, p = .037). As
shown in Figure 2, observed engagement declined among teachers in the treatment group and
increased among teachers in the control group. There was also a trend-level positive effect on
ECED on the slope of rigor (β = .06, SE = .03, p = .097) such that teachers in the control group
showed a decline in observed rigor over two years, relative to the treatment group. See Figure 3.
No significant effect of the intervention was found on change in alignment.
125
0.71
Control
Observed Engagement
Intervention
0.70
0.69
0.67
0.66
0
1
2
3
Time (Fall Wav e 1 -- Spring Wav e 4)
Figure 2. Impact of Intervention on change in Observed Engagement.
0.288
Control
Intervention
Rigor
0.216
0.145
0.074
0.003
0
1
2
Time (Fall Year 1 -- Spring Year 2)
Figure 3. Impact of Intervention on change in Rigor.
3
126
Results for ELA Teachers
Point-in-time analyses predicting ELA teacher attitudes and experiences. The same
point-in-time models were estimated for the ELA teachers as were estimated for math teachers.
As explained in the Method section (see p. 51), these comparisons are non-experimental because
it was not possible to know which teachers (including additional ones hired to do it) would have
taught ECED Literacy in the control schools had that course been offered. Table 25 presents the
point-in-time model testing impacts of ECED on perceived administrative support for
instructional innovation and improvement (fourth model controlling district, baseline, and
teacher demographics), so there is not a set of teachers in the control group that is comparable to
the one in the experimental group. Again, teachers who taught a target ELA class (i.e., ECED
Literacy and/or 9th- 10th-grade English) during the first year are included in the Year 1 analyses.
The Year 2 analyses include all ELA study teachers, that is all teachers who taught a target ELA
course at any point during the four waves. As with math, there were no differences on the
administrative support variable between the teachers in the treatment and control conditions
either at the end of Year 1 or the end of the intervention. Baseline score was the only significant
predictor. Additionally, terms teaching a target course and the interaction between terms and
condition (not tabled) were non-significant. When comparable models were estimated for each of
the individual scales, there were no significant between-group differences.
127
Table 25. Predicting support for instructional innovation and improvement (Second Order
Factor) for ELA teachers
Year 1
n teachers = 218
j schools = 20
Estimate
Treatment (0 = control)
District 2
District 3
District 4
District 5
Baseline
Gender (0 = male)
Race/ethnicity
Years of teaching
0.02
-0.16
-0.20
0.20
-0.26
0.44
0.07
0.04
-0.02
Year 2
n teachers = 298
j schools = 20
SE
p
0.10
0.13
0.16
0.18
0.13
0.10
0.1
0.06
0.04
0.845
0.232
0.214
0.278
0.071
0.000
0.517
0.491
0.567
Estimate
-0.10
-0.02
-0.09
-0.07
-0.20
0.41
-0.01
0.00
-0.01
SE
p
0.10
0.18
0.18
0.15
0.18
0.09
0.09
0.05
0.03
0.327
0.917
0.631
0.648
0.272
0.000
0.947
0.988
0.812
Growth curve analyses predicting ELA teacher attitudes and experiences. The same
set of longitudinal analyses were conducted for ELA teachers as were presented above for math
teachers, using the four questionnaire outcomes for which we had four waves of data. No
impacts of the intervention were found on growth in the outcomes over two years. 64
128
VI. Impacts on Students
Data Analytic Strategy
Point-in-time impacts. As with teacher models, for each outcome, we estimated a series
of 2-level Hierarchical Linear Models (HLM 6.02; Raudenbush & Bryk, 2002) with fixed effects
to consider the impact of the ECED treatment on the outcomes at the end of Year 1 (Wave 2) and
the end of Year 2 (Wave 4) using the imputed data. (Note that there is a matrix showing
correlations among all student outcome variables in Appendix 14. Further, there is matrix that
shows correlations among all outcomes—both student and teacher—aggregated to the school
level, in Appendix 15). Most of the student outcomes had high school-level reliability (see ICCs
in Method Chapter), indicating a strong possibility to detect school-level effects. The only
exceptions were from the student questionnaire: student engagement in both years and perceived
competence in Year 1 had low school-level reliability. Thus, school-level effects would be
difficult to find because these reliabilities were below the cut-off considered acceptable;
however, they were not so low that detecting effects would be impossible. The models accounted
for nesting of students within schools. The Year 1 analyses included all students who were in the
9th-grade and were enrolled in a target school during the first year of the study (Wave 1 and/or
2). The Year 2 analyses include all students who were enrolled in target schools at any point in
the study and were in 9th-grade in the first year and/or 10th-grade in the second year.
A series of six separate models were estimated for most outcomes. The first two models
included condition (treatment versus control) (Model 1) and condition plus four dummy codes
accounting for the five school districts (Model 2) at Level 2 (school). The next set of models
added covariates at Level 1. The third model added the students’ baseline (Wave 1) of the
dependent variable when available. Baseline was available for the survey outcomes, math and
129
ELA achievement, but not for GPA, credits earned, or attendance. The fourth model added
student baseline demographic covariates: gender, race/ethnic background, special education,
free/reduced price lunch, receipt of English language learner (ELL) services. The fifth model
added a variable indicating the number of semesters the student was enrolled in a study school to
account for variation in students’ potential exposure to the intervention. In the case of math
achievement, an additional covariate was added to the fifth model indicating the type of math test
taken (e.g., Algebra 1, Geometry, etc.) to account for the fact that different districts administered
different tests and test level often depended on students’ course schedules. It is important to note
that the variables added in Model 5 are not exogenous, because they were measured during the
intervention and could have been affected by the intervention. Finally, the last model tested for
moderation effects of gender, race/ethnicity, baseline when available, and number of semesters
in study school by including cross-level interactions between these covariates and treatment
condition. See Table 26 for a list of variables included in each model.
Throughout this chapter, we will present tables of the findings from Model 4, because it
includes the demographic variables but does not include the potentially endogenous variables or
the interactions. In each table, the district variables compare each district to District 1. The
race/ethnicity variables compare each group to White students. Free/reduced price lunch is coded
so that 0 means that the student did not receive that service either year, and 1 means he or she
received it one or both years. Special education and English language learner services reflect the
students’ status at Year 1 (baseline) because those variables could theoretically be affected by the
intervention so we did not want to include the Year 2 measure as a control.
130
Table 26. Variables Included in Student Models
Model
Condition
Four dummy codes for district
Baseline, if available
Student demographics (gender,
race/ethnicity, special education,
free/reduced price lunch, ELL
services)
Semesters enrolled in target
school (+ type of math test for
math achievement only)
Moderation (gender, race/
ethnicity, baseline, semesters X
condition)
Level
1
2
3
4
5
6
2 (school)
2 (school)
1 (student)
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
1 (student)
1 (student)
cross-level
X
As with the teacher models, maximum likelihood parameter estimates with robust
standard errors were used to estimate the parameters. All covariates were grand-mean centered,
following guidelines by Enders and Tofighi (2007) for cluster randomized studies where a Level
2 treatment effect is of interest. In interpreting the results we consider an alpha level of p < .05 as
statistically significant, but given the nature of the design (resulting in only 14 degrees of
freedom to estimate the intervention effect and therefore relatively low power to estimate the
intervention effect), we note effects up to the .10 level, particularly in the case of interactions
(McClelland & Judd, 1993). Effect sizes were calculated by dividing the estimate of the
intervention effect by the raw standard deviation of the dependent variable for the control group
(a variant of Cohen’s d, attributed as Glass’s ∆; Cohen, 1992).
Although cross-level interactions between treatment status and each of the baseline
covariates (i.e., gender, race, baseline value) were estimated for each outcome in the sixth model,
no consistent pattern emerged. Given the lack of power at the school level to estimate these
131
cross-level interactions and the lack of a consistent pattern, this report does not present the few
significant but idiosyncratic interactions that did emerge with baseline variables (See Appendix
16 for a table of all significant interactions). There was one cross-level interaction that did
emerge repeatedly, namely the interaction between treatment and number of semesters enrolled
in a study school. This is not a baseline characteristic and is therefore non-experimental. This
report does, however, present this one cross-level interaction when it was significant because it
seemed to represent an interpretable pattern.
Growth curve models. Growth curve analyses were subsequently conducted on the
survey outcomes in order to better understand the pattern of change in intervention impacts over
time. As noted in the Method Chapter, the measure of student achievement changed across
waves, so we are not able to estimate growth curve models for student achievement. And, growth
curve analyses were not conducted for the measures of performance because we did not have
baseline data for those variables and therefore only had two time points. Estimates of
intervention impacts on change in the student survey scales from baseline (Wave 1) to the end of
the study (Wave 4) were calculated using a series of three-level hierarchical linear growth
models in HLM. In these models, Level 1 represents time (i.e., the repeated assessments of the
outcomes of interest for each student), Level 2 represents the student, and Level 3 represents
schools. The same student-level covariates as included in the point-in-time models were included
at Level 2. Level 3 included an intervention dummy and four district dummies representing the
five districts in the study. In addition, we examined cross-level intervention by baseline covariate
interactions for the appropriate growth parameters. A series of unconditional models was first
estimated and compared to determine the most appropriate functional form for each of the
outcomes. These were an intercept only, intercept-slope, and intercept-slope-quadratic model.
132
Models were compared using the likelihood ratio test (Raudenbush & Bryk, 2002) based on
change in the deviance estimate between models generated in HLM and the number of
parameters in the model. 65 The best-fitting model was used to test program impacts. As with the
point-in-time analyses, cross-level interactions between treatment status and each of the baseline
covariates (i.e., gender, race, free or reduced price lunch, special education, ELL, baseline value)
were estimated for each outcome in the sixth model, no consistent pattern emerged. For that
reason we are not presenting those interactions (see Appendix 16 for a table of all significant
interactions). We do, however, present cross-level interactions between treatment and terms
enrolled in the study school.
Variation in implementation analyses. In order to test the degree to which variation in
two years of ECED implementation affected intervention impacts on the main outcomes of
interest at the end of the study, follow-up analyses were conducted in which the
intervention/control dummy variable was replaced with the overall variation in implementation
indicator variable (see Chapter IV for a detailed description of this variable). Because variation
in implementation was not randomly assigned, these analyses are non-experimental and are
meant to complement the experimental results.
The analysis strategy was identical to the point-in-time student impact analyses
conducted using the intervention/control dummy variable. The same series of 2-Level
Hierarchical Linear Models with fixed effects were estimated. In these analyses, the models
tested the degree to which greater implementation of the ECED treatment was associated with
more positive student questionnaire responses, student achievement, and school performance at
the end of Year 2.
133
A second set of non-experimental follow-up analyses was conducted to test the degree to
which variation in implementation was related to student outcomes within ECED intervention
schools. A similar series of 2-Level fixed effects point-in-time models was estimated (Models 1
through 5) using the variation in implementation indicator at Level 2 in place of the treatment
indicator, but the analyses were conducted in the ten treatment schools only.
Point-in-Time Analyses Predicting Students’ Attitudes Toward School
The outcome for the first set of models was the students’ attitudes toward school, which
was the second order factor described in the Method chapter (Chapter III). That factor combines
the extent to which students see their teachers as being supportive, the students report being
engaged in school, and their perceived competence. Table 27 presents the findings from the
fourth model that controls district, baseline, and teacher demographics. The patterns of
significance were the same in each of the earlier models (Models 1 through 3). There was no
evidence that the treatment had an effect on students’ attitudes toward school. However, the
baseline response was a significant and fairly large predictor of responses at the end of the first
year and the end of the intervention. Girls reported more positive attitudes toward school at both
time points. Relative to White students, Black and Asian/Pacific Islander students reported more
positive attitudes toward school at the end of the second year. Free/reduced lunch, special
education, and ELL status were not significant predictors. In Model 5 (not tabled), the indicator
for number of semesters enrolled in a study school, was not a significant predictor either.
134
Table 27. Predicting Students’ Attitudes Toward School (Second Order Factor)
Year 1
n students = 7,354
j schools = 20
Estimate
Treatment (0 = control)
District 2
District 3
District 4
District 5
Baseline
Gender (0 = male)
Hispanic
Black
Asian/Pacific Isl.
Amer. Indian/Other
Free/reduced lunch
Special education
ELL
0.011
-0.056
-0.023
-0.053
-0.055
0.626
0.040
-0.006
0.008
0.046
-0.049
-0.011
0.026
0.010
SE
0.010
0.020
0.015
0.013
0.015
0.016
0.008
0.010
0.010
0.016
0.031
0.009
0.019
0.013
p
0.296
0.015
0.149
0.001
0.003
0.000
0.000
0.554
0.393
0.005
0.113
0.200
0.181
0.439
Year 2
n students = 8,433
j schools = 20
Estimate
-0.013
-0.055
-0.023
-0.045
-0.049
0.567
0.046
0.009
0.031
0.075
-0.006
-0.007
0.029
0.021
SE
p
0.017
0.035
0.030
0.027
0.030
0.015
0.006
0.011
0.011
0.019
0.023
0.010
0.018
0.015
0.464
0.145
0.472
0.118
0.120
0.000
0.000
0.421
0.007
0.00
0.789
0.458
0.102
0.176
In the sixth model that added interactions, the number of semesters enrolled in a study
school moderated the effect of treatment on students’ attitudes toward school (β = -.02, SE = .01,
p = .066) such that there was a negative effect of treatment on attitudes toward school for
students who were enrolled more terms in treatment schools. Students who were enrolled the
most terms in control schools had the most positive attitudes toward school. See Figure 4. It is
important to note that this is not a causal moderation because the number of semesters enrolled in
a study school is not exogenous to treatment (i.e., student mobility might be affected by the
intervention).
Students' attitudes towards school
135
3.20
3.19
3.18
3.17
3.16
3.16
3.15
3.14
3.13
3.12
3.11
3.10
3.10
3.09
3.08
3.07
3.06
3.05
3.04
3.04
3.02
3.01
3.00
-2.03
Control
Intervention
-1.28
-0.53
0.22
0.97
Number of semesters in study
Figure 4. Interaction of ECED intervention and number of semesters enrolled in a study
school on Year 2 students’ attitudes toward school.
Point-in-Time Analyses Predicting Individual Student Survey Scales
Follow-up models were then estimated for each of the individual scales on the student
questionnaire: (1) positive teacher support/expectations, (2) lack of teacher support, (3)
engagement in school, (4) perceived competence, (5) relative autonomy index. The same series
of six models was fitted. In the fourth model without interactions, the effect of treatment was
non-significant both at the end of Year 1 and Year 2 for each of the scales. As with the second
order factor, the baseline score was always a significant and large predictor in the models
predicting these individual scales. Girls tended to report more positive experiences than boys.
Relative to White students, Black and Asian/Pacific Islander students tended to report more
positive attitudes toward school at the end of the second year. Student free/reduced lunch, special
education, and English language learner services were generally not significant predictors.
136
As with the second order factor, in the fifth model, number of semesters in a study school
was not associated with any of the scales. However, in the sixth models where the interactions
were added, the number of semesters in a study school moderated the effect of treatment on
perceived competence (β = -.02, SE = .01, p = .038) such that a negative effect of treatment on
perceived competence was found for students who were enrolled more terms in treatment
schools.
Point-in-Time Analyses Predicting Math and ELA Achievement
In Model 3 that includes treatment, district covariates, and baseline test score, a
significant effect of treatment was found for math achievement such that students in treatment
schools had higher math test scores than their counterparts in control schools. This finding was
marginal in Year 1 (β = .18, SE = .09, p = .056, E.S. = .18) and significant in Year 2 (β = .15, SE
= .07, p = .043, E.S. = .16). Treatment was not a significant predictor of ELA achievement. Table
28 presents the findings from the fourth model which adds student-level demographic covariates.
The effect of ECED treatment is marginal both at Year 1 and Year 2 in this model, indicating
that adding the demographic covariates slightly diminished the association between treatment
and test score and the effect sizes were small (.16 at Year 1 and .14 at Year 2). No significant
effects of treatment were found for ELA test scores in either year. Based on these analyses, we
concluded that there is some evidence that intervention was effective at increasing math scores,
but that ELA scores were unaffected by the treatment.
As would be expected, baseline test scores were a significant and fairly large predictor of
tests scores at each year. Additionally, as seen in Table 28, when controlling for all these other
variables, several demographic covariates were significantly associated with achievement.
137
In Model 5, the number of semesters enrolled in a study school was not a significant
predictor of math or ELA achievement, nor was test type a significant predictor of math
achievement. The sixth model revealed no significant interactions between demographic
characteristics and ECED treatment condition, indicating the treatment was equally effective at
increasing math scores across all these groups and that ECED did not affect ELA scores in any
demographic subgroup.
138
Table 28. Predicting Math and ELA Achievement
Math achievement
Year 1
Estimate
Treatment (0=control)
SE
ELA achievement
Year 2
p
Estimate
SE
Year 1
p
Estimate
SE
Year 2
p
Estimate
SE
p
0.162 0.081 0.066
0.130 0.063 0.058
0.032 0.125 0.803
0.062 0.053 0.255
District 2
0.127 0.124 0.323
-0.060 0.092 0.526
0.048 0.122 0.701
-0.015 0.084 0.862
District 3
District 4
0.184 0.132 0.184
0.251 0.125 0.065
0.188 0.093 0.062
0.390 0.094 0.001
0.023 0.129 0.864
-0.058 0.123 0.646
-0.137 0.084 0.127
0.011 0.085 0.896
District 5
0.207 0.130 0.134
0.334 0.100 0.005
0.251 0.318 0.466
-0.148 0.107 0.188
Baseline
0.522 0.015 0.000
0.396 0.015 0.000
0.676 0.012 0.000
0.608 0.020 0.000
Gender (0=male)
Hispanic
-0.006 0.020 0.783
-0.144 0.040 0.001
-0.071 0.021 0.002
-0.131 0.047 0.018
0.022 0.019 0.257
-0.092 0.043 0.051
-0.012 0.024 0.627
-0.172 0.051 0.011
Black
-0.190 0.038 0.000
-0.241 0.052 0.001
-0.147 0.050 0.018
-0.151 0.049 0.012
0.017 0.050 0.736
0.051 0.066 0.458
-0.002 0.052 0.972
-0.032 0.050 0.526
Amer. Indian/Other
Free/reduced lunch
-0.210 0.075 0.007
-0.012 0.028 0.661
-0.107 0.081 0.201
-0.005 0.032 0.879
-0.075 0.077 0.341
-0.061 0.026 0.034
-0.178 0.081 0.042
-0.099 0.026 0.000
Special education
-0.305 0.050 0.000
-0.221 0.055 0.001
-0.290 0.052 0.000
-0.171 0.070 0.041
ELL
-0.054 0.029 0.066
-0.025 0.027 0.355
Covariates
Asian/Pacific Isl.
.
-0.082
0.028 0.006
-0.047
0.034 0.197
139
Point-in-Time Analyses Predicting Performance (GPA, Credits, and Attendance)
Table 29 presents the findings from the fourth models with student level demographics
covariates predicting grade point average (GPA), proportion of credits earned toward graduation,
and attendance. The findings indicate that the ECED intervention did not affect grade point
average (GPA) at the end of the second year of the study or the proportion of credits students
earned at the end of Year 1 or the end of Year 2. Similarly, there were no effects on student
attendance in either year.
Each of the student baseline covariates were significantly related to GPA and credits
earned. Additionally, the number of semesters enrolled in a study school, added as a control in
the Model 5, was a significant predictor of GPA, credits earned, and attendance in both years. In
Model 6, only one significant cross-level interaction was found. Students who were enrolled the
most number of semesters in a treatment school had the highest attendance rate (see Figure 5).
Again, because the number of semesters students enrolled in a study school is not exogenous to
treatment, this moderation effect is non-experimental.
140
Table 29. Predicting GPA, Credits Earned, and Attendance
GPA
Credits Earned
Year 1
Year 2
Year 1
Attendance
Year 2
Year 1
Year 2
Est.
SE
p
Est.
SE
p
Est.
SE
P
Est.
SE
p
Est.
SE
p
Est.
SE
p
0.15
0.13
0.28
0.11
0.10
0.31
0.13
0.08
0.16
0.07
0.07
0.35
0.23
0.14
0.12
0.14
0.15
0.38
District 2
0.12
0.20
0.56
0.20
0.16
0.24
-0.86
0.13
0.00
0.16
0.11
0.15
0.01
0.22
0.97
-0.02
0.21
0.92
District 3
District 4
-0.02
0.17
0.20
0.23
0.92
0.47
0.04
0.24
0.17
0.16
0.82
0.16
0.26
-0.29
0.13
0.13
0.07
0.05
0.31
-0.28
0.11
0.11
0.01
0.03
0.01
0.02
0.23
0.29
0.97
0.95
0.06
0.03
0.21
0.26
0.77
0.92
District 5
0.15
0.20
0.47
0.24
0.16
0.16
0.12
0.13
0.39
0.20
0.11
0.09
-0.10
0.22
0.65
-0.02
0.21
0.93
Gender (0=male)
0.28
0.02
0.00
0.32
0.02
0.00
0.07
0.01
0.00
0.06
0.01
0.00
0.04
0.02
0.14
-0.03
0.03
0.28
Hispanic
Black
-0.14
-0.24
0.05
0.04
0.01
0.00
-0.20
-0.27
0.04
0.04
0.00
0.00
-0.06
-0.08
0.02
0.02
0.00
0.00
-0.01
-0.05
0.02
0.02
0.63
0.01
0.06
0.09
0.05
0.05
0.24
0.05
0.11
0.06
0.05
0.05
0.05
0.19
Asian/Pacific Isl.
0.34
0.06
0.00
0.37
0.05
0.00
0.05
0.02
0.05
0.14
0.03
0.00
0.40
0.07
0.00
0.36
0.07
0.00
Amer. Indian/Other
-0.40
0.09
0.00
-0.41
0.08
0.00
-0.09
0.04
0.01
-0.08
0.04
0.05
-0.07
0.08
0.40
-0.35
0.09
0.00
Free/reduced lunch
Special education
-0.04
-0.23
0.03
0.07
0.18
0.01
-0.16
-0.26
0.03
0.05
0.00
0.00
0.04
-0.09
0.01
0.02
0.00
0.00
-0.10
-0.04
0.02
0.02
0.00
0.08
0.08
-0.05
0.03
0.06
0.02
0.37
-0.07
-0.01
0.06
0.08
0.23
0.95
ELL
-0.02
0.03
0.51
-0.09
0.03
0.00
0.00
0.01
0.80
0.03
0.02
0.03
0.09
0.03
0.01
0.00
0.04
0.96
Treatment (0=control)
Covariates
141
0.24
Control
Intervention
Year 2 attendance
0.10
-0.05
-0.20
-0.34
-2.03
-1.03
-0.03
0.97
Number of semesters in study
Figure 5. Interaction of ECED intervention and number of semesters in enrolled in study
school on Year 2 attendance.
142
Growth Curves Predicting Students’ Attitudes Toward School
Growth curve analyses were subsequently conducted on student survey outcomes in order
to better understand the pattern of change over time in students’ attitudes toward school.
Unconditional models. Before testing the effects of the ECED intervention on growth in
student attitudes, three unconditional models without any covariates were compared for each
outcome to determine the appropriate functional form of the growth curve. For each outcome, the
three models compared were: an intercept only model, an intercept-slope only model, and an
intercept-slope-quadratic model. Once the appropriate form of the growth curve was determined,
the appropriate model was then used to test program impacts over time. With respect to the
student survey outcomes, the intercept-slope model was a significantly better fit than the
intercept-only model in all cases, indicating that that these outcomes showed meaningful growth
across the four time points. Further, the intercept-slope-quadratic model was a significantly
better fit than the intercept-slope only models and the quadratic parameter itself was significant,
indicating curvilinear rather than linear change across time.
Given the results from the unconditional models, intervention effects on the student
survey outcomes were estimated for the intercept, slope, and quadratic parameters. The intercept
was centered at Wave 1 and the effect of the intervention on the intercept represents
intervention-control differences at baseline. The estimate on the slope parameter represents
intervention-control differences in linear change in the outcome across the four waves. The
estimate on the quadratic parameter represents intervention-control differences in rate of
acceleration or deceleration of the trajectory over the waves of data. Intervention by covariate
interactions were estimated for the quadratic parameter. Results are reported in Table 30 and
143
summarized below. Figures depicting significant impacts on student growth show smooth curves
that reflect the underlying mathematical function rather than the observed data.
Impacts on growth in students’ attitudes toward school. A non-significant treatment
effect on the intercept parameter indicates that there was no intervention-control difference at
baseline for students’ attitudes toward school as expected due to random assignment (see first
section of Table 30 labeled “Intercept”). Looking at the second section on Table 30 (labeled
“Slope”), there was a trend-level positive treatment effect on the slope parameter. Looking at the
third section on Table 30 (labeled “Quadratic”), there was a significant negative treatment effect
on the quadratic parameter. As shown in Figure 6, the ECED treatment group declined slightly
less in the first year of the study, and continued to decline in the second year, while the control
group leveled out in the second year.
As shown in Table 30, there were effects of some baseline demographic covariates on the
intercept such that girls reported more positive attitudes toward school at baseline, as did
Asian/Pacific-Islander and Black (relative to White) students. American-Indian and Hispanic
(relative to White) students reported less positive attitudes toward school.
In addition, number of semesters in the study, added in Model 5, was significantly
associated with the intercept and the slope. Students who were enrolled for more semesters in a
study school reported more positive attitudes toward school at baseline compared to students
who were enrolled fewer semesters in a study school. While a significant negative slope
indicated that this difference got smaller over time, a positive trend-level association between
number of semesters in study and the quadratic term indicated that the rate of decline in attitudes
toward school was slower for students spending more time in study schools.
144
Model 6 indicated a significant cross-level interaction between number of semesters
enrolled in a study school and treatment (β = -.002, SE = .001, p <.001) suggesting that while all
students experienced an overall decline in attitudes toward school over two years, students who
were enrolled the most semesters in control schools experienced a positive shift in attitudes
toward school in the second year (see Figure 7).
145
Table 30. Predicting Growth in Student Attitudes Toward School
Est.
Intercept
Treatment (0=control)
Child covariates
Gender (0=male)
Hispanic
Black
Asian/Pacific Isl.
Amer. Indian/Other
Free/reduced lunch
Special education
ELL
Slope
Treatment (0=control)
Child covariates
Gender (0=male)
Hispanic
Black
Asian/Pacific Isl.
Amer. Indian/Other
Free/reduced lunch
Special education
ELL
Quadratic
Treatment (0=control)
Child covariates
Gender (0=male)
Hispanic
Black
Asian/Pacific Isl.
Amer. Indian/Other
Free/reduced lunch
Special education
ELL
SE
p
0.01
0.01
0.35
0.06
-0.04
0.03
0.02
-0.06
-0.01
-0.03
0.02
0.01
0.01
0.01
0.02
0.03
0.01
0.02
0.01
0.00
0.01
0.02
0.20
0.04
0.21
0.12
0.10
0.01
0.01
0.08
0.02
-0.01
-0.01
0.02
-0.06
-0.01
0.03
0.00
0.01
0.01
0.02
0.02
0.03
0.01
0.02
0.01
0.01
0.67
0.63
0.35
0.08
0.57
0.21
0.98
-0.01
0.00
0.01
-0.01
0.00
0.00
0.00
0.02
0.00
-0.01
0.00
0.00
0.00
0.01
0.01
0.01
0.00
0.01
0.00
0.04
0.30
0.37
0.74
0.06
0.56
0.45
0.76
146
Students' attitudes towards school
3.20
Control
Intervention
3.17
3.14
3.11
3.07
3.04
3.01
1.00
2.00
3.00
Time (Fall of Year 1 to Spring of Year 2)
Figure 6. Impact of intervention on students’ Attitudes Toward School.
Figure 7. Cross-level interaction between number of semesters in the study and treatment
status on change in students’ Attitudes Toward School.
147
Associations Between ECED Implementation and Student Outcomes Across All Study
Schools
Student attitudes toward school. Findings from the student models suggested that
variation in implementation did not consistently predict students’ attitudes toward school as
reported in the student questionnaires. Table 31 presents the findings predicting students’
attitudes toward school—the second order factor—at the end of Year 2, controlling for district,
baseline, and student demographics (Model 4). The variation in implementation estimate was
non-significant. Follow-up models for the individual student scales (i.e., positive teacher
support/expectations, lack of teacher support, self-report of engagement in school, perceived
competence, relative autonomy index) indicated no associations between variation in
implementation and the outcomes.
In each of these models, the addition of the indicator variable for number of semesters
students were enrolled in a study school did not alter the findings, nor was the estimate of this
indicator significant in any of the models (Model 5). However, number of semesters enrolled did
moderate the association between variation in implementation and students’ attitudes toward
school, teacher support, lack of teacher support, and perceived competence (Model 6). A
negative effect of being in schools with higher ECED implementation was found for students
who were enrolled more terms in treatment schools. Students who spent the most time in schools
with the least implementation had the most positive reports of their schools. In addition, baseline
levels of the outcome moderated the relationship between variation in implementation and
students’ attitudes toward school. The slope of the relationship between baseline and students’
attitudes toward school was steeper for students in low implementation schools, indicating a
stronger positive relationship.
148
Table 31. Association Between Variation in Implementation and Student’s Attitudes
Toward School (Second Order Factor) at the End of Year 2
Estimate
Overall implementation
District 2
District 3
District 4
District 5
Baseline
Gender (0 = male)
Hispanic
Black
Amer. Indian/Other
Asian/Pacific Isl.
Free/reduced lunch
Special education
ELL
0.00
-0.05
-0.02
-0.04
-0.05
0.57
0.05
-0.01
0.08
0.01
0.03
-0.01
0.03
0.02
SE
0.00
0.03
0.03
0.03
0.03
0.01
0.01
0.02
0.02
0.01
0.01
0.01
0.02
0.01
p
0.500
0.06
0.45
0.11
0.10
0.00
0.00
0.81
0.00
0.46
0.02
0.51
0.07
0.10
Student achievement. The inclusion of the variation in implementation variable
produced similar results to the models testing treatment/control differences in predicting
students’ math and ELA test scores at the end of Year 2. Students in schools that implemented
ECED to a greater extent had marginally higher math achievement at the end of Year 2,
controlling for pre-intervention math scores and baseline demographic covariates (see Table 32).
However, variation in implementation was not significantly related to students’ ELA scores at
the end of the study. The addition of the math test type did not change the math finding, nor was
this a significant predictor of achievement. The addition of the indicator variable for number of
semesters students spent in the study did not alter the findings, nor was the estimate of this
indicator significant in either model.
149
Table 32: Associations Between Variation in Implementation and Year 2 Math And ELA
Achievement
Math achievement
Estimate
Overall implementation
Covariates
District 2
District 3
District 4
District 5
Baseline
Gender (0 = male)
Hispanic
Black
Asian/Pacific Isl.
Amer. Indian/Other
Free/reduced lunch
Special education
ELL
SE
p
ELA achievement
Estimate
SE
p
0.00
0.00 0.07
0.00
0.00
0.42
-0.08
0.17
0.36
0.34
0.40
-0.07
-0.13
-0.24
0.05
-0.11
0.00
-0.22
-0.03
0.09
0.09
0.10
0.10
0.01
0.02
0.05
0.05
0.07
0.08
0.03
0.06
0.03
-0.02
-0.14
0.00
-0.15
0.61
-0.01
-0.17
-0.15
-0.03
-0.18
-0.10
-0.17
-0.05
0.09
0.09
0.09
0.11
0.02
0.02
0.05
0.05
0.05
0.08
0.03
0.07
0.03
0.80
0.12
0.98
0.20
0.00
0.63
0.01
0.01
0.52
0.04
0.00
0.04
0.20
0.42
0.09
0.00
0.01
0.00
0.00
0.02
0.00
0.46
0.20
0.88
0.00
0.35
Note: The estimates and SEs of 0.00 are due to rounding.
Student performance and commitment. Variation in implementation was not
significantly related to students’ GPA, credits earned, or attendance at the end of the study (see
Table 33 for the Model 4 results). The addition of the math test type did not change the findings,
nor was this a significant predictor of performance or attendance. The addition of the indicator
variable for number of semesters the student was enrolled in a study school did not alter the
findings, but this indicator was significant and positive for all three outcomes. Further, the
number of semesters enrolled moderated the association between variation in implementation
and attendance (β = .001, SE = .00, p = .003), such that the slope of the relationship between
semesters and attendance was steeper for students in high implementation schools. Students who
150
were enrolled the most terms in schools with the highest implementation had the highest
attendance.
Table 33. Associations Between Variation in Implementation and Year 2 Performance
GPA
Est.
Overall implementation
Covariates
District 2
District 3
District 4
District 5
Gender (0=male)
Hispanic
Black
Asian/Pacific Isl.
Amer. Indian/Other
Free/reduced lunch
Special education
ELL
SE
Credits Earned
p
0.00 0.00 0.11
0.18
0.01
0.20
0.25
0.32
-0.20
-0.27
0.37
-0.41
-0.16
-0.26
-0.09
0.16
0.16
0.15
0.16
0.02
0.04
0.04
0.05
0.08
0.03
0.05
0.03
0.28
0.93
0.21
0.14
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
Est.
SE
p
0.00 0.00 0.11
0.15
0.30
-0.30
0.20
0.06
-0.01
-0.05
0.14
-0.08
-0.10
-0.04
0.03
0.10
0.10
0.11
0.10
0.01
0.02
0.02
0.03
0.04
0.02
0.02
0.02
0.18
0.01
0.02
0.08
0.00
0.63
0.01
0.00
0.05
0.00
0.08
0.03
Attendance
Est.
SE
p
0.00 0.00 0.31
-0.04
0.04
-0.01
-0.01
-0.03
0.11
0.06
0.36
-0.35
-0.07
0.00
0.00
0.21
0.21
0.25
0.21
0.03
0.05
0.05
0.07
0.09
0.06
0.08
0.04
0.83
0.85
0.98
0.96
0.28
0.05
0.19
0.00
0.00
0.23
0.96
0.96
Note: The estimates and SEs of 0.00 are due to rounding.
Associations Between ECED Implementation and Student Outcomes in Intervention Study
Schools
Student survey outcomes. Variation in implementation did not consistently predict
students’ attitudes toward school within ECED intervention schools. Table 34 presents the
findings predicting students’ attitudes toward school—the second order factor—at the end of
Year 2, controlling for district, baseline, and student demographics (Model 4). The variation in
implementation estimate was non-significant. Follow-up models estimating associations between
variation in implementation within intervention schools and each of the individual student scales
151
were estimated. Only the association between implementation and lack of teacher support
showed a positive association (β = .005, SE = .002, p = .058), indicating that greater
implementation was association with more teacher support at the end of Year 2. The addition of
the indicator for number of semesters in a study school did not change the findings, nor was the
estimate significant.
Table 34. Associations Between Variation in Implementation in Intervention Schools and
Students’ Attitudes Toward School (Second Order Factor) at the End of Year 2
Est.
Overall implementation
Covariates
District 2
District 3
District 4
District 5
Baseline
Gender (0=male)
Hispanic
Black
Asian/Pacific Isl.
Amer. Indian/Other
Free/reduced lunch
Special education
ELL
SE
p
0.00
0.00
0.26
-0.11
-0.07
-0.10
-0.03
0.55
0.05
-0.01
0.03
0.01
0.09
-0.01
0.03
0.04
0.05
0.05
0.05
0.04
0.01
0.01
0.02
0.02
0.04
0.02
0.01
0.02
0.02
0.07
0.20
0.10
0.52
0.00
0.00
0.66
0.10
0.83
0.00
0.35
0.19
0.02
Student achievement. Within the intervention group, variation in implementation was
not significantly related to math or ELA achievement at the end of the study (see Table 35). The
addition of the math test type and number of semesters in the study did not make a difference,
nor were the estimates of these indicators significant.
152
Table 35. Associations between variation in implementation within intervention schools
and Year 2 math and ELA achievement
Math achievement
Est.
Overall implementation
Covariates
District 2
District 3
District 4
District 5
Baseline
Gender (0=male)
Hispanic
Black
Asian/Pacific Isl.
Amer. Indian/Other
Free/reduced lunch
Special education
ELL
0.00
-0.27
0.16
0.38
0.33
0.41
-0.06
-0.15
-0.26
0.05
-0.17
-0.01
-0.13
0.00
SE
ELA achievement
p
Est.
SE
p
0.00 0.63
-0.01 0.00 0.11
0.15
0.16
0.17
0.14
0.02
0.04
0.06
0.06
0.09
0.09
0.04
0.06
0.04
0.07
0.17
-0.08
-0.01
0.62
-0.03
-0.18
-0.15
-0.03
-0.21
-0.12
-0.09
-0.04
0.15
0.39
0.09
0.08
0.00
0.10
0.01
0.00
0.57
0.06
0.87
0.04
0.94
0.06
0.06
0.11
0.00
0.04
0.03
0.06
0.05
0.06
0.14
0.03
0.13
0.04
0.30
0.05
0.51
0.11
0.00
0.34
0.01
0.01
0.60
0.14
0.00
0.48
0.34
Student performance. Variation in implementation within intervention schools was
significantly associated with Year 2 GPA and credits earned, such that students in schools with
greater implementation had higher GPAs and earned more credits toward graduation than
students in schools with lower implementation, controlling for district and student demographics
(see Table 36). Variation in implementation was not significantly related to student attendance at
the end of Year 2.
153
Table 36. Associations Between Variation in Implementation Among Intervention Schools
and Year 2 Performance
GPA
Est.
Overall implementation
Covariates
District 2
District 3
District 4
District 5
Gender (0=male)
Hispanic
Black
Asian/Pacific Isl.
Amer. Indian/Other
Free/reduced lunch
Special education
ELL
SE
Credits Earned
p
0.02 0.01 0.04
-0.01
-0.13
-0.19
0.39
0.34
-0.27
-0.31
0.36
-0.50
-0.19
-0.22
-0.12
0.19
0.20
0.20
0.17
0.03
0.05
0.05
0.07
0.11
0.04
0.07
0.04
0.96
0.56
0.41
0.07
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.01
Est.
SE
p
0.02 0.00 0.00
-0.11
0.15
-0.52
0.31
0.09
0.02
-0.05
0.18
-0.04
-0.13
-0.01
-0.01
0.08
0.08
0.08
0.07
0.02
0.03
0.03
0.04
0.05
0.02
0.03
0.02
0.24
0.14
0.00
0.01
0.00
0.56
0.05
0.00
0.40
0.00
0.71
0.81
Attendance
Est.
SE
p
0.01 0.01 0.32
-0.04
0.23
-0.25
0.18
-0.07
0.06
-0.02
0.35
-0.51
-0.09
0.04
-0.03
0.24
0.25
0.37
0.20
0.04
0.06
0.06
0.09
0.13
0.07
0.10
0.06
0.88
0.41
0.54
0.41
0.13
0.35
0.71
0.00
0.00
0.18
0.71
0.63
154
VII. Implementation and Data Collection Challenges
We encountered numerous problems and setbacks with this grant. Indeed, as described in
Appendix 1, the grant was awarded to do an effectiveness trial of First Things First, but the
funding to do the FTF intervention did not materialize and the four districts that had agreed to
participate all withdrew because of changes at their top administrative levels. Thus, we could not
evaluate FTF with an effectiveness trial, as we had originally proposed. Thus, we reformulated
the project after discussions with NCER to focus on an efficacy trial of ECED, which is the
instructional improvement component of FTF, done without FTF’s structural changes.
In carrying out ECED, we also encountered challenges, both in implementing the ECED
supports and in collecting the needed evaluation data. This section outlines the types of
difficulties experienced and concludes by discussing their implications for the impact evaluation
of ECED.
Challenges in Recruiting Schools to Participate
Recruiting schools to take part in the ECED Efficacy Trial proved more difficult than
anticipated. In the years just prior to recruiting for the ECED Efficacy Trial, IRRE experienced
rapid growth and found that there were more schools and districts wanting their support than
they could serve. At that time, No Child Left Behind (NCLB) was mandating large-scale or
whole-school reforms for all of the nation’s low-performing schools, making districts eager to
participate in this type of reform effort. Thus, we anticipated that many schools would be
interested in participating in this project. Unfortunately, by the time recruitment for ECED began
in 2008-09, most low-performing schools had already started one or more reform efforts. Many
had instituted 9th-grade academies, some type of classroom walkthroughs by administrators and
instructional leaders, and/or special additional classes for struggling students such as Read 180.
155
Many had moved to a block schedule and were creating communities of practice or instituting
mentoring systems for teachers. Those types of reforms were all compatible with ECED, but they
were generally just taking hold in the districts, making them hesitant to embark on another
reform that might distract from their recent efforts.
Additionally, because NCLB was mandating reforms in many schools, district
administrators were being bombarded by individuals wanting to sell them reform packages,
professional development models, and new curricula for struggling students. The ECED supports
were being offered free of charge, but it was often difficult to get the attention of the district
leaders because they were wary of individuals trying to sell them supports.
Most districts that expressed initial interest remained interested as they learned the
details. However, the addition of a full required course for all 9th- and 10th-graders (i.e., ECED
Literacy) did prevent some districts and schools within districts from electing to participate.
Student schedules—especially the schedules of the college-bound students—are often quite full
and in many cases there was not space for an additional requirement.
We accepted all schools/districts for participation that were interested following the site
visit. If we had more and better-prepared schools/districts to choose from we would most likely
have rejected some because of the magnitude of challenges we knew the schools would face in
implementing the reform and fulfilling the research requirements. For instance, as noted earlier,
one district does not regularly administer ELA tests to 10th-grade students. Given that that was a
main outcome of this study, we might have excluded them if we had more interested districts to
choose from that did administer this assessment. That same district included two schools that did
not exist at the time of recruitment, making it impossible to meet with their leadership and ensure
buy-in, but only two other schools within that district agreed to participate, and we need four.
156
Another district had agreed to participate in Recruitment Group 1 but then several members of
the district leadership team, including the superintendent, left their positions before the
intervention began. In order to keep them in the study we made an agreement with the interim
leadership for them to begin participating in the project one year later, as part of Recruitment
Group 2. Additionally, it was clear that that district did not have a good data system so it would
be very difficult to get the school/district records data we needed. Still another district that we
included was in turmoil when we were negotiating to begin the project there, and indeed the
superintendent left the district early in the first term of ECED implementation. The new district
administration had not been part of the recruitment phase and demonstrated little support for the
project. In the end, we included all schools/districts that were interested in participating if they
signed agreements to fulfill their implementation and research commitments. This lead to some
major implementation challenges (outlined below), but does provide us with a very rigorous test
of the model because we did not limit the study to schools that we were highly confident could
implement with fidelity.
Challenges in Implementing ECED
Once schools and districts agreed to participate, IRRE encountered problems in fully
implementing the model. High turnover in district and school leadership was one of the largest
problems that limited implementation. As noted in Chapter III, four of the five superintendents
were replaced between the time we recruited the district to participate and the end of the second
year of implementation. Two of these four were forced out due to political and legal difficulties
in the district. The fifth superintendent—the only one that stayed in place throughout the
project—announced his resignation as the second year was coming to a close, after experiencing
serious health problems during the second year. 66 Further, in all five districts, the individual who
157
had been most instrumental in coordinating and engaging with ECED at the beginning of the
project—often the director of high schools or assistant superintendent of curriculum—was either
reassigned to another position within the district or left the district entirely during the course of
the project.
At the principal level, 11 out of the 20 participating schools (six treatment, five control)
experienced one or more changes in principals during the course of the study. 67 To be more
specific, six principals (two treatment and four control) were brand new as their school began
participation, meaning they had not participated in the recruitment efforts and were unfamiliar
with the project when it started. At one of the treatment schools there were three different
principals during Year 1 and a fourth principal was assigned as the second year began. That
school stopped participating in ECED after Year 1. Five schools (four treatment, one control)
had a change in principal between the end of the first year and the start of the second year. One
treatment school changed principals during the second year of the study.
Additionally, four schools learned that there would be major changes to their schools as
the second year came to a close, causing significant disruption in two of the schools. One of the
control schools learned late in the second year that the principal and all other administrators were
being replaced due to low test scores. At another control school that had fairly stable leadership
during the study, the staff learned right at the end of the second year that their school would be
closed entirely when the school year ended. Two treatment schools learned that their principals
were leaving as the second year ended, but those changes did not appear to cause major
disruption.
Further, in three districts (12 schools) the district finances and union contracts led to all
teachers being “laid off” or “furloughed” in the spring of each year. Most teachers were re-hired
158
prior to the start of the next school year with no actual disruption in their employment, but the
threat of not having a job or being reassigned made the day-to-day working conditions of all
teachers very stressful for a significant portion of each year. Not surprisingly, this high level of
leadership turnover and upheaval severely limited some districts’ abilities to focus on
instructional improvement or to be open to changing long-standing practices that required
considerable effort. This instability thus caused some school-level personnel to question if ECED
remained a high priority within the administration.
As explained elsewhere in this report, two treatment schools left the project prior to its
completion. One left after the first semester of participation. In that district, the superintendent
had been the individual most engaged with ECED and had encouraged the schools to take part.
He was dismissed by the school board after only 14-months of leadership, in the midst of a
highly contentious battle with the teachers’ union. Although the principal at the school that
withdrew had expressed enthusiasm for ECED when the school agreed to participate, his support
quickly faded when the superintendent was dismissed. Due to a hiring freeze, the school did not
have a literacy coach until about one-month into the school year, meaning that she had not
participated in the summer training activities. Additionally, the math coaching duties were given
to the math chair. He was given some reduction in his teaching load in order to take on this
additional role, but he never fully supported the ECED Math and was openly hostile toward the
IRRE consultants working with the school. In sum, this school never embraced ECED and
stopped participating after a single semester of weak implementation. Nonetheless, because it
was randomly selected into the treatment condition, it is included in the intent-to-treat analyses
and we were able to gather some Wave 4 data there.
159
The second school that stopped participation had three different principals during the year
it participated. Additionally, an assistant principal who had been one of ECED’s biggest
supporters died during the first year. A fourth principal was assigned to that school at the start of
the second year. He was entirely unfamiliar with ECED and the teachers were against on-going
participation because they found it quite demanding. Thus, as with the first school that stopped
participation, this school had implemented weakly in the first year and stopped participation all
together in the second year. It is, however, included in the intent-to-treat analyses.
As noted earlier in the discussion on recruitment, the addition of an entire course to all
9th-and 10th-graders’ schedules proved challenging. This was especially true for the two districts
in California. The course requirements for entrance into California’s public universities dictate
the student schedules for college-bound students, leaving little-to-no room for an additional
course. One district 68 resolved this by offering only one-half of the ECED Literacy curriculum
and one-half of the regular 9th- and 10th-grade English curriculum. 69 This decision was made in
conjunction with IRRE, after the MOU was signed. In the other district, one school 70 simply did
not enroll students whose schedules were already full. It made this decision without consulting
IRRE. The other treatment-group school in that district did enroll all 9th- and 10th-graders in the
first year, but in the second year only enrolled those who had not passed the state’s standardized
test in ELA the previous year. Additionally, the 9th- and 10th- grade English and ECED literacy
classes became a single, year long, block course in which they were supposed to cover the
English curriculum on certain days and the ECED literacy course on other days. This should
have been enough time to cover the full curricula for both courses; however, in was clear from
informal conversations with the staff that the regular English curriculum was given priority over
the ECED literacy material. These decisions were made in consultation with IRRE and the
160
research team, but indicated that they would cease participating in the project if this
accommodation was not made.
As noted earlier, the implementation of the ECED Math was fairly successful at the eight
schools that participated for two years and almost all used the mastery grading system. However,
the mastery grading system was challenging for most schools, both because it was difficult to
explain to students and parents and because teachers, students, and parents believed that it
resulted in a higher rate of failure. Data received from the schools does seem to bear this out. 71
Excluding the two schools that just stopped participating, students in treatment schools received
more Fs and Is 72 in Algebra 1 and Geometry at the end of every term than students in control
schools. The largest difference was in Wave 4, Algebra 1 when 48% of students in treatment
schools did not pass (i.e., received an F or I) while only 26% of students in control schools did
not pass (t = -16.19, p < .001).
A final implementation challenge had to do with teacher turnover and the reassigning of
teachers. Ideally, the same individuals would have taught the target classes throughout the length
of the project, meaning that they would have participated in the initial summer institute where
the ECED philosophy and strategies are explained in-depth and would have received IRRE’s
support for a full two years. Of course, we knew in advance there would be teacher turnover as a
result of teachers leaving the schools, but the rate of teachers leaving was much higher than we
had anticipated (see Table 1, p. 53). Further, we had not anticipated that schools would change
teachers’ assignments as often as they did, including during a school year, resulting in a low
number of teachers teaching a target class in all four waves. For instance, one school that was on
a block schedule assigned half of their students to take English in the fall and ECED Literacy in
the spring and the other half to take ECED Literacy in the fall and English in the spring. The
161
teachers followed this same pattern. So, when the spring term started, none of the teachers had
ever taught ECED before and none had attended the summer professional development sessions
designed to introduce them to the ECED curriculum. As noted in the Method Chapter, only about
one-third of the study teachers taught a target course during all four waves. Because the project
was intended to work with the same group of teachers for two full years, teachers leaving the
schools and teachers changing assignment led to weakened implementation and extensive
missing data.
Data Collection Challenges
IRRE has a proprietary, on-line system for collecting questionnaire responses. As
described in Appendix 7, schools were given ‘tickets’ for each 9th- and 10th-grade student to use
in taking the on-line survey. The tickets included an access code that IRRE could use to link the
students’ responses to their study identification numbers. Schools decided which teachers and
classes to use to administer the surveys and had to schedule time for each teacher who was
administering surveys to spend time in a computer lab. The research project director worked
closely with each school to facilitate this process, but each school was required to have an
individual who was tasked with making sure that the surveys ran smoothly and that each student
had an opportunity to participate. The success of this system for administering surveys depended
largely on the commitment of the individual given this responsibility. Some worked hard to
ensure that each student was given an opportunity to participate; others simply placed the tickets
in the teachers’ boxes and provided no follow-up to ensure that the surveys were completed.
Thus, as seen in the Method Chapter, survey response rates were typically between about twothirds and three-quarters of students who were enrolled and eligible.
162
Much of the key data for this project—test scores, student course schedules, student
demographics—was housed in district databases. As a condition of participation, each district
promised to provide the needed data to the evaluation research team. In the end, each district did
provide most of the needed data, but receiving it took much longer than anticipated. In the most
extreme case, one district that participated in the second recruitment group (2010-11 and 201112) did not provide the needed records data for either year until October, 2012. This was roughly
15 months after the project was scheduled to receive the 2010-11 data and three months after it
was scheduled to receive the 2011-12 data. Often the delays were caused by staffing shortages.
One district did not have a research staff at all. The individual who maintained their student
records had very limited knowledge of how to extract information from their database. Several
districts experienced significant staffing shortages and turnover within the research department
during the project. And, several districts adopted new student records data systems during the
project, meaning that they were unfamiliar with the new systems as they were trying to extract
information.
In addition to delays, the records data received from the district often contained
inconsistencies and missing information. For example, one district operated only high schools
and all students came from “feeder” primary districts. That district did not routinely obtain 7th- or
8th-grade test scores and had to request them as part of this project. A large proportion of the
target students were missing from those files. Additionally, districts often sent separate files
containing course-period and course-grade information. Those two files would not always agree
about which courses a student had taken and would occasionally indicate that a student had taken
a course from a teacher who was not listed as teaching that course according to other information
provided by the district. This type of discrepancy required extensive hand cleaning on the part of
163
the research project director. In the end, all districts provided data and only a few data points
were missing entirely within a district. However, no district could provide complete data on all
students in the study and most had large gaps. School records data about students who left the
district during the course of the year, which are needed for intent-to-treat analyses, proved
especially difficult to obtain and often had to be imputed. This was true even for data regarding
the portion of the year that they were present and for variables that would have been meaningful
(e.g., course enrollments, attendance, free/reduced lunch).
Implications of the Challenges for the Impact Evaluation
Collectively, these challenges represent the real world of low-performing high schools
serving ethnically diverse and low-income students. The schools that participated in the study
were not optimally prepared to either implement ECED or to participate in the impact evaluation.
There was extensive turnover in district and school leadership and in teachers. For all these
reasons, ECED was not fully implemented in most districts and schools. Further, a large amount
of data needed for the impact evaluation was missing, necessitating use of a complex multipleimputation strategy. As described in the data analysis section, through various procedures, we
have taken steps to ensure the internal validity of the impact evaluation. While impact estimates
may be biased downward (i.e., in the direction of finding no impacts when they might exist),
they do not appear to be biased by accidentally favoring the treatments or control group of
schools.
164
VIII. Discussion
The current study examined the efficacy of Every Classroom, Every Day (ECED), an
instructional improvement approach designed by the Institute for Research and Reform in
Education. Whereas ECED was designed to be part of a more comprehensive whole school
intervention called First Things First, in this study ECED was evaluated as a stand-alone
intervention. This study is one of very few randomized field trials done in the area of educational
reform that have involved multiple school districts and have randomized at the level of the high
schools. The experiment was longitudinal, and involved four waves of data collection, at the start
and end of two consecutive school years. The analyses used a multilevel design accounting for
students and teachers nested within schools.
Central to the intervention were the concepts of engagement, alignment, and rigor as
markers of high quality instruction. Teachers received professional development and on-going
supports to teach in ways that are more Engaging for students, more Aligned with local, state,
and federal standards, and more Rigorous for all students (EAR). Trained independent raters,
blind to the intervention status of the schools, observed classrooms as an important source of
research data, and trained school leaders and consultants observed classrooms as a means of
targeting professional development. Teachers and students completed questionnaires and school
districts provided information about students, including demographic characteristics, scores on
standardized achievement tests in math and English/language arts (ELA), progress toward
graduation, grade point average, and attendance.
Our primary hypothesis was that ECED’s instructional improvement interventions would
improve math and ELA achievement as measured by standardized test scores. Secondary
hypotheses were: that the intervention would improve other student performance outcomes such
165
as attendance, grade point average, and progress toward graduation, and would enhance teacher
and student attitudes. Finally, we hypothesized that both the fidelity of implementation and the
number of semesters students were in intervention schools would predict better school outcomes
in non-experimental analyses.
Summary of Most Important Findings
Findings from this evaluation provide some evidence that ECED was efficacious in
improving student achievement in math. Students in the treatment schools scored significantly (p
= .04) higher on standardized tests of math than did students in the control-group schools, after
controlling for pre-intervention math achievement and school district, although that result
became only marginal (p = .06) when student demographic controls were added to the models
(see p. 138). In addition, there was some evidence that fuller implementation of ECED was
linked to better student outcomes such that students in schools with higher implementation
scores had marginally higher math achievement (see p. 149), significantly higher grade point
averages (see p. 152), and significantly higher credits toward graduation (see p. 152). As well,
the number of terms students were enrolled in ECED schools moderated the association between
implementation and attendance, such that those students with more semesters in schools with
higher ECED implementation had higher attendance (p. 149). In contrast to these impacts, the
ECED intervention did not improve achievement in English language arts (see p. 138), nor was
the degree of ECED implementation related to ELA achievement (see p. 149 & p. 152).
Preliminary analyses of the classroom observations for math indicated that across the two years,
participation in ECED increased the rigor of classroom instruction relative to control-group
schools (see p. 123 & p. 125).
166
Although there was indication that the ECED intervention led to some positive outcomes,
including math achievement and rigor in math classrooms, there was also evidence that these
improvements came at some cost to students and teachers. Among students in ECED treatment
schools, students who were enrolled more terms reported worse attitudes toward school; whereas
among students in control schools, those who were enrolled more terms reported better attitudes
toward school (see p. 135). This finding appears to be primarily driven by a single component of
student attitudes toward school, namely perceived academic competence. Students who were
enrolled for more terms in treatment schools reported lower perceived academic competence (see
p. 136). This decline in perceived academic competence for students with longer exposure to the
ECED intervention could be a function of the increased rigor observed in math classes and to the
overall raising of academic expectations and providing more mastery based feedback to students.
Similarly, math teachers in treatment schools reported less mutual support among colleagues
than did those in control schools (see p. 117). And, across the two year of the study, math
teachers in ECED schools who taught more terms of courses targeted by the intervention
reported that their districts’ leadership was less committed to change, while the opposite was true
for teachers in control schools (see p. 117). Finally, based on preliminary analyses of the
classroom observations it appears that ECED had a negative impact on observed student
engagement in math classes (see p. 123 & p. 125). No effects of ECED were seen for ELA
teacher attitudes and experiences (see p. 127).
Implications
Math achievement. STEM (Science, Technology, Engineering and Math) instruction
and achievement is an increasingly high priority for our nation’s schools, as evidenced by
President Obama’s “Educate to Innovate” campaign (The White House, n.d.). ECED Math
167
appears to be a promising strategy to address that priority. Through ECED Math, teachers work
as teams to make math relevant and accessible to all students through ‘I Can…’ statements,
assess students regularly to ensure they are mastering the content, and provide students multiple
opportunities for relearning. Since the initiation of this ECED efficacy trial, IRRE has begun
implementing similar benchmarking strategies across other courses within the high school
curriculum in other sites where it is providing support. The results from this evaluation support
the idea that this type of instruction, assessment, and teacher-student interactions may be
beneficial with regard to student achievement.
English Language Arts achievement. The ability to read, write, and communicate is
also an important skill for today’s high school students. Indeed, the Common Core State
Standards emphasizes the key role these skills play in college and career readiness and the
interwoven nature of these skills (Common Core State Standards Initiative, n.d.). Unfortunately,
this evaluation provided no evidence that ECED’s ELA intervention had an impact on students’
ELA achievement. The cornerstone of ECED’s ELA component is the Literacy Matters
curriculum that focuses on expository reading and writing, as well as skills that cut across the
curriculum such as critical thinking and skills for comprehending, organizing, and remembering
information. It is meant to compliment the 9th- and 10th-grade English curriculum that has
traditionally focused on literature.
One possible explanation for the lack of ELA achievement improvements is that ECED
Literacy was not implemented with sufficient fidelity. As noted elsewhere in this report only
26% of the students were enrolled all four terms and took the prescribed amount of ECED
Literacy (see p. 59). Further, turnover was high among ECED literacy teachers, meaning that few
teachers received all the supports intended to help them fully implement the curriculum (see p.
168
53). For example, an ELA teacher who was hired to teach ECED literacy during the second year
of the intervention would not have had the 3-day professional development workshop during the
summer before the first year, nor the other first-year supports, so he or she would be likely to
have implemented the literacy curriculum less efficaciously. However, these limitations also
affected math. Only 31% of students were enrolled all four terms and took both the math classes
in which ECED was working. And, in the treatment schools, math and ELA teachers were
employed an equal number of terms, so there is no evidence that turnover was greater in ELA
than math. Nonetheless, ECED did impact math achievement, but not ELA achievement.
Another possible explanation is that the skills taught in Literacy Matters are not those
tested by the ELA achievement tests. Those tests may be more closely aligned to the regular
English curriculum that is in use in both treatment and control schools than to the supplemental
Literacy Matters curriculum. According to IRRE, ECED Literacy was designed to align with
national standards similar to those represented in the new Common Core State Standards and it
assessments were more performance-based in contrast to the district- or state-created standards
and assessment in palace at the time of this study. Of course, based on this evaluation, it is also
possible that the ECED Literacy simply does not have the intended benefits for students’ ELA
achievement.
Student and teacher attitudes and self-reported experiences. As noted, ECED was
designed to be the instructional improvement component of the First Things First (FTF)
approach to school reform. Prior to this evaluation, no school had implemented these
instructional improvement strategies without also implementing the other two major components
of FTF, namely creating small learning communities and implementing the student and family
advocacy system. In math, the ECED instructional improvement strategy alone paid off in terms
169
of improvement in test scores, and fuller implementation of this intervention predicted better
student performance and persistence (e.g., GPA, credits earned), all of which are primary
concerns of today’s schools. But, test scores and performance indicators are only one facet of
high quality education. On the negative side, these instructional improvements did not have the
intended benefits for students’ attitudes toward school or teachers’ self-reported experiences, and
they do appear to have been detrimental to some important aspects of the school experience, such
as student engagement in math classes, students’ perceptions of their own academic competence,
and teachers’ experience of mutual support. Rigor in math classes appears to have increased, and
perhaps that was what led the students to perceive themselves as less competent. Further, the
expectable stress of implementing interventions such as ECED can exacerbate already toxic
conditions in struggling schools without deft leadership from principals and district staff—
conditions that were apparent in the majority of the treatment schools in this study.
Viewed together, the results of the current evaluation seem to suggest that to have larger
and broader effects on school outcomes, along with more positive experiences and selfperceptions for students and teachers, additional supports for teachers and students are needed
that focus on the interpersonal side of education. One approach would be to include changes to
the design and/or implementation of ECED itself. An alternative would be to implement
ECED—which is the instructional improvement component of FTF—within the context of the
other two FTF components (viz., small learning communities and the advocacy system) that
were designed to improve student engagement and school climate. These components could
create the necessary conditions for students and teachers to get to know each other better, for
each student to have an advocate within the faculty, and for both groups to feel supported. By
improving the sense of community and shared goals among teachers and students, there would
170
seem to be a greater likelihood both that ELA outcomes would be enhanced and also that the
improvements in math tests scores that were observed would be sustainable.
A third potential approach would be to focus more attention earlier on in the site selection
process on ensuring greater stability and quality in the leadership ranks of the districts and
schools—both treatment and control schools—because the effective implementation and the
evaluation of the intervention both rely on a threshold of leadership stability and support that was
clearly absent in many cases in the schools in this trial. Without a threshold level, there is no
hope of obtaining a fair and rigorous test of the intervention’s potential impacts. Indeed, other
school reform models have been criticized for failing to couple their focus on quality curricula
with effective district- and school-level expectations and supports for teachers and
administrators. Contextual supports have the potential to create school-level “buy-in” and
increase the likelihood of effective implementation that can lead to sustainable systemic change
and desirable student outcomes over time (Berends et al., 2002; Desimone, 2002). Although
much time and effort was spent by IRRE working on leadership and teacher development, as
well as providing targeted and research-based curricular material and supports, the instability and
level of turnover of teachers and administrators did not allow these resources to be directed at the
same people consistently even for the two years of the intervention. This third potential approach
does not mean to imply that some districts or schools should be denied access to interventions;
rather, it means that they may need special preliminary work to ready them to take on the
challenges of an intervention such as ECED.
Performance and commitment. No Child Left Behind and Race to the Top have
encouraged schools to place increasing attention on student achievement as measured by
standardized test scores. However, there is some evidence that other aspects of school
171
performance—including course grades, credits toward graduation, and attendance—are equally
or possibly even more important outcomes for predicting success in life (Farrington et al., 2012).
Each of these is a marker for important personal attributes such as effort, perseverance, and selfregulation, all qualities that are critical for success in post-secondary education and careers.
The evaluation provided some evidence that the extent to which ECED was implemented
as intended by IRRE was positively related to students getting better grades, receiving more
course credit toward graduation, and have higher attendance. However, these findings must be
interpreted with caution. We did not obtain baseline information about these variables, so it is
possible that these differences existed prior to implementation of ECED. This would seem to be
especially likely given the baseline differences in level of economic disadvantages (see p. 57).
Further, there were some inconsistencies in the patterns of findings. For grade point average and
credits toward graduation, the relations were significant when all 20 schools were included in the
analyses, but not significant when only treatment schools were considered (see p. 149 vs. p. 152).
Of course, there were only 10 schools in the analyses that included only treatment schools,
making the power quite low. For attendance, higher scores on the ECED implementation
variable was a significant predictor of better attendance across the 20 schools for students who
were enrolled a longer time (see p. 149), but there was no main effect of implementation scores
across the 20 schools (see p. see p. 149) and there was no effect when the sample was limited to
the 10 treatment schools (p. 152). Thus, although there is some evidence that ECED’s improves
school performance, which is tied to school success, we cannot draw firm conclusions.
Strengths of the Design and Analyses
As one can see from Chapters III, V, and VI, this project employed state-of-the-art design
and analytic strategies. First, the school randomized field trial design with particular emphasis on
172
creating unbiased estimates of effects allows us to draw causal conclusions about the impacts of
ECED. Second, for our intent-to-treat analyses we were faced with a large amount of missing
data. We addressed that problem using sophisticated multi-stage multiple imputation techniques
(see p. 92). Multiple imputation reduces the bias introduced by missing data. Third, to properly
account for the random assignment at the school-level and our interest in teacher and student
data, we used multi-level modeling. Finally, longitudinal growth curve analyses were used for
teacher and student questionnaire data and EAR classroom observations, where the data were
directly comparable across time. The growth curves allowed us to compare student and teachers
in control versus treatment schools on rates of changes over time.
Limitations
As detailed in Chapter VII the ECED Efficacy Trial experienced numerous challenges in
recruiting, implementation, and data collection. One of the largest was missing data, which
resulted from imperfect record keeping in the school districts, uneven administration of student
surveys at participating schools, limited commitment to the project by some teachers, and
mobility. Administrators, teachers, and students were all highly mobile and that mobility led to
extensive missing data as well as weak implementation in some schools.
The difficulty in combining test scores across state systems and across content is another
limitation of this study. The fact that schools were randomized to implement ECED or not within
districts means that intervention and control schools’ test scores were treated identically,
protecting the internal validity of the study. The fact that this study took place in four different
states, spread across the country, strengthens its external validity. However, each had its own
testing schedule and system, forcing us to combine scores across systems. We cannot be certain
that each system was testing the same type of material or that instructional quality was equally
173
linked to each set of tests. Further, within states, different students took different tests depending
on their course enrollments. This is the nature of high school testing, but poses a problem for
researchers looking for a common metric. Students in more advanced courses are given more
advanced tests; it is not possible to know how those students would have scored had they been
given the less advanced tests. Chapter III (p. 83) provides extensive details about how this
challenge was mitigated, but we acknowledge that some of the findings, especially some of the
non-significant findings, may be related to less-than-perfect outcome measures.
Future Analyses
This report addresses only the main research questions posed by this study, but there are
many more questions that could be explored in the future. For instance, there is extensive
additional work that could be done with the classroom observations of Engagement, Alignment,
and Rigor. As noted in the teacher results chapter (Chapter V), once the multiple imputation is
complete for the EAR data, the impact of ECED on E, A, and R for both math and ELA will be
explored using the same analytic strategy as used elsewhere in this report, including both pointin-time and growth curve analysis. Further, IRRE’s theory of change posits that EAR should
causally mediate the relation between ECED implementation and achievement. This hypothesis
can be explored in the future. Analyses by Early and colleagues (2013) indicate that observed
engagement at Wave 1 predicts math and ELA test scores at the end of Year 1. Those analyses
could be expanded to account for the multiple teachers that each student experienced across the
two years of the study and to better understand links between observed E, A, and R and test
scores, as well as attitudes, grades, credits, and attendance. Finally, it would be valuable to
understand what types of conditions and teacher characteristics lead to greater E, A, and R.
Analyses could be conducted that link various professional development experiences as reported
174
on the teacher questionnaire and student and teacher demographic characteristics to changes in E,
A, and R.
A second line of future inquiry is whether there are certain students or teachers who
benefited more from ECED than others. The analyses presented in this report use primarily an
intent-to-treat framework, with some analyses including variation in implementation. These are
the appropriate analyses to answer impact questions and tell us the net impact we would expect
to see in a typical situation where schools, teachers, and students vary in commitment to the
intervention and where a host of real-world circumstances limit the extent to which any
intervention could be implemented with fidelity. However, this is a conservative approach that
precludes us from knowing the circumstances under which the ECED supports are beneficial.
The intent-to-treat approach, for instance, means that the two schools that stopped participating
in the project are treated like all the others in the treatment condition, although of course they
received much lower scores on variation in implementation (i.e., fidelity). Likewise, students at
treatment schools who never enrolled in ECED Literacy, never took one of the math courses
targeted by ECED, were enrolled very few days, or had very low attendance, were treated the
same as students who received the full dosage of ECED. Additionally, teachers who started at a
treatment school late in the project and received little of the support, as well as teachers who
implemented few of the ECED strategies, were treated the same as those who took part for four
terms and implemented most components. And, regular 9th- and 10th-grade English teachers at
treatment schools who never taught ECED Literacy are included with the ECED Literacy
teachers. ‘Treatment on the treated’ analyses could be conducted that focus on teachers and
students who actually received the treatment. Although these would be non-experimental
analyses and would not permit causal conclusions, they would allow us to test various path
175
models with data from individuals who experienced the treatment. Likewise, the current
experimental analyses could be expanded to include additional student-level interactions that
more fully explore possible differential effects of ECED. For instance, students who had
different attitudes toward school or who differed with regard to motivation (as measured by the
relative autonomy index) at the start of the project might have different outcomes as a function of
their participation.
Student and teacher mobility is a third line of inquiry that could be addressed in the
future. As noted elsewhere in this report, both student and teacher mobility were high and it is
possible that the turnover affected the implementation, the outcomes, or both. Future analyses
could consider the role mobility played in ECED’s implementation and the relation of that to the
outcomes.
Conclusions and Recommendations
ECED Math appears to be a valuable path to improved student standardized test scores in
math. In schools from five districts in four states, which were fraught with problems and served
high percentages of students from economically disadvantages homes, the use of the ECED Math
approach resulted in improved math scores relative to those in control-group schools. We found
this effect with a relatively small sample size (10 schools per condition) and a stratified random
assignment within districts, leaving only about 14 degrees of freedom. Further, of the 10
treatment schools, two had dropped out by the end of the first year, weakening the intervention
implementation. Still there was indication that the approach did enhance achievement in math.
Nonetheless, the intervention did have some negative effects on teacher and student
experiences and self-perceptions, which over time might have interfered with the stability of the
positive effects. Thus, we would recommend either that ECED be implemented within the
176
framework of a larger school reform effort, such as First Things First, which focuses on teacher
and student support and school climate or that various teacher and student supports be added to
the ECED intervention so that teachers and student within ECED would feel better about
themselves and about their work and so that these feelings might help to buttress the positive
achievement effects that are likely to result from implementation of ECED.
177
IX. References
Berends, M., Bodilly, S., & Kirby, S. N. (2002). Facing the challenges of whole-school reform:
New American Schools after a decade. Santa Monica, CA: RAND Corporation. Retrieved
from: http://www.rand.org/pubs/monograph_reports/MR1498.html
Borman, G. D., Hewes, G. M., Overman, L. T., & Brown, S. (2003). Comprehensive school
reform and achievement: A meta-analysis. Review of Educational Research, 73(2), 125-230.
doi: 10.3102/00346543073002125
Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical Linear Models in social and behavioral
research: Applications and data analysis methods (First Edition). Newbury Park, CA: Sage
Publications.
Cavalluzzo, L., Lowther, D. L., Mokher, C., & Fan, X. (2012). Effects of the Kentucky Virtual
Schools' hybrid program for algebra I on grade 9 student math achievement. Final report
(NCEE 2012-4020). Washington, DC: National Center for Education Evaluation and
Regional Assistance, Institute of Educational Sciences, U.S. Department of Education.
Retrieved from: http://ies.ed.gov/ncee/edlabs/regions/appalachia/pdf/20124020.pdf
Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and
standardized assessment instruments in Psychology. Psychological Assessment, 6(4), 284290. doi: 10.1037/1040-3590.6.4.284
Cohen, J (1992). A power primer. Psychological Bulletin, 112 (1): 155–159. doi:10.1037/00332909.112.1.155
Common Core State Standards Initiative (n.d.). Retrieved from: http://www.corestandards.org/
Corrin, W., Lindsay, J. J., Somers, M. A., Myers, N. E., Meyers, C. V., Condon, C. A., & Smith,
J. K. (2012). Evaluation of the Content Literacy Continuum: Report on program impacts,
178
program fidelity, and contrast (NCEE 2013-4001). Washington, DC: National Center for
Education Evaluation and Regional Assistance, Institute of Educational Sciences, U.S.
Department of Education. Retrieved from: http://files.eric.ed.gov/fulltext/ED538060.pdf
Corrin, W., Somers, M. A., Kemple, J. J., Nelson, E., & Sepanik, S. (2008). The Enhanced
Reading Opportunities Study: Findings from the Second Year of Implementation (NCEE
2009-4036). Washington, DC: National Center for Education Evaluation and Regional
Assistance, Institute of Educational Sciences, U.S. Department of Education. Retrieved from:
http://www.mdrc.org/sites/default/files/full_554.pdf
Darling-Hammond, L., Wei, R. C., Andree, A., Richardson, N., & Orphanos, S. (2009).
Professional learning in the learning profession: A status report on teacher development in
the United States and abroad. Dallas, TX: NSDC. Retrieved from:
http://learningforward.org/docs/pdf/nsdcstudy2009.pdf.
Desimone, L. (2002). How can comprehensive school reform models be successfully
implemented? Review of Educational Research, 72(3), 433-479.
doi: 10.3102/00346543072003433
Desimone, L. M. (2009). Improving impact studies of teachers’ professional development:
Toward better conceptualizations and measures. Educational Research, 38, 181-199.
doi: 10.3102/0013189X08331140
Early, D. M., Rogge, R., Deci, E. L. (2013). Engagement, alignment, and rigor as vital signs of
high-quality instruction: A classroom visit protocol for instructional improvement and
research. Manuscript submitted for publication.
179
Elmore, R. F. (2002). Bridging the gap between standards and achievement: The imperative for
professional development in education. Washington, DC: Albert Shanker Institute. Retrieved
from:
http://www.gtlcenter.org/sites/default/files/docs/pa/3_PDPartnershipsandStandards/TheImper
ativeforPD.pdf
Enders, C. K., & Tofighi, D. (2007). Centering predictor variables in cross-sectional multilevel
models: A new look at an old issue. Psychological Methods, 12(2), 121-138. doi:
10.1037/1082-989X.12.2.121
Faggella-Luby, M., & Wardwell, M. (2011). RTI in a middle school: findings and practical
implications of a tier 2 reading comprehension study. Learning Disability Quarterly, 34(1),
35-49.
Farrington, C.A., Roderick, M., Allensworth, E., Nagaoka, J., Keyes, T.S., Johnson, D.W., &
Beechum, N.O. (2012). Teaching adolescents to become learners. The role of noncognitive
factors in shaping school performance: A critical literature review. Chicago: University of
Chicago Consortium on Chicago School Research.
French, S. E., Seidman, E., Allen, L., & Aber, J. L. (2006). The development of ethnic identity
during adolescence. Developmental Psychology, 42(1), 1-10. doi: 10.1037/0012-1649.42.1.1
Gambone, M.A., Klem, A. M., Summers, J. A., Akey, T. A., & Sipe, C. L. (2004). Turning the
tide: The achievements of the First Things First education reform in the Kansas City, Kansas
Public School District. Philadelphia: Youth Development Strategies, Inc. Retrieved from:
http://www.ydsi.org/ydsi/pdf/turningthetidefullreport.pdf
180
Garet, M. S., Porter, A. C., Desimone, L. M., Birman, B., & Yoon, K. S. (2001). What makes
professional development effective? Analysis of a national sample of teachers. American
Educational Research Journal, 38(3), 915–945. doi: 10.3102/00028312038004915
Grolnick, W. S., & Ryan, R. M. (1989). Parent style associated with children's self-regulation
and competence in school. Journal of Educational Psychology, 81, 143-154. doi:
10.1037/0022-0663.81.2.143
Heller, R., & Greenleaf, C. L. (2007). Literacy instruction in the content areas: Getting to the
core of middle and high school improvement. Washington, DC: Alliance for Excellent
Education. Retrieved from:
http://carnegie.org/fileadmin/Media/Publications/PDF/Content_Areas_report.pdf
Hulleman, C. S., Rimm-Kaufman, S. E., & Abry, T. (2013). Innovative methodologies to explore
implementation: Whole-part-whole--Construct validity, measurement, and analytical issues
for intervention fidelity assessment in education research. In T. Halle, A. Metz, & I.
Martinez-Beck (Eds.), Applying implementation science in early childhood programs and
systems (pp. 65-93). Baltimore: Paul H. Brookes.
Joyce, B. & Showers, V. (2002). Student Achievement through staff development (3rd ed.)
Alexandria, BA: Association for Supervision and Curriculum Development.
Kansas State Department of Education. Assessment report. Retrieved from:
http://www.ksde.org/Default.aspx?tabid=233
181
Kemple, J. J., Corrin, W., Nelson, E., Salinger, T., Herrmann, S., & Drummond, K. (2008). The
Enhanced Reading Opportunities study (NCEE 2008-4015). Washington, DC: National
Center for Education Evaluation and Regional Assistance, Institute of Educational Sciences,
U.S. Department of Education.
Retrieved from: http://www.air.org/files/ERO_Full_Report_Year2011.pdf
Lang, L., Torgesen, J., Vogel, W., Chanter, C., Lefsky, E., & Petscher, Y. (2009). Exploring the
relative effectiveness of reading interventions for high school students. Journal of Research
on Educational Effectiveness, 2(2), 149-175. doi: 10.1080/19345740802641535
Ludtke, O., Trautwein, U., Kunter, M., & Baumert, J. (2006). Reliability and agreement of
student ratings of the classroom environment – A reanalysis of TIMSS data. Learning
Environments Research, 9, 215-230. doi:10.1007/s10984-006-9014-8
McClelland, G. H. & Judd, C. M. (1993). Statistical difficulties of detecting interactions and
moderator effects. Psychological Bulletin, 114(2), 376-390. doi: 10.1037/00332909.114.2.376
Muthén, L. K. & Muthén, B. O. (1998-2009). Mplus. (Version 6.12) [Computer software]. Los
Angeles, CA: Muthén & Muthén
National Research Council and the Institute of Medicine. (2004). Engaging Schools: Fostering
High School Students’ Motivation to Learn. Committee on Increasing High School Students’
Engagement and Motivation to Learn. Board on children, Youth, and Families, Division of
Behavioral and Social Sciences and Education. Washington, DC: The National Academies
Press.
Quint, J. (2006). Meeting five critical challenges of high school reform. New York City: MDRC.
Retrieved from: http://www.mdrc.org/sites/default/files/full_440.pdf
182
Quint, J., Bloom, H. S., Black, A. R., Stephens, L., & Akey, T. M. (2005). The challenge of
scaling up educational reform, Findings and lessons from First Things First, Final report.
New York City: MDRC. Retrieved from:
http://www.mdrc.org/sites/default/files/full_531.pdf
Rakes, C. R., Valentine, J. C., McGatha, M. B., & Ronau, R. N. (2010). Methods of instructional
improvement in algebra: A systematic review and meta-analysis. Review of Educational
Research, 80(3), 372-400. doi: 10.3102/0034654310374880
Raudenbush, S. W. & Bryk, A. S. (2002). Hierarchical Linear Models (Second Edition).
Thousand Oaks: Sage Publications.
Roberston, J. (2013, February 8). Benchmarking is a big boost for math. The Kansas City Star.
Retrieved from: http://www.kansascity.com/2013/02/08/4056702/benchmarking-is-a-bigboost-for.html
Roediger, H. L., Agarwal, P. K., McDaniel, M. A., & McDremott, K. B. (2011). Test-enhanced
learning in the classroom: Long-term improvements from quizzing. Journal of Experimental
Psychology: Applied, 17(4), 382-395. doi: 10.1037/a0026252
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: John Wiley &
Sons.
Schulz, K. F., Altman, D. G., & Moher, D. (2010). CONSORT 2010 Statement: updated
guidelines for reporting parallel group randomised trials. Annals of Internal Medicine, 152,
1-7.
Seidman, E., Allen, L., Aber, J. L., Mitchell, C., & Feinman, J. (1994). The impact of school
transitions in early adolescence on the self-system and perceived social context of poor urban
youth. Child Development, 65(2), 507-522. doi: 10.2307/1131399
183
Si, Y. & Reiter, J. P. (2013). Nonparametric Bayesian multiple imputation for incomplete
categorical variables in large-scale assessment surveys. Journal of Educational and
Behavioral Statistics, 38(5), 499-521. doi: 10.3102/1076998613480394
Slavin, R. E., Cheung, A., Groff, C., & Lake, C. (2008). Effective reading programs for middle
and high schools: A best evidence synthesis. Reading Research Quarterly, 43(3), 290-322.
doi: 10.1598/RRQ.43.3.4
Slavin, R. E., Lake, C., & Groff, C. (2009). Effective programs in middle and high school
mathematics: A best-evidence synthesis. Review of Educational Research, 79(2), 839-911.
doi: 10.3102/0034654308330968
Tseng, V., & Seidman, E. (2007). A systems framework for understanding social
settings. American Journal of Community Psychology, 39(3-4), 217-228.
doi: 10.1007/s10464-007-9101-8
US News and World Report (2013, April); Best high schools. US News and World Report.
Retrieved from: http://www.usnews.com/education/best-high-schools
White House (n.d.) Retrieved from: http://www.whitehouse.gov/issues/education/k-12/educateinnovate
184
Appendix 1: Change in Project Focus
This project was originally entitled Scaling Up the First Things First Reform Approach
and was funded through a grant from the Institute of Education Sciences (IES), U.S. Department
of Education. Its original aim was to conduct a randomized field trial (RFT) of the effectiveness
of scaling up the First Things First (FTF) approach to school reform high schools that serve large
percentages of disadvantaged students. FTF, designed and implemented by the Institute for
Research and Reform in Education (IRRE), creates multiple theme-oriented small learning
communities (SLCs) made up of 300-350 students and 15-18 teachers within a larger school.
Within the SLCs are family and student advocacy groups, in which a teacher (the advocate) has
15 to 20 students across the four grades with whom he or she meets weekly and with whose
family he or she is the liaison. The third and final key FTF strategy is instructional improvement
(II), which involves creating enriched learning opportunities that are rigorous and engaging for
all students and are aligned with district, state, and federal standards.
During the first year of the grant, we encountered two major difficulties in relation to the
project: (1) each of four sites that had committed to be part of the trial at the time we submitted
the application withdrew their commitment shortly before the grant was to begin; and (2) the
funding that IRRE had expected to receive to support the FTF intervention that our project would
have evaluated failed to materialize at the end of the first year of our grant. Accordingly, we had
spent the first year of the grant recruiting sites for the initial project, but at the end of that year
we had to tell these sites that we could not begin the project as planned. Further, this meant that
it was impossible for us to do the proposed S-RFT because there would not be enough years to
complete the trial if we did not select and randomize the first cohort until the Spring of 2009
185
(i.e., late in Year 2 of the grant). We therefore worked with the program division of the National
Center for Education Research (NCER), within IES, to reformulate the project.
We designed two studies in the summer of 2008, which was early in the second year of
the grant: (1) a validation study of the Engagement, Alignment, and Rigor (EAR) Classroom
Visit Protocol designed by IRRE and used to assess the quality of classroom instruction; and (2)
a S-RFT to examine the efficacy of the instructional improvement component of FTF. We refer
to the intervention as Every Classroom, Every Day (ECED). We submitted a request to NCER in
the summer of 2008 (early in Year 2 of the grant) to change the grant’s scope of work to these
two studies. We received informal approval from NCER in 2008 to do the validity study and
began recruitment for the efficacy trial of ECED. In the Spring of 2009 we submitted a “miniapplication” to proceed with the efficacy trail and to have these two studies replace the initial
effectiveness trial. We were granted informal approval shortly thereafter in 2009 to begin the
efficacy trail in the summer of 2009. In May 2010 we received formal approval from NCER for
all the requested changes. Thus, the original title of the grant is misleading as the project is not
an effectiveness trial of First Things First but is instead primarily an efficacy trial of the Every
Classroom, Every Day component of FTF, evaluated as a free-standing intervention rather than a
part of FTF.
186
Appendix 2: Findings From the First Component of Revised Project:
Validation of the EAR Classroom Visit Protocol
(Manuscript Under Review)
ENGAGEMENT, ALIGNMENT, AND RIGOR
187
Engagement, Alignment, and Rigor as Vital Signs of High-Quality Instruction:
A Classroom Visit Protocol for Instructional Improvement and Research
Diane M. Early, Ronald D. Rogge, and Edward L. Deci
University of Rochester
Diane M. Early, Department of Clinical and Social Sciences in Psychology, University of
Rochester; Ronald D. Rogge, Department of Clinical and Social Sciences in Psychology,
University of Rochester; Edward L. Deci, Department of Clinical and Social Sciences in
Psychology, University of Rochester.
The research reported here was supported by the Institute of Education Sciences, U.S.
Department of Education, through Grant R305R070025 to the University of Rochester. The
opinions expressed are those of the authors and do not represent views of the Institute or the U.S.
Department of Education. The authors wish to thank the Institute for Research and Reform in
Education for their support of this work and the participating students, teachers, schools and
districts for their support of the data collection efforts.
Correspondence concerning this article should be addressed to Diane M. Early,
Department of Clinical and Social Sciences in Psychology, University of Rochester, P.O. Box
270266, Rochester, NY, 14627, E-mail: [email protected]
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
188
Engagement, Alignment, and Rigor as Vital Signs of High-Quality Instruction:
A Classroom Visit Protocol for Instructional Improvement and Research
Abstract
This paper investigates Engagement (E), Alignment (A), and Rigor (R) as vital signs of highquality teacher instruction and examines the reliability and predictive validity of the EAR
Classroom Visit Protocol, designed by the Institute for Research and Reform in Education
(IRRE). In Study 1, we examined observations of 33 English/Language Arts (ELA) teachers and
25 mathematics teachers from four high schools. Study 2 included 63 math and 64 ELA teachers
from eight high schools. Engagement was a consistent predictor of math and ELA test scores,
when controlling for the previous year’s score. Further, under some circumstances, alignment
and rigor also served as indicators of high quality instruction. Students’ self-report of their
engagement in school was also generally predictive of test scores in models that also included
perceived academic competence and observed engagement, alignment, or rigor. We discuss the
importance of classroom engagement as a marker of instructional quality and a predictor of
student achievement.
Keywords: instructional quality, engagement, alignment, rigor, high school, standardized
test scores
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
189
Engagement, Alignment, and Rigor as Vital Signs of High-Quality Instruction:
A Classroom Visit Protocol for Instructional Improvement and Research
In recent years there has been substantial discussion about the quality of our nation’s
educational system, with both policy makers and education experts maintaining that, on average,
the quality of U.S. education is lower than optimal. To support this claim they often point to
international test-score results. For example, results from the Program for International Student
Assessments (PISA) for reading literacy for 15-year-olds indicated that the U.S. ranked only in
the top 26 out of 65 participating countries, with nine of the countries having scores that were
significantly higher than those of the U.S. (National Center for Education Statistics, Institute of
Education Sciences, 2009). In math, the Trend for International Mathematics and Science Study
(TIMSS) results for eighth-grade students indicated that the U.S. scores were only among the top
24 out of the 56 educational systems involved, with 11 systems having significantly better scores
than the U.S. (National Center for Education Statistics, Institute of Education Sciences, 2011).
Earlier similar test results were part of the justification for the No Child Left Behind
legislation enacted in 2002, which both mandated standardized achievement tests in all states
seeking federal funds and required schools and school districts to improve student test scores
(Rothman, 2012). Subsequently, the Race-to-the-Top program has added to the press for
improved test scores by increasingly holding individual teachers accountable for improving the
scores of students in their classes (Klein, 2012). The National Research Council and the Institute
of Medicine (2004) argued that the quality of teachers’ instruction is the most proximal and
powerful predictor of students’ learning. Accordingly, considerable interest has been directed
toward methods for assessing and improving the quality of teacher instruction in our schools.
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
190
High-Quality Classroom Instruction
Senior staff members from the Institute for Research and Reform in Education (IRRE)
were interested in developing a tool for measuring ‘vital signs’ of instructional quality. Similar to
a physician measuring blood pressure or pulse to obtain a quick picture of a person’s health,
IRRE sought to identify vital signs of instructional quality that could be measured quickly and
often as a way of tracking variation in the quality of instruction. To this end, they began with the
question, “what characteristics of classroom instruction would make it excellent?” and turned to
the existing research literature, as well as to their own experiences working to improve schools,
to formulate an answer. They noted that student engagement was consistently linked to high
quality instruction and learning. Numerous studies published in the past 30 years have confirmed
that when students are high in intrinsic motivation, they become more engaged in learning,
which leads to deeper and more conceptual understanding (e.g., Benware & Deci, 1984;
Grolnick & Ryan, 1987), particularly when the learning tasks are heuristic rather than
algorithmic (McGraw, 1978). As well, there is evidence that, when students have fully
internalized the regulation of learning particular topics, even ones they do not find interesting,
they tend to be more engaged in learning and to perform better than when learning is controlled
by external or internal contingencies (e.g., Black & Deci, 2000; Grolnick & Ryan, 1989). Thus,
intrinsic motivation and fully internalized motivation predict engagement and positive
educational outcomes; together they have been referred to as autonomous motivation for learning
(Ryan & Deci, 2000). Still other research has shown that when teachers are supportive of
students, are interested in the material, and are enthusiastic about teaching, students tend to be
more autonomously motivated and engaged (e.g., Deci, Schwartz, Sheinman, & Ryan, 1981;
Patrick, 1995). Evidence such as this, combined with their experiences in schools, led the IRRE
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
191
staff to conclude that engaging classroom instruction, encouraged by teachers’ support of
students, enthusiasm about teaching, and interest in the content they are teaching, matters for
student learning and should be measured as a vital sign of high-quality instruction.
The IRRE team then considered whether, in light of the circumstances in our schools,
engaging instruction alone would be sufficient to index teaching quality. What if students were
engaged in learning that was not aligned with curricula and the types of assessments being used
by the district? What if the level of instruction was too low to yield the grade-level mastery being
assessed by state and national achievement tests? These discussions within the IRRE team led
them to postulate two other potentially important vital signs of excellent instruction, namely
alignment and rigor.
Because the federal government had mandated that states administer standardized
achievement tests as the primary indicator of student learning, the IRRE staff reasoned that
classroom instruction would need to be aligned with state standards in order to be high quality
and affect student performance. This was not intended to be interpreted as teaching to the tests,
but rather as being sure that the material—that is, content and curriculum—widely agreed upon
by educators as being important for students at particular grade levels was covered in the
relevant classes. Alignment, as conceptualized here, is a vital sign of instructional quality
because it measures the extent to which the teacher is providing content that is on-time and ontarget with what students need to learn. Other researchers have developed systems for
quantifying the extent to which state-mandated assessment systems are aligned with state
educational standards (Herman, Webb, & Zuniga, 2007; Webb, Herman, & Webb, 2007). That
type of alignment is also important, but differs from the alignment measured by the EAR
Protocol because it is a characteristic of the testing system, rather than a component of
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
192
instructional quality.
Finally, rigor was selected for two reasons: (1) because the literature showed a strong
connection between challenge and students’ intrinsic motivation (e.g., Csikszentmihalyi, 1975;
Danner & Lonky, 1981; Deci, 1975; Harter, 1978); and (2) to ensure that all students would be
expected and supported—by learning materials, the work being asked of them, and evaluations
of their work—to master material at levels sufficient to yield grade-level or better learning of the
subject matter embodied by the standards. Although rigor is often interpreted to mean making
schoolwork extremely hard, it is here intended to convey that expectations for all students are
consistently high and that instructional strategies deployed by teachers ensure that the work
presented optimally challenges all students to move from where they are toward high standards.
Having postulated that engagement, along with alignment and rigor, would represent vital
signs of excellent instruction, the IRRE staff developed an observational protocol to assess these
dimensions. The goal was to be able to gather reliable information on a regular basis (1) to
provide ongoing feedback to teachers so they could reflect on their own teaching, (2) to aid
administrators in selecting professional development activities, and (3) to assess whether the
self-reflection and professional development is making a difference in what the teachers actually
do in the classroom. Change in student test scores is certainly an important long-term outcome of
assessing a professional development strategy, but schools need more immediate feedback to
gauge whether their efforts to improve instruction are working. The resulting tool, named the
Engagement, Alignment, and Rigor (EAR) Classroom Visit Protocol, was intended for use by
school staff, as well as outside consultants and researchers.
Classroom Observations
Currently, there are a few tools available that have been found reliable and valid for
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
193
assessing instruction. A recent investigation by the Bill & Melinda Gates Foundation (2012) of
five such observational tools concluded that each was a valid predictor of student achievement
gains when used by trained observers with a background in education but without special
knowledge of instructional assessment or ties to the specific tool. The EAR Protocol, if found to
be reliable and valid, would expand the list of useful tools because it has several desirable
characteristics and a somewhat different focus that may be preferable for specific schools or
districts depending on their interests and needs. For example, the EAR Protocol was designed for
use in all subject areas, including core subjects such as math and English/language arts (ELA)
and electives such as art and physical education. Its focus is a set of specific instruction-related
experiences for students that result from what teachers do, rather than measuring teacher
behaviors or attributes. That is, E, A, and R are postulated to be vital signs of quality that lend
themselves to improvement through multiple modalities and foci of professional development,
but they do not focus on implementation of specific instructional strategies. Further, if used
widely in a school or district, the protocol provides a common language and set of descriptors to
promote conversations about high-quality instruction across grade levels and subject areas,.
The EAR Protocol requires a 20-minute observation, providing enough time to obtain a
clear picture of what is happening in the classroom while still being feasible for school
administrators to use on a regular basis. Multiple observations of a single teacher, grade, or
department are necessary for the results to be meaningful, but having this short observation
period makes it usable for administrators who generally have very full schedules. The 20-minute
observation stands in contrast to the three- to five-minute “walk-throughs” that are popular with
school personnel, some of which are imported from external sources and others of which are
“home grown” by the school district or individual schools. Although those very brief visits may
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
194
help administrators gain a picture of the general state of instruction in their schools, they are
highly subjective and are too brief to provide a meaningful understanding of what is really taking
place in an individual teacher’s classroom (Downey, Steffy, English, Frase, & Poston, 2004;
Protheroe, 2009). In contrast, the 20-minute EAR Protocol, allows for a richer sampling and
more quantitative representation of instructional quality.
Additionally, data from the EAR Protocol are collected using an electronic data
collection system (via smartphone or tablet computer) and are uploaded immediately. Reports
and graphs can be generated through an on-line system that aggregates observations across an
individual teacher, entire department, grade, small learning community, school, or district. This
on-line system provides school and district administrators with immediate feedback and the
ability to quickly identify trends and changes over time.
The Current Research
The EAR Classroom Visit Protocol was developed in 2004, and IRRE began field testing
it immediately. To date, it has been used in more than 100 elementary, middle, and high schools
across the country for more than 27,000 visits (Broom, 2012). Those data, and feedback from the
schools that use the tool, provide preliminary indication of its utility. The current study was
designed to be a more rigorous test of the reliability and validity of the tool, conducted by an
independent research team. Thus, the current paper (1) describes the EAR Protocol, (2)
investigates the tool’s inter-rater reliability, both when used by trained observers from outside
the school and by trained school and district personnel, and (3) examines the tool’s predictive
validity, both by itself and in conjunction with students’ questionnaire responses, using
standardized test scores as the outcome of interest.
The EAR Classroom Visit Protocol
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
195
In the EAR Protocol, engagement is defined as students being actively involved—
emotionally, behaviorally, and cognitively—in their academic work. When students are engaged,
they are actively processing information (listening, watching, reading, thinking) or
communicating information (speaking, performing, writing) in ways that indicate they are
focused on the task and interested in it (Connell & Broom, 2004). Engagement is a prerequisite
for school success. It leads to effort and persistence, both of which allow students to profit from
challenging curricula (National Research Council and the Institute of Medicine, 2004).
The EAR Protocol defines alignment as students (1) being asked to do and actually doing
schoolwork that reflects academic standards and (2) having opportunities to master the methods
used on high stakes assessments such as their state’s standardized tests and college entrance
exams. It can be assessed in relation to district, state, or national standards and assessments such
as the Common Core. In aligned classrooms, what is being taught and what students are being
asked to do: are in line with the standards and curriculum; are “on time” and “on target” with the
scope and sequence of the course of study; and provide students opportunities to experience high
stakes assessment methodologies among other assessment approaches (Connell & Broom, 2004).
Rigor, as defined in the EAR Protocol, reflects the common sense notion that students
will only achieve at high levels if that level of work is expected and inspected for all students. In
rigorous classrooms, the learning materials and instructional strategies being used challenge and
encourage all students to produce work or respond at or above grade level. All students are
required to demonstrate mastery at these levels and have the opportunity for re-teaching as
needed (Connell & Broom, 2004).
Students’ Self-Reported Motivation Variables
As noted above, one aim of the current studies was to evaluate the predictive validity of
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
196
the EAR Classroom Visit Protocol. In addition to testing whether the tool would predict
standardized achievement test scores, when the previous year’s scores were controlled, we
examined whether student reports of their academic engagement and perceived academic
competence across their school experiences would add to this prediction. In earlier studies, both
perceived competence and engagement were shown to predict change in students’ academic
performance (Connell, Spencer, & Aber, 1994; Gambone, Klem, Summers, Akey, & Sipe,
2004). In the current study, the EAR Protocol was used to assess vital signs of high-quality
instruction and to predict performance on standardized achievement tests. We further examined
whether students’ own perceptions of their academic engagement and perceived competence
would contribute to change in these test scores. This is in line with the Bill & Melinda Gates
Foundation (2012) report, which encouraged schools to use both classroom observation and
student reports of their learning experiences to get the fullest picture of instruction in their
school, arguing that neither source of information alone is sufficient.
Study 1 Method
Description of the EAR Classroom Visit Protocol
The EAR Classroom Visit Protocol is an observational tool completed by trained
observers during and after a 20-minute observation. The original tool includes 15 items, but only
ten are used to calculate final scores. Those ten items appear in Table 1. Typically teachers
receive multiple 20-minute observations across the school year to gain a full picture of
instruction in their classroom(s). The observers must be experienced educators, such as school
administrators, teachers, technical-assistance providers, or researchers with past classroom
experience. All observers must be trained in use of the protocol. Data are uploaded to a central
server that provides reports at different levels (e.g., teacher, department, grade, school) for use in
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
197
professional development and reflective conversations with teachers, as well as performance
management around instructional leadership and support at the district and school levels.
Engagement. Classroom visitors use two items to assess engagement: one measures the
percentage of students who are on-task and the second measures the percentage of on-task
students who are actively and intellectually engaged in the work. For both items, trained
observes walk around the classroom, inspecting student work, watching students’ facial
expressions, and listening to student conversations and student responses to teacher questions.
Additionally, classroom visitors have brief conversations with students. The conversations,
which take place only if they will not disrupt the class, include questions like “What does your
teacher expect you to learn by doing this work?” and “Why do you think the work you are doing
is important?” The questions are open-ended and require students to explain what they are
learning, thus preventing students from simply providing socially desirable answers. Student
responses are used along with the observations to estimate the percentage who are actively
engaged.
Alignment. Observers make four binary judgments about whether the learning materials,
learning activities, expectations for student work, and students’ class work reflect relevant
federal, state, and local standards, designated curricula, and high stakes assessments. When
available, observers are provided with the pacing guide for the course being observed to aid in
determining the extent to which the course is covering the required material and if instruction is
“on-time” and “on-target” for their district.
Rigor. This construct is assessed with four judgments (three binary, one percentage) that
relate to the cognitive level of the material, the student work expected, and the extent to which
students are required and supported to demonstrate mastery of the content. Items concern
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
198
whether learning materials and student work products are appropriately challenging, whether
students are expected to meet/surpass relevant standards, and whether they have an opportunity
to demonstrate proficiency and are supported to do so. Observers are instructed to consider the
level of thinking and performing required by the learning activities, as defined in Bloom’s
taxonomy (Bloom, Englehart, Furst, Hill & Krathwohl, 1956). Credit is given only when the
activities are predominately intermediate and include some advanced level work.
EAR Data Collection
Study 1 took place in four high schools from a single district in a southwestern state
during the 2008-09 school year. The schools in this study were relatively large, with an average
student enrollment over 1,500. Over 40% of the students enrolled in these schools were
Latino/Hispanic and a roughly equal percentage was non-Hispanic, White. About one-third of
the students were from low-income families, as evidenced by their eligibility for free/reduced
lunch. The EAR Protocol data were collected for multiple purposes, including to support
professional development in the district, to establish a scoring system with continuous variables
that could be used for research, to investigate the tool’s inter-rater reliability, and assess the tools
validity for predicting standardized test scores in math and English/Language Arts (ELA).1
Data were collected by three groups of individuals: (1) IRRE consultants (n = 9) who had
used the tool extensively over several years and were also providing instructional support in
these schools, (2) former educators hired expressly for this project who had deep knowledge of
high school classroom practices but no direct connection with these schools (n = 3), and (3)
school leaders such as principals, assistant principals, and instructional coaches from the
participating schools (n = 21).2 The former educators and school leaders were trained by IRRE,
using their standard training procedures that consist of (1) two full days of group instruction,
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
199
including several classroom visits followed by scoring discussions, (2) a two-to-three-week
window during which those participating in the training make practice visits as teams to calibrate
their scoring, and (3) two additional full days of group instruction focusing on calibration and
use of the data for instructional improvement.
In all, 2,171 EAR Protocols were collected during the 2008-09 school year; 416 were
collected by IRRE consultants, 347 by the former educators, and 1,408 by school leaders. Table
1 presents descriptive statistics for the 10 individual indictors across the 2,171 observations.
These observations, which were made in all types of courses, including math, ELA, science,
history, art, and special education, were used for the Confirmatory Factor Analysis (CFA)
discussed below. Only data from 10th-grade math and ELA classes were used in the predictive
validity analyses because the state only administers standardized tests in high school in those two
subject areas. Because the math and ELA exams were administered early in the spring term, we
used only data from fall observations of math and ELA classes for these validity analyses. Thus,
the validity analyses included 125 observations of 33 different ELA teachers and 102
observations of 25 different math teachers.
Student Questionnaires
In Study 1, in the fall of 2008, all 10th-grade students at the four high schools were asked
to respond to an on-line questionnaire, administered during the school day. Items on the
questionnaire had been used extensively by IRRE in their past work with schools. Two scales
from that questionnaire are of particular interest in this study: self-reported engagement in school
and perceived academic competence.
The measure of self-reported engagement in school asked students to respond to six
items, using a four-point scale ranging from ‘not at all true’ to ‘very true.’ Sample items include:
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
200
‘It is important to me to do the best I can in school’ and ‘I pay attention in class.’ Cronbach’s
alpha on this scale in the current sample was .70 (n = 1,144, mean = 3.04, SD = 0.48). The
measure of perceived academic competence includes six items, using the same four-point
response scale. Sample items include: ‘I feel confident in my ability to learn at school’ and ‘I am
capable of learning the material we are being taught at school.’ Cronbach’s alpha on this scale in
this sample was .76 (n = 1,144, mean = 3.23, SD = 0.51).
Analytic Plan
Inter-rater agreement. In the past, IRRE encouraged school districts to work as teams
to establish a common understanding of the tool that could be used within their district to
improve instruction. For research purposes, however, it was important to establish that different
types of classroom visitors were using a common understanding of the tool and that this common
understanding could be used to predict changes in student test scores. For this reason, the first
goal of the Study 1 analyses was to establish the tool’s inter-rater agreement across the different
types of users, applying the continuous scoring system.
Predictive validity. The second goal of the Study 1 analyses was to investigate the
relationship between classroom instruction, as measured using the EAR Classroom Visit
Protocol, and standardized test scores, above and beyond previous test scores. For these tests, we
used Hierarchical Linear Modeling (HLM; Raudenbush & Bryk, 2002) to appropriately model
these data in which students were nested within sections (i.e., specific period of a specific
teacher), and sections were nested within teachers.
Study 1 Results and Discussion
Scoring the EAR Classroom Visit Protocol
IRRE has used the EAR Protocol extensively in its instructional improvement efforts. In
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
201
order to give straightforward feedback to educators, IRRE has used thresholds to indicate
whether a classroom, department, grade or school is at an acceptable level on each EAR vital
sign. These thresholds were developed through extensive deliberation by the IRRE instructional
experts, with input from districts and the existing literature on instruction. IRRE has found this
threshold approach effective for communicating easily with schools and districts about the
quality of their classroom instruction.
In order to use the EAR Protocol for research purposes, however, it is preferable to have
continuous variables that express the full range of variance on these constructs and maximize
power when analyzing associations between the quality of instruction and student outcomes. To
that end, we conducted a series of Confirmatory Factor Analysis (CFA) using the 2,171
observations in Study 1. Based on those CFAs, three scores were created. Engagement was the
mean of proportion of students on task and the proportion of students actively engaged in the
work. Alignment was the proportion of positive answers on the four dichotomous alignment
indicators. Rigor was the mean of the four rigor indicators, after standardizing them using
population estimates derived from 1,551 observations conducted by the IRRE intervention team
in 19 high schools in six school districts across the country between 2004 and 2010.
Across the Study 1 observations, the three variables were correlated, but not so highly as
to be measuring the same construct (E and A r = .32, p < 000; E and R r = .44, p < 000; A and R
r = .63, p < 000). Further we tested whether a model with a single underlying construct would
have a better fit with the data than the model with the three latent constructs, but found that the
fit from that model was unacceptable. Thus a single variable would not satisfactorily represent
instructional quality, so we proceeded using the model with the three latent construct.
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
202
Inter-Rater Agreement
In Study 1, across the 2008-2009 school year, there were 388 cases coded simultaneously
by a pair of observers. Inter-rater reliability was calculated as the intraclass correlation (one-way
random, absolute agreement) between pairs of scores. After calculating continuous scores on
Engagement, Alignment, and Rigor using the scoring method described above, the single
measures intraclass correlation was .76 for engagement, .71 for alignment, and .65 for rigor. Of
the 388 pairs, there were 238 where the pair was made up of an IRRE consultant and a school
leader. Looking just at this sub-set, the single measures intraclass correlations remained
unchanged: .76 for engagement, .71 for alignment, and .65 for rigor. There were 107
observations where the pair was made up of an IRRE consultant and one of the external
observers from the research team (i.e., the former educators). Looking just at this subset, the
correlation was .72 for engagement, .62 for alignment, and .67 for rigor. Thus, all ICCs fall
within the “good” (.60 to .74) or “excellent” (.75 to 1.0) range (Cicchetti, 1994).
Predictive Validity
Cases available for validity analyses. The subset of observations that were collected for
Study 1 in the fall of 2008 in 10th-grade math and ELA classes was used to test the predictive
validity of the EAR Protocol. We focused on math and ELA because those are the subjects in
which standardized test scores were available. As noted, we used fall observations because the
tests were administered fairly early in the spring term. In math classes, 125 observations were
conducted of 33 teachers, teaching 57 sections (i.e., specific period of a specific teacher). On
average, each math teacher was observed 3.68 times (range = 1 to 8, SD = 2.18). In ELA classes,
102 observations were conducted of 25 teachers teaching 50 sections. On average, each ELA
teacher was observed 4.08 times (range = 1 to 7, SD = 1.89). After calculating continuous E, A,
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
203
and R scores for each observation using the scoring described above, a mean E, A, and R score
was calculated for each section and each teacher when there was more than one observation for a
particular section or teacher.
Math teachers were relatively diverse (43% female; 75% White, 10% Latino, 10% MultiRacial). They had an average of 5.09 years of teaching experience (SD = 1.23) and had been in
their current positions 2.50 years (SD = 1.06) on average. The ELA teachers were less diverse:
81% female and 93% White. ELA teachers had an average of 4.94 years of teaching experience
(SD = 1.18) and had been in their current positions 2.38 years (SD = 1.29) on average.
The standardized test serving as the 10th-grade outcome for this study was the state’s high
school exit exam. Students in this state began taking a high school exit exam in the spring of the
10th-grade year in ELA and math, repeating it each semester until they passed. For this study
only scores from the first administration of this exam were used. Additionally, this district
administers a nationally normed standardized assessment called the Terra Nova in math and ELA
to all 9th-graders. The district provided 9th-grade Terra Nova and 10th-grade high school exit
exam scores for students who were in 10th-grade in the 2008-2009 school year.
There were 634 students available for the math analyses, meaning that they had both 9thand 10th-grade math scores and their math section had been observed. There were 993 students
available for the ELA analyses, meaning that they had both 9th-grade and 10th-grade ELA scores
and their ELA section had been observed. The sample size dropped slightly when student
questionnaires were added to the models (n = 621 for math; 975 for ELA) due to some missing
student questionnaires. Table 2 presents demographic information describing the student sample.
Multi-level model description. We used Hierarchical Linear Modeling (HLM;
Raudenbush & Bryk, 2002) to predict 10th-grade exit exam scores in math or ELA, controlling
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
204
for the previous year’s score in the same subject, as a function of observed math or ELA
classroom engagement (E), alignment (A), or rigor (R). Specifically, we built 3-level models in
which between-student differences within sections, as measured by students’ previous year’s test
scores, were modeled at Level 1; within teacher (between section) variation was modeled at
Level 2; and between teacher variation was modeled at level 3. Math and ELA outcomes were
modeled separately, using observed E, A, or R in fall math classes in the math models and
observed E, A, or R in fall ELA classes in the ELA models.
The association between 10th- and 9th-grade test scores were allowed to vary across
teachers (as a level 3 random effect), and average 10th-grade test scores were allowed to vary
across classes (as a Level 2 random effect) and to vary across teachers (as a level 3 random
effect). Both the predictor and outcome variables were standardized prior to running these
analyses, essentially converting the HLM coefficients into standardized coefficients. Secondary
models including students’ self-reported engagement in school and perceived academic
competence were run to investigate the predictive role of these individual student characteristics
beyond that of classroom-level observations.
Predicting math scores from EAR observations. As seen in Table 3, 9th-grade test
scores served as a strong predictor of 10th-grade test scores in all of the models. A 9th-grade test
score one standard deviation above the mean predicted a 10th-grade test score .61-.62 standard
deviations higher on math and .73-.74 standard deviations higher on ELA, suggesting a strong
component of student ability and/or past instruction in 10th-grade scores. After controlling for
these effects, when observations of Engagement, Alignment, or Rigor were separately allowed to
predict residual change in standardized test scores, the results offered support for E, A, and R as
predictors of student achievement. As seen in the first set of columns for predicting 10th-grade
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
205
math scores (on the three rows labeled “Differences Across Teachers in…”), for every standard
deviation higher than average a fall math teacher was rated on engagement, the model predicted
his or her students scoring an average of .17 standard deviations higher on their 10th-grade math
tests, even after controlling for their 9th-grade math scores and variations of engagement among
the teacher’s different sections. Similarly, for observed alignment, a +1 standard deviation
difference for a teacher predicted a statistically significant +.16 standard deviation difference in
students’ scores, and for rigor a +1 standard deviation difference for a teacher predicted a
marginally significant +.14 standard deviation difference in students’ scores. Thus, under the
stringent conditions of predicting standardized math test scores in 10th-grade after controlling for
standardized scores one-year earlier, we found evidence that each of the teaching variables of
observed engagement, alignment, and rigor explained some variance. Teachers whose instruction
was more engaging, aligned, and rigorous had students who showed greater gains on
standardized tests. These results further suggest that the dimensions assessed by the EAR tool
capture aspects of classroom dynamics and effective instruction that lead to measurable realworld gains in learning, underscoring the utility of this instrument.
Predicting ELA scores from EAR classroom observations. Looking at the models
predicting 10th-grade ELA scores (the right portion of Table 3), we see there was again evidence
that the three dimensions of instructional quality were linked to student outcomes, but the results
were somewhat weaker than they were for math. For every standard deviation higher than
average a fall ELA teacher was rated on engagement or on alignment, the model predicted his or
her students scoring an average of .06 standard deviations higher on their 10th-grade standardized
ELA tests (both effects marginally significant), even after controlling for 9th-grade ELA score
and variations among the teacher’s different sections. For observed rigor, a +1 standard deviation
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
206
difference in teachers predicted a significant +.10 standard deviation difference in ELA scores.
Thus, we see evidence that observations of teachers’ instructional quality predict students’
improvement in ELA, although the size of the effects was not as large for ELA as for math.
In the six models without student questionnaires presented in Table 3, the betweensection variation on E, A, and R within a teacher (Level 2) were non-significant (see the three
lines on the table labeled “Within Teacher Variation in…”). This indicates that the engagement,
alignment, and rigor of a particular section was not predictive of student outcomes, above and
beyond that section’s teacher’s overall level, suggesting that the common experiences teachers
are creating across different sections of their courses predict student growth in learning more
than differences they create between these different sections.
Predicting math scores from EAR observations and student motivation variables.
The second and fourth set of columns in Table 3 summarize the HLM results after the student
reports of engagement in school and perceived competence were included to assess their unique
contribution to student learning beyond the quality of observed instruction. The sample for the
math analyses includes 621 10th-graders who responded to the questionnaire, were enrolled in a
math section where EAR protocols were collected, and had scores on both the 9th-and 10th-grade
standardized achievement tests in math. In the three models predicting 10th-grade math scores
(second set of columns for math), students’ self-reported engagement in school was a significant
predictor of their math test scores, while perceived competence was not. Thus, higher student
self-report of academic engagement was associated with slightly higher 10th-grade math scores.
After controlling for student reports of engagement and competence, observed E, A, and R in
math classes all remained significant predictors of 10th-grade math scores. Specifically, higher
observed levels of a teacher’s E, A, or R predicted significantly higher average 10th-grade math
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
207
scores in his or her students ranging from +.15 to +.21 standard deviations. These results indicate
that both the observed quality of students’ instructional experience and their generalized sense of
how engaging they find their work in school uniquely contribute to their performance on
standardized assessments.
Predicting ELA scores from EAR observations and student motivation variables.
The final set of columns in Table 3 includes the 975 students with questionnaire and ELA data.
These data suggest that higher student reports of their own engagement in school were also
associated with slightly higher 10th-grade ELA scores even after controlling for 9th-grade test
scores. Additionally, perceived academic competence was a marginal predictor of 10th-grade
ELA scores. Of the three observed vital signs, only observed engagement in the ELA classrooms
was marginally predictive of student ELA achievement in these models, suggesting that the
students’ experience of engagement may have more pervasive effects across subject matter areas.
Study 2 Method
The second study sought to replicate the predictive validity findings of Study 1 in a larger
sample of teachers and students, this one taken from more economically disadvantaged schools.
Ensuring that the findings hold across different settings, with different student demographics and
different standardized testing systems, is important for conclusions about the tool’s validity.
EAR Data Collection
Study 2 took place in eight high schools in two districts in a single western state. The
schools in Study 2 were also large, with an average student enrollment of 1,880. Over half the
students in these schools were Latino/Hispanic, with much smaller groups of White, African
American, and Asian students (less than 20% per group). Over two-thirds of the students across
these eight schools were eligible for free/reduced price lunch.
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
208
Data for these analyses were collected as part of an evaluation of an intervention
designed to improve instruction in 9th- and 10th-grade ELA and math. Four of the participating
high schools had been randomly assigned to receive the intervention and four were serving as
controls; however the two treatment conditions were combined for the current analyses because
the EAR Protocol observations came from the baseline wave of data collection, just as the
intervention was beginning. The observers (n = 3) in this study were former educators or former
IRRE consultants with no links to the participating teachers, schools, or districts. All three had
also collected data as part of Study 1. These three individuals used the EAR Protocol for 261
observations in the fall of 2009 (229 alone, and 17 in pairs); all observations were made in 9thand 10th-grade ELA, Algebra 1, and Geometry classes.
Student Questionnaires
In the fall of 2009, 9th- and 10th-graders in Study 2 were asked to respond to the same
questionnaire as had been used in Study 1. The questionnaire was administered on-line, during
the school day. Cronbach’s alpha for the six-item self-reported student engagement scale in
Study 2 was .68 (n = 2,601, M = 3.10, SD = .65). Cronbach’s alpha for the six item measure of
perceived academic competence in Study 2 was .72 (n = 2,553, M = 3.31, SD = .22).
Study 2 Results and Discussion
Inter-Rater Agreement
The three individuals who collected data for Study 2 worked on the larger project for its
full three years of data collection. Across that project, they participated in 249 observations for
which two individuals were present. Inter-rater reliability was calculated as the intraclass
correlation (one-way random, absolute agreement) between pairs of scores across the entire
project, in order to have enough cases for meaningful analysis of reliability. After calculating
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
209
continuous scores, the single measures intraclass correlation was .72 for engagement, .65 for
alignment, and .67 for rigor.
Cases Available for Analyses
In math, observations were conducted of 63 teachers teaching 111 sections. On average,
each math teacher was observed 1.95 times (range = 1 to 3, SD = 0.63). In ELA, observations
were conducted of 64 teachers teaching 104 sections. On average, each ELA teacher was
observed 1.86 times (range = 1 to 3, SD = 0.53). As in Study 1, after calculating continuous E,
A, and R scores for each observation, section- and teacher-level means were created.
The state in which Study 2 took place administers standardized tests in math and ELA in
8th-, 9th-, and 10th-grade. Students eligible for this study were those who were in either 9th- or
10th-grade in 2009-10. Test scores from one year previous (8th- or 9th-grade) were used as control
variables. The ELA test that each student took depended on his or grade-level, with 8th-graders
taking the 8th-grade ELA test, 9th-graders taking the 9th-grade ELA test, etc. Scale scores were
provided by the districts and were standardized within grade and district for the current analyses.
The math test taken each year depended on the courses in which the students were enrolled. For
instance, students enrolled in Algebra 1 took the Algebra 1 test, regardless of their grade.
However, the scores provided by the districts were scaled similarly for all math tests, as
evidenced by the equivalence of cut-points used across tests to determine different levels of
proficiency. Therefore, we elected to standardize the math test scores within grade (8th, 9th, or
10th) and school district, but not within test.
There were 1,644 students available for the math analyses, meaning that they had two
years of math scores and their math section was observed. There were 2,262 students available
for the ELA analyses, meaning that they had two years of ELA scores and their ELA section was
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
210
observed. As seen in Table 2, the student sample for Study 2 was highly racially diverse, with a
high level of eligibility for free or reduced price lunch. As with Study 1, due to missing
questionnaire data, the sample sizes dropped slightly when student questionnaires were added to
the models. The math analyses that included student questionnaires had sample sizes of 61
teachers, 109 sections, and 1,360 students. The ELA models that included student questionnaire
responses, the sample sizes were 62 teachers, 100 sections, and 1,956 students.
Multi-Level Model Results
HLM analyses parallel to those conducted for Study 1 were conducted using the Study 2
data, with 9th- or 10th-grade test scores as the outcome, controlling for test scores from one year
earlier. As seen in Table 4, previous year’s test scores served as the strongest predictors of
current math and ELA scores. After controlling for those effects, higher levels of observed
engagement in both math and ELA (modeled at level 3 as differences between teachers) were
predictive of higher test scores in both math and ELA. These results continued to offer support
for the utility of the EAR tool. However, in contrast to the findings of Study 1, between-teacher
differences in alignment and rigor did not serve as significant predictors of test scores in Study 2.
The second and fourth sets of columns in Table 4 present the models that also include
self-reported engagement and perceived academic competence. Consistent with Study 1, studentreported engagement in school was a significant, positive predictor of math achievement, again
suggesting that engaged students tend to show greater gains in math knowledge. In contrast to
Study 1, perceived academic competence was marginally, negatively, associated with math
achievement in Study 2, predicting slightly lower gains in math knowledge for students reporting
high levels of competence in this sample. After controlling for those effects, observed
engagement continued to predict gains in student achievement for math and ELA, but alignment
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
211
and rigor did not. Taken as a set, these results suggest that classroom engagement is the most
critical element of success.
General Discussion
Summary of Findings
Findings from these two studies indicate that school and district personnel, as well as
educators from outside the district, can learn to reliably use the Engagement, Alignment, and
Rigor Classroom Visit Protocol. Further, across the two studies, observed engagement was
significantly associated with higher test scores in six out of eight analyses and marginally (p <
.10) associated in the other two, after controlling for the previous year’s test scores. Alignment
and rigor were significantly or marginally associated with higher test scores, again after
controlling the previous year’s test scores, in both math and ELA in Study 1, but were not
predictive of test scores in Study 2.
When student self-reports of their engagement in school and perceived academic
competence were added to the models, both observed student engagement and students’ reports
of their generalized engagement in school predicted academic performance in math in both
studies. In ELA, observed engagement predicted achievement test scores in both studies, but
students’ reports of school engagement was a significant predictor of achievement only in Study
1. For the most part, students’ perceived academic competence was unrelated to their test scores.
Central Role of Engagement
These studies provide evidence for the importance of student engagement in the
classroom, as well as for the validity of EAR Classroom Visit Protocol for measuring the extent
to which teachers are engaging students during class. The importance of student engagement for
academic success is well accepted in the education and psychological literature (Appleton,
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
212
Christenson, Kim, & Reschly, 2006; Finn & Rock, 1997; Klem & Connell, 2004); however, an
agreed upon way to measure the construct has been lacking. The EAR Protocol focuses on
observable engagement in the classroom, across all students in the classroom. It assesses the
extent to which students are paying attention, doing the work requested, and showing signs of
cognitive involvement in the task at hand. It is measured through watching student behavior,
facial expressions, and, when possible, conversations with the students. Clearly, if students are
not engaged in the content in this way, it would be difficult for them to profit fully from the
curriculum. It is important that the EAR Protocol’s assessment of classroom-level engagement
significantly predicts student achievement, above and beyond past achievement, because it is a
relatively quick and simple measure of a complex and fundamental construct, thus adding an
important tool for educators and researchers working to assess instructional quality.
The fact that observed classroom-level engagement continues to predict students’ test
scores when controlling for the individual student’s self-reported engagement in school shows
that these are two somewhat distinct experiences of engagement. Engagement as assessed in the
EAR Protocol captures students’ behavior, affect, and cognition in a particular classroom which
signify that the teacher is teaching in an engaging way. It is thus an indicator of instructional
quality in the classroom rather than a characteristic of a student. The student’s self-report of
general academic engagement, on the other hand, is a malleable characteristic of the individual
student (Appleton et al., 2006) and has been shown—using measures similar to the one used for
these studies—to predict outcomes such as attendance and school drop-out, as well as
standardized test scores (Appleton et al., 2006, Finn & Rock, 1997, Klem & Connell, 2004). Of
course, these two types of engagement are related but perhaps in somewhat complex ways. For
example, consistent experiences of engaging instruction should contribute to student reports of
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
213
being generally engaged in school; but high school students who generally find school highly
engaging might disengage in classes where instruction is not engaging. The current studies, as
well as the work by the Bill & Melinda Gates Foundation (2012), indicate that measuring both
types of engagement is important as they account for different variance in student outcomes.
Alignment and Rigor
Clearly, it is important for the instruction provided to map onto the standardized tests if
we expect that instruction to make meaningful differences in students’ scores on those tests.
Likewise rigor – defined as a combination of appropriate difficulty and continuous checking to
ensure the students are mastering the content—is a common sense requirement for improving
student outcomes. The evidence for the importance of these two predictors; however, is
inconsistent. The differences in findings regarding alignment and rigor across the two studies
may be due to differences in the districts in which these studies took place.
The district in which Study 1 took place had very well defined pacing guides for every
course. District administrators provided copies of the pacing guides to all individuals collecting
EAR Protocol data, so they could quickly and easily determine if the course was on pace and
teaching the district-supported content. Further, the district had been careful to base those pacing
guides on the state grade level performance standards and the standardized tests the students
were expected to pass, ensuring that the correct material was covered prior to the exam, at the
appropriate level of rigor. Neither of the two districts that participated in Study 2 had such clear
and specific pacing guides. The data collection team sought to obtain pacing guides for each
course that was to be observed and those were provided to the Study 2 data collectors. However,
not all courses had guides and sometimes the guides were fairly vague. Further, those districts
had not focused as specifically on linking the pacing and curriculum guides to the state tests or
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
214
state grade level performance standards.
Additionally, a difference in the data collectors’ familiarity with the states might partially
explain the differences in the two studies. For Study 1, all of the school leaders and former
educators hired for the project were from within that state. The only individuals from out of the
state were the IRRE consultants, who conducted less than 20% of the total observations. For
Study 2, the three data collectors had never been educators in the state in which the data were
collected. Thus, it is possible that less knowledge of what alignment and rigor would actually
entail in that state led to lower accuracy in the Study 2 ratings.
These data provide evidence that when the expected curriculum and pacing is clearly
articulated and maps well onto the tests, and when the individuals making the ratings are highly
familiar with the state’s expectations, alignment and rigor can predict student test scores. This
may be less true when the pacing guides either lack detail or are not specifically mapped onto the
tests or when the raters are less familiar with the state standards.
Finally, it may just be that having students engaged in classroom activities is most
important for students’ learning and achievement, as various motivational theorists have
suggested or implied (e.g., Pintrich & Schunk, 1996; Ryan & Deci, 2009; Wigfield & Eccles,
2002). There was evidence that engagement shared variance with alignment and rigor, indicating
that teachers who are engaging in their instruction tend also to instruct in ways that are aligned
and rigorous, so those factors may be present to support achievement, although there was no
evidence of the importance of alignment and rigor in Study 2.
Conclusions
The EAR Protocol provides a straightforward, quick means for school administrators,
instructional support personnel, and researchers to assess the quality of classroom instruction in a
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
215
way that is reliable and linked to student test scores. A single EAR Protocol visit takes only 20
minutes and teachers were observed, on average, fewer than two times during the term in Study
2. With that little amount of information, Engagement was a valid predictor of student’s
standardized test scores, controlling for the previous year’s score, in both math and ELA. This
demonstrates both the importance of engagement as a construct and the ability of this tool to
measure it in a way that is meaningful. In addition, there is evidence that if classes have defined
pacing guides and curricula that are explicit and clearly linked to the state tests, then alignment
and rigor may also be useful constructs for measuring instructional quality.
School districts and researchers need reliable and valid ways to assess the quality of
instruction in the classroom in order to appropriately target professional development and
monitor change. For such systems to be useful, they need to be feasible within the workdays of
school personnel, provide immediate and actionable feedback, and give schools a common
language with which to discuss high-quality instruction. The EAR Protocol meets these criteria
and should be considered as a means of quickly and reliably gauging instructional vital signs.
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
216
References
Appleton, J. J., Christenson, S. L., Kim, D., & Reschly, A. L. (2006). Measuring cognitive and
psychological engagement: Validation of the Student Engagement Instrument. Journal of
School Psychology, 44, 427– 445. doi:10.1016/j.jsp.2006.04.002
Benware, C. & Deci, E. L. (1984). Quality of learning with an active versus passive motivational
set. American Educational Research Journal, 21, 755-765. doi: 10.2307/1162999
Bill & Melinda Gates Foundation (2012). Gathering feedback for teaching. Combining highquality observations with student surveys and achievement gains. Retrieved from:
http://www.metproject.org/downloads/MET_Gathering_Feedback_Research_Paper.pdf
Black, A. E., & Deci, E. L. (2000). The effects of student self-regulation and instructor
autonomy support on learning in a college-level natural science course: A self-determination
theory perspective. Science Education, 84, 740-756. doi: 10.1002/1098-237X
Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., & Krathwohl,D. R. (1956). Taxonomy
of educational objectives: Handbook Cognitive domain. New York, NY. David McKay
Broom, J. (2012, November). Building system capacity to evaluate and improve teaching quality:
A technical assistance providers perspective. In J. P. Connell (Chair), A developmental
approach to improving teaching quality: Integrating teacher evaluation and instructional
improvement. Symposium conducted at the meeting of the Association for Public Policy
Analysis & Management, Baltimore MD.
Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and
standardized assessment instruments in Psychology. Psychological Assessment, 6(4), 284290. doi: 10.1037/1040-3590.6.4.284
Connell, J. P. & Broom, J. (2004). The toughest nut to crack: First Things First’s (FTF)
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
217
Approach to improving teaching and learning. Retrieved from IRRE website:
http://www.irre.org/sites/default/files/publication_pdfs/The%20Toughest%20Nut%20to%20
Crack.pdf
Connell, J. P., Spencer, M. B., & J. L. Aber (1994). Educational risk and resilience in AfricanAmerican youth: Context, self, action, and outcomes in school. Child Development, 65 (2),
493-506. doi: 10.2307/1131398
Csikszentmihalyi, M. (1975). Beyond boredom and anxiety. San Francisco: Jossey-Bass.
Danner, F. W., & Lonky, E. (1981). A cognitive-developmental approach to the effects of
rewards on intrinsic motivation. Child Development, 52, 1043-1052.
Deci, E. L. (1975). Intrinsic motivation. New York: Plenum.
Deci, E. L., Schwartz, A. J., Sheinman, L., & Ryan, R. M. (1981). An instrument to assess adults'
orientations toward control versus autonomy with children: Reflections on intrinsic motivation
and perceived competence. Journal of Educational Psychology, 73, 642- 650.
doi: 10.1037/0022-0663.73.5.642
Downey, C. J., Steffy, B. E., English, F. W., Frase, L. E., & Poston, W. K. (2004). The threeminute classroom walkthrough: Changing school supervisory practice one teacher at a time.
Thousand Oaks, CA: Corwin Press.
Finn, J. D., & Rock, D. A. (1997). Academic success among students at risk for school failure.
Journal of Applied Psychology, 82, 221–234. doi: 10.2307/1170412
Gambone, M.A., Klem, A. M., Summers, J. A., Akey, T. A., & Sipe, C. L. (2004). Turning the
tide: The achievements of the First Things First education reform in the Kansas City, Kansas
Public School District. Philadelphia: Youth Development Strategies, Inc.
Grolnick, W. S., & Ryan, R. M. (1987). Autonomy in children's learning: An experimental and
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
218
individual difference investigation. Journal of Personality and Social Psychology, 52, 890898. doi: 10.1037/0022-3514.52.5.890
Grolnick, W. S., & Ryan, R. M. (1989). Parent styles associated with children's self-regulation
and competence in school. Journal of Educational Psychology, 81, 143-154.
doi: 10.1037/0022-0663.81.2.143
Harter, S. (1978). Pleasure derived from optimal challenge and the effects of extrinsic rewards
on children's difficulty level choices. Child Development, 49, 788-799.
Herman, J. L., Webb, N. M., & Zuniga, S. A. (2007). Measurement issues in the alignment of
standards and assessments: A Case study. Applied Measurement in Education, 20,101-126.
doi: 10.1207/s15324818ame2001_6
Klein, A. (2012). Obama uses funding, executive muscle to make often-divisive agenda a reality.
Education Week, 31 (35), 1-28.
Klem, A. M., & Connell, J. P. (2004). Relationships matter: Linking teacher support to student
engagement and achievement. Journal of School Health, 74(7), 262– 273.
doi: 10.1111/j.1746-1561.2004.tb08283.x
McGraw, K. O. (1978). The detrimental effects of reward on performance: A literature review
and a prediction model. In M. R. Lepper & D. Greene (Eds.), The hidden costs of reward (pp.
33-60). Hillsdale, NJ: Erlbaum.
National Center for Education Statistics, Institute of Education Sciences. (2009). Highlights from
PISA 2009: Performance of U.S. 15-year-old students in reading, mathematics, and science
literacy in an international context. Retrieved from:
http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2011004
National Center for Education Statistics, Institute of Education Sciences. (2011). Highlights
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
219
from TIMSS 2011: Mathematics and science achievement of U.S. fourth- and eighth-grade
students in an international context. Retrieved from:
http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2013009
National Research Council and the Institute of Medicine. (2004). Engaging Schools: Fostering
High School Students’ Motivation to Learn. Committee on Increasing High School Students’
Engagement and Motivation to Learn. Board on children, Youth, and Families, Division of
Behavioral and Social Sciences and Education. Washington, DC: The National Academies
Press.
Patrick, B. C. (1995). College students' intrinsic motivation as a function of instructor
enthusiasm. Unpublished Doctoral Dissertation, University of Rochester.
Pintrich, P. R., & Schunk, D. H. (1996). Motivation in education: Theory, research, and
applications. Englewood Cliffs, NJ: Prentice-Hall.
Protheroe, N. (2009). Using classroom walkthroughs to improve instruction. Principal,
March/April, 2009.
Raudenbush, S. W. & Bryk, A. S. (2004). Hierarchical linear models: Applications and data
analysis methods. Thousand Oaks, CA: Sage.
Rothman, R. (2012). Laying a common foundation for success. Phi Delta Kappan, 94 (3), 57-61.
Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic
motivation, social development, and well-being. American Psychologist, 55, 68-78.
doi:10.1037/0003-066X.55.1.68
Ryan, R. M., & Deci, E. L. (2009). Promoting self-determined school engagement: Motivation,
learning, and well-being. In K. R. Wentzel & A. Wigfield (Eds.), Handbook on motivation at
school, (pp. 171-196). New York: Routledge.
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
220
Webb, N. M., Herman, J. L., & Webb, N. L. (2007). Alignment of mathematics’ state-level
standards and assessments: The role of reviewer agreement. Educational Measurement:
Issues and Practice, 26, 17-29. doi: 10.1111/j.1745-3992.2007.00091.x
Wigfield, A., & Eccles, J. S. (2002). Students’ motivation during the middle school years. In J.
Aronson (Ed.), Improving academic achievement: Impact of psychological factors on
education (pp. 159-184). New York: Academic Press.
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
221
Notes
1
Throughout this manuscript, we use the term English/Language Arts (ELA) to describe courses
and exams focused on comprehension, reading, and writing in English. The courses included
typical high-school English courses, as well as literacy courses focused on improving expository
reading and writing. The exam names were ‘Reading’ in Study 1 and ‘English-Language Arts’ in
Study 2.
2
School district employees were trained by IRRE to collect EAR Classroom Visit Protocol for
their own instructional improvement purposes, as well as for this study. The district decided
which individuals would collect data for their purposes. Eight individuals (in addition to these
21) who worked for the district conducted EAR Visits during the year but either did not
participate in any inter-rater reliability visits (n = 2) or did not appear to understand the tool
based on preliminary analyses using thresholds set by IRRE (n = 6). Thus, their data were used
for the districts’ internal purposes only, and have been excluded entirely from this research.
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
222
Table 1
Study 1: EAR Protocol Descriptive Statistics (n = 2,171)
Item
Engagement
E1
% of students on task
E2
% of students actively engaged in the work requested.
Product of E1 * E2 1
Alignment
A1
The learning materials were(1) / were not(0) aligned with the pacing
guide of this course or grade level curriculum
A2
The learning activities were(1) / were not(0) aligned with the scope
and sequence of the course according to the course syllabus.
A3
The student work expected was(1) / was not(0) aligned with the types
of work products expected in state grade level performance
standards.
A4
Student work did(1) / did not(0) provide exposure to and practice on
high stakes assessment methodologies.
Rigor
R1
The learning materials did(1) / did not(0) present content at an
appropriate difficulty level.
R2
The student work expected did(1) / did not(0) allow students to
demonstrate proficient or higher levels of learning according to state
grade level performance standards.
R3
Evaluations/grading of student work did(1) / did not(0) reflect state
grade level performance standards.
R4
% of students required to demonstrate whether or not they had
mastered content being taught.
1
Mean
SD
77%
63%
53%
21
28
30
.89
.31
.88
.33
.72
.45
.56
.50
.89
.32
.59
.49
.37
.48
35%
E2 refers to the proportion of those students who were on task (in E1) who were actively
engaged, so E1 and E2 must be multiplied together to be meaningful.
Submitted for review to the High School Journal
Please do not share or cite.
38
ENGAGEMENT, ALIGNMENT, AND RIGOR
223
Table 2
Demographic Characteristics of Participating Students
Study 1
Total n
Study 2
1,144
3,144
In both math and ELA analyses
483
778
In math analyses only
151
1,484
In ELA analyses only
510
882
51.0%
49.4%
11.9%
15.6%
4.4%
12.8%
Latino/Hispanic
40.9%
58.6%
Native American
1.3%
3.3%
41.5%
9.7%
32.6%
75.6%
Female
Race/Ethnicity
African American
Asian/Pacific Islander
White
Eligible for Free/reduced price lunch
Note: All data on Table 2 come from student records provided by the school districts, except
free/reduced price lunch in Study 1. That district was unable to provide that information due to
confidentiality policies. Instead, this information comes from the Common Core of Data and
reflects the four schools in their entirety, rather than just the study sample.
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
224
Table 3
Observed Engagement, Alignment, or Rigor Predicting Standardized Test Scores in Study 1
Predicting 10th-Grade MATH Scores
Predictor Variables
without student
motivation variables
Predicting 10th-Grade ELA Scores
including student
motivation variables
without student
motivation variables
including student
motivation variables
Coeff.
SE
df
Coeff.
SE
df
Coeff.
SE
df
Coeff.
SE
df
-0.09
0.07
31
-0.07
0.06
30
0.02
0.03
23
0.02
0.03
23
0.06*
0.03
615
0.06*
0.02
969
+
0.02
969
Observed Engagement
Intercept
Student Self-Reported Engagement in School
Student Perceived Academic Competence
Previous Year’s Test Score
Within Teacher Variation in Engagement
Differences across Teachers in Engagement
0.03
0.03
615
0.04
0.62***
0.04
32
0.62***
0.04
31
0.74***
0.03
24
0.72***
0.03
24
-0.03
0.08
55
-0.03
0.07
54
0.04
0.04
48
0.04
0.04
48
+
0.03
23
0.06
+
0.03
23
0.03
23
0.02
0.03
23
0.17*
0.07
31
0.21***
0.05
30
0.06
-0.11
0.07
31
-0.09
0.07
30
0.01
Student Self-Reported Engagement in School
0.06*
0.03
615
0.05*
0.02
969
Student Perceived Academic Competence
0.03
0.03
615
0.05+
0.02
969
Observed Alignment
Intercept
Previous Year’s Test Score
Within Teacher Variation in Alignment
Differences across Teachers in Alignment
0.62***
0.04
32
0.63***
0.04
31
0.73***
0.03
24
0.72***
0.03
24
-0.01
0.07
55
-0.01
0.06
54
0.02
0.03
48
0.02
0.03
48
+
0.03
23
0.04
0.03
23
0.03
23
0.02
0.03
23
0.16*
0.08
31
0.18*
0.07
30
0.06
-0.12
0.07
31
-0.09
0.07
30
0.02
0.06*
0.03
615
0.06*
0.02
969
+
0.02
969
Observed Rigor
Intercept
Student Self-Reported Engagement in School
Student Perceived Academic Competence
Previous Year’s Test Score
Within Teacher Variation in Rigor
Differences across Teachers in Rigor
0.02
0.03
615
0.04
0.61***
0.04
32
0.62***
0.04
31
0.74***
0.03
24
0.72***
0.03
24
-0.05
0.07
55
-0.05
0.06
54
0.04
0.04
48
0.03
0.04
48
+
0.07
31
0.15*
0.07
30
0.10*
0.04
23
0.07
0.05
23
0.14
This table reflects twelve separate analyses: E, A, and R, for math and ELA, with and without student motivational variables from the questionnaires. Both the
predictor and outcome variables are standardized, essentially converting the HLM coefficients into standardized coefficients. + p<.10, * p < .05, ** p<.01, ***p<.001
Submitted for review to the High School Journal
Please do not share or cite.
ENGAGEMENT, ALIGNMENT, AND RIGOR
225
Table 4
Observed Engagement, Alignment, or Rigor Predicting Standardized Test Scores in Study 2
Predictor Variables
Predicting 9th/10th-Grade MATH Scores
without student
with student
questionnaires
questionnaires
Predicting 9th/10th-Grade ELA Scores
without student
with student
questionnaires
questionnaires
Coeff.
SE
df
Coeff.
SE
df
Coeff.
SE
df
Coeff.
SE
df
-0.12**
0.04
61
-0.11**
0.04
59
0.07**
0.02
62
0.06**
0.02
60
0.07**
0.02
1354
0.02
0.01
1950
+
0.02
1354
-0.01
0.01
1950
Observed Engagement
Intercept
Student Self-Reported Engagement in School
Student Perceived Academic Competence
Previous Year’s Test Score
-0.04
0.44***
0.03
62
0.42***
0.03
60
0.79***
0.01
63
0.79***
0.02
61
0.05
0.04
109
0.03
0.04
107
0.02
0.02
102
0.01
0.03
98
0.10**
0.03
61
0.10**
0.04
59
0.07**
0.02
62
0.07**
0.02
60
-0.12**
0.04
61
-0.11*
0.04
59
0.07**
0.02
62
0.06**
0.02
60
Student Self-Reported Engagement in School
0.07**
0.02
1354
0.02
0.01
1950
Student Perceived Academic Competence
-0.04+
0.02
1354
-0.01
0.01
1950
Within Teacher Variation in Engagement
Differences across Teachers in Engagement
Observed Alignment
Intercept
Previous Year’s Test Score
0.45***
0.03
62
0.43***
0.03
60
0.78***
0.01
63
0.79***
0.02
61
Within Teacher Variation in Alignment
0.01
0.04
109
0
0.04
107
0.02
0.02
102
0.01
0.02
98
Differences across Teachers in Alignment
-0.02
0.03
61
0.01
0.04
59
-0.01
0.02
62
-0.01
0.02
60
-0.12**
0.04
61
-0.11*
0.04
59
0.07**
0.02
62
0.06**
0.02
60
0.07**
0.02
1354
0.02
0.01
1950
+
0.02
1354
-0.01
0.01
1950
0.03
60
Observed Rigor
Intercept
Student Self-Reported Engagement in School
Student Perceived Academic Competence
Previous Year’s Test Score
-0.03
0.45***
0.03
62
0.43***
0.78***
0.01
63
0.79***
0.02
61
+
0.02
102
0.02
0.02
98
0.02
62
0.03
0.02
60
Within Teacher Variation in Rigor
0.02
0.03
109
0.01
0.04
107
0.03
Differences across Teachers in Rigor
0.02
0.04
61
0.05
0.04
59
0.02
Note: This table reflects twelve separate analyses: E, A, and R, for math and ELA, with and without student motivational variables from the questionnaires. Both the
predictor and outcome variables are standardized, essentially converting the HLM coefficients into standardized coefficients. + p<.10, * p < .05, ** p<.01, ***p<.001
Submitted for review to the High School Journal
Please do not share or cite.
226
Appendix 3: Sample Memorandum of Understanding
EVERY CLASSROOM, EVERY DAY
(DATE)
(District’s name) commits to the participation of four comprehensive high schools
in the Institute for Educational Sciences supported evaluation of Every Classroom,
Every Day from Summer 20XX through Spring 20XX. Every Classroom, Every Day is
an instructional reform initiative based on instructional improvement supports provided
by the Institute for Research and Reform in Education (IRRE).
Evaluation Activities and Commitments
(District name) understands that participation in Every Classroom, Every Day
(ECED) will involve the following components for the evaluation:
 Random assignment: Of the four comprehensive high schools participating in
the project, two schools will be randomly selected to receive the instructional
improvement supports from IRRE and participate in the research efforts. The
other two schools will participate only in the research efforts. The two schools in
the research-only group will receive a $10,000 stipend given directly to the
school and will be eligible to receive supports two years later from district
personnel trained during the first two years of the project and/or by contracting
directly with IRRE. It is understood that the process of deciding which schools
receive the instructional supports and which are in the research-only group will
be entirely random, and the district will have no input in deciding which of the
227
participating schools are in which group.
 Full participation: Both groups of schools (those receiving the supports and
those in the research-only group) will participate in data collection each year from
Summer 2009 to Summer 2011, as part of this project. The District will actively
encourage all teachers, personnel, and students to participate in the data
collection.
 Data collection: The data collection will include:
o Individual Student Records, (e.g., demographic characteristics, course
grades, scores on state mandated tests, attendance, disciplinary actions)
collected annually, including 8th grade standardized test scores and
attendance
o Teacher Questionnaire, taking roughly 30-minutes, collected annually
o Student Questionnaire, taking roughly 30-minutes, collected twice each year
o Master Schedule and Class Rosters, allowing researchers to link students,
teachers and courses, using code numbers
o Classroom Observations, conducted 3 to 10 times annually in each 9th and
10th grade language arts and math class
o Research Site-visit, conducted annually
(District name) commits to full participation in the ECED evaluation activities
described above as well as to the more detailed data requirements delineated in
Attachment 1.
Project Implementation Activities and Commitments
(District name) understands that participation in Every Classroom, Every Day
(ECED) requires commitment to implementing the following components of the project
in the two schools receiving instructional supports each year (except where noted):
1. At minimum, a half-time math coach and a half-time English/literacy coach will
be devoted to the 9th and 10th grade Language Arts and Math classes in each
of the two school receiving the instructional supports.1 These ECED coaches
will participate in two days of training during the first year, in addition to
1
The district agrees that the English/literacy and math coaches who work with the schools receiving
instructional supports will not be coaches in, or otherwise work with, the two research-only schools during
the two-year duration of the ECED project.
228
participating in the activities described in paragraphs 2, 3, 4, and 5.
2. A minimum of three full days of professional development time for 9th and 10th
grade English/Language Arts and Math teachers, ECED coaches, and
district/campus instructional leaders, with at least 75% of these target
participants attending each professional development activity.
3. A minimum of four additional full days of leadership trainings during the first
year for instructional leaders (e.g., principals, assistant principals, curriculum
specialists, teacher leaders) and ECED coaches to build shared
understanding of Engagement, Alignment and Rigor.
4. Participation in four three-day instructional site-visits. District and school
leadership and ECED coaches participate in all three days of each visit.
During one day of the site visit, teachers of 9th and 10th grade Math and
Language Arts will be out of their classrooms for one half-day (e.g., Math
teachers in the morning and Language Arts teachers in the afternoon).
During the other two days of the site visit, these teachers will also participate
in classroom observations, classroom coaching, and individual consultations.
5. Twice monthly, two-hour conference calls with ECED coaches and at least
one instructional leader from each school to support their real-time coaching,
talk through emerging issues, and help maintain project momentum.
6. Use of the ECED literacy curriculum with all 9th and 10th graders – including
ELL and special education students – who are expected to take the state
mandated achievement tests.1
7. A double English/Language Arts period for all students in 9th and 10th grades
(e.g., one 60- to 100-minute period every day all year or two 45- to 59-minute
periods every day all year) with half of this instructional time guided by the
ECED literacy curriculum described in paragraph 6.
8. Use of the ECED math benchmarking activities in 9th and 10th grade math
classes with all 9th and 10th graders – including ELL and special education
students – who are expected to take the state mandated achievement tests.
9. Creation and staffing of benchmark café by the math coach.
10. Use of the Measuring What Matters classroom observation protocol
throughout the participating schools. Training in the use of the protocol
requires that instructional leaders conduct a minimum of ten classroom visits
1
th
th
The ECED literacy curriculum will not replace the current 9 and 10 grade English/Language Arts
(ELA) curriculum. Instead, it will be used in conjunction with the existing ELA curriculum which is why a
th
th
double block is needed for 9 and 10 grade ELA.
229
of approximately twenty minutes each to build a shared understanding of
Engagement, Alignment, and Rigor. Use of the protocol also requires PDAs1,
to be purchased by the district, for a minimum of one district leader and five
campus leaders per school. Once training is completed, each trained user
will conduct at least five 20-minute classroom observations per week using
the protocol for the duration of the project.
Furthermore, (district name) commits to making the Every Classroom, Every Day
activities the focus of the professional development and instructional improvement
activities for 9th and 10th grade English/Language Arts and Math teachers and
instructional leaders in the schools receiving instructional supports. Specifically, these
teachers will not be asked to participate in any additional instructional improvement
activities beyond their work with ECED.
(District name) also commits to only offering the ECED programs and activities in
the schools that are randomly selected to receive the supports through Spring 2011. It
is understood that over the two-year course of the project (district name) cannot attempt
to replicate or import the ECED activities from the schools receiving the supports to the
research-only schools. It is also understood that after the project ends, (district name)
can choose to offer these supports to other (district name) schools. Additionally,
(district name) agrees none of the participating schools will be part of any other
research studies.
Financial Considerations and Commitments
(DISTRICT NAME) understands that two participating schools will receive two
years of training, on-site supports and technical assistance from the IRRE instructional
team, along with curricular materials and technology supports, at no cost to (district
name). All research costs – including the $10,000 honorarium given to the each of the
two research-only schools – and the costs of IRRE staff providing technical assistance
to participating schools and districts will be covered by a grant to the University of
Rochester from the US Department of Education.
(District’s name) commits to participating in ECED for the entire two years of the
project (Summer 2009 – Summer 2011) and to covering the following costs associated
with implementation of ECED for both years of the project implementation:
1
PDAs cost between $350 and $450 each and generally must be purchased new because of the
requirements of the Measuring What Matters software. Specifications are available upon request.
230
 Salary/stipends/substitutes, as needed, for the participating teachers for the three
professional development days and the four half-day trainings during the sitevisits, as well as for the participation of any teacher leaders in leadership
trainings and site visits.
 PDA devices for at least one district-level and five school-level instructional
leaders at each of the two schools receiving the instructional supports.
 A district-level point person (e.g., assistant superintendent) and one point person
at each school receiving the instruction supports to coordinate all Every
Classroom, Every Day activities. It is anticipated that these responsibilities will
require roughly 15% time for the district-level person, plus 15% time for each
school-level person.
 The ECED coaches (at least one-half FTE for math and one-half FTE for
English/literacy at each school receiving instructional supports) for each of the
two years of the ECED project.
 Facilities and food for professional development days, instructional leader and
coach trainings, and instructional site-visits.
 If needed, reallocation of staff to support the additional period of English required
for the literacy curriculum in 9th and 10th grades for both years of the project.
 In-kind time from district and school staff for overseeing teacher and student
questionnaire administration, fulfilling data requests associated with data
collection activities, and supporting technology associated with Measuring What
Matters. (Note: as outlined in the Data Requirements Addendum, some financial
assistance will be provided to offset these costs.)
Data Confidentiality
(District name) understands that the District’s participation in the ECED initiative
will be public information. Results of the ECED research will be included in public
research presentations and reports; however, results will be reported in a manner that
masks the identity of individual schools, teachers, and students in order to maintain
confidentiality. Indeed, the University of Rochester research team will have no way to
link student and teacher names to the information they provide. For the schools
receiving the ECED supports, school-level data, disaggregated student outcome reports
(when available), and teacher-level data from the EAR Classroom Visit protocols, will be
provided by IRRE as part of the ECED program supports. These data will be made
available to school and district administrators and IRRE’s technical assistance staff to
monitor student progress, plan professional development and other supports, and give
ongoing feedback to teachers, administrators, and IRRE staff on their own practices.
For schools in the research-only condition, no individual school-, teacher-, or studentdata will be provided by IRRE. No student- or teacher-level survey responses will be
231
released to anyone, at any school.
(name)
Superintendent, (district name)
(name)
Assistant Superintendent for Curriculum & Instruction, (district name)
(name)
Principal, (school name)
(name)
Principal, (school name)
(name)
Principal, (school name)
(name)
Principal, (school name)
James P. Connell, Ph.D.
IRRE President
Edward Deci, Ph.D.
Professor of Psychology and Gowen Professor in the Social Sciences
University of Rochester
232
Attachment to Attachment 1
Every Classroom Every Day Memorandum of Understanding
Data Requirements Addendum
This document outlines the data collection activities that are a required
component of Every Classroom, Every Day.
Design Features
•
Random assignment: Within each participating district, all of the interested and
qualified schools will enter a lottery where half of the schools will be selected to
begin the Every Classroom, Every Day instructional supports in Summer 2009
and continue through the 2010-2011 academic year. Schools not selected in the
lottery will participate in the research activities, continue to receive the
instructional supports currently provided by the district, and receive an
honorarium of $10,000 for their participation in the research. No one from the
school district, the implementation team, or the researcher team will have any
control over which schools are in which group.
•
Data collection in all participating schools: Research activities will begin the
Summer of 2009 in all schools that are participating in the project (both those that
are and are not receiving the Every Classroom, Every Day supports). Research
activities will continue for two years, through the 2010-2011 school year.
•
Ability to link student, teacher, classroom, and school information: A primary
purpose of Every Classroom, Every Day’s research is to understand how
different aspects of this instructional improvement model affect teachers and their
students. To understand these effects, code numbers will be assigned to each
student and teacher that will protect the confidentiality of participants from the
external research team, while also allowing researchers to link information from
students and teachers to their classes and schools.
Descriptions of Research Activities
All research activities described below will take place on the same schedule in all
schools that are part of the project, regardless of whether or not they are receiving the
Every Classroom, Every Day supports.
•
Individual Student Records: Student records will be provided at the end of each
school year (total of two times), in an electronic format for each 9th and 10th grade
student enrolled in participating schools. In addition, districts will provide 8th
grade records for incoming 9th graders.
233
The specific pieces of information needed for each student, each year, are:
•
o demographic characteristics, including birth date, race/ethnicity, gender, and
free or reduced-price lunch eligibility;
o special education and English language learner status;
o course grade for each course enrolled;
o state-mandated standardized test scores and proficiency levels in all statemandated subject areas;
o scores on ACT Plan or PSAT, if any;
o date enrolled in school;
o days absent from/present in school;
o drop-out status or other reason for withdrawal, and date of last attendance;
o promotion or retention status (including whether the student has graduated);
o progress towards graduation (credits earned and credits and courses that are
still needed for graduation); and
o number of suspension incidents and days missed due to suspension (both inand out-of-school).
Teacher Questionnaires: The Teacher Questionnaire measures attitudes,
beliefs, and perceptions and takes about 30 minutes to complete. The research
team will work with the school/district staff to administer the questionnaires once
each year (total of two times) via a secure website1 to all teachers of 9th and 10th
grade math and language arts classes.
•
Student Questionnaires: The Student Questionnaire measures attitudes, beliefs,
and perceptions and takes about 30 minutes to complete. The research team will
work with school/district to administer the questionnaires to all 9th and 10th grade
students in the fall and spring of each year (total of four times) via a secure
website.
•
Master Schedule: Master schedules will be used to facilitate linkages in the data
among teachers and classes. The Master Schedule will be provided each
semester electronically (total of four times) and should include teacher names,
teacher human resource ID numbers, departmental affiliation, course ID
numbers, and course names. A code number will be created for each teacher
allowing his/her course, questionnaire, and student information to be linked
confidentially.
•
Class Rosters: Class rosters will be used to facilitate data linkages in the data
among students, teachers, and classes. They will be provided electronically
each semester (total of four times) and can include either the student code
numbers that have been created for the research project or the student names
and school IDs. In the latter case, IRRE will convert the names or school IDs to
code numbers prior to sharing the information with the research team.
1
If a school or district does not have sufficient computer capability to conduct on-line questionnaires,
the research team will work with the school/district to make arrangements for paper and pencil
administration.
234
Alternatively, schools can provide each student’s course schedule electronically,
including course ID numbers that correspond to the course ID numbers on the
Master Schedule.
•
EAR Classroom Visit Protocol: Observers who work for the research team and
have been trained by a senior IRRE instructional staff member will conduct
instructional ratings, using the Engagement, Alignment, and Rigor (EAR)
Protocol. Each year, between six and ten observations will be conducted all 9th
and 10th grade math and language arts classes at all four schools in the project
(both those who are and are not receiving the Every Classroom, Every Day
supports). The EAR protocol data will be collected with a PDA and will go
directly to IRRE’s secure server. IRRE will provide the data to the research team
only after it has been made anonymous, using the research code numbers.
Additionally, the EAR Protocol data collected by school and district leaders as
part of the Every Classroom, Every Day supports will be used not only by the
district and schools but also by the research team to investigate differences in
ratings made by individuals with different roles.
•
Research Site-Visit: At least once per year, the research team will conduct sitevisits to all four participating schools to determine the degree to which teachers
and students are or are not engaged in Every Classroom, Every Day activities.
During these visits, the researchers will interview the principal or assistant
principal and teachers in each of the four school, as well as key district
personnel.
Parental Notification about Research Activities
IRRE will work with the schools to create and distribute a summary of the
research activities (including student surveys, classroom observations, and student
records) for parents. The summary sheet will include a way for parents to notify the
school and research team if a parent does not want the student to participate or does
not want the student’s records released. All 9th and 10th grade students will be included
unless the parent notifies the school that s/he should be excluded.
Financial Assistance
One person from each of the four schools will be designated as the study
coordinator and will work with the research team to coordinate the EAR Classroom Visit
Protocol observations and the administration of the teacher and student surveys. Each
school will receive a one-time honorarium of $500 to cover the costs of the coordinator’s
work. Further, each district will be provided a one-time honorarium of $1,500 to cover
the costs of the staff member who provides the data from the district database to the
researchers.
Participant Confidentiality
The names of the districts that participate in the ECED project will be considered
public information, however, information regarding which schools are in which condition
will be confidential, and all research results will be reported to the public in a way that
235
masks the identity of the schools, teachers, and students. Student and teacher ID
numbers will be created for purposes of this study and all research data will be linked
using only study-specific IDs only.
For schools receiving the ECED supports, the following data will be made
available to the schools, district-administrators, and IRRE’s technical assistance
providers: 1) EAR Classroom Observation protocol data at the teacher-level, and 2)
student- and teacher-survey responses, aggregated to the school-level.
For schools in the research-only condition, no data will be released to the
schools or district during the course of the study.
236
Appendix 4: Recruitment and Participation Diagram a
Assessed for eligibility
(n = ~21,100)
Excluded (n = ~21,080)
♦ Not
meeting inclusion criteria
• 9 grade enrollment <220 (n = ~14,800)
th
• FRPL < 30% (n = ~3,280)
Enrollment
• Fewer than 4 eligible schools in district
(n = ~1,800)
♦
Initial contact made, but received no
response (n = ~900)
♦
Declined after some additional contact
(phone calls, site visit) (n =~300)
Randomized (n = 20)
Allocation
Allocated to intervention (n = 10)
• Received 2 years of intervention (n=8)
• Received 1 year of intervention (n=1)
• Received <1 year of intervention (n=1)
Allocated to control (n = 10)
Follow-Up
Lost to follow-up (n = 0)
Lost to follow-up (n = 0)
Discontinued intervention (n = 2)
Analysis
a
Analysed (n = 10)
Analysed (n = 10)
Excluded from analysis (n = 0)
Excluded from analysis (n = 0)
This flow chart reflects the number of schools at each step, because randomization took place at the school level.
However, recruitment took place primarily at the district level and the randomization was blocked at the district level. All
values in the enrollment boxes are approximate. The recruitment process was iterative and took place over several
years. During that time, changes in school enrollments, etc. changed schools’ eligibility.
237
Appendix 5: Teacher Survey Items
The table below shows all teacher questionnaire items, along with the construct each was
intended to measures and the waves at which it was administered. The fall (Wave 1 and 3)
teacher questionnaires were not part of their original data collection plan and were not part of the
data collection that was described to the schools prior their agreeing to participate. For that
reason, we felt that it was important that it be very short (one page); it included only the 12 most
important items in the Fall of 2009 and Fall of 2010. For the Fall of 2011 we decided it could be
two pages in length and added individual teacher morale and demographic information.
Three constructs were included at all four waves because we believed they were the most
likely moderators of changed instruction, and it was critical to have baseline information on
those constructs. They were: support from school administration(3 items), support from district
administration (3 items), and support and collective engagement (6 items). Most of the other
constructs were measured only during the spring waves (Waves 2 and 4). They were:
district/school commitment to change (4 items), confidence in change (3 items), perceived
competence as a teacher (3 items), relative autonomy (8 items), amount and type of professional
development received (8 items), perceived value of professional development (7 items),
implementation of ECED (12 items). The final construct -- individual teacher morale (6 items) –
was included only in Waves 2 and 4 for schools in the first recruitment group and in Waves 2, 3
and 4 for schools in the second recruitment group. Additionally, there were 15 demographic
items, but not all were asked at each wave. We also requested demographic information about
teachers from the five districts, but only three (Districts 2, 3 and 4) provided the information.
The questionnaire and district information were combined, resulting in 26% missing.
238
Order
Question Text
4
School administrators understand and respond to the teachers' needs.
School administrators support teachers making their own decisions about
their students.
School administrators help teachers and staff get what they need from the
district office.
District administrators are attentive to school personnel and provide the
encouragement and support they need for working with students.
District administrators allow school staff to try educational innovations
that the teachers believe would be helpful for the students.
District administrators are responsive to the needs of teachers and staff
for professional development.
Teachers at this school do what is necessary to get the job done right.
Teachers at this school don't give up when difficulties arise.
Teachers at this school go beyond the call of duty to do the best job they
can.
Teachers in this school go out of their way to help each other.
Teachers in this school encourage each other to do well.
Teachers in this school share resources with each other.
How committed is your superintendent to strengthening the quality of
instruction within your district?
How committed are the staff in the District Office to improving teaching
and learning in the district?
How committed do you think the School Board is to making changes that
will improve instruction and achievement in your district?
How committed is your principal to supporting changes in the school that
will improve the quality of teaching in all classrooms?
How confident are you that instruction can be improved in your school to
ensure that all students experience high quality teaching and learning
every day?
How confident are you that your school is making changes that will
improve the performance of your students?
7
13
3
10
16
2
6
9
12
15
18
22
23
24
25
9
20
Response Options
Original
Construct
Waves
1
2
3
4
Not At All True
Not Very true
Sort of True
Very True
Support from
School
Administrators
All
1
2
3
4
Not At All True
Not Very true
Sort of True
Very True
Support from
District
Administrators
All
1
2
3
4
Not At All True
Not Very true
Sort of True
Very True
Support and
Collective
Engagement
All
1 Not at all Committed
2
3
4 Somewhat Committed
5
6
7 Very Committed
District/School
Commitment to
Change
2&4
only
1 Not at all Confident
2
3
4 Somewhat Confident
5
Confidence in
Change
2&4
only
239
Order
Question Text
29
33
How confident are you that your school is improving instruction in ways
that can be sustained over time?
I am very confident in my abilities as a teacher.
I think I am a very skilled teacher.
36
I feel very competent as a teacher.
21
26
27
28
30
31
32
34
35
37
38
39
40
41
42
The reason I am a teacher is that it's interesting and enjoyable to work
with students. (intrinsic)
I teach because it is my job and I need the salary to live. (external)
I teach because it is personally important to me to help students learn and
develop. (identified)
I teach because I think I should and would feel guilty if I didn't.
(introjected)
I teach because it is meaningful to me to understand students and
encourage their growth. (identified)
I am a teacher because I would feel bad about myself if I did not stick
with this career. (introjected)
The reason I teach is that it is exciting to watch students learn. (intrinsic)
I teach because I feel like I have to. (external)
How many hours have you been involved in: Workshops?
During this school year, how many hours have you been involved in:
College Courses (face to face)?
During this school year, how many hours have you been involved in:
Online courses/ modules?
During this school year, how many hours have you been involved in:
Conferences?
During this school year, how many hours have you been involved in:
Coaching or mentoring by another teacher?
During this school year, how many hours have you been involved in:
Coaching or mentoring by a specialist, administrator, or expert (not a
peer)?
Response Options
6
7 Very Confident
1 Not At All True
2 Not Very true
3 Sort of True
4 Very True
1
2
3
4
Not At All True
Not Very true
Sort of True
Very True
0
1
2
3
4
Was not involved
1-5 hours
6-10 hours
11-15 hours
More than 15 hours
Original
Construct
Waves
Perceive
Competence
2&4
only
Relative
Autonomy
Index
2&4
only
Amount/type of
Professional
Development
2&4
only
240
Order
43
44
45
46
47
48
49
50
51
1
5
8
11
14
17
52
Question Text
During this school year, how many hours have you been involved in:
Observation of other teachers' classes?
During this school year, how many hours have you been involved in:
Involvement in teacher study groups?
My professional development activities: Helped me to increase student
engagement in my classes.
My professional development activities: Helped me to better understand
the subjects I teach.
My professional development activities: Enhanced my classroom
management skills.
My professional development activities: Helped me to challenge and
encourage all students to work at or above grade level.
My professional development activities: Increased my use of effective
instructional strategies for improving academic achievement.
My professional development activities: Increased the extent to which
my instruction is aligned with the course standards and curriculum.
My professional development activities: Are likely to have a lasting
impact on my instructional practices.
I look forward to going to work in the morning.
My job has become just a matter of putting in time. (reverse)
When I am teaching, I feel happy.
Time goes by very slowly when I'm at work. (reverse)
When I am teaching, I feel bored. (reverse)
When I am teaching, I feel discouraged. (reverse)
Last semester, how often did you meet with the instructional coach and
other ECED teachers for discussions about improving instruction?
('ECED' was omitted from the version used in control schools.)
Response Options
1
2
3
4
Not At All True
Not Very true
Sort of True
Very True
1
2
3
4
Not At All True
Not Very true
Sort of True
Very True
1 Never (I was part of
ECED but never met
with the coach to
Original
Construct
Waves
Perceived
Value of
Professional
Development
2&4
only
Individual
Teacher Morale
(referred to as
Individual
Engagement by
IRRE)
Implementation
of ECED
2&4
for
RG 731;
2, 3 &
4 for
RG2
2&4
only
241
Order
Question Text
Response Options
2
53
This semester, how often do you meet with the instructional coach and
other ECED teachers for discussions about improving instruction?
('ECED' was omitted from the version used in control schools.)
3
4
5
Original
Construct
Waves
Implementation
of ECED
(continued)
2&4
only
discuss instruction)
Once or twice during
the semester
About once per month
About once per week
More than once per
week
6 N/A (I was not part of ECED
this/last semester)
54
55
56
57
If you taught Algebra I or Geometry last semester, was there a chart in
your classroom that displayed and tracked student progress toward
mastery of all benchmarks?
If you are teaching Algebra I or Geometry this semester, is there a chart
in your classroom that displays and tracks student progress toward
mastery of all benchmarks?
How often do school administrators or instructional coaches visit your
classroom to watch student learning? (Note: do not count observations
done as part of your formal performance evaluation.)
If school administrators or instructional coaches visit your classroom, do
you receive feedback or have conversations about your instruction after
the visits?
1 Yes
2 No
3 N/A I do not teach math
or did not teach Algebra
I or Geometry (this/last)
semester
1 Never
2 Once or twice during
the semester
3 About once per month
4 About once per week
5 More than once per
week
1 Yes, always
2 Yes, sometimes
3 No, never
4 N/A School
administrators or
instructional coaches do
not visit
242
Order
Question Text
Response Options
58
If you taught 9th grade ECED Literacy last semester, how many of the 51
lessons did you cover last semester? (Note: do not count observations
done as part of your formal performance evaluation.) (Omitted from the
version for the control schools.)
59
If you are teaching 9th grade ECED Literacy this semester, how many of
the 51 lessons have you covered this semester, as of today?
59a
If you taught 10th grade ECED Literacy last semester, how many of the
122 lessons did you cover last semester? (Omitted from the version for
the control schools.)
0 0
1 1-5
2 6-10
3 11-15
4 16-20
5 21-25
6 26-30
7 31-35
8 36-40
9 41-45
10 46-51
11 N/A I did not teach
ECED Literacy
(this/last) semester
0 0
1 1-10
2 11-20
3 21-30
4 31-40
5 41-50
6 51-60
7 61-70
8 71-80
9 81-90
10 91-100
11 101-110
12 111-122
13-N/A I did not teach
ECED Literacy last/this
semester
th
59b 74
If you are teaching 10 grade ECED Literacy this semester, how many of
the 122 lessons have you covered this semester, as of today? (Note: do
not count observations done as part of your formal performance
evaluation.) (Omitted from the version for the control schools.)
Original
Construct
Waves
Implementation
of ECED
(continued)
2&4
only
243
Order
Question Text
60
Just prior to the start of school, ECED and your district offered three
days of professional development about Every Classroom, Every Day
(ECED). Did you attend? (Omitted from the version for the control
schools.)
61
During the school year how many times did you participate in an ECED
site visit, that involved working with the ECED staff and your
instructional coach, while a substitute covered your class? (Omitted from
the version for the control schools.)
62
What is your current position?
Response Options
1 Yes, I attended all 3
days
2 Yes, I attended 2 days
3 Yes, I attended one day
4 No, I was busy or not
interested
5 No, I had not yet been
hired or did not yet
know I would be part of
ECED
0 0 times
1 1 time
2 2 times
3 3 times
4 4 times
1 Math Teacher
2 English or Literacy
Teacher
3 Other classroom teacher
(not Math or
English/Literacy)
4 Counselor
5 Administrator
6 Librarian/ Media
Specialist
7 School Psychologist/
Speech Pathologist
8 Long Term SubstituteMath
9 Long Term SubstituteEnglish/Literacy
Original
Construct
Waves
Implementation
of ECED
(continued)
2&4
only
Demographics
2&4
only
244
Order
63
Question Text
Counting this year, how many academic years have you worked in your
current position?
64
Counting this year, how many years in total have you taught at this or
other schools/ districts?
65
Counting this year, how many years in total have you worked in this
school?
66
Response Options
0
1
2
3
4
5
6
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
No Answer
Less than 1 year
1-2 years
3-5 years
6-10 years
11-20 years
More than 20 years
No Answer
N/A I'm not a teacher
Less than 1 year
1-2 years
3-5 years
6-10 years
11-20 years
More than 20 years
No Answer
Less than 1 year
1-2 years
3-5 years
6-10 years
11-20 years
More than 20 years
Are you...(male/female)?
Recoded – see dataset
67
What grade (or grades) are you currently teaching? (Check all that apply)
Original
Construct
Waves
2&4
only
2&4
for
RG1; 2,
3&4
for
RG2
Demographics
(continued)
2&4
only
2&4
for
RG1; 2,
3&4
for
RG2
2&4
only
245
Order
Question Text
68
Are you Hispanic/ Latino?
69
What is your race?
70
What is the highest level of education that you have completed?
Response Options
Original
Construct
Waves
2&4
for
RG1; 2,
3&4
for
RG2
2&4
for
RG1; 2,
3&4
for
RG2
1 Associates or other 2year college degree
2 Bachelor's or other 4year college degree
3 Teaching certification
programming requiring
at least one year of study
beyond Bachelor's
4 Master's degree
5 Education specialist or
other professional
diploma beyond a
Master's
6 Phd, EdD, or other
doctorate
Demographics
(continued)
4 only
for
RG1; 2,
3&4
for
RG2
246
Order
71
Question Text
Original
Construct
Are you certified in THIS state to teach Secondary English/ Language
Arts?
72
Are you certified in THIS state to teach Secondary Mathematics?
73
Are you certified in THIS state to teach any other content areas?
74
Response Options
Are you certified by the National Board for Professional Teaching
Standards in at least one content area?
1 Yes, regular
certification
2 Yes, provisional,
probationary, temporary,
or emergency certificate
3 No
Demographics
(continued)
1 Yes, fully certified
2 Working toward
National Board
Certification
3 No, not certified
Waves
4 only
for
RG1; 2,
3&4
for
RG2
4 only
for
RG1; 2,
3&4
for
RG2
4 only
for
RG1; 2
& 4 for
RG2
2&4
for
RG1; 2,
3&4
for
RG2
247
Order
75
76
Question Text
Are/were you part of a Teach for America program?
Are/were you part of a program that trains non-educators to teach?
Response Options
1 Yes, I am currently
teaching in this school
through a Teach for
America program
2 Yes, I was part of Teach
for America when I
began teaching
3 No, I have not been part
of Teach for America
1 Yes, I am currently part
of such a program
2 Yes, I was part of such
a program when I
started teaching
3 No, I have not been part
of such a program
Original
Construct
Waves
4 only
for
RG1; 2
& 4 for
RG2
Demographics
(continued)
4 only
for
RG1; 2
& 4 for
RG2
248
Appendix 6: EAR Protocol Training for ECED Efficacy Trial Data Collectors
Of the seven individuals who collected EAR Protocol data for ECED, six participated in
one of IRREs typical training events. Of these six, two 75 participated in an IRRE training to learn
the EAR protocol when they were employees of a school district, and their district elected to take
part. Those two used the tool in their own district for several years and then later became
consultants for IRRE, using the EAR Protocol in school districts across the country. After
working for IRRE, they began work for ECED. Two of the six 76 went through the typical IRRE
training as they began working as consultants for IRRE. Again, they used the tool for several
years as part of their IRRE work prior to becoming data collectors for ECED. The remaining
two 77 took part as they were beginning to work on an earlier project funded by this same grant
regarding the tool’s psychometric properties. They participated in the four-day training, along
with a school district that was not part of ECED, but they were not and had never been
employees of that school district. The seventh individual 78 who collected EAR Protocol data for
ECED learned the protocol specifically to collect data for this project. She worked directly with
an IRRE consultant who had led many of the four-day trainings. Her training consisted of one
and one-half days focused solely on understanding the terms and scoring of the protocol.
Immediately following this training, this individual took place in the “refresher training”
described below.
Following the first year of data collection, five of the seven data collectors participated in
a one and one-half refresher training, with an IRRE consultant. Of the two who did not
participate, one was no longer working for the project 79 when the refresher took place and the
other had not been hired yet. 80 Both were individuals who had learned the tool as part of prior
employment and had used it extensively for their work.
249
Appendix 7: Student Questionnaire Administration Procedures
Several weeks prior to each planned administration, the district or school provided a list
of all enrolled 9th- and 10th-graders to the research team, as well as information about what
sections (e.g., teacher and period) would be used for survey administration. The research team
created a ‘ticket’ for each student, which was simply a half sheet of paper that indicated the
student’s name, his/her survey administration section (e.g., name of English teacher and period),
the survey URL, and a unique seven character survey access code. The survey access codes were
used to link student identifiers to their survey responses. Each survey access code was used just
one time and they were not the same as the study IDs or district-assigned IDs. Survey tickets
were organized by section and distributed to the teachers who would be administering the
surveys. Schools were given a set of ‘extra’ tickets that contained survey access codes, but no
student or teacher names, for use with students who enrolled after the lists were created or if
tickets were lost. Schools were asked to tell the research team the name and district-assigned ID
of any student who used an extra ticket. Data were discarded for any cases where an extra ticket
was used without the research team learning who used it or if the research team learned that the
student who used it was not eligible (e.g., 11th-grader).
Teachers were instructed to take their class to a computer lab during the assigned period,
read a very brief set of instructions to the students, and ensure that students were able to log on to
the survey. Teachers were asked to give absent students multiple opportunities to take the survey.
Students who did not want to participate were asked simply to click through to the end without
answering any questions. Students whose parents had returned the opt-out form were given a
ticket without a survey access code and a note indicating that the parent had requested that the
student not participate.
250
A few schools were not able to administer the surveys on-line due to shortage of
computer space and/or limited internet connections. In those cases, the surveys were
administered on paper and double keyed later. When paper surveys were used, schools were
provided with one survey for each target student. Each survey had a cover sheet on it that
indicated the student’s name and administration teacher and period. Students were asked to
remove the cover after completing the survey. Once the cover was removed, the survey itself had
only the student’s study ID on it. (Note: The schools and waves in which the student
questionnaires were administered on paper were: School 1, Waves 3 and 4; School 11 Waves 2
through 4; School 17 Waves 3 and 4; School 19 Waves 3 and 4.)
251
Appendix 8: Student Questionnaire Items
The table below lists all items in the student questionnaire and the construct each was
intended to measure. As noted in the text, factor analyses conducted for this study resulted in a
slightly different set of scales. In addition to gathering demographic information (9 items), the
questions were intended to measure six constructs: teacher support (8 items); student engagement
(8 items), peer support (3 items), teacher expectations (4 items), rigor (4 items), relative
autonomy (8 items), and perceived competence (4 items). (Note that the demographic questions
were not included in the student questionnaires in District 5. That district’s policies only
permitted questions that were directly about school. The district did, however, provide
demographic information about the study students from the student records.) Students responded
to all items except the demographic and rigor items using the following scale: 1 = Not at all true,
2 = Not very true, 3 = Sort of true, 4 = Very true. The response scale for the Rigor items was: 1 =
Almost Never, 2 = Not Very Often, 3 = Most of the Time, 4 = Almost Always.
252
Student Questionnaire Items
Question
My teachers care about how I do in school.
My teachers like to be with me.
My teachers like the other kids in my class better than me.
(reverse)
My teachers interrupt me when I have something to say.
(reverse)
My teachers are fair with me.
My teachers don't make clear what they expect of me in
school. (reverse)
Original Construct
Teacher Support
Teacher Support
My teachers are not fair with me. (reverse)
Teacher Support
My teachers' expectations for me are not realistic. (reverse)
Teacher Support
It is important to me to do the best I can in school.
Student Engagement
I work very hard on my schoolwork.
Student Engagement
I don't try very hard in school. (reverse)
Student Engagement
I pay attention in class.
Student Engagement
I often come to class unprepared. (reverse)
Student Engagement
Question
Number
3
8
Teacher Support
14
Teacher Support
Teacher Support
Teacher Support
18
23
26
31
34
5
7
17
20
10
When I'm doing a class assignment or homework, it's not clear Student Engagement
to me what I'm supposed to be learning. (reverse)
13
When I'm doing a class assignment or homework, I
understand why I'm doing it.
Student Engagement
35
A lot of the time I am bored in class. (reverse)
Student Engagement
Student in my school get to know each other well.
Students in my school show respect for each other.
Peer Support
Peer Support
24
21
28
In my school, the students push each other to do well.
Peer Support
32
My teachers show us examples of the kinds of work that can
earn us good grades.
My teachers make it clear what kind of work is expected from
students to get a good grade.
My teachers expect all students to do their best work all the
time.
My teachers expect all students to come to class prepared.
Teacher Expectations
1
Teacher Expectations
6
Teacher Expectations
12
Teacher Expectations
16
253
Question
I am asked to fully explain my answers to my teachers'
questions.
Our classroom assignments and homework make me think
hard about what I'm learning.
Rigor
36
Rigor
37
I know what kind of work it takes to get an A in my classes.
Rigor
38
Rigor
39
Relative autonomy
Relative autonomy
2
19
Relative Autonomy
4
Relative Autonomy
27
Relative autonomy
15
Relative Autonomy
Relative Autonomy
Relative Autonomy
Perceived Competence
22
11
29
30
Perceived Competence
Perceived Competence
Perceived Competence
Demographic
Demographics
Demographics
Demographics
Demographics
9
25
33
40
41
42
43
44
What language are you most comfortable speaking?
Demographics
45
Are you enrolled in special education classes this year?
Demographics
Demographics
46
47
Demographics
48
My teachers make sure I understand before we move on to the
next topic.
I do my homework because I would get in trouble if I didn't.
(external)
I do my schoolwork because that's the rule. (external)
I do my homework because I want the teachers to think I'm a
good student (introjected)
I do my schoolwork because I would feel bad about myself if
I didn't do it. (introjected)
I do my homework because it's important for me to do my
homework. (identified)
I do my schoolwork because I really want to understand the
subjects we are studying. (identified)
I do my homework because it's fun. (intrinsic)
I do my schoolwork because I enjoy doing it. (intrinsic)
I feel confident in my ability to learn at school.
I am capable of learning the material we are being taught at
school.
I feel able to do my schoolwork.
I feel good about how well I do at school.
How old are you?
Are you male or female?
Are you Hispanic or Latino?
What is your race?
What grade are you in this year?
Do you get free or reduced price lunches this year?
During this school year, did you have any suspensions, inschool or out of school? If how many total days have you
been suspended this school year? (only asked in Wave 4 for
Recruitment Group 1 and Waves 2, 3 and 4 for Recruitment
Group 2)
Original Construct
Question
Number
254
Appendix 9: Restructuring the Course Files for Use in Analysis
Because the main purpose of these files was to link students to math and ELA teachers
and periods/blocks, the data sets were restructured so that for each student, at each wave, up to
eight courses were described: (1) regular (9th- or 10th-grade) English, (2) ECED Literacy, (3) first
other English, (4) second other English, (5) Algebra 1, (6) Geometry, (7) first other math, and (8)
second other math. First and second ‘other English’ were any classes other than regular (9th-/10thgrade) English or ECED Literacy for which a student could receive English credit. Typically
these were support classes for English language learners and remedial English courses, but also
included English electives such as creative writing. In math first and second ‘other math’
included any math class other than the two courses targeted by ECED, Algebra 1 or Geometry.
Therefore, other math included both lower level remedial math courses and more advanced math
courses like Algebra 2. For each of the eight courses, six variables were created (1) teacher ID,
(2) period (e.g., 1st, 2nd, 3rd), (4) period ID (which uniquely identifies each combination of
teacher and period) (3) course name (as provided by the district), (4) course id (as provided by
the district), (5) final grade. When a student did not take a particular course, the variables
describing that course were given a specific missing code to indicate that the student was not
enrolled (995).
Additionally, for each wave and year, variables indicating the total number of math and
English courses each student had were calculated. In order that these variables could be directly
compared across schools, they were weighted by the length of the course. A course that met for a
single period each day in a school on a traditional schedule was weighted as 1. These courses
met between 45 and 55 minutes daily. Likewise, a course that met every other day in a school on
an AB Block was also weighted 1. Blocks are typically 90 minutes in length, so this every other
255
day block schedule is roughly equivalent to 45-55 minutes daily. A course that met every day for
a block (e.g., 90 minutes) was weighted as 2 because that is roughly twice as much time as a
course meeting daily on a tradition schedule. Courses that met less frequently (e.g., one 45
minute period every other day) were weighted accordingly (e.g., 0.5). 81 Note that occasionally a
student had more than two ‘other’ English or math courses. Those ‘others’ were included in the
count variables, but are not described in the data set.
256
Appendix 10: Test Scores Received for Grade Cohort 1
This table shows the types of tests for which we received scores in each district, as well as the number and percentage of scores received for
students in Grade Cohort 1. The last line in each cell (in bold) presents the total number and percentages of Grade Cohort 1 students with a usable test
score in each district and year, prior to imputation, after combining using the rules described in the text.
District
(state)
n
Baseline Math
1 (CA)
2628
2 (CA)
1590
3 (TN)
1264
GL 7 ‘08: 1928 (73%)
Gen. Math ’08: 2 (<1%)
Gen. Math ‘09: 819 (31%)
th
GL 8 ’09: 8 (<1%)
Alg 1 ‘08: 7 (<1%)
Alg 1 ‘09: 1240 (47%)
Geo ’09: 10 (<1%)
CBL_math: 2159 (82%)
th
GL 7 ‘08: 1032 (65%)
Gen. Math ‘08: 2 (<1%)
Gen. Math ‘09: 598 (38%)
th
GL 8 ‘09: 2 (<1%)
Alg 1 ‘08: 94 (6%)
Alg 1 ‘09: 486 (31%)
Geo ‘09: 75 (5%)
Alg 2 ‘09: 1 (<1%)
CBL_math: 1225 (77%)
th ‘
7 09: 713 (56%)
th
7 ’10: 10 (<1%)
th ‘
8 09: 96 (8%)
th
8 ‘10: 778 (62%)
Alg 1 ’10: 54 (4%)
th
CBL_math: 905 (72%)
Baseline ELA
th
Y1 Math
Y1 ELA
th
Y2 Math
Y2 ELA
th
7 CST ‘08: 1937 (73%)
th
8 CST ‘09: 2101 (79%)
Gen Math ‘10: 16 (<1%)
Alg 1 ’10: 1580 (60%)
Geo ‘10: 437 (16%)
Alg 2 ‘10: 15 (<1%)
Int Math 1 ’10: 1 (1%)
9 CST ‘10: 2148 (82%)
Alg 1 ‘11: 634 (24%)
Geo ‘11: 877 (34%)
Alg 2 ‘11: 339 (13%)
HS Math ‘11: 23 (1%)
CAHSEE ‘11: 2059 (78%)
10 CST ‘11: 1985 (75%)
CAHSEE ‘11: 2079 (78%)
CBL_ELA: 2165 (82%)
th
7 CST ‘08: 1147 (72%)
th
8 CST ‘09: 1150 (72%)
CY1_math: 2039 (78%)
Gen Math ‘10: 8 (<1%)
Alg 1 ‘10: 863 (54%)
Geo ‘10: 278 (17%)
Alg 2 ‘10: 64 (4%)
HS Math ’10: 1 (<1%)
CY1_ELA: 2148 (82%)
th
9 CST ‘10: 1214 (76%)
CY2_math: 1863 (71%)
Alg 1 ‘11: 229 (14%)
Geo ‘11: 502 (32%)
Alg 2 ‘11: 257 (16%)
HS Math ‘11: 56 (3%)
Int Math 1 ‘11: 118 (7%)
Int Math 2 ‘11: 1 (<1%)
CAHSEE ‘11: 1295 (81%)
CY2_ELA: 1974 (75%)
th
10 CST ‘11: 1207 (76%)
CAHSEE ‘11: 1300 (82%)
CBL_ELA: 1230 (77%)
th
7 Rdg ’09: 714 (56%)
th
7 Rdg ’10: 10 (<1%)
th
8 Rdg ’09: 96 (8%)
th
8 Rdg ’10: 774 (61%)
th
8 Wrt ’09: 103 (8%)
th
8 Wrt ’10: 753 (60%)
th
9 ELA ’10: 72 (6%)
th
10 ELA ’10: 7 (<1%)
CBL_ELA: 913 (72%)
CY1_math: 1213 (76%)
Alg 1 ‘11: 641 (51%)
Geo ‘11: 78 (6%)
Alg 2 ‘11: 12 (1%)
CY1_ELA: 1214 (76%)
th
9 ELA ‘11: 804 (64%)
th
10 ELA’ 11: 53 (4%)
CY2_math: 1160 (73%)
Alg 1 ‘12: 157 (12%)
Geo ‘12: 504 (40%)
Alg 2 ‘12: 154 (12%)
CY2_ELA: 1207 (76%)
th
9 ELA ‘12: 84 (7%)
th
10 ELA ‘12: 676 (53%)
CY1_math: 720 (57%)
CY1_ELA: 848 (67%)
CY2_math: 766 (61%)
CY2_ELA: 736 (58%)
257
District
(state)
n
4 (AZ)
2222
5 (NY)
729
Baseline Math
th
7 ‘08: 1 (<1%)
th
7 ‘09: 605 (27%)
th
8 ‘09: 1 (<1%)
th
8 ‘10: 735 (33%)
CBL_math: 1329 (60%)
th
7 ‘09: 361 (50%)
th
8 ‘09: 54 (7%)
th
8 ‘10: 421 (58%)
Alg1 ‘10: 105 (14%)
Geo ‘10: 2 (<1%)
CBL_math: 519 (71%)
Baseline ELA
th
Y1 Math
Y1 ELA
Y2 Math
Y2 ELA
7 Rdg ‘08: 1 (<1%)
th
7 Rdg ‘09: 602 (27%)
th
7 Wrt ‘08: 1 (<1%)
th
7 Wrt ‘09: 604 (27%)
th
8 Rdg ‘10: 735 (33%)
CBL_ELA: 1329 (60%)
th
7 ‘09: 364 (50%)
th
8 ‘09: 54 (7%)
th
8 ‘10: 425 (58%)
Stan ‘11: 1530 (69%)
Stan Lang ‘11: 1490 (67%)
Stan Rdg ‘11: 1528 (69%)
th
8 Rdg ‘11: 1 (<1%)
Exit ‘12: 1582 (71%)
Exit Rdg ‘12: 1610 (72%)
Exit Wrt ‘12: 1607 (72%)
CY1_math: 1530 (69%)
Alg1 ‘11: 370 (51%)
Geo ‘11: 29 (4%)
Alg2 ‘11: 1 (<1%)
CY1_ELA: 1530 (69%)
Rgnt ELA ‘11: 5 (<1%)
GM ‘11: 3 (<1%)
CY2_math: 1582 (71%)
Alg1 ‘12: 226 (31%)
Geo ‘12: 131 (18%)
Alg2 ‘12: 17 (2%)
CY2_ELA: 1613 (73%)
Rgnt ELA ‘12: 35 (5%)
GM: 199 (27%)
CBL_ELA: 507 (70%)
CY1_math: 392 (54%)
CY1_ELA: 8 (1%)
CY2_math: 359 (49%)
CY2_ELA: 232 (32%)
258
Appendix 11: State-Specific Decisions Regarding Combining Test Scores
In addition to the general rules we formulated for handling data at different grade levels
across districts, we also had to make state-specific decisions in order to combine of achievement
scores across.
Districts 1 and 2 (CA). Starting in 10th-grade, students in Districts 1 and 2 took the
California High School Exit Exam (CAHSEE) in English and math, in addition to the CST. If
they did not pass the CAHSEE in 10th-grade, they continued to take the test (up to eight
additional times) until they passed. Passing the CAHSEE is a requirement of graduation in
California. For ECED, we requested scores only for the first time each student took the tests.
However, in the end we decided to omit California High School Exit Exam (CAHSEE) scores all
together. Almost all students with a CAHSEE score also had a CST score, so omitting those did
not appreciably increase the amount of missing data. CAHSEE is a very different testing system
from the CST, with a different purpose and content. Including CAHSEE would be the only time
that we were including two entirely separate testing systems to create a student’s score.
Districts 3 (TN). This state entirely revamped their testing system between 2009 and
2010. So, 2009 test scores were standardized separately from 2010 and later, even when the tests
had the same name. Additionally, for ELA at baseline, students in this state took a reading test in
the 7th-grade and separate reading and writing tests in the 8th-grade. In order to avoid giving
twice as much weight to the 8th-grade tests as to the 7th-grade tests, we first averaged the 8thgrade reading and writing tests together and then averaged those scores with whatever other tests
the student had (typically 7th-grade reading, but occasionally 9th- or even 10th-grade ELA).
Districts 4 (AZ). For ELA at baseline, students in this state took separate reading and
writing test in the 7th-grade and only a reading test in the 8th-grade. In order to avoid giving twice
259
as much weight to the 7th-grade tests as to the 8th-grade tests, we first averaged the 7th-grade
reading and writing tests together and then averaged those scores with the 8th-grade reading
score.
District 5 (NY). For Grade Cohort 1, only 8 out of 729 students had an ELA score. We
imputed the missing data and included this district in the impact analyses. However, given the
large amount of error produced when imputing such a large amount of missing data, we ran
sensitivity analyses in which we excluded the district.
260
Appendix 12: Indicators of Variation in Implementation
Indicator
E1
E2
E3
E4
E5
English/Language Arts
All 9th- and 10th-graders are enrolled in ECED Literacy, including
English language learners, special education and honors (excluding
only new immigrants ["newcomers"] and students with profound
disabilities).
Each student is enrolled in ECED Literacy in minimum 135 clock
hours per year (45 minutes per day, 180 days per year).
In the first year of implementation, all 41 lessons in the first 3 units
from a single strand of the ECED Literacy Curriculum are covered for
both 9th- and 10th-graders. (Note: The 10 lessons of unit 4 are optional).
In the 2nd year of implementation, 9th-graders again receive all 41
lessons of Year 1 curriculum, and 10th-graders receive the 97 lessons of
the first 3 units of the Year 2 curriculum. (Note: the 25 lessons of the
Year 2 unit 4 are optional). In both years in both grades, three mid-unit,
and three end-of-unit assessment are administered.
All 9th- and 10th-graders are enrolled in regular ELA course, including
ELL, special education, and honors.
Each student enrolled in minimum of 135 clock hours per year (45
minutes per day, 180 days per year).
Data Source
Weight
District Records
7
District Records
7
Teacher
Questionnaires
& ELA
Coach/Chair
Interviews
8
District Records
4
District Records
4
11
E6
All ECED Literacy teachers participate in 3 days of professional
development prior to the start of school each year, focused on using the
ECED Literacy Curriculum.
Teacher
Questionnaires
& ELA
Coach/Chair
Interviews
E7
All ECED Literacy teachers participate in four one-half days of PD
across the school year with ECED staff, during which they practice
instructional strategies and discuss challenges.
Teacher
Questionnaires
& ELA
Coach/Chair
Interviews
11
ELA
Coach/Chair
Interviews
8
ELA
Coach/Chair
Interviews
10
E8
E9
There are regularly scheduled weekly meetings between the literacy
coach and all ECED Literacy teachers, focused on instruction. Most
teachers attend most meetings. The time is used to discuss emerging
issues around use of the curriculum, including but not limited to
reflection on lessons taught, preview and modeling of upcoming
lessons, and discussion of modification for struggling students.
There is a literacy coach for whom a minimum .50 FTE is dedicated to
helping ECED Literacy teachers use the curriculum, conducting model
lessons, and increasing Engagement, Alignment, and Rigor (EAR) in
ECED Literacy courses. Position lasts for entire school year.
261
E10
E11
E12
M1
M2
M3
M4
M5
Indicator
Literacy coach participates in the 3 days of professional development
focused on using the ECED Literacy Curriculum, along with the ECED
Literacy teachers prior to the start of school. Literacy coaches also
participate in 6 hours of additional PD embedded in those 3 days (i.e.,
starting earlier, staying later) focused on learning the expectations of
the role, the skills necessary to carry out the role, and how to engage
others to build capacity and further develop collaborative instructional
improvement.
Literacy coach participates in four 3-day site visits from ECED team,
focused on conducting EAR classroom visits, working directly with
ECED Literacy Teachers, and one-on-one coaching with ECED
teachers.
Literacy coach participates in conference calls with ECED staff
focused on emerging issues. Total of 14 calls per year (2 per month
during the 5 months when there is no site visit; 1 per month during the
4 months when there is a site visit).
Math (Algebra I and Geometry)
Throughout the school year, lessons/units in Algebra I and Geometry
classes are organized around benchmarks based on the state standards
in the form of "I Can" statement.
In Algebra I and Geometry classes, short, focused, benchmark
assessments, created by the teachers as a content team and based on
state standards, are given at the end of each lesson/benchmark
completion. Students must successfully answer 80% of the questions
on two benchmark assessments to be given credit for mastery of that
benchmark. (As noted below, course grades are based on number of
benchmarks mastered).
Capstone assessments are given at the end of a series of benchmarks
that test application of a group of related concepts. These assessments
are created together by the content team and are based on state
standards. The format of the questions on the capstone assessments is
similar to that on high-stakes tests.
Student grades are assigned based solely on the proportion of
benchmarks mastered during the grading period (cut-points select by
school teams/district). Students receive either a grade or an 'incomplete'
for each grading period. Fs are not assigned until all opportunities for
re-learning have been exhausted (i.e., end of summer school). Ds are
never assigned. School teams/district determines what proportion of all
benchmarks must be mastered in order to receive course credit.
Each Algebra I and Geometry class has a chart that publically displays
and tracks which benchmarks each student has mastered to date (80%
correct on two occasions). Students have a record of which benchmarks
they have mastered. Parents have been told about the benchmarks and
how to read their child's benchmark record.
Data Source
Weight
ELA
Coach/Chair
Interviews
10
ELA
Coach/Chair
Interviews
10
ELA
Coach/Chair
Interviews
10
Math
Coach/Chair
Interviews
7.14
Math
Coach/Chair
Interviews
7.14
Math
Coach/Chair
Interviews
7.14
Math
Coach/Chair
Interviews
7.14
Math
Coach/Chair
Interviews
7.14
262
M6
M7
Indicator
Benchmark Café is open for students to receive extra help and re-take
benchmark assessments daily before, during, and after school,
throughout the school year. It is organized by the math coach and
staffed by teachers, students, volunteers, and others deemed qualified
by the coach.
During the summer, there are opportunities for students to re-learn and
re-test benchmarks that they did not master during the school year.
Benchmarks mastered during the summer count towards the students'
final grade.
M8
All Algebra I and Geometry teachers participate in 3 days of
professional development prior to the start of school each year, focused
on creating "I Can" statements, pacing guides, and benchmark
assessments.
M9
All Algebra I and Geometry teachers participate in four one-half days
of PD across the school year with ECED staff, during which they
practice instructional strategies and discuss challenges.
M10
M11
M12
M13
There are regularly scheduled weekly meetings between the math
coach and all Algebra I/Geometry teachers, focused on instruction.
Most teachers attend most meetings. The time is used to discuss
emerging issues around use of the curriculum, including but not limited
to reflection on lessons taught, preview and modeling of upcoming
lessons, and discussion of modification for struggling students.
There is a math coach for whom a minimum of .50 FTE is dedicated to
helping Algebra 1 and Geometry teachers use ECED math strategies
(e.g., benchmarking, I Can statements), increasing EAR in Algebra 1
and Geometry classes, conducting model lesson, and working in and
organizing the Benchmark Café. Position lasts for entire school year.
Math coach participates in the 3 days of professional development
focused on creating "I Can" statements, pacing guides, and benchmark
assessments, along with the Algebra 1 and Geometry teachers, prior to
the start of school. Math coaches also participate in 6 hours of
additional PD embedded in those 3 days (i.e., starting earlier, staying
later) focused on learning the expectations of the role, the skills
necessary to carry out the role, and how to engage others to build
capacity and further develop collaborative instructional improvement.
Math coach participates in four 3-day site visits from ECED staff,
focused on conducting EAR classroom visits, working directly with
Algebra 1 and Geometry teachers, and one-on-one coaching with
teachers.
Data Source
Weight
Math
Coach/Chair
Interviews
7.14
Math
Coach/Chair
Interviews
7.14
Teacher
Questionnaires
& Math
Coach/Chair
Interviews
Teacher
Questionnaires
& Math
Coach/Chair
Interviews
7.14
7.14
Math
Coach/Chair
Interviews
7.14
Math
Coach/Chair
Interviews
7.14
Math
Coach/Chair
Interviews
7.14
Math
Coach/Chair
Interviews
7.14
263
M14
P1
P2
P3
P4
Indicator
Math coach participates in conference call with ECED staff focused on
emerging issues. Total of 14 calls per year (2 per month during the 5
months when there is no site visit; 1 per month during the 4 months
when there is a site visit).
EAR Protocol
Math and literacy coach, plus three other instructional leaders,
participate in four days of training on use of the EAR Protocol.
Math and literacy coach, plus three other instructional leaders, conduct
10 practice visits in groups, between EAR Protocol training days 1-2
and days 3-4, followed by debriefing, to build shared understanding.
Each trained EAR observer conducts 5 EAR visits per week, once
training is complete, and uploads data to server. (Total of 140 per
observer per year, assuming 28 weeks).
EAR Classroom visit are used as a non-evaluative tool to give data to
coaches and instructional leaders for reflective coaching conversations
with teachers; allow coaches and instructional leaders to see trends;
and make professional development decisions specific for individuals
and groups of teachers around EAR.
Data Source
Weight
Math
Coach/Chair
Interviews
7.14
Wgt.
Math & ELA
Coach/Chair
Interviews
25
EAR Protocol
Database
25
EAR Protocol
Database
25
Math
Coach/Chair,
ELA
Coach/Chair &
Principal/AP
Interviews
25
264
Appendix 13: Teacher-Level Correlations Among Outcome Variables
Outcome
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Fall Year 1
1 Teacher collective commitment
2
Teacher mutual support
3
Support from district
.43**
.23** .21**
4
Support from administration
.36** .34** .40**
5
Engagement
6
Alignment
7
Rigor
.02
.07
-.04
.08
-.15*
.07
-.13
-.06
-.04 .26**
.04
-.08
.03
.44** .64**
Spring Year 1
8
9
Perception of administrative support for
.26** .24** .35** .37**
instructional innovation and improvement
Teacher collective commitment
.37** .34** .21* .27**
.03
.01
-.03
.01
10 Teacher mutual support
.30** .38**
.19* .30**
.05
.08
11 Support from district
.20** .16** .30** .28**
-.03
-.03
12 Support from administration
.24** .24** .24** .36**
.10
.07
13 Commitment to change
.14**
.08 .24** .17**
-.05
-.01
14 Confidence in change
.23** .26** .28** .36**
-.01
.12 .39**
.12 .33** .48**
-.04 .72** .27** .25**
.04 .70** .33** .30** .49**
-.05 .77** .24** .15** .44** .33**
.10
.04
.22* .16** .27**
.16
.07
16 Professional development
.19** .20** .30** .27**
-.05
-.04
17 Perceived competence
.01
.05
.01 .22** .31** .20**
-.04 -.14*
-.09 .20** .22** .21**
15 Individual teacher morale
.21*
.18*
.06 .81** .35** .32** .44** .53** .39**
.09 .37** .28** .21** .26** .38** .14** .39**
.04 .47** .28** .25** .44** .31** .32** .38** .25**
.22**
.20*
.10
18 RAI
.13*
.16*
.00 .18**
19 Engagement
-.08
-.03
-.09
-.06 .41** .18**
-.02
20 Alignment
-.09
-.04
-.04
-.08
21 Rigor
-.03
-.03 -.16* -.16*
.16* -.14*
.08 .30** .26** -.02
.14* .28** .33**
-.05
.11
.21*
.11 .28**
.11 .23**
.34*
.01 .26** .35**
.12 .31**
.09
-.02
.01
.12 -.14*
.05
.06
.01
-.07 .19**
-.04
.02
.06
.03 -.17** .32** .73**
.21*
-.04
.02 -.14*
-.06 -.10
-.02 -.22**
-.04
.00
.02
-.07 -.13*
-.09
-.03
.12*
.02
Fall Year 2
22 Teacher collective commitment
.31** .29**
.15* .23**
-.12
-.09
.06 .25** .30** .28** .18** .22** .16**
23 Teacher mutual support
.27** .30**
.14* .24**
-.01
-.04
.06 .28** .27** .31** .25**
.12*
-.08
-.04
-.08 .35**
.12
.16*
.19* .15** .19**
-.01
.03
.03 .40**
.18*
.19**
.20* .15** .24**
.12
.15
.10
.24*
.19* .19**
.19* .22**
.21*
.18*
.09
-.09
-.12
-.11
-.03
-.01 .22**
.15
-.01
.12
-.02
.02
.01
-.03
.05
.13 -.24**
0.11 -.11
.13 .35** .30**
.00
.01
.01
.05
.08
-.05
.05
-.04
24 Support from district
25 Support from administration
26 Individual teacher morale
27 Engagement
.09
.07 .20**
.14
.04
-.02
-.02
28 Alignment
-.03
.11
-.09
.04
29 Rigor
-.09
.04
-.03
.06
.16* .20**
.12
.16*
.16*
.13
.10* .34** .21** .28** .23**
.10* .21**
.03
.17* .34** .34** .26** .31**
.18* .20**
.08
.10*
.07 -.07
.02
.18* .15**
.16*
.14
.02
.25* .17** .22**
.04
.10 .24** .26**
-.13
-.05
-.14
.08
.04
-.01
.10 -.07
.11*
.04
.02
.06
.02
.06
.01 -.01 -.13
.03
.06
.13 .33** .15* .08
-.12 .03 .15* .18*
-.03 .14 .15* .17*
Spring Year 2
30 Perception of administrative support for
instructional innovation and improvement .31** .31** .29** .37**
31 Teacher collective commitment
.40** .36** .22** .34**
32 Teacher mutual support
33 Support from district
-.01
-.01
.09
-.03
.34** .36** .20** .31**
.05
.08
.10 .31** .34** .37** .26** .27** .18** .27** .21** .23**
.14
.26* .25** .27** .29**
-.05
.10
-.02 .47** .29** .24** .41** .34** .34** .34** .23** .31**
.16*
34 Support from administration
.29** .32** .26** .38**
35 Commitment to change
.16** .19**
36 Confidence in change
.28** .25** .25** .35**
37 Individual teacher morale
.28** .26**
38 Professional development
.17 .18**
.04 .59** .36** .32** .47** .45** .40** .49** .32** .37** .22** .18** -.03 -.12 -.03
.05 .37** .39** .35** .27** .29** .23** .33** .25** .26** .19*
.16 -.05 -.04 .10
.03
.14
.11 .50** .36** .31** .40** .43** .31** .42** .31** .33**
-.06
.04
-.04 .37** .21** .19** .29** .25** .30** .28** .15** .20**
.02
.10
.14*
.19* .21**
.15*
.20*
.13 .24** .35** .28**
.24* .22** .25** .31**
-.08
.02
.02 .41** .24**
39 Perceived competence
.22** .20**
.11 .24**
.02
-.03
-.01
.17*
.16*
40 RAI
.23** .19**
.06 .21**
.12
-.05
-.01 .17**
.14*
.06
.04 -.09 -.05
.06 -.14 -.13 -.08
.10 .53** .31** .29** .39** .41** .31** .49** .32** .35** .20** .19**
.22* .34**
-.01 -.00
.12 -.08 -.05 -.09
.07 -.09
.08
.20* .26** .33** .19** .33** .36** .27** .21** .16** .17* .08
.17* .35** .28** .28** .35** .25** .45** .13*
.10 -.10 -.01
.05
.11
.11
.17*
.13* .13** .19**
.08 .17**
.24*
.12* .30** .16**
.03 .19** .21** .16**
-.24**
-.06
-.15
-.01
-.13 .38** .23**
All other correlations are pooled correlations using 5 multiply imputed datasets.
.19* -.19* -.21* -.21** -.16*
.09 -.08 -.03 -.03
.02
.11 -.31**
-.02
.03
-.14
-.05 -.01 -.01
.07 -.01
0.0
-.06 .44** .12 .13
42 Alignment
.05
.01 -.06 -.01 .18*
-.10 -.07 .07 .33** .20**
43 Rigor
-.12 -.04
.09 -.13 .17* .29** .27** -.12 -.03 -.09 -.01 -.07 -.11 -.13
.03 -.09 -.02 -.12 .22** .35** .25**
Note. **p < .01, *p < .05; Correlations between EAR observation outcomes (i.e., engagement, alignment, and rigor) and all variables were conducted on non-imputed data.
41 Engagement
-.09 -.17* -.16*
-.04 -.05 -.13
.11* .27**
.15
265
Teacher-Level Correlations Among Outcome Variables (con’t)
Outcome
Fall Year 2
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
22 Teacher collective commitment
23 Teacher mutual support
24 Support from district
.40**
.16* .22**
25 Support from administration
.27** .35** .42**
26 Individual teacher morale
.22** .24**
.17*
.22*
27 Engagement
-.08
-.07
-.02
-.00
.18
28 Alignment
.00
.02
.01
.11
29 Rigor
-.05
-.08
.01
.10
.12 .22**
.07 .39** .68**
Spring Year 2
30 Perception of administrative support for
instructional innovation and improvement .28** .33** .37** .41** .28**
31 Teacher collective commitment
.39** .35** .15* .23** .26**
.04
-.06
.15*
-.02
.15 .21** .24**
32 Teacher mutual support
.30** .34**
-.13
.03
33 Support from district
.24** .27** .32** .34**
.21*
.11
.08
34 Support from administration
.25** .29** .30** .37** .27**
.12
35 Commitment to change
.15 .19** .26** .25**
.12
36 Confidence in change
.24** .29** .30** .33** .27**
37 Individual teacher morale
.22** .26**
38 Professional development
39 Perceived competence
40 RAI
41 Engagement
.05
.15*
.20*
.18** .19**
.031 .47** .56**
.09 .74** .43** .39**
.16 .22** .76** .45** .45** .58**
.05 -.05 .77** .31** .28** .45** .38**
*
.05 .20**
.01
.13* .23** .30** .20**
.20** .19** .25** .27** .19* -.06
.11
.02 .50**
.17* .83** .42** .39** .47** .58** .41**
.12 .48** .38** .33** .40** .45** .24** .44**
.13
.14 .55** .34** .33** .42** .44** .32** .53** .37**
.07 .18*
.07
.01
-.03 .19** .28** .20**
.16* .20**
.09 .17** .34** .24**
.06 .11* .21**
.04
.04
.02 .22** .28** .27**
.15* .22**
.07 .25** .35** .19** .32**
-.05 .02 .29** -.06 .09
.00
.17 .45
-.01 .10 -.00 -.15
-.12 .02 .05
.02
.17 .12 .22** .27**
.01 -.03 -.05 .05 .08 -.06 .02 .12 .04 .09 .10 .32**
.02 .11 .07 .45** .70**
43 Rigor
-.08 -.14 -.02 -.01 .06 .19** .20** .31** -.05 -.13* -.12 -.02 .01 -.12 .01 .19**
Note. **p < .01, *p < .05; Correlations between EAR observation outcomes (i.e., engagement, alignment, and rigor) and all variables were conducted on non-imputed data.
42 Alignment
*
-.15
.02
-.08
-.06
-.04
-.05
-.04
.09
All other correlations are pooled correlations using 5 multiply imputed datasets.
**
*
266
Appendix 14: Student-Level Correlations Among Outcome Variables
Outcome
1
2
3
4
5
6
7
Fall Year 1/pre-baseline
Attitudes towards school
1
2 Positive teacher support .74**
3 Lack of teacher support .79** .43**
.75** .40** .42**
4 Engagement
Perceived
competence
.76** .52** .39** .50**
5
RAI
.18** .11** .12** .16** .16**
6
ELA
achievement
.07** -.05** .11** .06** .05** -.06**
7
Math
achievement
.04** -.05** .07** .04** .04** -.04* .67**
8
Spring Year 1
9 Attitudes towards school .63** .48** .47** .50**
10 Positive teacher support .50** .47** .34** .35**
11 Lack of teacher support .45** .30** .42** .32**
.49** .34** .33** .48**
12 Engagement
13 Perceived competence
14 RAI
15 ELA achievement
16 Math achievement
17 Grade point average
18 Credits earned
.51** .38** .34** .39**
.10** .07** .06** .08**
.09** -.02 .11** .08**
.09** -.00 .10** .08**
24 Perceived competence
25 RAI
32 ELA achievement
33 Math achievement
34 Grade point average
35 Credits earned
36 Attendance
Note. **p < .01, *p < .05
10
11
12
13
14
15
16
17
18
19
.30** .09** .09** .09** .77** .41**
.37** .13** .06** .05** .76** .42** .42**
.46** .12** .07** .08** .78** .58** .37** .52**
.10** .22** -.06** -.05** .13** .08** .08** .12** .12**
.08** -.05** .69** .53** .12**
.08** -.02 .49** .54** .12**
.02 .12** .10** .11** -.05**
.03* .11** .10** .12**
-.02 .37** .38** .20** .09** .15** .19** .17**
.01 .14** .14** .12** .09** .09** .10** .10**
-.01 .18** .19** .08** .04** .06** .07** .07**
-.03* .59**
-.02 .43** .51**
.01 .17** .19** .32**
.00 .17** .24** .48** .32**
.48** .14** .06** .05** .69** .54** .50** .53** .56** .11** .09** .09** .17** .10** .07**
-.02 .55** .53** .35** .38** .45** .08**
.00
.01 .08** .06** .03**
.39** .11** -.03*
.30** .09** .09** .08** .50** .35** .46** .35** .36** .08** .11** .10** .14** .08** .06**
.36** .12** .04** .03* .53** .38** .34** .50** .42** .09** .07** .07** .15** .07** .05**
.50** .37** .33** .39** .45** .13** .07** .05** .56** .44** .36** .42** .51** .10** .09** .09** .14** .08** .05**
.02
.00
.12** .09** .07* .09** .11** .20** -.06** -.04* .12** .10** .09** .09** .10** .22** -.05* -.03* -.01
Spring Year 2
26 Attitudes towards school .55** .41** .40** .43**
27 Positive teacher support .45** .40** .30** .32**
28 Lack of teacher support .36** .24** .34** .26**
.44** .31** .29** .41**
29 Engagement
30 Perceived competence
31 RAI
9
.49** .14** .07** .07**
.02
.00 .76**
.39** .10**
.17** .07** .13** .17** .14**
.10** .07** .06** .08** .08**
.08** .04** .06** .07** .07**
19 Attendance
Fall Year 2
20 Attitudes towards school .60** .45** .44** .47**
21 Positive teacher support .49** .45** .34** .35**
22 Lack of teacher support .42** .27** .38** .30**
.46** .32** .32** .44**
23 Engagement
8
.45** .33** .29** .35**
.09** .07** .06** .07**
.09** -.02 .11** .08**
.04** -.04** .06** .04*
.44** .15** .07** .08** .64** .50** .46** .50** .52** .13** .09** .09** .17** .12** .07**
.02
.01 .50** .47** .32** .36** .42** .11**
.01
.02 .08** .08** .04**
.37** .12**
.25** .09** .10** .09** .45** .31** .41** .32** .32** .08** .11** .10** .14** .09** .06**
.35** .13** .04* .04** .49** .36** .32** .46** .39** .11** .06** .06** .15** .10** .06**
.41** .11** .07** .08** .50** .39** .32** .38** .46** .10** .09** .09** .14** .09** .06**
.02
.00
.09** .17** -.06** -.05** .11** .09** .08** .09** .10** .20** -.05** -.03* -.01
.08** -.05** .63** .48** .11**
.04** -.02 .40** .42** .07**
.20** .07** .16** .20** .16**
.12** .07** .08** .11** .11**
.08** .04** .06** .07** .07**
.01 .12** .09** .10**
-.04* .66** .50** .40** .17** .19**
-.01 .08** .05** .07**
-.02 .42** .51** .30** .14** .13**
-.01 .43** .41** .23** .10** .17** .23** .20**
-.02 .47** .51** .68** .31** .36**
-.01 .17** .22** .14** .08** .10** .12** .12**
-.02 .19** .24** .40** .53** .36**
-.00 .15** .17** .08**
-.01 .14** .18** .32** .23** .55**
.05* .05** .07** .07**
267
Student-Level Correlations Among Outcome Variables (con’t)
Outcome
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Fall Year 2
20 Attitudes towards school
21 Positive teacher support
22 Lack of teacher support
.77**
.77** .42**
23 Engagement
24 Perceived competence
.75** .43** .43**
.79** .59** .39** .51**
25 RAI
.16** .10** .10** .14** .15**
Spring Year 2
26 Attitudes towards school
.67** .54** .48** .51** .55** .15**
27 Positive teacher support
28 Lack of teacher support
.53** .52** .34** .37** .44** .11** .77**
29 Engagement
30 Perceived competence
.52** .38** .33** .49** .41** .13** .75** .44** .39**
.47** .34** .45** .32** .33** .10** .74** .38**
.52** .41** .32** .39** .50** .11** .78** .60** .34** .49**
31 RAI
32 ELA achievement
.10**
33 Math achievement
34 Grade point average
.21** .09** .18** .20** .18**
-.02 .23** .11** .18** .22** .20**
.12** .07** .08** .10** .11**
.00 .13** .07** .09** .12** .11**
.08** .04** .07** .06** .07**
-.01 .08** .05** .05** .08** .07**
35 Credits earned
36 Attendance
Note. **p < .01, *p < .05
.12** .09** .08** .10** .11** .24** .16** .11** .11** .13** .14**
.06**
.00 .12** .08** .10** -.06** .12**
-.01 .08** .04** .07**
-.03* .08**
.02 .12** .08** .12** -.05**
.01 .08** .05** .09**
-.02* .53**
-.02 .55** .45**
.02 .18** .09 .38**
-.01 .18** .18** .42** .33**
268
Appendix 15: School-Level Correlations Among Outcome Variables
Outcome
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
1 Treatment status
Fall Year 1/pre-baseline
2 Attitudes towards school
.25
3 Positive teacher support
.04 .69**
4 Lack of teacher support
.23 .75**
5 Engagement
.30
6 Perceived competence
.15 .82** .59**
.40
.23
-.14
.26
.41
.29
-.12
.05
.47*
-.24
-.32
.09
8 ELA achievement
.33
.13
-.27 .58**
.06
-.07
9 Math achievement
.04
.24
.15
.38
-.04
.05
10 Teacher collective commitment
.19
.13
-.21
.39
.27
-.10
-.41
.35
11 Teacher mutual support
.08
.08
-.17
.37
.16
-.17
-.35
.28
12 Support from district
.04
.36
.27
.53*
-.02
.02
.17
.55*
13 Support from administration
.18
.21
.04
.42
.13
-.12
-.39
.39
14 Engagement
.33
-.08
-.13
.16
-.16
-.16
-.26 .59**
15 Alignment
.31
.18
.07
.27
.14
-.05
-.20
16 Rigor
.32
.19
.06
.16
.26
.07
-.22
.53*
.39
.41
.19
.38
.10
-.05
.19
.26
.29
.13
.29
.09
.36
.37 .69**
.05
-.24
.34
.50*
-.38
-.04
-.02
.10
.01
.13
-.09
.20
.25
.14 .57**
.33
.27
-.10
.17
.29
.32
.31
.28
.34
.18
.32
.50*
.43 .57**
7 RAI
-.50*
-.34 .57**
.46*
.36 .93**
.37
.07
-.01
.39 .65** .65**
.42
.52*
.11
.06
.20
.25
.37
.25
.27
.04
.19
.50*
.45*
.39
-.06
-.05
.03 .67**
.07 .63** .79**
Spring Year 1
17 Attitudes towards school
.22
18 Positive teacher support
.14
19 Lack of teacher support
.14
20 Engagement
.13
.42
.12
.37 .49*
.22
-.27
-.06
.11
.33
.33
-.07
.26
.08
21 Perceived competence
.34
.44*
.37
.26
.08
.47*
.17
.06
.20
.24
.24
.10
.19
.09
.27
.47*
-.02
-.06
.24
-.20
-.22
-.06
.87**
-.39
-.40
-.30
-.22
.22
-.14
-.35
-.39
-.33
23 ELA achievement
.40
.01
-.15
.26
.02
-.15
-.36 .73** .59**
.21
.09
.29
.08 .77**
.47*
.37
24 Math achievement
.34
.31
.05 .64**
-.09
.07
-.45* .76** .64**
.40
.36
.38
.35 .71** .67**
.51*
25 Grade point average
.27
.33
.17
.01
-.38 .84** .76**
26 Credits earned
.13
.24
.46*
.12
-.31
.21
27 Attendance
Perception of administrative support for
28
instructional innovation and improvement
29 Teacher collective commitment
.37
.33
-.02 .61**
.18
.03
.14
.17
.12
.24
.08
-.05
-.12
.22
.21
.12
-.17
.47*
.09
-.13
-.18
30 Teacher mutual support
.15
.20
.07
.39
.01
-.02
-.29
31 Support from district
.05
.20
.28
.25
-.18
.04
.10
22 RAI
32 Support from administration
.49*
.04 .57**
.17
.11
.36
.25 .68**
.36
.28
.26
.43 .502* .58**
.50*
.52*
-.08
-.04
.11
.23
.30
.17
.33
.40
-.08
-.23
-.20
.07
-.03
.11
-.48* .79** .82**
.52*
.49*
.29
.22
.47* .76**
-.08
.40
.45* .79** .73**
.23 .59**
.28
.36 .69** .77**
.07 .71**
.19
.19
.11 .59** .68**
.27
.12
.29
.07
-.07
.29
.03
-.14
-.37
.36
.18
.42
.44
33 Commitment to change
-.03
.20
.24
.04
.12
.12
.20
-.07
.05
-.09
-.22
34 Confidence in change
.18
.14
.02
.28
.19
-.15
-.29
.27
.40
.47*
.40
35 Individual teacher morale
.27
-.16
-.35
-.06
.24
-.14
-.40
.23
.13
.32
.25
.01
36 Professional development
.42
.34
.18
.45* .58**
.53*
.13
.27 .84**
.45*
.34
.37 .78**
.02
.04
.00
-.29
-.38
-.28
.04
.17
.20
-.07
.38
.12
.19
.23
.17
.07
.16
.34
.03
-.02
.36
-.23
.15
.35
.30
.21
.39
-.16
.07
-.24
-.50*
-.33
-.30
-.08 -.66**
-.36
.12
.35
.25
.33
-.15
.17
.16
.30
.24
38 RAI
.20
-.21
-.13
-.18
-.11
-.13
-.17
-.20
-.17
.21
.29
-.46*
.32
-.14
.08
-.04
39 Engagement
.29
-.01
-.22
.31
.13
-.28
-.35 .61**
.33
.19
.19
.25
.04 .73**
.56*
.51
40 Alignment
.15
.19
.11
.22
.11
.03
.16
.08
.27
.32
.20
.09
-.11
-.08
.10
.20
41 Rigor
.30
.16
-.17
.24
.42
.06
-.16
.16
.30
.48*
.34
-.12
-.05
.14
.34
.53*
37 Perceived competence
Note. **p < .01, *p < .05; N=20; Correlations are based on school aggregated student and teacher data.
Correlations between EAR observation outcomes (i.e., engagement, alignment, and rigor) and all variables were conducted on non-imputed data.
All other correlations are based on school aggregates of multiply imputed data.
269
School-Level Correlations Among Outcome Variables, Year 1-2 (Part 2 of 6)
Outcome
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Spring Year 1
17 Attitudes towards school
18 Positive teacher support
.75**
19 Lack of teacher support
.89**
.40
20 Engagement
.82**
.38 .84**
21 Perceived competence
.83** .73** .58** .49*
22 RAI
.13
.42
.04
-.22 .13
23 ELA achievement
-.17
-.32
-.08
-.10 -.04
24 Math achievement
.24
.00
.32
.18 .27 -.52* .62**
25 Grade point average
.10
-.21
.27
.06 .15
-.39 .69** .79**
.48* .66**
.20
.24 .55*
.05
26 Credits earned
-.43
.17
.31
.14
27 Attendance
Perception of administrative support for
28
instructional innovation and improvement
29 Teacher collective commitment
.33
-.03 .46*
.28 .34
-.39 .64** .80** .86**
.19
.16
.06
.27
.09 .00
.08
-.05
.11
.30
-.15
.25
.01
.37
.16 .21
-.03
.17
.40
.37
-.22 .49* .53*
30 Teacher mutual support
.39
.27
.37
.33 .29
-.21
.08 .46*
.28
.21
.41 .55* .75**
31 Support from district
.20
.26
.24
-.06 .08
.33
-.11
.08
.22
.14
.17 .87**
32 Support from administration
.17
.08
.23
.14 .07
-.10
.04
.29
.28
.04
.33 .85** .58** .72** .76**
33 Commitment to change
-.03
-.05
.09
-.10 -.14
.28
-.22
-.27
.07
-.34
-.16 .79**
34 Confidence in change
.22
.01
.34
.24 .03
-.13
.06
.26
.40
-.21
.37 .94** .67** .67** .69** .84** .63**
35 Individual teacher morale
-.02
-.20
.09
.04 -.01
-.24
.03
.17
.19
-.34
.18 .60** .53* .52*
36 Professional development
.34 .45*
.26
.17 .20
.42
-.17
-.12
-.03
.01
.01 .58** .55* .45* .54*
.43
.42 .58**
.36
37 Perceived competence
-.28
-.29
-.27
-.04 -.25
-.41
.30
.11
.18
-.15
.22
.01
.17
.08
-.19
.24
.19
-.02
38 RAI
-.03
.19
-.18
-.05 .00
-.06
-.14
-.05
-.30
-.07
-.15
.34
.35 .56*
.24 .58**
.02
.37 .55*
.30
.13
39 Engagement
-.02
-.34
.19
.17 -.15
-.38 .59** .59** .56*
.09 .51*
-.02
.15
.20
-.17
.05
-.18
.14
.11
-.26
.31
-.34
40 Alignment
-.04
-.11
-.04
-.07 .11
.02
.18
.13
.17
-.31
.18
-.23
.30
-.12
-.38
-.39
-.08
-.10
-.28
.03
.19
-.35
.15
41 Rigor
.10
-.24
.26
.22 .07
-.19
.13
.21
.19 -.52*
.28
.06 .58**
.23
-.28
-.05
.05
.28
.37
.19
.15
-.07
.40 .65**
.22
.35
.17
.43
.07 .68**
.19
.40
.32 .68**
-.23
.29 .70**
Note. **p < .01, *p < .05; N=20; Correlations are based on school aggregated student and teacher data.
Correlations between EAR observation outcomes (i.e., engagement, alignment, and rigor) and all variables were conducted on non-imputed data. All other correlations based on school aggregates of multiply imputed data.
270
School-Level Correlations Among Outcome Variables, Year 1-2 (Part 3 of 6)
Outcome
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
.41
Fall Year 2
42 Attitudes towards school
.16
.23
.14
.24
-.05
.26
.24
-.10
.12
.19
.22
-.03
.04
.22
.30
43 Positive teacher support
-.04
.22
.28
.07
-.12
.30
.32
-.36
-.06
.19
.27
-.08
.09
-.03
.20
.23
44 Lack of teacher support
.21
.09
.03
.22
-.13
.05
.24
.12
.27
.06
.04
.15
-.05
.48*
.31
.40
45 Engagement
.09
.17
.03
.20
.05
.18
.13
-.14
.02
.12
.17
-.18
-.01
.08
.19
.34
46 Perceived competence
.20
.40
.17
.30
.15
.51*
-.03
-.10
.01
.33
.39
-.21
.20
-.12
.20
.29
.23 .69**
-.18
-.27
.26
.93**
-.54*
-.19
-.37
-.32
.20
-.22
-.34
-.26
-.18
.38
47 RAI
48 Teacher collective commitment
-.10
.16
-.05
-.37
.23
.28
-.21
-.31
.20
.30 .78** .80**
-.12
.53*
-.01
.20
49 Teacher mutual support
-.07
-.07
-.27
.28
.06
-.31
-.36
.23
.44 .76** .84**
-.05 .58**
.10
.24
.39
50 Support from district
-.27
.12
.18
.31
-.25
-.13
-.05
.23
.27
.30
.18
.00
-.13
.48*
.46* .57**
.15 .58**
.45*
.40
.33
.48* .64** .71**
.14 .82**
.05
.25
.17
.38
.19
-.04
.26
.29
.51*
51 Support from administration
52 Individual teacher morale
53 Engagement
54 Alignment
55 Rigor
-.13
.03
.41
.23
.11
-.03
.45*
-.10
-.17
-.42
.49*
-.02
-.02
-.10
.21
.02
-.27
-.49*
.31
.13
-.01
-.26
.33
-.04
-.12
-.46
.55*
.38
.10
.02
.14
.05
.21
.22
-.01
.23
.41
.04
-.21
.25
.54*
.30
.32
.14
.34
.55*
.51*
.21
.50*
.35
.33
-.26
.49*
.54*
.32
.33
.18
.25
.55* .62** .72**
Spring Year 2
56 Attitudes towards school
-.11
.02
.17
-.09
-.14
.07
.29
-.18
.20
.18
.23
.02
.14
.10
.19
57 Positive teacher support
-.16
-.13
.26
-.33
-.36
.01
.51*
-.42
.01
-.03
.03
-.07
-.04
.00
.09
.15
58 Lack of teacher support
.05
.10
.06
.19
-.08
.04
.12
.12
.31
.29
.28
.20
.27
.27
.27
.45*
59 Engagement
-.07
.08
.12
-.08
.16
.08
.10
-.23
.08
.16
.26
-.12
.14
.06
.20
.54*
60 Perceived competence
-.24
.02
.13
-.16
-.04
.16
.16
-.16
.23
.20
.25
-.06
.09
-.08
.03
.32
61 RAI
.00
-.03
.38
-.31
-.41
.16
.82**
-.47*
-.46*
-.32
-.21
-.12
-.36
-.27
-.26
-.16
62 ELA achievement
.38
.20
-.27
.54*
.23
.06 -.57** .87** .56**
.48*
.38
.13
.13
63 Math achievement
.33
.00
-.04
.37
-.40
-.15
.18 .67**
.44*
.12
64 Grade point average
.24
.37
.03 .64**
.16
.09
.38
.29
65 Credits earned
.14
.52*
.45*
.30
.09
.54*
66 Attendance
Perception of administrative support for
67
instructional innovation and improvement
68 Teacher collective commitment
.26
.22
-.03
.51*
.00
.00
-.17
.17
.06
.41
.01
-.14
-.16
.15
.46* .60** .69**
.24 .67**
-.07
.10
-.08
.38
.08
-.17
-.15
.20
.45* .76** .78**
.20 .65**
69 Teacher mutual support
-.21
.18
.10
.38
-.02
-.11
-.04
.11
.45* .70** .79**
70 Support from district
-.41
.17
.17
.43
-.24
-.10
-.09
.17
.43
71 Support from administration
-.03
.24
.05
.50*
.01
-.02
-.36
.24
72 Commitment to change
-.27
.07
.10
.33
-.22
-.21
.11
73 Confidence in change
-.04
.13
-.01
.25
.23
-.11
-.17
74 Individual teacher morale
-.10
.07
-.19
.32
.26
-.18
-.35
75 Professional development
.12
.29
.19
.24
.32
.05
76 Perceived competence
-.20
-.32
-.26
-.27
.02
77 RAI
-.18
-.15
-.04
-.07
-.26
78 Engagement
-.23
-.23
-.36
.16
79 Alignment
-.14
-.14
-.25
.36
.22
.15
80 Rigor
.33
.26
.44
.32
.05
.05
.42
-.38 .86** .67**
.27
.23 .60**
.36
.56*
-.03
.43
.34
.36
-.02
.04
.18
-.46* .83** .79**
.34
.34
.48*
.44 .64**
-.20 .64**
.17
.41
.37 .57**
.42
.38
-.04
.19
.17
-.02
.19
.30
.18 .61**
-.06
.22
.29
.54*
.37 .65**
-.03
.05
.00
.44
.54* .66**
.06 .60**
.17
.31
.28
.13
.30
.47*
.40
.48*
-.20
.05
-.12
.05
.41
.54* .59**
.10 .59**
-.04
.18
.29
.31
.50* .70** .71**
.25 .75**
-.07
.08
.15
-.02
-.03
.30 .57** .62**
.06
.55*
-.19
.15
.39
-.29
-.26
-.01
.09
.37
.39
-.18
.33
-.26
-.31
-.01
-.12
-.16
.03
.16
.39
.51*
-.05 .57**
-.19
-.18
-.05
-.14
-.35
-.51*
.37
.16
.40
.46*
-.01
.29
.19
.08
.10
.13
-.14
-.15
-.40
.23
.01
.41
.48*
-.06
.41
-.03
-.12
.05
.22
-.11
.26
.00
.28
.04
-.09
-.11
.09
-.14
.52*
.31
.32
.43
.52*
Note. **p < .01, *p < .05; N=20; Correlations are based on school aggregated student and teacher data.
Correlations between EAR observation outcomes (i.e., engagement, alignment, and rigor) and all variables were conducted on non-imputed data.
All other correlations are based on school aggregates of multiply imputed data.
271
School-Level Correlations Among Outcome Variables, Year 1-2 (Part 4 of 6)
Outcome
17
28
29
30
31
.20
.04
.31
.30
-.05
-.04
.07
.22
.29
.09
.29
.10
.10
.37
.37
.20
-.05
.24
-.18
21
22
23
24
25
26
.53* .66**
.25
-.14
.16
-.06
.32
.34 .55*
.24
-.39
.03
-.20
.42
.36
.43
.24
.16
.24
.14
.18
.48* .71** .60** .48*
.24
-.19
-.01
-.21
.11
.03
-.29
.15
-.11
18
19
20
27
34
35
36
37
38
39
40
-.06
.06
-.01
.26
-.37
.12
-.11
-.13
.11
-.10
-.07
-.15
.22
-.36
.17
-.37
-.19
-.17
-.01
.07
.10
-.04
.15
-.25
-.06
.20
-.02
.20
.14
-.05
.14
.17
.40
-.26
.21
-.08
-.17
.23
.15
-.27
-.02
.06
.14
-.34
.27
-.36
-.10
.02
-.17
32
33
.12
.06
.12
-.01
.17
.11
.33
.10
.33
.04
41
Fall Year 2
42 Attitudes towards school
.78** .62** .72**
43 Positive teacher support
.60** .68**
.40
44 Lack of teacher support
.56**
45 Engagement
.71**
46 Perceived competence
.74** .63**
.55*
.35 .63**
.53* .79**
47 RAI
.24 .63**
.03
-.15
.27 .82**
-.43
-.43
-.32
.22
-.38
.00
-.21
.25
-.28
.30
-.39
.46*
-.40
-.17
-.45
.13
-.19
48 Teacher collective commitment
.26
-.02
.37
.29
.19
-.07
-.03
.19
.14
-.31
.40
.39 .85** .71**
.17
.52*
.02 .57** .59**
.39
.29
.46*
.10
.17
.58*
49 Teacher mutual support
.25
-.01
.38
.32
.07
-.19
.04
.29
.19
-.16
.44
.40 .80** .80**
.22
.52*
-.01 .58**
.42
.35
.45*
.39
.25
.12
.45
50 Support from district
.12
.04
.28
.13 -.22
.00
.06
.17
.23
.20
.24
.39
.01
.47*
.31
.31
.31
-.14
.00
.09
-.16
.32
-.35
-.27
51 Support from administration
.24
.07
.33
.28
.05
-.34
.37 .57**
.41
.22 .59**
.45*
.53* .72**
.35 .63**
.03
.55*
.28
.11
.30
.31
.51*
-.16
.14
52 Individual teacher morale
.09
.02
.11
.12
.04
-.37
.11
.41
.43
.03
.50* .57** .59** .74**
.43 .73**
.08 .70**
.52*
.32
.50*
.49*
.04
-.17
.01
53 Engagement
-.44 -.53*
-.25
54 Alignment
.45
55 Rigor
.19
-.37 -.39
-.39 .59**
.53*
.49*
-.27
.43
-.08
.13
.00
-.08
.03
-.18
-.02
-.02
-.41
.12
.07
.27
.15
.06
.08 .62**
.45
.26
-.02
.17
.16
.37
-.13
.59*
.42
.45
.44
.25
.33
.25
.55*
.42
.30
.19
.01
.31
-.03
.43
.26
.57*
.44 .52*
-.18
.49* .62**
.58*
.30 .76**
.09
.33
.49*
.07
.21
-.19
.23
.21
.08
-.07
.00
.44
.06
.31
.64** .59**
.49*
.40.65**
.21
-.18
-.02
-.07
.35
.07
.15
.20
.31
.15
.03
.14
.17
-.02
.25
.06
.06
-.12
-.03
.03
.38 .62**
.12
.09 .46*
.35
-.21
-.20
-.27
.37
-.20
.00
-.03
.07
.10
-.12
.10
-.05
-.21
.24
.02
.16
-.37
-.07
-.23
.42 .63**
.54*
Spring Year 2
56 Attitudes towards school
57 Positive teacher support
58 Lack of teacher support
.65**
59 Engagement
.76**
.44.63**
.09
.01
.26
.19
.26
.27
.26
.40
.44
.20
.17
.19
.32
.08
.17
-.01
.04
.10
.04
.17
.55* .69** .68** .58**
.17
-.32
-.10
-.13
.33
.10
.15
.14
.37
.12
.12
.00
.22
.20
.36
.02
.03
.03
-.26
.11
.07
-.19
-.14
-.11
.24
.05
.06
.11
.17
.04
-.07
.11
.08
-.04
.13
.23
-.07
-.15
.10
.10
-.46*
-.41 -.52*
.31 -.52*
-.19
-.14
-.10
.00
-.26
.02
-.33
-.24
.25
-.35
.00
-.31
.08
-.11
60 Perceived competence
.44
.40
.28
.28 .56*
61 RAI
.18 .58**
-.11
-.19
.35 .70**
62 ELA achievement
.05
-.32
.26
.03
.13
-.34 .57** .58** .71**
-.04 .80**
.35
.40
.23
.33
.48*
.07
.38
.33
-.15
.06
-.06
.35
-.02
.14
63 Math achievement
.05
.02
.08
-.10
.11
-.23
.41
.02
.11
.11
.16
.23
-.27
.04
.01
-.20
-.04
-.07
.41
-.12
-.20
64 Grade point average
.03
-.27
.22
-.01
.09
-.41 .65** .79** .94**
.10 .80**
.22
.33
.32
.15
.23
.03
.31
.15
-.21
.12
-.25 .63**
.16
.21
65 Credits earned
.43
.40
.21
.23.67**
-.26
.44
.40
-.21
.18
.35
-.21
-.14
-.34
-.09
-.05
.04
.02
-.11
.11
.38
.30
66 Attendance
Perception of administrative support for
67
instructional innovation and improvement
68 Teacher collective commitment
.13
-.13
.27
.03
.17
-.41 .63** .81** .87**
.20 .92**
.25
.34
.34
.25
.34
-.07
.33
.20
-.15
.21
-.16
.54*
.01
.12
.32
.16
.43
.31
.08
-.07
-.08
.25
.26
-.08
.40 .62** .72** .72**
.31 .71**
.31
.49*
.30
.27
.08
.01
.26
.42
.54* .78** .58**
.16
.44
.30
.53*
.47* .58**
.33
.12
.45*
.26
.18
.03
-.05
.29
.28
-.17
.43
.54* .90** .77**
.42
.56*
.15 .67**
.47* .64**
.22
.33
.02
.15
.46*
.34
.45*
.36
.31
.04
-.05
.24
.21
.13
.40
.39 .76** .76**
.36
.45*
.01
.51*
.21 .62**
.29
.28
-.06
.10
.20
70 Support from district
.29
.21
.38
.20
.05
-.02
-.10
.29
.28
.14
.32
.55* .68** .58**
.53*
.23
.55*
.14
.42
.15
.20
-.03
-.20
-.12
71 Support from administration
.39
.17
.49*
.47*
.09
-.29
.07
.39
.25
.11
.47*
.35 .57**
.10 .58**
.25
.27
.21
.33
.25
-.16
.18
72 Commitment to change
.06
.09
.08
-.05
.04
.05
-.10
.23
.27
-.08
.20
.28
.21
.41
.03
.37
.29
.01
-.08
.38
.15
73 Confidence in change
.34
.11
.46*
.38
.07
-.02
-.10
.07
.16
-.22
.35 .66** .66** .66**
.45* .57**
.41 .76**
.46*
.52*
.29
.34
.08
-.06
.37
74 Individual teacher morale
.13
-.11
.29
.15
.02
-.14
.03
.23
.42
-.26
.49* .64** .76** .68**
.47* .64**
.28 .74**
.56*
.47*
.34
.33
.00
-.03
.27
75 Professional development
.48*
.35
.46*
.43
.33
.15
-.16
.00
.06
-.05
.31
.38 .68**
.21
.28
-.01
.16
.42
76 Perceived competence
-.22
-.21
-.16
-.17 -.18
-.04
-.27
-.16
-.06
-.27
.00
.26
.28
77 RAI
.12
.24
.03
-.10
.21
.03
-.30
.09
.03
.14
.13
.35
78 Engagement
-.17
-.34
-.02
-.02 -.19
-.46*
.08
.44
.28
.00
.22
79 Alignment
.10
-.03
.15
.09
.11
-.22
-.21
.28
.09
.14
.14
80 Rigor
.32
.28
.21
.15 .45*
-.08
.36
.47*
.24 .61**
.24
69 Teacher mutual support
.54*
.47* .59** .76**
.35 .58**
.37
.53* .63** .63**
.27
.39
.49*
.25 .62**
.32
.20
.33
.04
.30
.48*
.26
.31
.23
-.07
-.04
.16
.38 .58**
.46*
.54*
-.03
.34
.40
.29
.12
.51*
-.34
-.30
-.22
-.05
.24
.41
-.10
.18
-.34
.09
.16
-.29
.31
.05
.38
.01
.06
.07
.25
.47*
.13
.34
-.27
.12
.24
-.10
.02
.18
.07
-.16
-.07
-.39
-.23
.00
-.18
-.20 -.46*
-.40
-.32
-.38
-.38
-.13
.23
-.07
-.24
Note. **p < .01, *p < .05; N=20; Correlations are based on school aggregated student and teacher data.
Correlations between EAR observation outcomes (i.e., engagement, alignment, and rigor) and all variables were conducted on non-imputed data. All other correlations based on school aggregates of multiply imputed data.
272
School-Level Correlations Among Outcome Variables, Year 1-2 (Part 5 of 6)
Outcome
42
43
44
45
46
47
48
49
50
51
52
53
54
55
Fall Year 2
42 Attitudes towards school
43 Positive teacher support
.77**
44 Lack of teacher support
.86**
45 Engagement
.81**
46 Perceived competence
.72** .74**
.46*
.38 .70**
.31 .56**
47 RAI
.25
.35
.19
.16
.06
48 Teacher collective commitment
.31
.13
.20
.40
.35
-.30
49 Teacher mutual support
.26
.09
.22
.37
.20
-.33
50 Support from district
.12
-.02
.29
.12
-.20
.00
51 Support from administration
.27
.04
.35
.25
.11
-.40
52 Individual teacher morale
-.02
.11
-.14
-.07
.17
-.38
53 Engagement
-.28
-.32
-.07
-.36
-.30
-.48*
54 Alignment
.38
-.03
.49*
.50*
.16
-.06
55 Rigor
.37
.06
.40
.36
.35
-.12
.90**
-.08
.23
.47* .67** .64**
.61** .69**
.05
.21 .60**
.11
.01
.20
.21
.62** .63**
.33
.48*
.35
-.03
.38
.35
.03
.41
.21
.26 .67**
.30
.26
.30
.13
.20
.11
-.49*
.35
.15
.48*
-.01
.02
.04
.00
.00
-.45
.01
-.14
-.23
Spring Year 2
56 Attitudes towards school
.73** .64** .64**
57 Positive teacher support
.60** .71**
58 Lack of teacher support
.74**
59 Engagement
.61**
.42
60 Perceived competence
.42
.42
.29
.25
.42
.20
.24
.26
.08
.10
61 RAI
.28
.31
.18
.25
.18
.76**
-.21
-.27
-.15
-.31
62 ELA achievement
-.01
-.28
.11
-.03
.10
-.52*
.32
.27
.23
.48*
.37
63 Math achievement
.14
.09
.27
-.10
.04
-.28
-.15
-.07
.26
.42
.21
-.08
-.25
.14
-.21
-.13
-.35
.14
.21
.29
.46*
-.28
.07
.12
-.10
.39 .62**
.49*
.51*
.48* .62**
.53* .71** .73**
64 Grade point average
.48* .48*
.50*
.26
.36
.52* .77**
.50*
.41
.11
.36
.39
.18
.35
.17
.40
.30
.45* .69** .48*
.21
.35
.37
.14
.17
.10 -.66** .60**
.38
.07 .47*
.15
.15
.24
.32
.32
.06
-.27
-.25
.49*
.46
.53*
.41
-.14
.23
.35 .17**
.37 .62**
.18
.23
-.01
66 Attendance
Perception of administrative support for
67
instructional innovation and improvement
68 Teacher collective commitment
.09
-.06
.24
.33
.27
.27
.31
.22
-.08
.66** .81**
-.06
.54*
.31
.20
.20
.39
.28
-.07
.85** .87**
.11
.46* .70**
.04
.54*
.32
69 Teacher mutual support
.37
.31
.20
.43
.38
.04
.71** .81**
.22
.49* .68**
-.14
.46
.29
70 Support from district
.27
.27
.20
.26
.15
.02
.41 .65** .60** .57** .68**
.03
.31
.07
71 Support from administration
.41
.24
.35
.43
.29
-.28
.57** .78** .62** .89** .64**
.04
.52*
.31
72 Commitment to change
.07
.21
.02
-.08
.07
.08
-.06
-.07
-.32
.06
.09
.11
65 Credits earned
-.10
-.39
-.54*
-.45* -.64**
.34
.44
73 Confidence in change
.35
.20
.31
.40
.20
-.08
.75** .80**
74 Individual teacher morale
.08
.05
.00
.13
.15
-.25
.79** .78**
.73** .70**
75 Professional development
76 Perceived competence
77 RAI
.25
.30
.55*
.44* .64** .64**
.19
.46* .85**
.14
.56*
.13
-.11 .76**
.27
.17 .64**
.28
.24
.11
.08
.41
.34
.13
.10
.39
.48*
-.37
-.25
-.48*
-.13
-.11
-.14
.48*
.46*
-.19
-.03
.44
-.31 .67**
-.12
.17 .68**
.42
.20
-.09
.05
.07
.27
-.19
.05
.34
-.02
.48*
.48*
-.05
-.04
.07
78 Engagement
-.27
-.17
-.26
-.24
-.14
-.52*
.28
.44
.00
.23
.42 .71**
-.37
.00
79 Alignment
-.05
.07
-.22
-.01
.20
-.33
.35
.41
-.10
.12
.41
.17
-.42
.11
.29
.24
.30
.06
.25
-.01
-.29
-.29
-.20
-.04
-.27
.26
-.19
.49*
80 Rigor
Note. **p < .01, *p < .05; N=20; Correlations are based on school aggregated student and teacher data.
Correlations between EAR observation outcomes (i.e., engagement, alignment, and rigor) and all variables were conducted on non-imputed data.
All other correlations are based on school aggregates of multiply imputed data.
273
School-Level Correlations Among Outcome Variables, Year 1-2 (Part 6 of 6)
Outcome
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
-.06
Spring Year 2
56 Attitudes towards school
57 Positive teacher support
.86**
58 Lack of teacher support
.90** .66**
59 Engagement
.71**
60 Perceived competence
.85** .70** .63** .60**
61 RAI
.40 .59**
.35
.47*
.17
62 ELA achievement
-.15
-.39
.07
-.15 -.05 -.58**
63 Math achievement
-.10
-.07
.11
-.29 -.30
-.20
.46*
64 Grade point average
-.11
-.33
.20
-.18 -.14
-.47*
.70**
.53*
65 Credits earned
.29
.11
.24
.28
.42
.17
.08
.02
66 Attendance
Perception of administrative support for
67
instructional innovation and improvement
68 Teacher collective commitment
.02
-.18
.21
-.04
.03
-.51*
.38
.20
.44
.30
.32
-.20
.22
.03
.24
.04
.35
.25
.03
.36
.33
.16
-.17
.24
-.03
.21
.17
.28 .81**
69 Teacher mutual support
.40
.22
.41
.43
.34
-.05
.12
-.05
.15
.31
.23 .83** .91**
70 Support from district
.29
.16
.36
.25
.16
-.16
.15
.14
.27
.04
.33 .85** .74** .82**
71 Support from administration
.30
.08
.40
.33
.21
-.29
.29
.15
.29
.09
.42 .89** .64** .70** .77**
72 Commitment to change
.22
.22
.27
-.11
.22
.02
.04
.18
.19
.07
.19 .72** .60** .64** .66**
73 Confidence in change
.40
.18
.41
.45*
.37
-.24
.24
-.20
.14
-.02
.29 .91** .76** .71** .65** .81**
74 Individual teacher morale
.13
-.08
.22
.20
.15
-.43
.44
-.06
.36
.06
.42 .80** .87** .77** .71** .62**
75 Professional development
.25
.28
.33
.82** .65** .85**
.29
.32
.07
.25
.54*
.40
.00
.17
-.37
.00
.29
-.18
-.32
-.26
.18
.01
-.08
.04
-.34
-.11
.07
-.01
.16
.10
.12
.25
.14
.03
.10
.01
-.01
.19
78 Engagement
-.20
-.38
.01
-.06 -.23
-.32
.11
.23
.36
79 Alignment
-.05
-.26
.08
.18 -.09
-.10
.09
.13
.12
.10
.07
.25
.04 -.15
.07
.09
.47*
.30
76 Perceived competence
77 RAI
80 Rigor
.12 .66** .73** .74**
.44
.43
.54* .80**
.43
.54*
.46*
.32
.20
.03
.07
.25
.16
.37 .57**
.55*
.55*
.27
.28
.30 .59**
.15
.26
.15
.32
.24
.34
.20
.19
-.03
.30
-.10
.20
.13
.12
.38
.31
.35
.16
.10
-.01
.31
.36
.21 -.47*
-.31
-.24
-.23
-.22
-.43 -.53*
-.43
.07 .62** .73** .73**
-.40 -.39 -.06 -.06
.17
.29 .78** .66**
.51*
.42
.36 .70**
.49*
.48*
Note. **p < .01, *p < .05; N=20; Correlations are based on school aggregated student and teacher data.
Correlations between EAR observation outcomes (i.e., engagement, alignment, and rigor) and all variables conducted on non-imputed data. All other correlations based on school aggregates of multiply imputed data
274
Appendix 16: Child-Level Interactions
There were 12 main student outcomes tested: (1) Y1 Students’ Attitudes, (2) Y2 Students’ Attitudes, (3) Y1 Math Achievement, (4) Y2 Math
Achievement, (5) Y1 ELA Achievement, (6) 3) Y2 ELA Achievement, (7) Y1 GPA, (8) Y2 GPA, (9) Y1 Credits Earned, (10) Y2 Credits Earned,
(11) Y1 Attendance, (12) Y2 Attendance. For each, the sixth model added eleven interactions: (1) treatment X baseline, (2) treatment X gender, (3)
treatment X Hispanic, (4) treatment X Black, (7) treatment X Asian/Pacific Islander, (6) treatment X American Indian/Other, (7) treatment by
free/reduced price lunch, (8) treatment X special education, (9) treatment X ELL, (10) treatment X math test type where applicable, and (11)
treatment X semesters enrolled. For the first ten interactions, no interpretable pattern appeared. The table below shows all interactions that were
marginal or significant. Those not tabled had p values higher than .10. When the treatment X semester interaction was significant, it is presented in
the main body of the text.
Note: These are fixed effect models and random effects for these interactions were not included.
Y1 students'
attitudes
Y2 students'
attitudes
Y1 math
achievement
Y2 math
achievement
Y1 GPA
Y2 GPA
Y1 credits
earned
Point-in-time models
Tx x baseline
Tx x Hispanic
Tx x Asian/Pacific Isl.
Tx x special education
Tx x math test type
Growth models
Tx x ELL
Implementation models
Tx x baseline
Tx x Hispanic
t
-0.06**
0.04t (0.02)
(0.02)
-0.05* (0.03)
-0.07***
(0.02)
-0.08t
(0.04)
-0.08* (0.04)
0.00*
(0.00)
-0.001*
(0.00)
< .10, *<.05, **<.01, ***<.001
not tested
not tested
0.00t
not tested
not tested
(0.00) -0.09t (0.05)
-0.00**
(0.00)
not tested
275
Endnotes
1
There are growing literatures on instructional improvement in preschools and elementary
schools that are not covered here.
2
The name “Literacy Matters” was adopted by IRRE after the ECED Efficacy Trial was
underway. Schools participating in the ECED Efficacy Trial knew it as ECED Literacy.
3
As with “Literacy Matters”, the name “Reading Matters” was adopted after the ECED Efficacy
Trial was underway. The schools participating in this project called it 9th-Grade ECED
Literacy.
4
As with the other names, “Writing Matters” was adopted after the ECED Efficacy Trial was
underway. The schools participating in this project called it 10th-Grade ECED Literacy.
5
This blocking at the district level was necessary for recruitment and effective use of project
resources. We did not believe that districts would agree to participate in the study unless they
were guaranteed that some of their schools would receive the supports. Further, because
IRRE consultants had to travel to provide the supports, it would have been inefficient to have
fewer than two treatment schools in the same geographic region.
6
Although schools did not pay IRRE for the supports, there were some costs associated with
participation. The treatment schools had to cover the costs of: a .50 FTE literacy coach and a
.50 FTE math coach, three professional development days for each ECED teacher during two
summers, substitute teachers so that each ECED teacher could participate in four half-day PD
sessions per year, PDAs for at least 3 district leaders and 5 school leaders, and photocopying
of the student materials for the FTF Literacy curriculum. Additionally, staff time had to be
devoted to coordinating the ECED supports in treatment schools, coordinating the research
activities in all schools, and providing the research team with student records.
7
In spring of 2009, District 4 agreed to participate, signed the MOU, and participated in the
random assignment process with the intention that they would begin participation in the
summer of 2009 as part of Recruitment Group 1. However, very shortly after signing the
MOU and prior to starting any supports, several members of the district’s leadership –
including the superintendent – left the district. After negotiating with the new interimsuperintendent, that district’s participation was delayed one year and they participated in the
second recruitment group, which began in the summer of 2010.
8
As noted below, two schools served only 9th-graders during the first year of ECED and
therefore had very small enrollments.
9
Schools 4 and 5.
10
Schools 6, 7, and 19.
11
Schools 18 and 20.
276
12
School 2, in District 1, part of Recruitment Group 1.
13
School 17, in District 5, part of Recruitment Group 2.
14
Our definition of ‘target teacher’ changed several times during data collection. This
document summarizes only our final decision about which teachers and teacher data to
include.
15
Three individuals taught both math and English/literacy and are counted in both groups.
16
Teachers who changed schools during the course of the study and taught a target math class
at more than one school are treated as two separate individuals in these data. These 238 math
teachers 232 different individuals.
17
Because different districts use different names for 9th- and 10th-grade English, we counted
any class for which a student got a regular, required English credit as 9th- or 10th-grade
English. So, for example, English as a Second Language (ESL) and remedial English classes
counted as regular English if they replaced the required regular English course and the
enrolled students did not have to ‘make-up’ the English credit later. If, on the other hand, the
student did have to take 9th- or 10th-grade English upon successful completion of ESL or
remedial, those courses were not counted as 9th- or 10th- grade English.
18
Three of these individuals taught math and English/literacy and are counted in both groups.
19
Teachers who changed schools during the course of the study and taught a target ELA class
at more than one school are treated as two separate individuals in these data. These 298 cases
actually represent 295 different individuals.
20
Some treatment schools offered ECED Literacy in the fall only and 9th- and 10th-grade
English in the spring only, taught by the same teachers. In those cases, the teachers who
taught ECED Literacy in the fall continued to receive ECED supports in the spring.
21
This section refers to any course at the school, whether or not it was targeted by ECED. The
next section refers only to the ECED target courses of ECED Literacy, 9th/10th-grade English,
Algebra 1, or Geometry.
22
Two schools (schools 4 and 14) included accelerated or advanced programs for high
achieving students from throughout the district. Students in those programs were excluded
from the study all together and are excluded from all statistics in this report. The decision to
exclude the students in these programs was made prior to the districts’ agreeing to participate
and prior to the random assignment. Students in those programs have very full course
schedules and it would not have been possible to add the ECED Literacy course to their
schedules, so those schools could not have participated if the project had not agreed to
exclude the students in those programs. School 4 ended up being in the control condition and
school 14 in the treatment condition. The students at school 4 in the accelerated program are
technically enrolled in school 4. The students in the accelerated program at school 14 are
277
technically in a separate school with its own school administration and federal identification
number; however, the program is physically housed on the same campus as school 14.
23
This variable refers to the student’s age when s/he took the Wave 1 survey. For students who
did not take the Wave 1 survey (including all students in Grade Cohort 3), it is the student’s
age on the average date when students in his/her district took the Wave 1 survey.
24
The name ‘Literacy Matters’ was adopted by IRRE after ECED was underway. Participating
schools knew these courses simply as ECED Literacy.
25
These students were also supposed to have an equivalent amount of 9th-/10th-grade English,
but those courses and their teachers were not specifically targeted by the intervention.
26
These two schools did not offer Literacy Matters in Year 2 because they had stopped
participating. So, no student in these schools took the prescribed amount.
27
Reminders were sent each wave except for Wave 1 for teachers in schools in the first
recruitment group. The decision to add a fall teacher survey was made at the last minute and
we had not obtained contact information for the teachers. For that reason we were unable to
follow-up with teachers who did not return the survey.
28
The different procedures for the fall and spring administration were due to budget
constraints. The fall surveys were not part of the original study design and the cost of their
administration was not included in the subcontract with IRRE, who manages and maintains
the on-line survey system.
29
Number of teachers who responded divided by n.
30
These teachers were not asked to participate because they were not teaching a course that
was targeted that term and had not taught a target course in the past so were not yet part of
the sample.
31
Schools 2 and 17 dropped out of the study. School 2 did not allow teacher surveys in Waves
2 or 3; school 17 did not allow teacher surveys in Waves 3 or 4.
32
Includes teachers who were asked to participate but did not, and those who should have been
asked to participate, but were not due to an oversight or because they did not appear on the
schedules provided by the schools. Most likely this last group was hired after the surveys
were conducted.
33
Number of teachers who responded divided by n minus those not at the school at the time of
administration.
34
The dropped three items were: (1) My job had become just a matter of putting in time; (2)
Time goes by very slowly when I’m at work; and (3) When I’m teaching, I feel bored.
35
The RAI score was calculated using the scoring methods established by the authors: the mean
278
of the two external items was weighted -2, the mean of the two introjected items was
weighted -1, the mean of the two identified was weighted +1, and the mean of the two
intrinsic items was weighted +2. In other words, the controlled subscales are weighted
negatively, and the autonomous subscales are weighted positively. The more controlled the
regulatory style represented by a subscale, the larger its negative weight; and the more
autonomous the regulatory style represented by a subscale, the larger its positive weight.
(http://www.selfdeterminationtheory.org/questionnaires/10-questionnaires/48).
36
E1 and E2 must be multiplied together to be meaningful because E2 refers to the proportion
of those students who were on task (in E1) who were actively engaged.
37
Observer 3 had been an the interim superintendent in a District 3 the year before she began
working for ECED, but she did not collect any data in that district at any point in the project.
38
The only exception was District 2, Wave 2, when there was a one-week break between the
two weeks of data collection.
39
Number of teachers observed divided by n.
40
School 2 and 17 dropped out of the study. School 2 did not allow observations in Waves 2 or
3; School 17 did not allow observations in Waves 3 or 4.
41
These individuals should have been observed by were not. Reasons include changes in
teacher assignments that research team learned about after observations, oversight,
miscommunication with data collectors, changes in definitions of target teachers, and teacher
refusals.
42
Number of teacher observed divided by n minus the number not teaching target course and
no longer/not yet at school.
43
These scores were calculated by calculating the ICC between each of the two observers’
scores to the consensus score and then taking the average of those values. Of the 281 visits in
which two observers were present, we had consensus scores for 277 cases. The others were
missing due to technical difficulties or observer error.
44
Includes students that should have been asked to complete the survey but were not due to
some type of error or misunderstanding with the school. It includes students who the school
indicated were in self-contained special education or ‘newcomer’, but district records did not
concur. Also, at Wave 1, one treatment school mistakenly excluded students who were not
enrolled in Literacy Matters.
45
Includes students who were chronically absent, truant, suspended, in the process of being
expelled, incarcerated, or home-bound due to illness/pregnancy.
46
These students disenrolled between the time the roster was created and the administration of
the survey. It includes ‘no shows’—
279
students who enrolled but never attended—if they were not removed from the rosters by the time
they were produced for the ECED Efficacy Trial.
47
School 2 stopped participating in ECED supports after Wave 1. They administered student
surveys to all 9th- and 10th-graders in Wave 1 and to 10th-graders only in Wave 4. They did
not administer student surveys at all in Waves 2 and 3.
48
As with teacher questionnaires, the items comprising the Relative Autonomy Index (RAI)
were left out of the exploratory analyses. RAI scores for students were calculated using the
same formula described for teacher RAI scores.
49
The five dropped items were: (1) My teachers are fair with me; (2) When I’m doing a class
assignment or homework, it’s not clear to me what I’m supposed to be learning; (3) Students
in my school get to know each other well; (4) Students in my school show respect for each
other; and (5) In my school, the students push each other to do well.
50
As noted above, parents were given a way to exempt their student’s records from being
released.
51
Some districts provided a three response categories for free/reduced price lunch (free,
reduced, paid), whereas others only provided free/reduced lunch (yes or no). When a district
provided three response categories, the categories of free and reduced were collapsed into a
single category, resulting in a dichotomous variable that was comparable across districts.
Similarly, some districts were able to provide more detail regarding special education and
English-language learner status than others. To make them comparable across districts, a
dichotomous variable for each student for each year was created. The special education
variables indicate if the student did or did not receive special educational services that year.
Students in self-contained special education are not in the study, so those in special education
were all in inclusive settings. The English fluency variables refer to whether or not a student
received services for English language learners (ELL) that year. Students who did not receive
ELL services – either because they were native English speakers or because they had
achieved a level of fluency in English that no longer required special supports according to
the district – were labeled as ‘no EL’ (0). Those who received services are labeled as ‘EL’
(1). As with special education, students whose English skills were so limited as to be
excluded from the regular curriculum (‘newcomers’) are not in the study sample.
52
10th-grade tests scores were not requested for students who were in 9th-grade in the second
year of the study (Grade Cohort 3) because those were administered the year after data
collection ended.
53
These general rules were constructed by the ECED research team, in conjunction with an
expert in missing data analysis, Dr. Jennifer Hill, Associate Professor of Applied Statistics at
New York University.
54
Prior to setting this rule, we checked with each district to ensure that there had been no major
changes to their testing system that would make combining across years unadvisable. The
280
only such change was in District 3 between 2009 and 2010. See the specific information
about how District 3 tests were handled for more information.
55
Note that the tests are ordered from lowest to highest level in Appendix 10.
56
District 4 did ultimately provide the needed data on August 26, 2013, four days before the
end of this grant. By that time, those data had been imputed and analyses conducting using
the imputed data. We will be able to incorporate that districts’ attendance data into future
analyses, but they are not included in the analyses in this report.
57
As noted earlier, one district sent the attendance information after these analyses were
complete. We will be able to incorporate that districts’ attendance data into future analyses,
but they are not included in the analyses in this report.
58
At a later time, we will tests the impact of a single year of exposure to ECED on the students
in grade cohorts 2 and 3.
59
ICC(2) = k x ICC(1)/1 + (k – 1) X ICC(1) with k being the average group sample size.
60
The EAR protocol was not available to control schools, so there was no possibility of control
schools using it.
61
Two treatment schools discontinued their participation in ECED supports prior to the second
year. Those two scores had year 2 implementation scores that were similar to those seen in
the control schools.
62
We have been working with a post-doctoral researcher at Columbia University with expertise
in imputation to create the multiply imputed data sets. She has not yet had time to complete
the imputation of the EAR Protocol data. For that reason, we are presenting non-imputed data
here. We will conduct these same analyses with imputed data when they become available.
63
The deviance statistic is not provided in HLM 6 when multiply imputed data is used. Instead,
we checked the deviance statistics using one of the five imputed datasets.
64
As noted above, we do not yet have the needed imputed data for the EAR Protocol analyses.
For that reason, we are not presenting EAR Protocol data for the ELA teachers at this time.
65
The deviance statistic is not provided in HLM 6 when multiply imputed data is used. Instead,
we checked the deviance statistics using one of the five imputed datasets.
66
The four that left during the study were in Districts 1, 3, 4, and 5. The one that left just as the
project was ending was in District 2.
67
Schools 1, 4, 6, 7, 10, 12, 14, 15, 17, 18, and 20
68
District 1
281
69
In Year 2, one of the two treatment schools in that district did provide the entire ECED
Literacy curriculum and entire ECED Literacy curriculum to lower-achieving 9th-graders.
70
District 2, School 2.
71
Syntax for this comparison is in file called ‘Fs in Algebra 1 and Geometry.’
72
Is would be counted as the same as an F if the I was never raised to a passing grade.
Theoretically students could have changed Is to a passing grade after we received the data
files from the district. However, we typically received the files well after term had ended –
often as much as a year later– so such changes were likely rare.
73
‘RG’ refers to recruitment group.
74
Questions 59a and 59b were asked at Wave 4 only because no teacher was using the Year 2
curriculum during Year 1.
75
Observers 5 and 7
76
Observers 1 and 6
77
Observers 2 and 4
78
Observer 3
79
Observer 7
80
Observer 6
81
The schools in District 5 had very complicated schedules, where many courses met less often
than daily. These were all weighted accordingly, using 1 period of a regular day as the
standard.