Summary of State Proposed Growth Models Undergoing Peer Review in 2005-2006 In November 2005, The United States Department of Education requested state proposals for accountability models that incorporate measures of student growth. States were encouraged to submit proposals to the Department for using growth models to demonstrate accountability under the federal No Child Left Behind (NCLB) Act. States submitting proposals were required to show how their growth-based accountability models satisfy NCLB alignment elements and foundational elements explained in the November letter. Those elements included the following: NCLB Alignment Elements 1. The accountability model must ensure that all students are proficient by 2013-14 and set annual goals to ensure that the achievement gap is closing for all groups of students. 2. The accountability model must not set expectations for annual achievement based upon student background and school characteristics. 3. The accountability model must hold schools accountable for student achievement in reading/language arts and mathematics. Foundational Elements 4. The accountability model must ensure that all students in the tested grades are included in the assessment and accountability system. Schools and districts must be held accountable for the performance of student subgroups. The accountability model includes all schools and districts. 5. The State’s assessment system, the basis for the accountability model, must receive approval through the NCLB peer review process for the 2005-06 school year. In addition, the full NCLB assessment system in each of grades 3-8 and in high school in reading/language arts and math must have been in place for two testing cycles. 6. The accountability model and related State data system must track student progress. 7. The accountability model must include student participation rates in the state assessment system and student achievement on an additional academic indicator. The Department received 20 proposals in February 2006. Seven (Hawaii, Maryland, Nevada, New Hampshire, Ohio, Pennsylvania, and South Dakota) of the twenty proposals applied to evaluate growth in the 2006-2007 school year, so they were not evaluated in the first round of reviews. Of the 20 submissions, eight (Alaska, Arizona, Arkansas, Delaware, Florida, North Carolina, Oregon, and Tennessee) were sent to be evaluated by a peer review group. The five states that submitted but were not sent by USDE for peer review included Colorado, Indiana, Iowa, South Carolina, and Utah. During the review of the eight proposals, the peer review group identified additional elements on which they evaluated proposals. Based on their Crosswalks paper, the peer review group determined that states should 1. incorporate available years of existing achievement data, instead of relying on only two years of data, 2. align growth timeframe with school grade configuration and district enrollment, 3. make growth projections for all students, not just those below proficient, and 4. hold schools accountable for same subgroups as they did under status model. They determined that states should not 11 1. use wide confidence intervals, 2. reset growth targets each year, and 3. average scores between proficient and non-proficient students. The goal of this paper is to identify distinguishing practical and psychometric features of the submitted proposals. Since states function within unique political contexts and have unique data structures for their assessments, no one growth model best meets all states’ needs. By comparing and contrasting growth model proposals on specific features, states looking for a growth model can use this information to find a model that best fits their particular circumstances. States with existing growth models might use this information to find similar models for which they can compare their model. The tables in Appendix A summarize the eight proposals that were reviewed by the committee on practical and psychometric features. Growth Model Features Pilot Approved in 2006 – State growth model proposals approved by the USDE in 2006. These states will be able to include growth measures in accountability decisions for the 2005-2006 school year and are expected to present data that show how the model works compared with the current AYP model. Resubmitting in 2007 – This field indicates if the state plans to resubmit their growth model proposal in 2007. Name of Growth Measure – For states that have named their measure of growth, the name is listed. All Students at Proficiency or On Track by 2013-2014 – This field indicates if states’ growth model proposals will have all students at proficiency or on track to be proficient by 2013-2014. Scores on Vertical Scale – This field indicates whether the state has a vertical or developmental scale underlying its assessment. Vertically Aligned Standards – This field indicates whether the state vertically aligned its performance standards. First Year All Grades Tested – The years listed in this field indicate the first time that the state assessed students in the NCLB grades in reading and mathematics. Includes Grade 3 Students – States that calculate growth for grade 3 students are noted in this field. Includes Students Without Prior Year Score – States that measure growth for students who are missing the prior year test score are noted in this field. As examples, states may calculate growth for students without the prior year score by using scores from more than one year prior, the mean of the grade-level scores the prior year, or a pretest score to compute growth. 21 Includes SWD Taking Alternate Assessment – For states that have an alternate assessment for students with disabilities, this field indicates whether the proposed growth model can be applied to scores on that alternate assessment. Includes ELL Taking Alternate Assessment – For states that have an alternate assessment for English Language Learners, this field indicates whether the proposed growth model can be applied to scores on that alternate assessment. Grades for which Growth is Calculated – Some states calculate growth for grades 3-12, whereas other states only calculate growth for a subset of grades. This field indicates the grades in which states are calculating student growth. Number of Years for Students to Reach Proficiency – States use different numbers of years for students to reach proficiency. This field indicates the number of years that states use in their growth-based accountability models. Growth Tracked Only for Below Proficient Students – This field indicates if state growth models are only applied to students who are below proficient. Use Confidence Interval – This field indicates if the state will use a confidence interval in any way in the growth-based accountability model. Some states propose using a confidence interval around the percent of students reaching proficiency or meeting growth targets and others propose using a confidence interval around students’ growth estimates. Averaging of Calculations – This field identifies states that average calculations over more than one year or over students within subgroups. Incorporates Available Years of Achievement Data – This field indicates whether states use all years of previous achievement data or if the state uses only two years of available data. Growth Target Timeline Starts Over Each Year – This field indicates if states identify a starting year for student growth calculations and define time to reach proficiency by this starting year or whether states recalculate growth each year. Growth Target Timeline Aligns with Grade Configuration – This field indicates if the number of years in which students are expected to reach proficiency matches with the grade configurations in the state. For example, if students below proficiency are expected to grow at a speed so that they will reach proficiency in three years, does the state set the three years in a way that maps the three years onto the grades that each school unit serves, such as the three middle school grades. Accounts for Students Falling Off Track – This field identifies states that use growth to identify students who may be above proficiency but whose growth will likely result in their falling below proficiency at a later date. Minimum N Same as for AYP Status Model – This field identifies states that apply a minimum sample size rule to the growth model. For example, many states designate a minimum N (or sample size), such that the state does not include a subgroup in AYP calculations or growth model calculations if that subgroup has fewer students than the minimum sample size. 31 Growth Applied after Status and Safe Harbor Provisions – This field indicates states that use growth as an additional method for meeting AYP after applying the status model and the safe harbor provisions. Growth Reported at Individual Level – This field notes states that report growth on individual student reports or at the individual student level. STATE Variable Alaska AK Arizona AZ Arkansas AR Delaware DE Pilot approved in 2006 Resubmitting in 2007 Name of Growth Measure No No No No N/A N/A N/A Yes Measure of Academic Progress (MAP) Yes Yes Yes, on target only No Yes 2001-2002 No No Yes Yes 2004-2005 No No Yes, Grades 3-8 Yes 2005-2006 No No No Yes by changing performance levels No information found For all but 1.7% SWD For all but 0.9% LEP students 4-8 3-10 All students at proficiency or on track by 2013-2014 Scores on vertical scale Vertically aligned standards First year all grades tested Includes grade 3 students Includes students without prior year score Includes SWD taking alternate assessment Includes ELL taking alternate assessment Grades for which growth calculated Number of years for students to reach proficiency Growth tracked only for below proficient students Uses confidence interval No information found 4-9; 10 3-8 Yes 2001-2002 Yes No Yes Yes 4 grades 4-9; 3 grade 10 Yes Min (3 years or Grade 8) Yes 4 N/A Yes No 99% 99% around subgroup percents Yes for overall AYP only, 98%, one-tailed 41 unknown level Yes, up to 3 years starting 2007-2008 No Averaging of calculations Yes, uniform No Incorporates available years of achievement data Growth target timeline starts over each year No No Yes Yes Yes No No, unless discontinuously enrolled Yes No No No No No No Yes, n=20 for subgroups, n=40 for SWD or LEP Yes, n=40 for subgroups Yes, n=40 for subgroups Yes No, report growth for n=10; need max (90% of students in status model or 40 students) for subgroups Yes Yes Yes Yes Yes Yes No Growth target timeline aligns with grade configuration Accounts for students falling off track Minimum n same as for AYP status model Growth applied after status and safe harbor provisions Growth reported at individual student level Yes No STATE Variable Florida FL North Carolina NC Oregon OR Tennessee TN Pilot approved in 2006 Resubmitting in 2007 Name of Growth Measure All students at proficiency or on track by 2013-2014 Scores on vertical scale Vertically aligned standards First year all grades tested No Yes No Yes N/A Yes ABCs Growth Model No N/A Yes N/A Yes Yes Yes 2000-2001, grades 3-10 No Yes Yes Mid 1980s, new edition in 2005-2006 Yes, using a pretest score Yes Yes 2004-2005 Yes Yes 1992 (3-8), 2001 (high school) Yes, in terms of status only Includes students without prior year score No Yes, if pretest score on change scale Includes SWD taking alternate assessment No Yes, in terms of status only Includes ELL taking alternate assessment No Grades for which growth calculated Number of years defining on 3-10 Yes, if meet Full Academic Year and have previous scores on change scale 3-8 3 years 4 years Includes grade 3 students 51 Yes, in terms of current year scores Yes, in terms of current year scores Yes, in terms of status changes only Yes, in terms of status changes only Yes, in terms of status only Yes, in terms of status only Unclear 3-8, 10 3-8 4 years 3 years track Growth tracked only for below proficient students Uses confidence interval No No No No No No Yes, 95% CI Averaging of calculations No Yes, across students Incorporates available years of achievement data Growth target timeline starts over each year Growth target timeline aligns with grade configuration Accounts for students falling off track Minimum n same as for AYP status model Yes No No, includes previous or 2 previous scores No, reset only when move to new LEA No Yes, around school average growth Yes, across students No, planned for future Yes No Yes No Yes Possibly Yes Yes, n=30 or 15% of tested population Yes Yes, n=40, yet growth calculated when n < 40 Yes No, n=21 scores over 2 years for growth Yes Yes, greatest of (n=45 or 1%) Yes Not to parents Yes Yes Growth applied after status and safe harbor provisions Growth reported at individual student level Yes No Yes No Yes Although there are a large number of possible ways to measure growth and design accountability systems, there are a limited number of methods that underlie those possibilities. The next section describes six model types and eight characteristics that differentiate the model types. A table is provided to summarize the information. Model Descriptions Improvement: The change between different groups of students is measured from one year to the next. For example, the percent of fourth graders meeting standard in 2005 maybe compared to the percent of fourth graders meeting standard in 2006. This is the only growth model described here that does not track individual student's growth. The current NCLB “safe harbor” provision is an example of Improvement. Difference Gain Scores: This is a straightforward method of calculating growth. A student's score at a starting point is subtracted from the same student's score at an ending point. The difference or gain is the measure of an individual's growth. The difference scores can be aggregated to the school or district level to obtain a group growth measure. Growth relative to performance standards can be measured by determining the difference between a student's current score and the score that would meet standard in a set number of years (usually one to three). Dividing the difference by the number of years gives the annual gain needed. A student's actual gain can be compared to the target growth to see if the student is on track to meet standard. Residual Gain Scores: In this model, students' current scores are adjusted by their prior scores using simple linear regression. Each student has a predicted score based on their prior score(s). The difference between predicted and actual scores is the residual gain score and it is an indication of the student's growth compared with others in the group. Residual gains near zero indicate average growth, positive scores indicate greater than average growth and negative sores indicate less than average growth. Residual gain scores can be averaged to obtain a group growth 61 measure. Residual gain scores can be more reliable than difference gain scores, but they are not as easily integrated with performance standards in accountability systems such as NCLB because they focus on relative gain. Linear Equating: Equating methods set the first two or four moments of the distributions of consecutive years equal. A student’s growth is defined as the student’s score in Year 2 minus the student’s predicted score for Year 2. A student’s predicted score for Year 2 is the score in the distribution at Year 2 that corresponds to the student’s Year 1 score. The linear equating method results in a function that can be applied year to year. If the student’s score is above the expected score, the student is considered to have grown. If the student’s score is below the expected (predicted) score, the student is considered to have regressed. Expected growth is defined as maintaining location in the distribution year to year. Transition Matrix: This model tracks students’ growth at the performance standard level. A transition matrix is set up with the performance levels (e.g., Does not meet, Meets, Exceeds) for a given year as rows and the performance levels for a later year as columns. Each cell indicates the number or percent of students that moved from year 1 levels to year 2 levels. The diagonal cells indicate students that stayed at the same level, cells below the diagonal show the students that went down one or more levels and the cells above the diagonal show the students that moved to higher performance levels. Transition matrices an be combined to show the progress of students across all tested grades. Transition matrices are a clear presentation of a school's success (or lack thereof) in getting all students to meet standard. Multi-level: This model simultaneously estimates student-level and group-level (e.g. school or district) growth. There is evidence that multi-level models can be more accurate than difference or residual gain score models. However, even though the statistics have been around for many years, only recently has the computing power, software and expertise been widely available. Therefore the results of this model appear to be more complex because the methods are still unfamiliar to many people. Characteristics of Growth Models Database of matched student records over time (Student ID)- Most methods of measuring growth require analysis of individual student's results from two or more years. This means that student records from two different test administrations have to be combined or matched. Until recently, most systems lacked a student ID system that assigned each student a unique identification number that is recorded with any test that student takes as long as he or she is in the system. Without such an ID number, record matching must be based on some combination of name, birthdate or other demographic information. Because of changes in that information over time, combining students' test records is usually time consuming and prone to non-matches and mis-matches. The preferred solution is to develop a student ID system in which the ID number is part of the students' records system wide. This usually means integrating the ID into each school's student information system and maintaining a central database to assign and report the ID numbers. These changes require a significant investment or resources to develop and implement the new procedures. However, in the long run there should be a reduction in the work needed to match student records and an improvement in the quality of the information available. Requires common scale- Some growth methods require student scores to be reported on a common scale. Ideally this would mean that all the tests were written with measuring growth in mind and based on content standards that are aligned across grades. However it is possible to create a common scale for existing tests that were designed separately across grades. There are 71 technical issues and controversies about how to do this equating. Psychometric advice from experts should be sought before determining that a set of tests can be combined for measuring growth. Confidence Interval- A confidence interval (CI) is used to take into account the uncertainty in measuring growth. Sources for uncertainty include the normal measurement error of the test and sampling error. There are well established statistical techniques for estimating uncertainty and growth models use different techniques due to the differences in the way growth is calculated. Implementing a confidence interval is not simply a matter of applying a statistical technique. A decision must be made about the width of the confidence interval. A typical narrow CI is 68% (or 1 standard error) while a wider CI would be 95% or 99%. If the confidence interval is implemented around the target for growth, choosing a wider instead of a narrow CI will decrease the chances of incorrectly identifying a student or school as failing to meet the growth target. However, choosing a wide CI also increases the chances of incorrectly stating that adequate growth has been made when in fact it hasn't. Choosing the width of the CI always involves a compromise between those two types of errors. The policy-maker must weigh the consequences of each type of error and choose a CI that best serves the intended purpose of implementing a growth model. Includes students with missing scores- Student mobility is a potential problem in any model of growth that measures student achievement over time. If large numbers of students (i.e., more than 15%) do not stay in the same school long enough to take the test each time it is administered, then the sample of students whose scores are included in the model may not represent the whole school's enrollment. A problem would arise if the students with missing scores showed significantly higher or lower performance on the test. In the improvement model, all students' scores are included. However since individual students are not tracked over time, it is possible that the differences in performance of students who are moving in and out of the school contribute to the observed improvement. This could lead to over- or under- estimation of the school's effectiveness. Multi-level models use all the students' scores to estimate growth for both individuals and groups. However, students with only one score are estimated to make growth that was the average their group. A secondary problem with missing scores occurs when some groups have more missing scores than other groups. In that case the lack of data may mean that growth estimates for those groups are less reliable and may have to be excluded from reports. For all models, the effects of missing scores on growth estimates can be determined and should be examined. Includes Results From Alternate Tests- Since some models require measurements on a common scale, if alternative tests (e.g., for students with disabilities, English language learners, or high school end-of-course tests) do not produce scores on that scale, it may not be possible to include those students in the growth calculations. The Transition Matrix model is based on student progress as indicated by changes in the performance levels attained by students. If common performance levels have been set across different tests, the results can be combined. However, meaningful results depend on the assumption that the performance standard were set such that it is reasonable to assume that the performance levels on both tests indicate that students have the same knowledge and skills. Growth Question Answered- Growth models may be distinguished by the questions they answer. Determining the question you want to answer by using a growth model will make it easier to choose a growth model and to interpret the results of that model. Student Performance Standards Explicitly Included in Definition of Growth- For two growth models (Linear Equating and Transition Matrix ), the performance standard is built into the 81 model. Therefore there is no need to so through a separate process to set standards for adequate growth after the estimate of student growth are obtained. For the other models, users often conduct a standard setting process similar to the ones used to determine the individual performance standards for students at each grade level. Handles Non-linear Growth- Some growth models assume that each student's growth in achievement follows a straight line. This is generally a reasonable assumption. However, there is evidence that growth over many years is curved with elementary grade achievement growing at a greater rate than high school achievement. If growth is measured more frequently than once a year, there may differences in the rate of growth at different times. If you believe that students' growth is nonlinear, it maybe necessary to choose a growth model that can statistically model that type of growth.Table of Growth Model Characteristics Improvement Difference Residual Linear Gain Scores Gain Scores Equating Transition Matrix Multi-level Data Requirements Database of matched student records over time (Student ID) N Y Y Y Y Y Requires common scale N Y N N N Y Independent Groups t-Test Model Error Variance Model Error Variance NA Model Error Variance Includes students with missing scores Y N N N N Y Includes Results From Alternate Tests (Different scales) N N N N Y N Psychometric Issues Confidence Interval Growth Question Answered Did this year's Is the gain for students do a group higher better than last or lower than year's students? average? How much growth was produced by a group? Did students stay at the same percentile? Are students in a group How much of a making group's growth adequate is the result of progress across group-level performance effects? levels? Student Performance Standards Explicitly Included in Definition of Growth Y N N N Y N Handles non-linear growth N N Y N Y Y 91
© Copyright 2026 Paperzz