Frequently Asked Questions This document addresses a number of questions that educators have asked about STAR Reading™ tests and score interpretations. What is STAR Reading? STAR Reading is a computer adaptive reading assessment that provides accurate, norm-referenced reading scores on demand for students in grades 1-12. (Kindergarten students with a 100-word reading vocabulary can take STAR Reading; however it is not normed for this grade.) What is computer-adaptive technology? STAR Reading uses computer-adaptive technology which means the test adapts to the student’s level of proficiency. If a student answers a question correctly, the difficulty level is increased. If a student misses a question, the difficulty level is decreased. How can STAR Reading determine a child’s reading level in less than 10 minutes? Short test times are possible because the STAR Reading test is computer-adaptive. It adapts to test a student at his or her level of proficiency. Because the STAR Reading test can adapt and adjust to the student with virtually every question, it is more efficient than conventional pencil and paper tests, and acquires more information about a student’s reading ability in less time. This means the STAR Reading test can achieve measurement precision comparable to a conventional test that takes two or more times as long to administer. What is Item Response Theory? Item Response Theory (IRT) is an approach to psychometric test design and analysis that uses mathematical models that describe what happens when a student is administered a particular test question. IRT models give the probability of answering an item correctly as a function of the item’s difficulty and the student’s ability. What are cloze and maze procedures? These are terms for different kinds of fill-in-the-blank exercises that test a student’s ability to create meaning from contextual information, and as such have elements in common with the STAR Reading test design. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 1 of 11 R11447.120320 How does STAR Reading measure comprehension? Isn’t it a vocabulary test? The vocabulary-in-context test items, while using a common format for assessing reading, require reading comprehension. Each test item is a complete, contextual sentence with a tightly controlled vocabulary level. The semantics and syntax of each context sentence are arranged to provide clues as to the correct cloze word. The student must actually interpret the meaning of (in other words, comprehend) the sentence in order to choose the correct answer because all of the answer choices ‘fit’ the context sentence either semantically or syntactically. In effect, each sentence provides a mini-selection of which the student demonstrates the ability to interpret the correct meaning. This is, after all, what most reading theorists believe reading comprehension to be –the ability to draw meaning from text. In the course of taking the vocabulary-in-context section of STAR Reading tests, students read and respond to a significant amount of text. The STAR Reading test typically asks the student to demonstrate comprehension of material that ranges over several grade levels. Students will read, use context clues from, interpret the meaning of, and attempt to answer 20 to 25 cloze sentences across these levels, generally totaling more than 300 words. The student must select the correct word from sets of words that are all at the same reading level, and that at least partially fit the sentence context. Students clearly must demonstrate reading comprehension to correctly respond to these 20 to 25 questions. A child’s level of vocabulary development is a major factor—perhaps the major factor—in determining his or her ability to comprehend written material. Decades of reading research have consistently demonstrated that a student’s level of vocabulary knowledge is the most important single element in determining the child’s ability to read with comprehension. Tests of vocabulary knowledge typically correlate better than do any other components of reading with valid assessments of reading comprehension. In fact, vocabulary tests often relate more closely to sound measures of reading comprehension than various measures of comprehension do to each other. Knowledge of word meaning is simply a fundamental component of reading comprehension. The student’s performance on the vocabulary-in-context section is used to determine the initial difficulty level of the subsequent authentic text passage items. Although this section consists of just five items, the accurate entry level and the continuing adaptive selection process mean that all of the authentic text passage items are closely matched to the student’s reading ability level. This results in unusually high measurement efficiency. For these reasons, the STAR Reading test design and item format provide a valid procedure for assessing a student’s reading comprehension. For which grades can STAR Reading be used? STAR Reading was designed and normed for students in grades 1-12. Although it’s not normed for kindergarten, the software can be used with kindergarten students who have a 100 sight word reading vocabulary. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 2 of 11 R11447.120320 How do I know if a student is ready to take a STAR Reading assessment? A student should have a reading vocabulary of at least 100 words. Although there’s no “list” for determining this, some teachers use the STAR Reading practice items as a gauge. Students could also be identified as a Probable Reader by STAR Early Literacy™. In other words, the student should have at least beginning reading skills. Practically, if the student can work through the practice questions unassisted, that student should be able to be tested in STAR Reading. If the student has a lot of trouble getting through the practice, he or she probably does not have the basic skills necessary to be measured by STAR Reading. How many items are on a STAR Reading test? STAR Reading administers 25 items to all students. Students tested at grade 3–12 receive 20 vocabulary-incontext (short comprehension) items and five extended comprehension items. For students tested at grade K, 1 and 2, all 25 items are short comprehension items. How will students with a fear of taking tests do with STAR Reading tests? Students who have a fear of tests should be less disadvantaged by the STAR Reading test than they are in the case of conventional tests. The STAR Reading test purposely starts out at a level that most students will find to be very easy. This was done in order to give almost all students immediate success with the STAR Reading test. STAR Reading moves into more challenging material in order to assess the level of reading proficiency once the student has had an opportunity to gain some confidence with the relatively easy material. In addition, most students find it fun to take STAR Reading tests on the computer, which helps relieve some test anxiety. Does STAR Reading provide accessibility to special education students or English language learners with accommodation and/or special forms? The STAR Reading test has time-out limits for individual items based on a student's grade level. Students in grades K-2 have up to 60 seconds to answer each item during test sessions. Students in grades 3-12 are allowed 45 seconds to answer each vocabulary-in-context (short comprehension) item (the first 20 items) and 90 seconds to answer each extended comprehension item (the last five test items). STAR Reading provides the option of extended time limits for selected students who, in the judgment of the test administrator, require more than the standard amount of time to read answer the test questions. Extended time may be a valuable accommodation for English language learners as well as for some students with disabilities. When the extended time limit accommodation is elected, students have three times longer than the standard time limits to answer each question. Is there any way for a teacher to see exactly which items a student answered correctly and incorrectly? No. This was done for two reasons. First, in computer-adaptive testing, the student’s performance on individual items is not as meaningful as the pattern of responses to the entire test. The student’s pattern of performance on all items taken together forms the basis of the scores STAR Reading reports. Second, for purposes of test security, we decided to do everything possible to protect our items from compromise and overexposure. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 3 of 11 R11447.120320 How often can students take STAR Reading assessments? Schools often use the STAR assessments two to five times a year for purposes including screening, progress monitoring, placement and benchmark assessment. STAR Reading may be used monthly or weekly in progress monitoring programs, and has been found to meet the standards of the National Center on Response to Intervention. What is the difference between criterion-referenced and norm-referenced testing? Criterion-reference scores measure a student’s performance by comparing it to a standard criterion – what the student knows or can do. For example: has the student attained mastery of specific curriculum objectives? Does the student meet or exceed a specific performance standard? What proportion of the questions that measure knowledge of a specific content domain can the student answer correctly? The criterion-referenced score reported by STAR Reading software is the instructional reading level (IRL), which compares a student's test performance to vocabulary lists based on the Educational Development Laboratory (EDL)’s core vocabulary. Norm-referenced scores express the student’s standing in relation to his or her peers across the country in the same grade. A norm-referenced score is related to a changing criterion: changes in the reference group will be accompanied by changes in the norm-referenced scores. For example, when new norms are developed for a test, the distribution of test scores in the new reference group is typically different from the previous reference group to some extent; because of change over time in both the reference population and the attribute the test measures. Such changes usually affect norm-referenced scores including percentile rank (PR), grade equivalent (GE), and normal curve equivalent (NCE) scores. Is STAR Reading a criterion-referenced or a norm-referenced test? STAR Reading was developed within a criterion-referenced framework, by writing test items designed to reflect reading ability that is characteristic of specific grade levels. In addition, the resulting tests have been subjected to a nationally representative norms development process. As a result, STAR Reading provides both a criterion-referenced score (IRL) and norm-referenced scores (PR, GE, and NCE). Teachers can use STAR Reading’s criterion-referenced scores to estimate the student’s level of functioning in reading. They can also use its norm-referenced scores to assess student’s standings in relation to other students in the same grade across the country (based on PR), and to students in other grades (based on GE scores). What is a Scaled Score (SS)? Because of computer-adaptive technology, STAR Reading creates a virtually unlimited number of test forms. In order to make the results of all tests comparable, and in order to provide a basis for deriving norm-referenced scores, it is necessary to convert all of the results of STAR Reading tests to scores on a common scale. STAR Reading scaled scores range from 0-1400. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 4 of 11 R11447.120320 What is a Grade Equivalent (GE) score? A GE indicates the normal grade placement of students for whom a particular score is typical. The GE is a norm-referenced score. It provides a comparison of a student’s performance with that of other students around the nation. If a student receives a GE of 4.0, this means that the student scored as well on the STAR Reading test as did the typical student at the beginning of grade 4. It does not mean that the student can read books that are written at a fourth-grade level—only that he or she reads as well as fourth grade students in the norm group. For example, the median (typical) Scaled Score obtained by third-graders in the seventh month (April) during STAR Reading 2.x norming was 432. Thus, the Grade Equivalent score for anyone receiving a Scaled Score of 432 is 3.7. What is an Instructional Reading Level (IRL) score? IRL is a criterion-referenced score that estimates the grade level of written material at which the student can most effectively be taught. For example, if a student (regardless of current grade placement) receives a STAR Reading IRL of 4.0, this indicates that the student can most likely learn without experiencing too many difficulties when using materials written to be on a fourth-grade level. When a student completes a STAR Reading test, the software analyzes the student's performance and calculates his or her Instructional Reading Level (IRL), a criterion-referenced score that is the highest reading level at which a student is 80% proficient (or higher) at comprehending material with assistance (Gickling & Thompson, 2001). Research has found that this level of comprehension corresponds to being at least 90-98% proficient at recognizing words (Gickling & Havertape, 1981; Johnson, Kress, & Pikulski, 1987; McCormick, 1999)1; STAR Reading does not directly assess word recognition. What does an Instructional Reading Level (IRL) score of pre-primer mean? The assignment of a pre-primer IRL means that the student taking the test is a non-reader. How are GE and IRL scores related? One obvious similarity is the fact that both are expressed in grade equivalents. Although GE and IRL scores are highly correlated, they do not mean the same thing. While they may coincide in some cases, they can differ markedly, particularly for students who are significantly above or below average where the functional reading level is likely to differ substantially from the grade level. It’s also important to note that GE is a norm-referenced score while IRL is a criterion-referenced score. 1 Gickling, E. E., & Havertape, S. (1981). Curriculum-based assessment (CBA). Minneapolis, MN: School Psychology Inservice Training Network. Gickling, E. E., & Thompson, V. E. (2001). Putting the learning needs of children first. In B. Sornson (Ed.). Preventing early learning failure. Alexandria, VA: ASCD. Johnson, M. S., Kress, R. A., & Pikulski, J. J. (1987). Informal reading inventories. Newark, DE: International Reading Association. McCormick, S. (1999). Instructing students who have literacy problems (3rd Ed.). Englewood Cliffs, NJ: Prentice-Hall. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 5 of 11 R11447.120320 Why is it that GE and IRL scores sometimes differ? These two scores are both expressed in terms of grade levels, but that is the only similarity between them. The GE is a norm-referenced score that indicates the grade level at which the performance of students in the normative population was most similar to that of the student who is the subject of interest. In contrast, the IRL is a criterion-referenced score that estimates the grade level of written material at which the student can most effectively be taught. GE and IRL scores are highly correlated, but they do not connote the same thing, and there is no reason to expect them to be identical. While they may coincide in some cases, they can differ markedly, particularly for students who are significantly above or below average, where the functional reading level is likely to differ substantially from the grade level. What is a Normal Curve Equivalent (NCE) score? NCE scores are often used for research purposes. It’s an equal interval scale ranging from 1-99. When we see an NCE gain of 4.3 points we know that, based on the equal interval scale, the student has gained 4.3 points from the last test. For example: runners in a race finish first, second, and third. We know that the second place runner finished one place behind the first place runner, but we don't know how many seconds behind the first place runner the second place runner finished. This example is similar to how percentile rank scores work in STAR Reading. We know that if a student has a percentile rank of 85 he scored better than 85% of the students in the norming group. Now our runners finish times can also be recorded in seconds—all seconds are equal intervals so we know precisely how far ahead a runner has finished compared to another. This is the way the NCE score works. What is an Estimated Oral Reading Fluency score? An estimated oral reading fluency (Est. ORF) is an estimate of a student’s ability to read words quickly and accurately in order to comprehend text difficulty. Students with high oral reading fluency demonstrate accurate decoding, automatic word recognition, and appropriate use of the rhythmic aspects of language (e.g., intonation, phrasing, pitch, and emphasis). Est. ORF is reported in correct words per minute, and is based on an equating study indicating the relationship between STAR Reading performance and oral reading fluency. How do Zone of Proximal Development (ZPD) Ranges fit in? The ZPD defines the reading level range from which the student should be selecting books in order to achieve optimal growth in reading skills without experiencing frustration in the Accelerated Reader™ program. The ZPD is derived from a student’s demonstrated grade equivalent score. Renaissance Learning™ developed the ZPD ranges according to Vygotskian theory, based on an analysis of Accelerated Reader book reading data from 80,000 students in the 1996–1997 school year. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 6 of 11 R11447.120320 As a teacher, which score do I use when choosing materials for my students? The IRL is the best score to consider when choosing instructional materials for student use. The IRL can serve as a starting point, but it does not replace a teacher’s professional judgment based on his/her knowledge of the child. How is the STAR Reading Diagnostic Report constructed from the test results? The content of the Diagnostic Report is based on diagnostic codes (though the codes themselves are not printed on the reports). Diagnostic codes are based on two factors. The first is the Grade Equivalent (GE) score that the student achieved on the STAR Reading test. The second is the Percentile Rank (PR) that the student achieved on the same test. The resulting diagnostic code determines which descriptive text and prescriptive recommendations appear on the Diagnostic Report for a student. Diagnostic Reports for students performing at or below the 25th percentile include additional prescriptive information helpful for assisting those students. Why would a student’s Normal Curve Equivalent and Percentile Rank (PR) score go down while the Scaled Score and Grade Equivalent score went up? NCE and PR scores can go down, even though GE and Scaled Scores went up, any time the student’s performance relative to the norms group has slipped. For example, if a student has tested at the beginning of the school year and again at the end, his/her Scale Score and GE score will almost certainly go up as a result of maturation and instruction. However, if the student’s scores have gone up at a slower than normal rate, his/her PR and NCE scores would probably go down. (Remember PR and NCE scores are norm-reference and compared to a peer group.) Why would a student’s Scaled Score and Grade Equivalent score go up while his Percentile Rank and Normal Curve Equivalent went down? Different scores measure different types of growth. ABSOLUTE GROWTH reflects any and all growth that has occurred over a period of time and is reported in terms of SS and GE. RELATIVE GROWTH reflects growth that is above and beyond “normal” growth relative to a peer group (based on a normative group) and is reported in terms of PR and NCE. Therefore, because they measure different types of growth, GE and PR scores may not increase and decrease together. For example, when a student’s GE increases less than that of the student’s peers, she will lose ground in terms of PR. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 7 of 11 R11447.120320 Will I see growth over time in my gifted and talented students’ STAR Reading scores? Any student has the potential for growth, especially those who consistently practice their reading skills. When looking at STAR Reading scores, keep in mind that the test is normed and we are not comparing gifted and talented students to other gifted and talented students. Rather, we are comparing them to all students in the norming sample. What is Student Growth Percentile (SGP)? A Student Growth Percentile compares a student’s growth to that of his or her academic peers nationwide. Academic peers are students in the same grade with a similar scaled score on a STAR assessment at the beginning of the time period you are examining. SGP is reported on a 1–99 scale. For example, if a student has an SGP of 90, it means his growth from one test to another was better than 90 percent of students at a similar achievement level in the same grade. Adopted by 12 states, Student Growth Percentile (SGP) is a widely accepted growth measure. Until now, SGP has only been reported in summative state tests. Renaissance Learning has adapted the SGP model to the STAR interim tests. As a result, STAR assessments are the first interim assessment to report SGP. Why do some students STAR Reading test scores vary widely from the results of other standardized tests? The simple answer to this is that it is more than likely the result of the Standard Error of Measurement (SEM) of both testing instruments. Unfortunately, such a simple answer hides the complexity of the many factors that contribute to measurement errors inherent in psychometric instruments. You will find that the results of STAR Reading will agree very well with almost all of the other standardized reading test results. All standardized test scores have measurement error. The STAR Reading measurement error is comparable to that of most other standardized tests. When one compares the results from different tests taken at different times, it is not unusual to see differences in test scores ranging from two to five grade levels. This is true when comparing results from other test instruments as well. Standardized tests provide approximate measurements. The STAR Reading test is no different in this regard, but its adaptive nature makes its scores more reliable than conventional test scores near the minimum and maximum scores on a given form. A common shortcoming of conventional tests involves “floor” and “ceiling” effects at each test level. The STAR Reading test is not subject to this shortcoming because of its adaptive branching and large item bank. Other factors, such as student motivation and the testing environment, are also different for STAR Reading and high-stakes tests. Why do we see some students performing at a lower level now than they were nine weeks ago? This is a result of measurement error. As mentioned just above, all psychometric instruments, including the STAR Reading test, have some level of measurement error associated with them. The STAR Reading Technical Manual discusses standard error of measurement (SEM) in depth, and should be referred to in order to better understand this issue. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 8 of 11 R11447.120320 The standard error of the average results for a group is substantially lower than it is for individual test scores. Therefore, more frequent testing to measure the progress of classes, grades, or school populations will be less susceptible to measurement error than looking at the results of an individual student. I see improvement in a child’s reading skills. Why did her STAR Reading score drop? Some students’ scores may decline from one month to the next. Progress is best measured over a longer time span. The Annual Progress Report or the Student Progress Monitoring Report provides a graphic representation of a student’s progress including the trend over time. Scores on any test may decline from one administration to the next for a variety of reasons. Two of the most common reasons why scores decline are: Measurement Error Educational tests seek to estimate students' true ability level, but none do so with perfect precision. Some degree of error is always present. As a result, if your class were to take a test twice, many students' scores would go up and some may go down, even if students' skills have improved during the interval between tests. This is called measurement error, and is very similar to public opinion polls that present results with a plus or minus margin of error. Fluctuations in Performance A test score is a measure of a child’s performance on a given day. Human performance is not always consistent from one occasion to the next. A child may perform at her best during one administration of a test, and somewhat less than her best on another occasion because of an illness, distraction, anxiety, motivation, etc. For example, if a student ran a mile in 10 minutes one day and then two weeks later ran that same mile in 12 minutes you likely wouldn’t be concerned. We are not surprised by such fluctuations when they affect physical performance such as running speed or jumping distance in athletic contests. Similarly, we should not be surprised when test performance varies on occasion. Remember, STAR Reading scores are just one measure of a student’s academic performance. A student’s GE score can decrease as the year progresses if the student is not making progress. The GE score is closely related to the STAR Reading Scale Score. If the Scale Score declines over a period of time, the GE score will decline as well. It is not uncommon for some students’ scale scores and GE scores to decline from one occasion to the next, but it is highly unusual for these scores to be lower at the end of the school year than they were at the beginning. Such instances should be investigated to try to identify the reason. (See Understanding STAR Test Score Declines for more information.) P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 9 of 11 R11447.120320 Can incorrect grade placements be compensated? Teachers cannot make retroactive corrections to a student’s grade placement by editing the grade assignments in a student’s record or by adjusting the increments for the summer months after students have tested. In other words, STAR Reading software cannot go back in time and correct scores resulting from erroneous grade placement information. Thus, it is extremely important for the test administrator to make sure that the proper grade placement procedures are being followed. If you discover that a student has tested with an incorrect grade placement assignment, the procedures outlined in the STAR Reading Technical Manual can be used to arrive at corrected estimates for the student’s percentile rank and normal curve equivalent scores. What evidence do you have that STAR Reading is valid and reliable? This evidence comes in two forms. First, we have demonstrated test-retest reliability estimates that are very good. Second, the correlation of STAR Reading results with those of other standardized tests is also quite impressive. (See the STAR Reading Technical Manual.) In addition, the federally funded National Center on Response to Intervention (NCRTI) found STAR Reading to meet the highest scientific standards of quality. STAR Reading earned the highest scores for screening tools and earned the highest possible ratings in eight of nine categories for progress monitoring tools reviewed by NCRTI. Do all students receive longer reading comprehension items for the last five questions of the STAR Reading test? No. Only third through twelfth graders receive reading comprehension items. These new items were extracted from passages of authentic fiction and non-fiction text. Passages at the third grade level are about 30 words in length, while passages at the high school level are about 100 words in length. Kindergarten, first and second grade students receive shorter, vocabulary-in-context items throughout the STAR Reading test. How does the STAR Reading test compare to other standardized Tests? Very well. The STAR Reading test has a standard error and reliability that are very comparable to those of other standardized norm-referenced tests. Also, STAR Reading test results correlate well with results from these other test instruments. When performing our national norming of the STAR Reading 2.0 test, we also gathered student performance data from several other commonly used reading tests. These data comprised more than 12,000 student test results from test instruments including: CAT, ITBS, MAT, Stanford, TAKS, CTBS, and others. We computed correlation coefficients between STAR Reading 2.0 results and results of each of these test instruments for which we had sufficient data. These correlation coefficients are included in the STAR Reading Technical Manual. Using IRT computer-adaptive technology, the STAR Reading test achieves its results with fewer test items and shorter test times than other standardized norm-referenced test. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 10 of 11 R11447.120320 How is growth measured in STAR Reading? There are two types of academic growth (or gains) that may be evidenced in test results: absolute and relative growth. Absolute growth reflects any and all growth that has occurred. Relative growth reflects only growth that is above and beyond “normal” growth (i.e. beyond typical growth in a reference or norming group). In general, norm-referenced scores such as student growth percentile or normal curve equivalent indicate relative growth while scaled scores and the IRL reflect absolute growth. What if I am seeing negative growth in PR and NCE scores (the scores needed for Model/Master Classroom)? First, be sure that two grades are not grouped together on reports. This would not be comparing apples to apples as each grade should be compared to different norms. Second, determine if it’s just a few students that are bringing down the scores for the entire class. If so, consider retesting those students. How did you choose schools to participate in the norming of STAR Reading 2.0? Schools were chosen to be representative of the nation as a whole. The sample was balanced by geographic region, district size, socioeconomic status, and public versus private funding characteristics. The renorming sample included 269 schools and 29,627 students in grades 1–12. The final norms tables were weighted to account for deviations in the characteristics of the sample compared to the national norms. Can or should STAR Reading replace a school’s current standardized tests? This is up to the school system to decide, although this is not what STAR Reading was primarily designed to do. The primary purpose of STAR Reading is to provide teachers a tool to improve this instructional match for each student. Every school system has to consider its needs in the area of reading assessment and make decisions as to what instruments will meet those needs. We are happy to provide as much information as we can to help schools make these decisions, but we cannot make the decision for them. P.O. Box 8036Wisconsin Rapids, WI 54495-8036(800) 338-4204www.renlearn.com © 2012 Renaissance Learning, Inc. All rights reserved. Page 11 of 11 R11447.120320
© Copyright 2026 Paperzz