Some Key Points for Test Evaluators and Developers Scott Marion Center for Assessment Eighth Annual MARCES Conference University of Maryland October 11-12, 2007 1 Key Points Evaluating the technical quality of AAAAS must focus on the validity of the test based inferences Specifying the interpretative argument will help create the validity evaluation plan – Consequences are essential Prioritize and get started! Marion. Center for Assessment. MARCES 2007 2 It’s About Validation The purpose of technical documentation is to provide data for critically evaluating the inferences from the AA-AAS scores and the logic of the interpretative argument Marion. Center for Assessment. MARCES 2007 3 What about alignment, reliability, comparability, etc? Reliability and comparability might be overrated, especially in the narrow way that reliability has been applied in AA-AAS evaluations Alignment, if pursued at a deep level, can provide useful evidence about the nature of the construct Marion. Center for Assessment. MARCES 2007 4 Specifying the argument Simply saying that the AA-AAS is being designed to fulfill NCLB and IDEA requirements is inadequate Like a theory of action, the interpretative argument makes the values explicit and specifies the logical connections among the various components of the system, for example: – observed scores will increase for students in response to high quality instruction – the quality of school level instruction will increase as a result of appropriate use of test scores to target professional development – as a result of decisions based on AA-AAS scores, the educational opportunities of students will improve These all lead to important validity inquires Marion. Center for Assessment. MARCES 2007 5 Consequences are Central With AA-AAS we are trying to balance technical rigor with social justice There is a fundamental belief that educational opportunities for students with significant cognitive disabilities will improve when included in test based accountability systems – If this is a value of the system—and I believe it is for most state AA-AAS systems—than the evaluation of technical quality must include a critical examination of the intended positive and unintended negative effects of the uses of the test scores in this system Marion. Center for Assessment. MARCES 2007 6 Prioritizing and Focusing Several authors, including Kane (2006), Haertel (1999), Lane (2003), Ryan (2002), and Shepard (1993) have offered suggestions for prioritizing and focusing validity evaluations – There is no “right” way, but many wrong ways States need to develop a validity studies plan, based upon the interpretative argument (or other legitimate organizing framework) to organize both the studies and the synthesis of evidence – Many of these studies—particularly consequential studies—need early planning and some initial data collection Marion. Center for Assessment. MARCES 2007 7 A heuristic to help organize and focus the validity evaluation (Marion, Quenemoen, & Kearns, 2006) OBSERVATION Assessment System Test Development Administration Scoring INTERPRETATION VALIDITY EVALUATION Empirical Evidence Theory and Logic (argument) Consequential Features COGNITION Student Population Academic Content Theory of Learning Reporting Alignment Item Analysis/DIF/Bias Measurement Error Scaling and Equating Standard Setting
© Copyright 2026 Paperzz