I`m a Psychometrician and I`m Here to Help (and Learn)

Some Key Points for Test
Evaluators and Developers
Scott Marion
Center for Assessment
Eighth Annual MARCES Conference
University of Maryland
October 11-12, 2007
1
Key Points
Evaluating the technical quality of AAAAS must focus on the validity of the test
based inferences
Specifying the interpretative argument will
help create the validity evaluation plan
– Consequences are essential
Prioritize and get started!
Marion. Center for Assessment. MARCES 2007
2
It’s About Validation
The purpose of technical
documentation is to provide data for
critically evaluating the inferences from
the AA-AAS scores and the logic of the
interpretative argument
Marion. Center for Assessment. MARCES 2007
3
What about alignment, reliability,
comparability, etc?
Reliability and comparability might be
overrated, especially in the narrow way that
reliability has been applied in AA-AAS
evaluations
Alignment, if pursued at a deep level, can
provide useful evidence about the nature of
the construct
Marion. Center for Assessment. MARCES 2007
4
Specifying the argument
Simply saying that the AA-AAS is being designed
to fulfill NCLB and IDEA requirements is
inadequate
Like a theory of action, the interpretative argument
makes the values explicit and specifies the logical
connections among the various components of the
system, for example:
– observed scores will increase for students in response
to high quality instruction
– the quality of school level instruction will increase as a
result of appropriate use of test scores to target
professional development
– as a result of decisions based on AA-AAS scores, the
educational opportunities of students will improve
These all lead to important validity inquires
Marion. Center for Assessment. MARCES 2007
5
Consequences are Central
With AA-AAS we are trying to balance technical
rigor with social justice
There is a fundamental belief that educational
opportunities for students with significant
cognitive disabilities will improve when included
in test based accountability systems
– If this is a value of the system—and I believe it is for
most state AA-AAS systems—than the evaluation of
technical quality must include a critical examination of
the intended positive and unintended negative effects
of the uses of the test scores in this system
Marion. Center for Assessment. MARCES 2007
6
Prioritizing and Focusing
Several authors, including Kane (2006), Haertel
(1999), Lane (2003), Ryan (2002), and Shepard
(1993) have offered suggestions for prioritizing
and focusing validity evaluations
– There is no “right” way, but many wrong ways
States need to develop a validity studies plan,
based upon the interpretative argument (or other
legitimate organizing framework) to organize
both the studies and the synthesis of evidence
– Many of these studies—particularly consequential
studies—need early planning and some initial
data collection
Marion. Center for Assessment. MARCES 2007
7
A heuristic to help organize and focus the validity
evaluation (Marion, Quenemoen, & Kearns, 2006)
OBSERVATION
Assessment System
Test Development
Administration
Scoring
INTERPRETATION
VALIDITY EVALUATION
Empirical Evidence
Theory and Logic (argument)
Consequential Features
COGNITION
Student Population
Academic Content
Theory of Learning
Reporting
Alignment
Item Analysis/DIF/Bias
Measurement Error
Scaling and Equating
Standard Setting