For: AERA-D Rasch Measurement SIG. New Orleans, USA, 3rd April

For: AERA-D Rasch Measurement SIG. New Orleans, USA, 3rd April, 2002
Symposium entitled: Is Educational Measurement really possible?
Chair and Organizer: Assoc. Prof. Trevor Bond
Paul
tt
tt
PaulBarre
Barre
email:
email:[email protected]
[email protected]
[email protected],
[email protected], [email protected]
[email protected]
http://www.liv.ac.uk/~pbarrett/paulhome.htm
http://www.liv.ac.uk/~pbarrett/paulhome.htm
Affiliations:
Affiliations:Mariner7
Mariner7Ltd.,
Ltd.,Auckland
AucklandNZ
NZ
Dept.
Dept.of
ofPsychology,
Psychology,Univ.
Univ.of
ofAuckland
Auckland
Dept.
Dept.of
ofClinical
ClinicalPsychology,
Psychology,Univ.
Univ.Of
OfLiverpool
Liverpool
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
• Consider creating a measure of something
or somebody. What comes first?
• Some notion of the specific feature or
attribute [variable] for which you would like
to differentially identify magnitudes, or?
• The operation of constructing
measurement?
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
• We construct measurement for a “purpose”.
purpose
• That purpose requires that we have a reason
for such construction.
• This reason implies that we have an
“understanding”
understanding about why the measurement
of something will be worth constructing.
• This understanding requires some
meaning-laden statements about the
something,
something otherwise we would never have
had thought of the purpose in the first place.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Now you have to decide what kind of
measurement you wish to make – “kind of”
is conveniently delimited by whether or not
you wish to invoke a concatenation operation
using a “standard unit” - with which all
measures of magnitude of measures of your
variable will be so constructed.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
So, a “standard unit” is not for you.
This means you will at best be able to make
measurement using only ordinal relations
between measured magnitudes. i.e.
magnitudes expressed as ranks, orderrelations , and no additive operations.
operations This
is still of utility, but it will limit the
understanding of causes of phenomena to
explanations couched in terms of qualitative
magnitudes.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Otherwise known as the siren call of social
scientists! The distinguished speakers before
me have clearly explained the errors of logic
involved with using the operations of additive
concatenation, without ever considering
whether the variable so measured was
capable of sustaining such operations. You
now know its not “all just numbers”.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Maybe – but, hope persists because there is
Rasch scaling. Using this, we can create
“probabilistic” equal-interval measurement of
latent variables using order-relation (or even
categorical relation) comparisons between
“levels” or “categories” on two variables to
imply the equal-interval properties of the 3rd
“derived” latent variable. Voila! Quantitative
scientific measurement, with the standard unit
constructed via the scaling operation.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Can any two variables be so “conjoined” to
produce a 3rd?
If the axioms of conjoint measurement are
met, or the statistical fit indices of the
modelling procedure deem it so, then YES.
Barrett and Kline (1981) showed this with a
single test constructed of Extraversion and
Neuroticism items from the Eysenck
Personality Questionnaire.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Didn’t Robert Wood (1978) fit random
“coin-toss” data with a Rasch model?
YES, easily in fact. What was created was
an equal-interval latent variable of “cointossing” ability! This is the result of
measurement construction which is literally
meaningless.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Yes, definitely. The fault lies not with the
methodology in the two examples just
mentioned, but with the “meaning-laden”
conditions under which the scaling was
initiated.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
And you’d still have exactly the same
problems. My abstract mentioned
Rozeboom’s paradox … a simple example of
the failure of additive concatenation with the
physical measurement of the volume of two
liquids.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
“Suppose that my garage contains exactly a
pint of brake fluid, exactly two quarts of
alcohol, exactly a gallon of distilled water,
and not a trace of any other fluids. Do you
agree that this implies that my garage contains
13 pints of liquid -- not just approximately
but EXACTLY? If so, how do you reach that
conclusion? The proximal argument, of
course, is that 1 pint + 4 pints + 8 pints equals
13 pints”
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
“But how does additivity apply here? Does it
follow by what we have learned about the
physical concatenation of liquids that if the
fluids in my hypothetical garage were to be
poured together into a suitably calibrated
container of sufficient size, the mixture would
measure exactly 13 pints or differ from that
only by what can be explained by evaporation
and some adhesion to the original containers?”
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
“Unfortunately for the concatenation
argument, this is known (or at least alleged
by second-hand information I have
encountered) to be untrue: Distilled
water absorbs enough alcohol to reduce the
combined volumes to something less than
the expected 13 pints.”
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Because it shows again that manipulating
quantitative “objects” without regard to the
meaning of the units of those objects, can lead
to unexpected errors – as shown above. This
is not to argue for a one-to-one isomorphism
of a unit with some physical property of an
object (as per Campbell’s thesis), but to stress
that we do need to understand why a measure
works as it does.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
In the Rozeboom paradox, this knowing why
is crucial to explaining the “apparent” failure
of simple additive concatenation. The
explanation in this particular case is found in
the consideration of the constituent properties
of volume, measured as pints of liquid, and
understood in terms of molecular density,
composition, and interaction.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Can we really make measurement like this in
the social sciences …Well, what do we mean
by “this”? If we mean can we produce equalinterval measurement scales with certain
properties of measurement, for meaningful
variables, then YES.
YES The literature abounds
with examples from many domains. BUT …
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
We now realise that constructing
measurement scales without deep regard to
“what it is” that we are attempting to measure
is likely to end in a morass of competing and
virtually arbitrary scales with practically no
coherent means of choosing one over any
other. What’s worse is that none of them are
likely to make very accurate measurement,
except by chance alone.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Michael Maraun (1998) …
“Measurement practice in psychology
misdiagnoses the nature of measurement,
since it is uniformly formulated under the
assumption that measurement claims are
justified in large part through empirical casebuilding [aka construct validity]” (p. 436)
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
“The problem is that in construct validation
theory, knowing about something is confused
with an understanding of the meaning of the
concept that denotes that something…..”
But, if we look at Cronbach and Meehl …
“Scientifically speaking, to ‘make clear what
something is’ means to set forth the laws in
which it occurs.”
occurs (Cronbach and Meehl, 1955)
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
“This is mistaken. One may know more or less
about it, build a correct or incorrect case about
it, articulate to a greater or lesser extent the
laws into which it enters, discover much, or
very little about it. However, these activities
all presuppose rules for the application of the
concept that denotes it (e.g. intelligence,
dominance)…”
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
“Furthermore, one must be prepared to cite
these standards as justification for the claim
that these empirical facts are about it.”
(Maraun … 1998 p. 448)
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Let us also note Maraun (1998) again…
“The relative lack of success of
measurement in the social sciences as
compared to the physical sciences is
attributable to their sharply different
conceptual foundations….
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
“In particular, the physical sciences rest
on a bedrock of technical concepts, whilst
psychology rests on a web of commonor-garden psychological concepts.
concepts These
concepts have notoriously complicated
grammars [of meaning]”. (p. 436)
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Whatever measurement is to be created, if at
all possible, will need to be created within a
normative frame of meaning. That is, it is
impossible to create measures of
“intelligence” or “depression” unless these
constructs/ phenomena have a normative
meaning such that all investigators can work
within this common semantic framework.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Without this normative agreement, as is the
case today, chaos reigns as measure after
measure is produced but with no common
units or unambiguous common/shared
meaning
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Step 1: Define a normative meaning for your
technical construct. It will be of narrow focus,
capable of sustaining precise measurement.
Step 2: Construct appropriate normative
measurement for this construct.
Step 3: Test the hypothesis that the
measurement does indeed imply the normative
meaning of the construct as defined.
Step 4: Maintain this measurement via
metrology
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Of course, it is already a reality. We classify
and make ordinal statements as a matter of
course. However, the real question is:
Is it possible to make measurement that
accords to the properties required by the
instantiation of a concatenation
function using a standard measurement
unit?
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
To answer this question, steps 1 to 3 above
are required. So, put away the Hierarchical
Linear Modelling, Structural Equation
Models, and all those over-powered statistical
modelling and scaling techniques that demand
an additive concatenation unit. Sit down, and
first THINK about Step 1 – and how you aim
to define the technical, normative, meaning of
the constructs you propose.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Then, if you wish to construct measurement
using an additive concatenation unit, use the
Rasch model as your means of
operationalising this. Begin to construct
your measurement with that very specific,
proposed normative meaning in mind. Then,
if you are successful in producing
measurement with the properties you
specified, maintain it via metrology.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Barrett, P. T., & Kline, P. (1981) A
comparison between Rasch analysis and
factor analysis of items in the EPQ.
Personality Study and Group Behaviour, 1, 2,
11-28
Cronbach, L.J., & Meehl, P. (1955) Construct
validity in Psychological Tests. Psychological
Bulletin, 52, , 281-302.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002
Maraun, M. (1998) Measurement as a
Normative Practice. Theory and Psychology,
8, 4, 435-461.
Rozeboom, R. (1966) Scaling Theory and the
Nature of Measurement. Synthese, 16, 170233.
Wood, R. (1978) Fitting the Rasch Model: a
heady tale. British Journal of Mathematical
and Statistical Psychology, 31, 27-32.
Paul Barrett: Mariner7 - University of Auckland
AERA-D SIG-RM-108 – April 2002