Methods of Psychological Research Online 1998, Vol.3, No.2
Internet: http://www.pabst-publishers.de/mpr/
c
1999
Pabst Science Publishers
Latent Change in Discrete Data:
Unidimensional, Multidimensional, and
Mixture Distribution Rasch Models for
the Analysis of Repeated Observations
Thorsten Meiser
Psychologisches Institut der Universit
at Bonn
Elsbeth Stern
Max-Planck-Institut f
ur Bildungsforschung Berlin
Rolf Langeheine
Institut f
ur die P
adagogik der Naturwissenschaften
an der Universit
at Kiel
Abstract
A survey of unidimensional, multidimensional, and mixture distribution Rasch
models is presented with a particular focus on model applications for the analysis of change in repeated measures designs. A mover-stayer mixed Rasch
model is specied for modeling global change in one of two latent subpopulations and for modeling stability in the other latent subpopulation. The application of unidimensional, multidimensional, and mixture distribution Rasch
models for the analysis of change is illustrated using data on the development
of understanding and solving arithmetic word problems in elementary school
children.
Keywords: Rasch model, measurement of change, nite mixture distributions
Zusammenfassung
Eine Ubersicht
uber eindimensionale, mehrdimensionale und MischverteilungsRasch-Modelle wird vorgestellt unter besonderer Berucksichtigung der Modellanwendung zur Analyse von Veranderungen in Designs mit Mewiederholungen. Ein Mover-Stayer-Mischverteilungs-Rasch-Modell wird formuliert zur Modellierung globaler Veranderung in einer von zwei latenten Subpopulationen
und zur Modellierung von Stabilitat in der anderen latenten Subpopulation.
Die Anwendung von eindimensionalen, mehrdimensionalen und Mischverteilungs-Rasch-Modellen zur Analyse von Veranderungen wird anhand von Daten
zur Entwicklung des Verstehens und Losens mathematischer Textaufgaben bei
Grundschulkindern illustriert.
Schlusselworter: Rasch-Modell, Veranderungsmessung, nite Mischverteilungen
The class of psychometric models presented by Georg Rasch (1960/1980, 1968)
has gained considerable interest and has stimulated an impressive amount of research on statistical models in the social and behavioral sciences. The impact of
Rasch's work on modern test theory is documented in several monographs which
summarize current developments in the mathematical modeling of test data as well
76
Meiser et. al.: Latent Change in Discrete Data
as new perspectives in the application of test models to social science issues (e.g.,
Fischer & Molenaar, 1995; Langeheine & Rost, 1988; Rost & Langeheine, 1997; van
der Linden & Hambleton, 1997).
The present article is devoted to the application of Rasch models to longitudinal
test data which comprise repeated observations of the same items and the same
sample of individuals at dierent occasions. For this purpose, the unidimensional
Rasch model and some recent extensions, such as multidimensional Rasch models,
mixture distribution Rasch models, and submodels thereof, are outlined in the next
three sections with a particular focus on the analysis of change. With respect to
the loglinear representation of unidimensional and multidimensional Rasch models, a model hierarchy is pointed out which facilitates testing the assumption of
homogeneity of change across individuals. In the context of mixture distribution
Rasch models, a mover-stayer model is presented which allows an a-priori specication of dierent patterns of change for dierent latent subpopulations. Throughout
the text, the models are presented in terms of polytomous item response models
which include models for dichotomous items as special cases. In regard to polytomous Rasch models, two contradictory views concerning the ordering of threshold
diÆculties are described and a mediating view is suggested.
In the fourth section, the analysis of change by means of unidimensional, multidimensional, and mixture distribution Rasch models is illustrated using longitudinal
data on the development of understanding and solving arithmetic word problems in
elementary school children. Thereby the fourth section extends a previous analysis
of these data which was based on a latent class state-mastery model embedded in
latent Markov chain models (Langeheine, Stern, & van de Pol, 1994).
1 Unidimensional Rasch Models
Unidimensional Rasch models for polytomous items with ordered response categories x = 0; :::; m can be derived by the appropriate parameterization of the
threshold probabilities. The threshold probability of item i, i 2 f1; :::; I g, threshold
x, x 2 f1; :::; m g, and person v, v 2 f1; :::; N g, is dened as the conditional probability of person v responding with response category x in item i, given that person
v responds with either category x 1 or x:
p(X = x)
:
(1)
=
p(X = x 1) + p(X = x)
For unidimensional Rasch models, the threshold probabilities are parameterized
in terms of the logistic function, where the argument of the function is the dierence
between a person parameter and a threshold parameter :
exp( ) :
=
(2)
1 + exp( )
The parameter reects the latent ability or attitude of person v, whereas the
parameter reects the diÆculty of threshold x of item i. From Equation (2), the
probability of response category x for item i and individual v can be derived by a
recursive formula:
p(X = x) = p(X = x 1)
(3)
1 = :::
P
exp(
x
= P i exp (y P ) )
i
i
vi
vix
vi
vi
vix
v
ix
v
ix
vix
v
ix
ix
vix
vi
vi
vix
x
v
s=1 is
y
m
y =0
MPR{online 1998, Vol.3, No.2
v
s=1
c
is
1999
Pabst Science Publishers
77
Meiser et. al.: Latent Change in Discrete Data
(Andrich, 1978; Masters, 1982; Rost, 1988) with P := 0. The probability of response vector X containing the responses to a given set of I items,
X= (X ; :::; X ), results from multiplying the probabilities of the single item responses, that is multiplying Equation (3) over i:
0
s=1
1
is
I
p (X
v
= (x ; :::; x )) = Q
1
I
exp
I
i=1
t
PI
v
Pmi
y =0
i=1
Pxi
s=1
Py
exp (y
v
is
s=1
is
)
;
(4)
where t = P x denotes the total score of the item responses. Equation (4)
rests on the assumption of local independence , that is stochastic independence of
responses conditional on the person and threshold parameters.
Several special cases of the general unidimensional Rasch model for polytomous
items, which is also called the partial credit model (PCM; Masters, 1982; Masters
& Wright, 1997), can be specied by restrictions on the threshold parameters .
Equality constraints on the dierences between threshold parameters across items,
I
i=1
i
ix
ix
= +
i
x
with
m
X
s
s=1
= 0;
(5)
result in the well-known rating scale model (RSM; Andrich, 1978). Equality constraints on the dierences between adjacent threshold parameters within each item,
= + (x (m + 1)=2) Æ ;
(6)
result in the dispersion model (Andrich, 1982).
ix
1.1
i
i
i
Interpretation of Threshold Parameters
Since the parameterization of the threshold probabilities in Equation (2) is equivalent to the unidimensional Rasch model for dichotomous items, the threshold parameters have the same interpretation as the item parameters in the dichotomous
Rasch model. In particular, the following relations hold:
= , = :5,
is the turning point of the threshold characteristic curve , and
= , p(X = x 1) = p(X = x).
A controversial issue concerns the question whether the claim of ordered response
categories implies a corresponding order on the threshold parameters within each
item, that is: does the assumption that response category x of item i reects a higher
amount of latent ability or attitude than response category x 1, for x = 1; :::; m ,
necessarily imply the order relation < < ::: < i ? In recent publications,
Masters and Wright (1997) and Andrich, de Jong, and Sheridan (1997) elaborated
their contradictory views on this issue.
Masters and Wright (1997) pointed out that each threshold parameter refers to
the comparison of a single pair of adjacent categories (see Equation (2)) and that
the parameters can therefore have any order:
Because each item parameter [i.e., threshold parameter in the present
terminology] in the PCM is dened locally with respect to just two
adjacent categories (rather than taking into account all categories simultaneously), the item parameters in the model can take any order.
(Masters & Wright, 1997, p. 105)
ix
v
ix
vix
ix
v
ix
vi
vi
i
i1
MPR{online 1998, Vol.3, No.2
i2
im
c
1999
Pabst Science Publishers
78
Meiser et. al.: Latent Change in Discrete Data
An example for an achievement item with four dierent response categories, one
for entirely incorrect responses, two for partial solutions, and one for the correct
solution, is the math item
p5 1
4 =?
The solution of the item requires three consecutive steps:
1. 5 1 = 4,
2. p4 = 2,
3. 2=4 = 0:5.
If no step is taken successfully, the response is entirely incorrect (ignoring the possibility of guessing the solution) and scored as x = 0. Finishing only step 1 or
nishing steps 1 and 2 leads to the interim results \4" or \2" respectively, scored
as categories x = 1 and x = 2. Finishing all steps successfully leads to the correct
solution \0.5" which is scored as category x = 3. Since threshold 2 (i.e., scoring
2, rather than 1) requires taking a square root and therefore may be regarded as
more diÆcult than threshold 3 (i.e., scoring 3, rather than 2) which involves solving
a simple fraction, the natural order of threshold parameters for this task would be
< < . Thus, the order of threshold parameters does not correspond to the
order of response categories x. However, this does not violate the basic assumption that response category x = 3 reects more eort or ability than does category
x = 2, unless the division required in step 3 is considered to be trivial.
In contrast to the view of Masters and Wright (1997), Andrich et al. (1997) emphasized that the eects of the threshold parameters are not conned to adjacent
categories and that each category probability of item i depends on the entire set
of threshold parameters of that item (see Equation (3)). In particular, the authors
regard the ascending order of threshold parameters as an implication of the hypothesis of ordered response categories and, as a consequence, as a criterion for model
evaluation:
The ordering of the thresholds which divide the latent unidimensional
continuum into categories itself is a hypothesis about the data which is
embedded in the model. Although formal statistical tests of t could be
constructed to test that the empirical ordering is not consistent with the
intended ordering, reversed threshold estimates is suÆcient evidence to
conclude that the empirical ordering is not consistent with the intended
ordering. (Andrich et al., 1997, p. 68)
The rationale behind this view is based on the presumed process of generating a
response to an item in terms of integrating single responses to each of the thresholds (Andrich, 1978, 1985): considering the possible patterns of responses to the
two thresholds of a trichotomous item, for instance, yields the sample space =
f(0; 0); (1; 0); (0; 1); (1; 1)g, where (0,0) means that neither of the thresholds is passed,
(1,0) means that only the rst threshold is passed, etc. The requirement to integrate
the threshold responses to an unique response on the item reduces the sample space
of the item to 0 = f0 = (0; 0); 1 = (1; 0); 2 = (1; 1)g so that the pattern (0; 1) in
has to be dropped from the set of available response pairs. As a consequence,
the assumption of ordered response categories implies a Guttman structure on the
threshold responses within each item reecting the order < < ::: < i .
This brief reection of the contradictory views reveals that the hypotheses about
the response generating processes are crucial to the application of a psychometric
i1
i3
i2
i1
MPR{online 1998, Vol.3, No.2
c
1999
i2
im
Pabst Science Publishers
79
Meiser et. al.: Latent Change in Discrete Data
model and should be made explicit and underpinned by substantial theory. As a
mediating view on the issue, we suggest that violations of the Guttman structure
in the ordering of threshold parameters may be accepted for achievement items, if
the cognitive processes involved in the single steps of a task are well specied and if
an additional cognitive operation, resulting in a higher response category, indicates
some degree of eort or ability on the latent continuum to be measured (cf. Masters
& Wright, 1997). In contrast, applications of Rasch models to attitude items may
well include the assumption of ascending threshold diÆculties, because individuals
have to select the appropriate category by taking into consideration the whole set of
available categories at once (cf. Andrich et al., 1997). Hence, for attitude items the
ordering of the response categories is a hypothesis which can be tested by means of
the ordering of threshold diÆculties.
1.2
Linear Logistic Test Models for Measuring Change
Fischer and Parzer (1991) and Fischer and Ponocny (1994) introduced polytomous
Rasch models with linear constraints on the threshold parameters which allow measuring change and (quasi-) experimental treatment eects in repeated observations
(see Fischer & Ponocny, 1995, for an overview). These linearly restricted polytomous Rasch models are generalizations of the linear logistic test model (LLTM) for
the assessment of change with dichotomous items (Fischer, 1983, 1995; Fischer &
Formann, 1982). As before, only the polytomous case is considered here, because
it encompasses the application to dichotomous items as a special case.
If a set of I items is assessed at two measurement occasions T and T , the
resulting set of 2 I items can be divided into the subset of I items 1; :::; I observed
at T and a further subset of I virtually new items I + 1; :::; 2I observed at T . In
the linearly restricted PCM (Fischer & Ponocny, 1994), the thresholds of the rst I
items are parameterized in terms of Equation (2), while the threshold parameters
of the virtually new items I + 1; :::; 2I are decomposed into the initial threshold
parameters and a set of J treatment eects:
1
2
1
2
(
I +i)x
=
J
X
ix
(7)
q
j
j
j =1
In Equation (7), q denotes the dosage of treatment j , which may dier between
experimental groups, and denotes the eect of treatment j . By means of replacing the eect parameter by an item-specic eect parameter , the assumption
of constant treatment eects across items is dropped. Analogous linear decompositions have been proposed for the parameters of the RSM (Fischer & Parzer, 1991).
Methods of conditional maximum-likelihood (CML) estimation of the parameters
in the linearly restricted PCM and RSM are presented by Fischer and Parzer (1991)
and Fischer and Ponocny (1994, 1995).
If no experimental treatments are applied, development occuring in the interval
from T to T can be measured by setting J = 1 and q = 1. Then reects the
individuals' global amount of change on the latent continuum (cf. Langeheine, 1993;
Rost & Spada, 1983; Spada & McGaw, 1985). Generalizations to more than two
measurement occasions are straightforward.
The linear logistic test models for dichotomous and polytomous items maintain
the assumption of unidimensionality across items. Another family of models, called
linear logistic test models with relaxed assumptions (LLRA), abandons this condition (Fischer, 1983, 1995; Fischer & Formann, 1982). In LLRA models, each of
the items 1; :::; I may measure a trait of its own, although all items are presumed
to assess the same kind and amount of change. The relaxation is accomplished
by replacing the dierence of the person and threshold parameter, , by a
j
j
j
1
2
ji
1
1
v
MPR{online 1998, Vol.3, No.2
c
1999
ix
Pabst Science Publishers
80
Meiser et. al.: Latent Change in Discrete Data
joint parameter which denotes the interaction of both, . For the virtual items
I + 1; :::; 2I , the joint term is decomposed into the initial parameter plus a linear combination of change parameters as in the LLTM (see Equation (7)).
While LLRA models are multidimensional in nature, they do not contain explicit
hypotheses about the latent dimensions underlying the responses to any particular
item. Alternative multidimensional Rasch models which allow specifying the latent
structure of each threshold are presented in the next section.
vix
v (I +i)x
vix
2 Multidimensional Rasch Models
If it is assumed that the observed responses are aected by more than one latent trait, the threshold probabilities can be parameterized in terms of the logistic
function where the argument involves the sum over several latent dimensions d,
d = 1; :::; D:
exp P
=
1 + exp P
w
D
ixd
d=1
vix
D
d=1
w
(
ixd
vd
ixd
(
)
(8)
:
vd
ixd
)
The weights w 2 f0; 1g specify whether passing threshold x of item i involves
latent trait d or not (for details, see Meiser, 1996). The category probabilities can
be derived from Equation (8) by the recursive formula used in Equation (3):
ixd
P
P
exp P P w w P
p(X = x) = P i
(9)
P
P
P
exp
w w with = 0 for w = 0. As in the unidimensional case, the distribution of
response vector X results from multiplication over i:
p (X = (x ; :::; x )) =
D
x
d=1
vi
isd
D
isd
s=1
m
D
y
y =0
d=1
s=1
vd
x
d=1
isd
vd
isd
s=1
D
y
d=1
s=1
isd
isd
isd
isd
1
v
exp
QI
i=1
P
D
d=1
Pmi
exp
y =0
PD
t
d
P
D
I
vd
Py
d=1
s=1
PI
d=1
i=1
w isd
vd
Pxi
s=1
w PD
isd
isd
Py
d=1
s=1
w isd
;
(10)
isd
where t = P P i w denotes the total score related to latent dimension d,
that is the number of thresholds passed involving this dimension.
Special cases can be derived from the general multidimensional Rasch model
by appropriate specications of the weights w . The so-called multidimensional
partial credit model in which each threshold x of items 1 to I refers to a latent
dimension of its own (Kelderman, 1993, 1996; Kelderman & Rijkes, 1994) results
from
(
1 for x = d ;
w =
(11)
0 for x 6= d :
By this specication, Equation (9) is reduced to:
exp (P PP ) :
(12)
p(X = x) = P i P
(
)
d
I
x
i=1
s=1
isd
ixd
ixd
x
vi
m
y =0
MPR{online 1998, Vol.3, No.2
x
d=1
y
d=1
vd
vd
id
d=1
y
d=1
c
id
1999
Pabst Science Publishers
81
Meiser et. al.: Latent Change in Discrete Data
Recently it has been shown that the multidimensional partial credit model for
ordered response categories is equivalent to Rasch's traditional multidimensional
model for categorical responses with category probabilities
exp ( )
p(X = x) = P i
(13)
( )
(cf. Andersen, 1995; Roskam, 1996). Actually, reparameterizing Model (12) in terms
of
vx
vi
vy
y =0
vx
=
ix
m
x
X
vd
and
ix
iy
=
d=1
x
X
(14)
id
d=1
yields Model (13) (Kelderman, 1997).
2.1
Loglinear Representation of Multidimensional Rasch
Models
Loglinear representations of unidimensional Rasch models (e.g. Cressie & Holland,
1983; Kelderman, 1984; Thissen & Mooney, 1989; Tjur, 1982) and of multidimensional Rasch models (e. g. Kelderman & Rijkes, 1994; Meiser, 1996) provide a
convenient framework for specifying and testing hypotheses about the latent space
underlying the observed responses and about the structure of threshold parameters. The general multidimensional Rasch model of Equation (10) can be rewritten
in terms of the loglinear model
ln p (X = (x ; :::; x )) = u
1
xi
D
I
X
X
X
I
w isd
isd
d=1 i=1 s=1
+u
(15)
D ;
(t1 ;:::;t )
where (t ; :::; t ) is the vector of total scores referring to the latent dimensions 1 to
D. To achieve identiability, several constraints have to be imposed on the parameters of Equation (15) (cf. Kelderman & Rijkes,
1994; Meiser, 1996). A common
set of constraints is = 0 for w = 0, P ::: P D u P
0, and the reD =P
i =0
striction of centered scales for the latent dimensions, that is
for d = 1; :::; D. Thereby, the model for the distribution of the total score vector
(t ; :::; t ) is saturated in Equation (15), and maximum-likelihood estimation of the
parameters in the loglinear model yield CML estimates of the threshold parameters. The unidimensional Rasch model, the multidimensional partial credit model,
and models with several unidimensional scales can be derived from Model (15) by
appropriate selections of the design matrix in the nonstandard loglinear modeling
approach (cf. Langeheine, 1983; Rindskopf, 1990, 1992).
1
D
isd
1
2.2
isd
t1
(t1 ;:::;t )
t
I
m
i=1
s=1
isd
D
Loglinear Rasch Models for the Analysis of Change
Consider again the case of two measurement occasions T and T with a set of
items 1; :::; I observed at T and with the corresponding set of virtually new items
I + 1; :::; 2I observed at T . The above-mentioned special case of the linearly restricted PCM (Fischer & Ponocny, 1994), namely the unidimensional Rasch model
of global change which is independent from the thresholds, the items, and the person parameters, results from Model (15) by the specications D = 1, w = 1 for
all thresholds, and = :
ln p (X = (x ; :::; x ; x ; :::; x )) =
1
2
1
2
is1
(I +i)s
is
1
MPR{online 1998, Vol.3, No.2
I
I +1
2I
c
1999
Pabst Science Publishers
82
Meiser et. al.: Latent Change in Discrete Data
xi
I
X
X
u
(I +i)
I
X
X
x
is
is
i=1 s=1
i=1
s=1
+
(I +i)
I
X
X
x
+u
(16)
t
i=1
s=1
with t = P x + P x . The statistical comparison of Model (16) to the
model of perfect stability which results from the restriction = 0 allows testing
for the occurrence of change from T to T . Model (16) can easily be extended to
measuring global change in several latent dimensions or to designs with more than
two measurement occasions (see Meiser, 1996).
While the unidimensional Rasch model for repeated observations is based on
the assumption of homogeneity of change across individuals, the two-dimensional
Rasch model with latent trait d = 1 aecting only responses at T , that is responses
to the items 1; :::; I , and latent trait d = 2 aecting only responses at T , that is
responses to the virtually new items I + 1; :::; 2I ,
I
I
i=1
i
i=1
(I +i)
1
2
1
2
ln p (X = (x ; :::; x ; x
1
I
I
+1 ; :::; x2
I
xi
I
X
X
)) = u
(I +i)
I
X
X
x
is
i=1 s=1
is
i=1
s=1
+u
(t1 ;t2 )
(17)
with t = P x and t = P x , permits person-specic change (Meiser,
1996; see also Duncan, 1985a, 1985b). In Equation (17), the threshold parameters
are specied to be invariant over time. This invariance mirrors the assumptions that
latent change is conned to the individuals' total scores, which are the suÆcient
statistics of the latent person parameters , and that the structure of the items
and response categories remains unchanged.
Note that the unidimensional Rasch model of global change is a submodel of
the two-dimensional Rasch model of person-specic change: Equation (16) results
from Equation (17) by the restrictions u
= u + t with t = t + t . Hence,
the two models can be compared by means of the conditional log-likelihood ratio
statistic G in order to test for homogeneity of change.
In Equation (17), each of the item sets 1; :::; I and I + 1; :::; 2I is specied to be
unidimensional. Therefore, multidimensionality comes into play only by admitting
person-specic change between the two measurement occasions which both refer to
a unidimensional Rasch model. This contrasts with the LLRA which abandons the
assumption of unidimensionality per measurement occasions as discussed above.
I
1
i=1
i
2
I
i=1
(I +i)
vd
(t1 ;t2 )
t
2
1
2
2
3 Mixture Distribution Rasch Models
The essential assumption of the Rasch model that the threshold parameters are
homogeneous across individuals can be relaxed by use of nite mixture distribution
models (Everitt & Hand, 1981; Rost & Erdfelder, 1996; Titterington, Smith, &
Makov, 1985). In nite mixture distribution models, the probability of response
vector X is characterized by a set of component probability functions p(X j c) which
are conditional on latent subpopulations c, c = 1; :::; C , and by the distribution of
the latent subpopulations:
p(X) =
C
X
p(X j c);
(18)
c
c=1
where denotes the probability of subpopulation c.
If the component distributions of a mixture multinomial distribution are specied in terms of the unidimensional Rasch model (see Equation (4)), the mixed
c
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
83
Meiser et. al.: Latent Change in Discrete Data
Rasch model
p(X
results:
0
exp
t
PI
j
Pxi
1
j
A
(19)
exp y j P j (Rost, 1990, 1991; von Davier & Rost, 1995). In the mixed Rasch model, homogeneity of threshold parameters is specied within each of the latent subpopulations,
while the parameters may dier between latent subpopulations. Parameter estimation for mixture distribution Rasch models via the EM algorithm is described by
Rost (1990, 1991) and von Davier and Rost (1995).
v
3.1
= (x ; :::; x )) =
C
X
1
I
c
@Q
c=1
I
i=1
i=1
v c
Pmi
s=1
is c
y
y =0
s=1
v c
is c
A Mover-Stayer Mixed Rasch Model for Repeated
Observations
Recently, mixed Rasch models were applied to longitudinal data in order to separate
dierent patterns of change in an exploratory manner (Gluck & Spiel, 1997; Meiser,
Hein-Eggers, Rompe, & Rudinger, 1995; Meiser & Rudinger, 1997). Here we want
to focus on a special case of mixed Rasch models for a more conrmatory analysis of
longitudinal data, namely on a mover-stayer model encompassing a subpopulation
c = 1 of global change and another subpopulation c = 2 in which no change occurs.
In the case of two measurement occasions with items i = 1; :::; I observed at T and
virtually new items i = I + 1; :::; 2I observed at T , the model of global change in
subpopulation c = 1 and of no change in subpopulation c = 2 is specied by the
restrictions
and (20)
j = j
j = j
on the threshold parameters of the component distributions in Model (19). By the
additional restriction
j = j =
(21)
identical initial threshold parameters are specied for both subpopulations so that
dierences between the latent subpopulations are conned to potential dierences
in the distribution of the person parameter and to the a priori specied dierence
in the pattern of change, that is global change in subpopulation c = 1 versus no
change in subpopulation c = 2. Formally, the mover-stayer mixed Rasch model
with restriction (21) is related to the Saltus model (Wilson, 1989) for the analysis of discontinuous development by cross-sectional data. The generalization of
the mover-stayer mixed Rasch model to more than two measurement occasions is
straightforward.
1
2
(I +i)s 1
is
1
is
(I +i)s 2
1
is
2
is
2
is
4
Development of Understanding and Solving
Arithmetic Word Problems in Elementary School
Children
The data of the present analysis were gathered in the longitudinal study
SCHOLASTIK by the Max-Planck-Institute of Psychological Research in Munich,
Germany (Weinert & Helmke, 1997). In this study a sample of 1,453 children from
54 elementary school classes were repeatedly presented with achievement measures
on mathematics, science, and the mother tongue from rst grade to fourth grade.
In this paper we only focus on a small part of the collected data: certain arithmetic
word problems presented in the second and third grade are discussed.
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
84
Meiser et. al.: Latent Change in Discrete Data
Solving particular arithmetic word problems can be considered as a good indicator of mathematical understanding. While children can easily solve simple problems
that describe the exchange of sets (e.g., \At the beginning, John had 5 marbles.
Then he gave 2 of these marbles to Peter. How many marbles does John have
now?"), word problems dealing with the quantitative comparison (e.g., \John has
5 marbles. He has 3 marbles more than Peter has. How many marbles does Peter have?") or dealing with certain kinds of combination (e.g., \Peter and John
have 8 marbles altogether. Peter has 3 marbles. How many marbles does John
have?") provide particular diÆculties. Solving comparison and combination problems requires an advanced mathematical understanding which is based on abstract
part-whole relations rather than on the counting function of numbers (Stern, 1993,
1998; Stern & Lehrndorfer, 1992). Handling part-whole relations allows exibility
in formulating equations and partitioning sets. Particularly complex word problems which require the inference of information can only be solved on the basis of
part-whole representations. In this article we concentrate on three complex word
problems dealing with the comparison and the combination of sets presented in the
second and third grade of elementary school.
4.1
Items and Sample
The arithmetic word problems selected for the present analysis were:
1. Jack and Beth have 6 apples altogether.
Jack has 2 apples.
Ken and Ina have 9 apples altogether.
Ken has 5 apples.
How many apples do Beth and Ina have altogether?
2. John has 7 rabbits.
He has 4 rabbits more than Tom.
How many rabbits do John and Tom have altogether?
3. Joyce has 7 marbles.
She has 2 marbles more than Tom has.
Oliver has 3 marbles more than Tom.
How many marbles does Oliver has?
Responses were scored 0 for incorrect responses and 1 for correct responses. A sample of N = 1030 children participated in the two measurement occasions considered
here, that is in second and third grade. The empirical frequencies of the response
vectors comprising the responses to the three arithmetic word problems at second
and third grade are displayed in Table 1.
Since the kind of word problems used for the present analysis is only rarely
presented during elementary school, children cannot simply retrieve complete solution strategies from memory; they have to develop them on their own. Therefore,
solving each of the three word problems requires what is called far-transfer: a deep
reconstruction of the existing knowledge base in arithmetic is necessary. The mathematical requirements of the three problems go far beyond what is usually dealt
with in the rst two years of elementary school mathematics. Therefore, only second graders who have already developed outstanding mathematical competencies
on their own can be expected to solve at least some of the problems at this age level.
In the third grade, part-whole understanding of arithmetic equations is particularly
emphasized in school, for instance by frequently presenting children with ll-in-theblank problems. Children who benetted from the instruction at school should be
able to develop solution strategies for the three word problems and thereby improve
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
85
Meiser et. al.: Latent Change in Discrete Data
Table 1: Observed and Expected Frequencies of Response Vectors for the Three Arithmetic Word Items at Second and Third Grade
Grade 2
Grade 3
Frequencies Grade 2 Grade 3
Frequencies
It.1 It.2 It.3 It.1 It.2 It.3 Obs. Exp. It.1 It.2 It.3 It.1 It.2 It.3 Obs. Exp.
0 0 0 0 0 0 186 186.00 1 0 0 0 0 0
32 28.74
0 0 0 0 0 1
35 35.06 1 0 0 0 0 1
7 9.50
0 0 0 0 1 0
21 23.88 1 0 0 0 1 0
4 6.47
10 8.70
0 0 0 0 1 1
13 12.86 1 0 0 0 1 1
0 0 0 1 0 0
45 42.06 1 0 0 1 0 0
15 11.40
0 0 0 1 0 1
23 22.66 1 0 0 1 0 1
17 15.32
11 10.43
0 0 0 1 1 0
12 15.43 1 0 0 1 1 0
0 0 0 1 1 1
26 22.54 1 0 0 1 1 1
19 25.54
0 0 1 0 0 0
22 23.95 1 0 1 0 0 0
4 5.49
0 0 1 0 0 1
12 7.92 1 0 1 0 0 1
5 6.57
3 4.48
0 0 1 0 1 0
4 5.39 1 0 1 0 1 0
0 0 1 0 1 1
5 7.25 1 0 1 0 1 1
13 9.82
0 0 1 1 0 0
12 9.50 1 0 1 1 0 0
12 7.88
0 0 1 1 0 1
9 12.77 1 0 1 1 0 1
22 17.29
0 0 1 1 1 0
6 8.70 1 0 1 1 1 0
9 11.78
0 0 1 1 1 1
23 21.29 1 0 1 1 1 1
42 39.14
0 1 0 0 0 0
15 16.31 1 1 0 0 0 0
2 3.74
0 1 0 0 0 1
4 5.39 1 1 0 0 0 1
0 4.48
0 1 0 0 1 0
6 3.67 1 1 0 0 1 0
5 3.05
0 1 0 0 1 1
6 4.94 1 1 0 0 1 1
9 6.69
1 5.37
0 1 0 1 0 0
7 6.47 1 1 0 1 0 0
0 1 0 1 0 1
9 8.70 1 1 0 1 0 1
10 11.78
0 1 0 1 1 0
8 5.92 1 1 0 1 1 0
7 8.02
0 1 0 1 1 1
15 14.50 1 1 0 1 1 1
28 26.66
0 1 1 0 0 0
4 3.12 1 1 1 0 0 0
8 3.19
0 1 1 0 0 1
2 3.73 1 1 1 0 0 1
1 6.32
3 4.31
0 1 1 0 1 0
6 2.54 1 1 1 0 1 0
0 1 1 0 1 1
7 5.57 1 1 1 0 1 1
15 10.60
0 1 1 1 0 0
2 4.48 1 1 1 1 0 0
10 7.59
0 1 1 1 0 1
12 9.82 1 1 1 1 0 1
13 18.67
0 1 1 1 1 0
7 6.69 1 1 1 1 1 0
14 12.72
0 1 1 1 1 1
18 22.22 1 1 1 1 1 1
97 97.00
Expected frequencies according to the mover-stayer mixed Rasch model.
a
a
MPR{online 1998, Vol.3, No.2
a
c
1999
Pabst Science Publishers
86
Meiser et. al.: Latent Change in Discrete Data
performance. There may be, however, children who do not benet much from the
learning opportunities in the third grade and therefore will not be able to construct
advanced solution strategies for the word problems. These children will remain on
their initial performance level.
In their previous analysis of the data Langeheine et al. (1994), using twoclass state mastery models, found that a mover-stayer model including two latent
subpopulations, one with transitions from a state of low competency to a state of
high competency and one without transitions between states of competency, was
superior compared to a set of other models which did not allow for heterogeneity of
change. Although this result ts nicely into the theoretical expectation of dierential
gains in mathematical reasoning performance during third grade, none of the statemastery models considered in this earlier analysis showed a satisfactory goodness of
t at conventional levels of statistical signicance (Langeheine et al., 1994, p. 285).
The poor t may be due to the rather restrictive assumption that there are only
two levels of competency, which are reected by the class of \nonmasters" and the
class of \masters". In the present analysis, we drop this restrictive assumption by
using Rasch models, rather than two-class state-mastery models, thereby allowing
for more than two values of the latent person variable .
1
4.2
Results
Conditional maximum-likelihood estimates for the parameters of loglinear Rasch
models were obtained using the program LEM (Vermunt, 1997) which enables
specifying the design matrices of nonstandard loglinear models. The criterion of
statistical signicance for model rejection was set at = :05. A power analysis
using the program GPOWER (Faul & Erdfelder, 1992; see also Erdfelder, Faul,
& Buchner, 1996) with specications = :05, N = 1030, and medium eect size
w = :3 (Cohen, 1988) revealed that the statistical power of detecting eects of at
least medium eect size is larger than 1 = :99 for all of the goodness of t tests
and model comparisons reported in the following. Therefore, models and model
restrictions which do not yield a signicant mist can be accepted with a small
second-type error probability.
The most restrictive models of change discussed above are the unidimensional
Rasch model of global change specied in Equation (16) and its submodel of perfect
stability resulting from the parameter xation = 0. Note that both of these models
are special cases of the LLTM as described earlier. The estimates of the threshold
parameters and of the change parameter in the model of global change are
listed in Table 2 as is the log-likelihood ratio test statistic G . As can be seen in the
table, the model of global change yields a signicant G statistic so that it has to
be rejected. Accordingly, the even more restrictive model of perfect stability must
also be rejected for the present data, G = 184:87, 55 df , p < :05.
Since the unidimensional Rasch model of global change does not t the data,
we turn to the two-dimensional Rasch model permitting person-specic change.
The estimates of the threshold parameters and the log-likelihood ratio statistic of
the two-dimensional model specied in Equation (17) are displayed in Table 2. In
contrast to the unidimensional Rasch model of global change, the two-dimensional
2
i
2
2
2
1 Although Langeheine et al. (1994, pp. 284 ., Model X) selected the same items, the data
set used in the previous analysis diers from the data of the present analysis: Langeheine et al.
considered responses from N = 965 children who participated in three adjacent measurement
occasions of the longitudinal study, whereas the present analysis is based on a sample of N = 1030
children with complete data for two measurement occasions.
2 Since the responses to the arithmetic word problems were scored in dichotomous categories 0
and 1 and, as a consequence, the items have only one threshold, the index indicating the threshold
is omitted.
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
87
Meiser et. al.: Latent Change in Discrete Data
Rasch model shows an acceptable goodness of t. As shown above, the unidimensional Rasch Model (16) is a submodel of the two-dimensional Model (17) so that
the two models can be compared statistically, thereby testing for homogeneity of
change across individuals. The conditional test statistic of the model comparison
is signicant, G = 17:69, 8 df , p < :05, indicating that the assumption of homogeneity of change does not hold for the present data.
Considering the result that individuals dier with respect to their course of development from second to third grade, it is interesting to see whether the observed
heterogeneity of change can be traced back to two simple processes occuring in
dierent latent subpopulations: global change in subpopulation c = 1 and perfect
stability in subpopulation c = 2. This hypothesis is reected by the mover-stayer
mixed Rasch model, that is by the mixed Rasch model specied in Equation (19)
with the additional restrictions displayed in Equations (20) and (21). Since the
program LEM allows the specication of user-dened design matrices for log-linear
models in nite mixture distributions, the parameters of the mover-stayer mixed
Rasch model can be estimated by parameterizing the component distributions in
terms of the unidimensional Rasch model of global change (Equation (16)), where
the threshold parameters are constrained to be equal across subpopulations (Equation (21)) and where is restricted to zero for latent subpopulation c = 2 (Equation
(20)). Note that the response vectors (0,0,0,0,0,0) and (1,1,1,1,1,1) do not contribute to the CML estimation of the threshold parameters and therefore cannot
be assigned to latent subpopulations c = 1 and c = 2 (Rost, 1991). Therefore, the
loglinear parameters u j and u j are not identiable, unless additional restrictions
are imposed, such as equality constraints on u j and u j across c or xations at
zero in one of the two subpopulations. In the present analysis, we tted the cells
(0,0,0,0,0,0) and (1,1,1,1,1,1) of the contingency table by estimating the parameters
u j and u j in the subpopulation of stayers and at the same time restricting u j
and u j to zero for the subpopulation of movers. These restrictions aect the distribution of subpopulations, that is and , but do not inuence the estimates
of the threshold and change parameters of the component Rasch models.
As shown in Table 2, the mover-stayer mixed Rasch model does not yield a
signicant mist. Therefore we can maintain the hypothesis that the existence of a
mover and a stayer subpopulation accounts for the heterogeneity of change revealed
in the previous steps of analysis. The estimated size of the mover subpopulation
amounts to ^ = 0:43 and that of the stayer subpopulation to ^ = 0:57. The mean
expected probability of a correct response is depicted in Figure 1 as a function of
item, measurement occasion, and latent subpopulation. The expected frequencies
of response vectors are displayed in Table 1. A relaxation of the model by permitting dierent -parameters for the latent subpopulations, that is by dropping the
restriction displayed in Equation (21), does not result in a signicant improvement
in goodness of t, G = 2:46, 2 df , p > :05.
Since a statistical comparison of the two-dimensional Rasch model and the
nite-mixture model with two subpopulations is not possible, we cannot directly
test whether the more parsimonious mover-stayer model diers from the model of
person-specic change. Therefore, the nal model selection is based on the information criterion CAIC which emphasizes the avoidance of overtting a model (cf.
Bozdogan, 1987). The mover-stayer mixed Rasch model shows a CAIC value of
7440:27 which is slightly lower than that of the two-dimensional Rasch model with
a CAIC value of 7442:36. Therefore, the more parsimonious mover-stayer model
may be preferred over the model of person-specic change.
2
0
c
6
c
0
02
c
6
c
62
01
61
1
2
1
2
2
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
88
Meiser et. al.: Latent Change in Discrete Data
Table 2: Parameter Estimates of the Threshold and Change Parameters and Goodness
of Fit for the Unidimensional Rasch Model of Global Change, the Two-Dimensional Rasch
Model of Person-Specic Change, and the Mover-Stayer Mixed Rasch Model
Model of change
1
2
3
G2
df
Global
Person-specic
Mover-stayer
0:244
0:310
0:066
0:669
0:250
0:317
0:067
0:249
0:317
0:067
1:188
72:53
54
54:84
46
64:79
49
p < :05
Figure 1: Mean expected probabilities of correct responses to the arithmetic word items
at second and third grade for the latent subpopulations of movers and stayers.
1
Movers Stayers
Probability of a Correct
Response
0.8
0.6
0.4
0.2
0
Item 1 Item 2 Item 3
Item 1 Item 2 Item 3
Grade 3
Grade 2
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
89
Meiser et. al.: Latent Change in Discrete Data
5 Conclusions
The Rasch model and its extensions form a class of psychometric models which
can be used for modeling various aspects of change, such as global change, personspecic change, as well as dierent patterns of change in dierent subpopulations.
The loglinear representation of unidimensional and multidimensional Rasch models
and the loglinear parameterization of component distributions in mixture distribution Rasch models allow specifying and testing hypotheses about change in a exible
and straightforward way.
In the present article, we pointed out that the unidimensional Rasch model of
global change is a submodel of the multidimensional Rasch model of person-specic
change and that a mover-stayer mixed Rasch model can be specied by appropriate
restrictions on the threshold parameters of the mixed Rasch model. Thereby, we
addressed two issues raised by Stelzl (1997) in her reply to Gluck and Spiel (1997):
rst, Stelzl questioned the assumption of homogeneity of change underlying the
LLTM and other item response models for measuring change in repeated measures
designs. The above-mentioned hierarchical relation of the loglinear unidimensional
Rasch model of global change and the loglinear multidimensional Rasch model of
person-specic change allows explicitly testing this assumption by means of a conditional likelihood ratio test. Second, Stelzl emphasized the need to specify treatment
or change parameters in mixed Rasch models, if these models are to be used for
analyzing change as suggested by Gluck and Spiel (1997). In the context of the
mover-stayer mixed Rasch model, we showed how to specify a change parameter for one subpopulation and to x the parameter at zero for another subpopulation.
If a research design comprises several experimental or quasi-experimental groups,
straightforward extensions of the mover-stayer model by the extraneous group variable are available. Those extended mover-stayer models may include, for instance,
equality constraints on the threshold and change parameters across groups while
investigating dierences in the sizes of the mover and stayer subpopulations, or they
may impose equality constraints on the threshold parameters across groups while
testing for dierences in the change parameters of movers. Thus, tting a moverstayer mixed Rasch model may be the basis for further analyses to investigate which
groups of individuals will be aected by certain interventions and which groups of
individuals will not be aected or will be aected in a dierent way.
As illustrated by the above analysis of mathematical reasoning performance in
elementary school children, the restrictive assumption of homogeneity of change
may well be inappropriate for a given data set. Furthermore, we agree with Stelzl
(1997) that the assumption may even be implausible as far as natural (i.e., not
experimentally induced) change is concerned: in the context of solving arithmetic
word problems, children are expected to dier in both their level of competency at
one measurement occasion as well as their growth in competency from one measurement occasion to another. The unidimensional Rasch model of global change
reects dierences in initial competency, but precludes interindividual dierences
in growth. In contrast, the multidimensional Rasch model of person-specic change
is a rather unrestrictive model, inasmuch as it incorporates dierences in reasoning
competency at a given occasion as well as unconstrained interindividual dierences
in the amount of change from one occasion to another. The mover-stayer mixed
Rasch model falls in between these two extremes by allowing for interindividual
dierences in change in terms of two well-dened developmental processes: global
change for one subpopulation and stability for another subpopulation. For the
present data, the mover-stayer model turned out to be superior to the alternative
Rasch models considered, because it is not rejected and provides a parsimonious
description of the data generating processes.
The two subpopulations separated by the mover-stayer mixed Rasch model dier
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
90
Meiser et. al.: Latent Change in Discrete Data
with respect to the impact school instruction has on the development of mathematical competencies. In the third grade, some children gained from practicing problems
that require part-whole modeling by developing solution strategies for complex word
problems, while others did not. The subpopulation of stayers is composed of children who did not prot from elementary school instruction either because they
had already developed the competencies by themselves before they were taught at
school, or because the cognitive or motivational preconditions for proting from
the instruction were not available. In the subpopulation of movers, the children
extended their arithmetical competencies by exploiting what had been taught at
school, thereby improving their performance in word problem solving.
References
[1] Andersen, E. B. (1995). Polytomous Rasch models and their estimation. In G. H.
Fischer & I. W. Molenaar (Eds.), Rasch models. Foundations, recent developments,
and applications (pp. 271-291). New York: Springer.
[2] Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573.
[3] Andrich, D. (1982). An extension of the Rasch model for ratings providing both
location and dispersion parameters. Psychometrika, 47, 105-113.
[4] Andrich, D. (1985). An elaboration of Guttman scaling with Rasch models for measurement. In N. B. Tuma (Ed.), Sociological Methodology 1985 (pp. 33-80). San Francisco: Jossey-Bass.
[5] Andrich, D., de Jong, J. H. A. L. & Sheridan, B. E. (1997). Diagnostic opportunities
with the Rasch model for ordered response categories. In J. Rost & R. Langeheine
(Eds.), Applications of latent trait and latent class models in the social sciences (pp.
59-70). Munster: Waxmann.
[6] Bozdogan, H. (1987). Model selection and Akaike's information criterion (AIC): The
general theory and its analytical extensions. Psychometrika, 52, 345-370.
[7] Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ:
Lawrence Erlbaum.
[8] Cressie, N. & Holland, P. W. (1983). Characterizing the manifest probabilities of
latent trait models. Psychometrika, 48, 129-141.
[9] Duncan, O. D. (1985a). New light on the 16-fold table. American Journal of Sociology,
91, 88-128.
[10] Duncan, O. D. (1985b). Some models of response uncertainty for panel analysis. Social
Science Research, 14, 126-141.
[11] Erdfelder, E., Faul, F., & Buchner, A. (1996). GPOWER: A general power analysis
program. Behavior Research Methods, Instruments, & Computers, 28, 1-11.
[12] Everitt, B. S. & Hand, D. J. (1981). Finite mixture distributions. London: Chapman
and Hall.
[13] Faul, F. & Erdfelder, E. (1992). GPOWER: A priori, post-hoc, and compromise power
analyses for MS-DOS . University of Bonn: Department of Psychology.
[14] Fischer, G. H. (1983). Logistic latent trait models with linear constraints. Psychometrika, 48, 3-26.
[15] Fischer, G. H. (1995). Linear logistic models for change. In G. H. Fischer & I. W.
Molenaar (Eds.), Rasch models. Foundations, recent developments, and applications
(pp. 157-180). New York: Springer.
[16] Fischer, G. H. & Formann, A. K. (1982). Some applications of logistic latent trait
models with linear constraints on the parameters. Applied Psychological Measurement,
6, 397-416.
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
91
Meiser et. al.: Latent Change in Discrete Data
[17] Fischer, G. H. & Molenaar, I. W. (Eds.) (1995). Rasch models. Foundations, recent
developments, and applications . New York: Springer.
[18] Fischer, G. H. & Parzer, P. (1991). An extension of the rating scale model with an
application to the measurement of change. Psychometrika, 4, 637-651.
[19] Fischer, G. H. & Ponocny, I. (1994). An extension of the partial credit model with an
application to the measurement of change. Psychometrika, 59, 177-192.
[20] Fischer, G. H. & Ponocny, I. (1995). Extended rating scale and partial credit models
for assessing change. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models. Foundations, recent developments, and applications (pp. 353-370). New York: Springer.
[21] Gluck, J. & Spiel, C. (1997). Item Response-Modelle fur Mewiederholungsdesigns:
Anwendung und Grenzen verschiedener Ansatze. [Item response models for repeated
measures designs: Application and limitations of dierent approaches]. Methods of
Psychological Research Online, 2. Internet: http://www.pabst-publishers.de/mpr/
[22] Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika, 49, 223-245.
[23] Kelderman, H. (1993). Estimating and testing a multidimensional Rasch model for
partial credit scoring of children's application of size concepts. In R. Steyer, K. F.
Wender & K. F. Widaman (Eds.), Psychometric methodology. Proceedings of the 7th
European Meeting of the Psychometric Society in Trier (pp. 209-212). Stuttgart: Fischer.
[24] Kelderman, H. (1996). Multidimensional Rasch models for partial-credit scoring. Applied Psychological Measurement, 20, 155-168.
[25] Kelderman, H. (1997). Loglinear multidimensional item response models for polytomously scored items. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook
of modern item response theory (pp. 287-304). New York: Springer.
[26] Kelderman, H. & Rijkes, C. P. M. (1994). Loglinear multidimensional IRT models for
polytomously scored items. Psychometrika, 59, 149-176.
[27] Langeheine, R. (1983). Nonstandard log-lineare Modelle [Nonstandard log-linear models]. Zeitschrift fur Sozialpsychologie, 14, 312-321.
[28] Langeheine, R. (1993). Diagnosing incremental learning: Some probabilistic models
for measuring change and testing hypotheses about growth. Studies in Educational
Evaluation, 19, 349-362.
[29] Langeheine, R. & Rost, J. (1988). Latent trait and latent class models. New York:
Plenum Press.
[30] Langeheine, R., Stern, E., & van de Pol, F. (1994). State mastery learning. Dynamic
models for longitudinal data. Applied Psychological Measurement, 18, 277-291.
[31] Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47,
149-174.
[32] Masters, G. N. & Wright, B. D. (1997). The partial credit model. In W. J. van der
Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp.
101-121). New York: Springer.
[33] Meiser, T. (1996). Loglinear Rasch models for the analysis of stability and change.
Psychometrika, 61, 629-645.
[34] Meiser, T., Hein-Eggers, M., Rompe, P. & Rudinger, G. (1995). Analyzing homogeneity and heterogeneity of change using Rasch and latent class models: A comparative
and integrative approach. Applied Psychological Measurement, 19, 377-391.
[35] Meiser, T. & Rudinger, G. (1997). Modeling stability and regularity of change: Latent
structure analysis of longitudinal discrete data. In J. Rost & R. Langeheine (Eds.),
Applications of latent trait and latent class models in the social sciences (pp. 389-397).
Munster: Waxmann.
[36] Rasch, G. (1968). An individualistic approach to item analysis. In P. F. Lazarsfeld
& N. W. Henry (Eds.), Readings in mathematical social science. Cambridge: MIT
Press.
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
92
Meiser et. al.: Latent Change in Discrete Data
[37] Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests.
Chicago: The University of Chicago Press. (Original published 1960, Copenhagen:
The Danish Institute of Educational Research)
[38] Rindskopf, D. (1990). Nonstandard log-linear models. Psychological Bulletin, 108,
150-162.
[39] Rindskopf, D. (1992). A general approach to categorical data analysis with missing
data, using generalized linear models with composite links. Psychometrika, 57, 29-42.
[40] Roskam, E. E. (1996). Latent-Trait-Modelle [Latent trait models]. In E. Erdfelder,
R. Mausfeld, T. Meiser & G Rudinger (Eds.), Handbuch Quantitative Methoden (S.
431-458). Weinheim: Psychologie Verlags Union.
[41] Rost, J. (1988). Quantitative und qualitative probabilistische Testtheorie [Quantitative
and qualitative probabilistic test theory]. Bern: Huber.
[42] Rost, J. (1990). Rasch models in latent classes: An integration of two approaches in
item analysis. Applied Psychological Measurement, 14, 271-282.
[43] Rost, J. (1991). A logistic mixture distribution model for polychotomous item responses. British Journal of Mathematical and Statistical Psychology, 44, 75-92.
[44] Rost, J. & Erdfelder, E. (1996). Mischverteilungsmodelle [Mixture distribution models]. In E. Erdfelder, R. Mausfeld, T. Meiser & G Rudinger (Eds.), Handbuch Quantitative Methoden (S. 333-348). Weinheim: Psychologie Verlags Union.
[45] Rost, J. & Langeheine, R. (Eds.) (1997) Applications of latent trait and latent class
models in the social sciences . Munster: Waxmann.
[46] Rost, J. & Spada, H. (1983). Die Quantizierung von Lerneekten anhand von Testdaten [Quantifying learning eects using test data]. Zeitschrift fur Dierentielle und
Diagnostische Psychologie, 4, 29-49.
[47] Spada, H. & McGaw, B. (1985). The assessment of learning eects with linear logistic
test models. In S. Embretson (Ed.), Test design. Developments in psychology and
psychometrics (pp. 169-194). Orlando: Academic Press.
[48] Stelzl, I. (1997). Wie realistisch sind die Voraussetzungen von Item-ResponseModellen bei der Prufung experimenteller Hypothesen in Mewiederholungsdesigns?
{ Eine Auseinandersetzung mit Gluck, J. & Spiel, Ch. (1997) [How realistic are the assumptions underlying item response models for the analysis of experimental hypotheses in repeated measures designs? { A reply to Gluck, J. & Spiel, Ch. (1997)]. Methods
of Psychological Research - Online, Discussion Section. Internet: http://www.pabstpublishers.de/mpr/forum e.html
[49] Stern, E. (1993). What makes certain arithmetic word problems involving the comparison of sets so hard for children? Journal of Educational Psychology, 85, 7-23.
[50] Stern, E. (1998). Die Entwicklung des mathematischen Verstandnisses im Kindesalter [The development of mathematical understanding during childhood]. Lengerich:
Pabst Publisher.
[51] Stern, E., & Lehrndorfer, A. (1992). The role of situational context in solving word
problems. Cognitive Development, 7, 259-268.
[52] Thissen, D. & Mooney, J. A. (1989). Loglinear item response models, with applications
to data from social surveys. In C. C. Clogg (Ed.), Sociological methodology (Vol. 19)
(pp. 299-330). Oxford: Basil Blackwell.
[53] Titterington, D. M., Smith, A. F. M., & Makov, U. E. (1985). Statistical analysis of
nite mixture distributions. Chichester: Wiley.
[54] Tjur, T. (1982). A connection between Rasch's item analysis model and a multiplicative Poisson model. Scandinavian Journal of Statistics, 9, 23-30.
[55] van der Linden, W. J. & Hambleton, R. K. (Eds.) (1997). Handbook of modern item
response theory . New York: Springer.
[56] Vermunt, J. K. (1997). LEM: A general program for the analysis of categorical data. Tilburg University: Department of Methodology. Internet:
http://cwis.kub.nl/~fsw 1/mto/
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
93
Meiser et. al.: Latent Change in Discrete Data
[57] von Davier, M. & Rost, J. (1995). Polytomous mixed Rasch models. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models. Foundations, recent developments, and
applications (pp. 371-379). New York: Springer.
[58] Weinert, F. E. & Helmke, A. (1997). Entwicklung im Grundschulalter [Development
in elementary school]. Weinheim: Psychologie Verlags Union.
[59] Wilson, M. (1989). Saltus: A psychometric model of discontinuity in cognitive development. Psychological Bulletin, 105, 276-289.
MPR{online 1998, Vol.3, No.2
c
1999
Pabst Science Publishers
© Copyright 2026 Paperzz