Longitudinal Models

Longitudinal Models
Reference: Duncan, T. E., Duncan, S. C., Strycker, L. A., Li, F., & Apert, A. (2006). An Introduction to
Latent Variable Growth Curve Modeling: Concepts, Issues, and Applications. (2nd Edition).
Erlbaum.
See also Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence (Hardcover)
by Judith D. Singer (Author), John B. Willett
Longitudinal Model
A model for data having observations at two or more time points.
Example: Repeated measures data
Simplest Example
Correlated Groups t-test data
Correlated groups t compares mean of “b” column with mean of “a” column. (“b” for before, “a” for
after)
Pa ired S amples S tatis tics
Pa ir 1
b
Me an
4.8 7
a
6.6 0
N
15
Std . Deviatio n Std . Erro r Me an
2.2 32
.57 6
15
1.3 52
.34 9
Pa ired S amples Correla tions
N
Pa ir 1
b& a
15
Co rrelati on
.88 0
Sig .
.00 0
Pa ired S amples Test
Pa ired Differe nces
95 % Co nfide nce I nterval of
the Diffe rence
Pa ir 1
b-a
Me an
Std . Deviatio n Std . Erro r Me an
-1. 733
1.2 23
.31 6
Lo wer
-2. 410
Longitudinal Models - 1
Up per
-1. 056
t
-5. 490
df
14
7/13/2017
Sig . (2-t ailed )
.00 0
Paired samples t using Amos
Involves comparing a general model with a special model in which restrictions have been
applied.
As we have seen previously, Amos allows certain types of restrictions of parameter
values, including equality restrictions, through use of the Manage Models capability.
Comparing means of two correlated samples - Method 1: Constraining Separate Variable Means
Overview . . .
0) View/Set -> Analysis Properties. . . -> Estimation -> Estimate Means and Intercepts
1) First, estimate a model in which means are allowed to be different.
2) Second, estimate a model in which means are constrained to be equal.
3) Use the Chi-square difference between the two models to assess the significance of the difference in
fit and use that assessment to decide whether the means are equal or not.
Specifically . . .
The first step is to create a model. In this case, it’s a simple two-correlated-variables
model, with means estimated.
The second step in the procedure is to give NAMES, rather than values to the parameters
which will eventually be restricted. In this case, the means of variables B and A were
named uB and uA respectively. (u for µ, get it?)
Next, the Manage Models dialog box is opened by double-clicking on “Default Model”
Longitudinal Models - 2
7/13/2017
In the Manage Models dialog box, nothing is entered into the Parameter Constraints field
because both means, uB and uA, are estimated freely. No constraints apply.
Then, the [New] button is clicked, creating a new model.
In the second model, the means are to be estimated as being equal by entering the
constraint, uB = uA, into the Parameter Constraints dialog box. Next, a name, in this
case, “EqualMeans” is given to this model. Finally, the “Close” button is used to
complete the specification.
The result of this is the creation of TWO models – one with means allowed to be unequal,
the other with means constrained to be equal.
Longitudinal Models - 3
7/13/2017
When the AMOS Abacus button is clicked, parameter estimates from both models are
computed automatically. Two path diagrams – one for each model are also created. You
can select between the path diagrams by clicking on the name of the model whose
diagram you want to view.
Note that the variables in a longitudinal model are in principle no different than any other
pair of variables, except that we know that they are really the same characteristic
measured at two different times.
If they had been two different characteristics, our interest would likely have been on the
correlation between them. In this case, as is the case with most repeated measures data,
we assume that they are correlated, and our real interest is on equality of the means of
the two variables.
Comparing means of two correlated samples -Method 2 – Constraining mean of a difference
variable
The comparison of means can be done in a different way using difference scores. For this example, the
difference between each pair of scores was created and put in a column of the SPSS data editor. Then a
very simple one-variable model with mean estimated was created. Two version of the model were
applied – one in which the mean of the difference scores was estimated freely and the other in which the
mean of the difference scores was constrained to be equal to 0.
The result is below. Note that this technique is completely equivalent to the technique above. The
results are exactly the same.
Longitudinal Models - 4
7/13/2017
More than Two time periods.
The following model is of data presented in Duncan, et. al. (1999). The data concern use of alcohol at
four equally spaced time periods. Although we sometimes simply want to know if there are any
differences between means, in most instances the interest is on the form of increase or decrease in mean
usage across time periods. We’ll deal with both types of question here.
First, the “are there any differences?” question. An initial examination involves comparing a model
which allows all four means to “be all you can be” vs. a model which restricts the means to be equal.
This is essentially equivalent to overall test of equality of means given by the one way repeated measure
analysis of variance.
As above, this involves comparison of two models – one with means unequal – the general model – and
one with means constrained equal – the special model.
Duncan, et. al. (1999). Page 64
Alcohol Use across 4 time periods.
Chi-square = .000
.98
DF = 0
p = \p
1.14
1.08
RMSEA = \rmsea
1.35
1.22
1.34, 1.58
Alc Use
Time 1
General Model
Means allowed to
be unequal
1.39
1.59, 1.77
2.02, 2.07
Alc Use
Time 3
Alc Use
Time 2
Duncan, et. al. (1999). Page 64
Alcohol Use across 4 time periods.
Chi-square = 167.196
.77
DF = 3
p = .000
1.08
.96
RMSEA = .392
1.71, 1.72
Alc Use
Time 1
1.57
1.71, 1.79
Alc Use
Time 2
Alc Use
Time 4
Special Model
Means required to
be equal.
1.31
1.26
2.26, 1.89
1.71, 2.17
Alc Use
Time 3
1.71, 2.20
Alc Use
Time 4
In the general model, the means across time periods were 1.34, 1.59, 2.02, and 2.26. (Note the general
increase in value across time periods.)
In the special model, the means were all estimated at 1.71. But the chi-square statistic for this special
model was significant, indicating that the model which constrained the means to be equal fit
significantly worse than the model which allowed them to be estimated freely. The bottom line to
this logic is that there are significant differences between the means.
Longitudinal Models - 5
7/13/2017
Another example of the “are there any differences?” question – Comparing mean
conscientiousness scores across honest, incentive, and instructed faking conditions.
This is not a true “longitudinal” study, but it does involved measurement of the same construct over
three time periods.
Regular repeated measures analysis
General Linear Model
[DataSet1] G:\MdbR\Clark\ClarkDataFiles\ClarkAndNewDataCombined090223.sav
chc
cdc
cic
Effect
instructi
on
Descriptive Statistics
Std.
Mean
Deviation
4.4029
.90710
4.7979
1.05095
5.4779
.96713
Pillai's Trace
Wilks' Lambda
Hotelling's Trace
Roy's Largest
Root
Value
.480
.520
.923
.923
N
249
249
249
F
114.023a
114.023a
114.023a
114.023a
Multivariate Testsc
Hypothesis
df
Error df
2.000 247.000
2.000 247.000
2.000 247.000
2.000 247.000
Sig.
.000
.000
.000
.000
Partial Eta
Squared
.480
.480
.480
.480
Noncent.
Parameter
228.046
228.046
228.046
228.046
a. Exact statistic
b. Computed using alpha = .05
c. Design: Intercept
Within Subjects Design: instruction
Mauchly's Test of Sphericityb
Measure:MEASURE_1
Within
Subjects Effect
instruction
Mauchly's
W
.909
Approx.
ChiSquare
23.680
df
2
Greenhouse
Sig.
-Geisser
.000
.916
Epsilona
HuynhFeldt
.923
Lowerbound
.500
Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed
dependent variables is proportional to an identity matrix.
a. May be used to adjust the degrees of freedom for the averaged tests of significance.
Corrected tests are displayed in the Tests of Within-Subjects Effects table.
b. Design: Intercept
Within Subjects Design: instruction
Longitudinal Models - 6
7/13/2017
Observed
Powerb
1.000
1.000
1.000
1.000
Tests of Within-Subjects Effects
Measure:MEASURE_1
Type III
Sum of
Squares
147.244
Source
instruction
Sphericity
Assumed
GreenhouseGeisser
Huynh-Feldt
Lower-bound
Error(instructi Sphericity
on)
Assumed
GreenhouseGeisser
Huynh-Feldt
2
Mean
Square
F
73.622 124.082
Sig.
.000
147.244
1.832
80.353 124.082
.000
.333
227.376
1.000
147.244
147.244
294.295
1.845
1.000
496
79.788 124.082
147.244 124.082
.593
.000
.000
.333
.333
228.985
124.082
1.000
1.000
df
294.295 454.45
4
294.295 457.66
8
294.295 248.00
0
Lower-bound
Partial Eta Noncent.
Squared
Parameter
.333
248.163
Observed
Powera
1.000
.648
.643
1.187
a. Computed using alpha = .05
Tests of Within-Subjects Contrasts
Measure:MEASURE_1
Source
instruction
instruction
Linear
Quadratic
Error(instru Linear
ction)
Quadratic
Type III
Sum of
Squares
143.872
3.373
158.602
135.693
1
1
248
Mean
Square
143.872
3.373
.640
248
.547
df
F
224.967
6.164
Sig.
.000
.014
Partial Eta
Squared
.476
.024
Noncent.
Parameter
224.967
6.164
a. Computed using alpha = .05
Profile Plots
Longitudinal Models - 7
7/13/2017
Observed
Powera
1.000
.696
Analysis using Amos
The unrestricted means model
The restricted means model
This model fits significantly worse than the unrestricted means model, indicating that the restriction of
means to equality is not tenable. So the means must be unequal.
Longitudinal Models - 8
7/13/2017
The “what is the form of differences?” question -Longitudinal Growth Models (LGMs) –
Perhaps because much of the work in longitudinal models has been in the area of developmental
psychology, the emphasis of longitudinal models has been on growth or decline of means across time
periods. This emphasis has lead to the development of what are called longitudinal growth models or
LGMs.
A longitudinal growth model is a model for the direction and shape of change in means across time
periods. The simplest of such is a model of linear change over time. Such a model adds two latent
variables to the observed variable only model above. This simplest LGM will be described first.
Later, LGMs which allow for nonlinear change across time – quadratic or cubic functions, for example –
will be estimated.
To simplify the notation, the observed variables will be denoted as AT1, AT2, AT3, and AT4, rather
than the more complicated Alc Use Time 1, Alc Use Time 2, etc.
Graphical representation of the data being modeled
Alcohol
Use
Slope
Y-intercept
AT1
AT2
AT3
AT4
Time
The above is a graphical description of longitudinal growth.
The same people are observed at each time period.
Black circles are the means at the 4 time periods.
Longitudinal Models - 9
7/13/2017
Linear Longitudinal Growth Model (Who would have thought of this??)
Below is a prototypical latent variable model for linear change over time.
The simplest model has two latent variables. The first latent variable in a linear LGM is an intercept
latent variable. It is indicated equally by all the observed variables with regression weights set
equal to 1. When combined with the usual weighting of the indicators of the other latent variable, this
latent variable represents mean value of the dependent variable at time 1 – an intercept parameter.
One other latent variable is in the model. The values of regression weights between this variable and
the observed variables are 0, 1, 2, etc. This choice of regression weights makes this second latent
variable a linear slope parameter, representing linear increase or decrease over time. (Other choices of
weights could be used to make it quadratic or cubic slope parameter.)
This LGM is applied to the Alcohol use data from Duncan, et. al.
LGM
Intercept
0
1
Note that these are
not means in this
figure, they’re
intercepts.
1 b32 b43
1
1
LGM
Slope
1
Maddening Detail 1:
Intercepts of all
indicators set to 0.
0
AT1
1
0
AT3
AT2
1
0,
ea1
0,
ea2
0
0
1
0,
ea3
AT4
1 0,
ea4
Details of application
There are some maddening details associated with the application of such models.
For the simple model above, the only such detail has to do with the intercepts for the observed
variables – AT1, AT2, AT3, and AT4. All must be set to 0.
As always, residual regression weights = 1.
Longitudinal Models - 10
7/13/2017
Application of the Linear LGM
Following is the result of application of the linear LGM.
Variances: (Group number 1 - Intercepts 0)
Means: (Group number 1 - Intercepts 0)
P Label
1.283
I
***
1.325
I
.017
5.031
.086
S
***
.317
S
ea1
.355 .069 5.145 ***
ea2participants,
.508 at.051
*** alcohol usage was 1.325.
The LGM Intercept mean = 1.325. Across all
time9.938
1, average
ea3
.679 .064 10.600 ***
The LGM Slope mean = .317. Across all participants,
the average
increase
in alcohol usage was .317
ea4
.442
.082 5.360
***
per time period.
Estimate S.E.
Estimate S.E.
C.R.
P Label
.066 20.046 ***
.022 14.572 ***
C.R.
.123 10.409
The LGM Intercept variance = 1.283. There was significant time 1 variance in alcohol usage across
participants. That is, some reported usage greater than 1.325, others reported usage less than 1.325.
The LGM Slope variance = .086. There was significant variance in rate of increase in alcohol usage
across participants. Some participants increased at a rate larger than .317 / time period, others increased
at a rate less than .317 / time period.
The significant negative covariance of -.09 (r = -.27) means that those who started higher increased at a
lower rate and those who started lower, increased at a higher rate.
Duncan, et. al. (1999). Page 64
Alcohol Use across 4 time periods.
Chi-square = 20.677
-.09
DF = 5
p = .001
RMSEA = .094
.32, .09
1.33, 1.28
LGM
Slope
LGM
Intercept
.00
1.00
2.003.00
1.00
1.00
1.00 1.00
.00
AT1
1
.00
AT3
AT2
1
0, .35
ea1
0, .51
ea2
.00
.00
1
0, .68
ea3
AT4
1 0, .44
ea4
Longitudinal Models - 11
7/13/2017
Visual Interpretation of the Linear LGM.
2
Mean
Alcohol
Use
1
T1
-.09
T2
.32, .09
1.33, 1.28
LGM
Intercept
.00
1.00
1.00
1.00 1.001.00
T3
T4
1
Parameter estimates of the
model were obtained.
LGM
Slope
Note that the regression
weights relating the LGM
Intercept and LGM Slope
latent variables are fixed.
2.003.00
Note also that the AT1 . . .
AT4 intercepts are fixed at 0.
0
0
AlcAT1
Use
Time 1
AlcAT2
Use
Time 2
AlcAT3
Use
Time 3
1
1
1
0, .35
ea1
0, .51
ea2
0
0
0, .68
ea3
AlcAT4
Use
Time 4
1 0, .44
ea4
LGM Intercept
Mean = 1.33. This is the Y-intercept of the function relating means to Time.
Variance = 1.28. This indicates that there is variability in the “initial” amount of alcohol use across the
sample. If significant, this would indicate variability in the population.
LGM Slope
Mean = .32. This is the average increase in Alcohol use across the 4 time periods.
Variance = .09. This is the variability in slope for the sample. If significantly different from 0, this
would indicate that there is variability in the population.
Longitudinal Models - 12
7/13/2017
Estimating a Quadratic Growth Curve
(And you thought you’d never see polynomial coefficients again!!)
A quadratic growth curve can be estimated by adding a third latent variable, whose loadings represent
quadratic growth.
The loadings representing quadratic growth are 0, 1, 4, and 9, the squares of those representing linear
growth.
This model is below
Duncan, et. al. (1999). Page 64
Alcohol Use across 4 time periods.
.02
Chi-square = 4.828
DF = 1
-.17
-.20
p = .028
RMSEA = .104
.30, .61
1.33, 1.39
LGM
Quadratic
LGM
Linear
LGM
Intercept
.00, .06
1.00
.00
1.00 .00 1.002.003.00
1.00
1.00 1.00
4.00
9.00
0
AT1
1
0
AT3
AT2
1
0, .19
ea1
0, .41
ea2
0
0
1
0, .62
ea3
AT4
1 0, .15
ea4
Note that the mean of the LGM Quadratic is essentially 0, and not significantly different from 0,
indicating that the growth in Alcohol use across time periods was essentially linear.
Means: (Group number 1 - Default model)
I
S
Q
Estimate S.E. C.R.
P Label
1.331 .067 19.980 ***
.302 .063 4.774 ***
.003 .020 .171 .864
Longitudinal Models - 13
7/13/2017
Multiple indicators at each time period
Building on the previous example, suppose we have three indicators of substance use at each time period
ATi: Alcohol use at Time i
TTi: Tobacco use at Time i
MTi: Marijuana use at Time i
These will be treated as indicators of a general Substance use latent variable at each time period. Thus
there will be four Use latent variables, UT1, UT2, UT3, and UT4 representing Use at Times 1, 2, 3, and
4.
The first model estimated will be a simple model without a growth curve imposed. This will allow a
general “are there any differences?” test of the null hypothesis of equal means across time periods. If
that null is not rejected, there’s no point in attempting to model a curve of differences that don’t exist.
Note several maddening details . . .
1) The loadings for Alcohol at each time period have been given the same name, La.
This will constrain their values to be equal. An equivalent constraint was applied to the loadings for
Marijuana, Lm.
2) Intercepts for Tobacco have been set to 0. This was done because the source, Duncan, et. al., used
Tobacco as the reference variable.
3) Intercepts for Alcohol and Marijuana are not equal across time periods. A model in which they were
equal would be more defensible.
4) Residuals for the four same-type observations were allowed to be correlated – alcohol use residuals,
tobacco use residuals, and marijuana use residuals. This is common in repeated measures data –
repeated measures of the same variable will tend to be correlated due to unique characteristics of that
variable that are common across time periods. A prime example of the need for such correlations is in
questionnaire data when the same items are responded to at different time periods.
Longitudinal Models - 14
7/13/2017
The model with means estimated separately- unstandardized estimates . . .
Longitudinal Models - 15
7/13/2017
The model with means constrained to equality – unstandardized estimates . . .
The chi-square difference is significant (120.832 – 59.115 = 61.717 with df = 3), indicating that the
means are not equal in the population.
Since the Use means are not equal, it makes sense to ask about the form of the differences across time
periods. So the next step is to apply a linear growth model to the data to determine if a linear increase is
consistent with the data.
Longitudinal Models - 16
7/13/2017
The Linear Multiple-indicator LGM (MLGM)- unstandardized estimates . . .
The chi-square goodness of fit statistic is significant, but it was significant even for the “means-allowedto-be-themselves” model above. The RMSEA changed little and is still in the range (<.05) that is
acceptable. This suggests that the linear growth model is an adequate fit for these data.
Note that there is a small negative covariance (-.06) between the LGM Intercept latent variable and the
LGM Linear Slope latent variable. This value is significantly different from 0, suggesting that those
who started low increased their substance use at a higher rate than those who started high. But the
possibility of statistical artifact associated with an inadequate model should be considered.
Longitudinal Models - 17
7/13/2017
Covariates analyses
One of the interesting applications of LGMs is in their ability to capture both predictors of change and
sequelae of change. Here’s an example from Duncan, et. al.
In this model, the relationship of the intercept, linear and quadratic slope parameters to the predictor,
age, is examined.
In addition, the effect of the intercept, and linear and quadratic slope parameters on the sequel, problem
behaviors, is studied.
Longitudinal Models - 18
7/13/2017
Here are the results, for what they’re worth. See Duncan, et. al. for a complete description of the
problem. The standardized solution is printed.
Note that older kids started lower (-.21). Older kids increased faster (.49). I don’t know why the
standardized Age-> Q coefficient wasn’t printed.
Problem behaviors were positively related to how much alcohol kids were drinking at time 1 (the
intercept, .50). They were negatively related to the steepness of increase in alcohol use (-.20)?? Those
kids who increased their alcohol consumption more from one time to the next had fewer (??) problem
behaviors??
Longitudinal Models - 19
7/13/2017