April 18

Contrasting Marginal and Mixed Effects Models
Recall: two approaches to handling dependence in Generalized
Linear Models:
Marginal models: based on the consequences of dependence
on estimating model parameters.
• Target of inference is the population.
Random effect models: based on the sources of dependence.
• Target of inference is the subject.
1
Special case: Linear models
Model for mean response:
E (Yi) = Xiβ
Alternative specifications for variance:
• covariance pattern
Cov (Yi) = Σi
e.g. AR(1) or Toeplitz;
2
• random effects
E (Yi|bi) = Xiβ + Z0i,j bi
with
E (bi) = 0,
Cov (bi) = G,
whence
Σi = Z0i,j GZ00i,j + σ 2Ini
3
Interpretation of β : in both cases,
E Yi,j
= X0i,j β =
X
Xi,j,k βk
k
so
∂E Yi,j
∂Xi,j,k
= βk .
But also, with random effects,
E Yi,j |bi = X0i,j β + Z0i,j bi
so, if the kth covariate is not associated with a random effect,
∂E Yi,j |bi
∂Xi,j,k
= βk .
4
That is, βk is the rate of change of the mean response as the
kth covariate changes, both
• on average across subjects (marginally), and
• for a specific subject with random effects b (conditionally).
This might be either a rate of change over time, if the covariate is time (within-subject factor), or a treatment effect, if the
covariate is a dummy variable (between-subject factor).
5
Generalized Linear Models
With link function g(·) and inverse link h(·) = g −1(·):
• the marginal model is
n
g E Yi,j
o
= X0i,j β ,
or
E Yi,j
0
= h Xi,j β ;
• the mixed effects model is
n
g E Yi,j |bi
o
= X0i,j β ∗ + Z0i,j bi
or
E Yi,j |bi
0
∗
0
= h Xi,j β + Zi,j bi .
6
In the mixed effects case,
n
E Yi,j = E E Yi,j |bi
o
o
0
∗
0
= E h Xi,j β + Zi,j bi
Z
0
∗
0
= h Xi,j β + Zi,j bi fb(bi)dbi,
n and in general
E Yi,j
6 h X0i,j β
=
for any β .
7
Logistic regression with random intercept:
n
logit E Yi,j |bi
= X0i,j β ∗ + bi
o
so
e
E Yi,j |bi =
X0i,j β ∗+bi
1+e
X0i,j β ∗+bi
and
E Yi,j = E




e
X0i,j β ∗+bi



1+e
=
Z ∞
−∞
e




X0i,j β ∗+bi
X0i,j β ∗+bi
1+e



X0i,j β ∗+bi
×q
1
2πσb2
1 2
2
e− 2 bi /σb dbi.
8
No closed-form solution, but
n
logit E Yi,j
o
≈q
√
X0i,j β ∗
1 + k2σb2
3 = 0.588 and k 2 = 0.346, so
where k = 16
15π
β≈q
β∗
1 + 0.346σb2
.
If σb2 = 8, then β ≈ β ∗/2
9
0.0
0.2
0.4
y
0.6
0.8
1.0
Example: one covariate, sample of size 13 with β1∗ = −1.5,
β2∗ = 0.75, g1,1 = σb2 = 4; population average is shown in red,
and the approximation in blue.
−4
−2
0
2
4
6
8
x
10
Case study: abnormal ECG in a drug trial.
options linesize = 80 pagesize = 21 nodate;
data ecg;
retain id 0;
infile ’ecg.txt’ firstobs = 32;
input seq r1 r2 count;
if seq = 1
then /* Placebo followed by drug */
do i = 1 to count;
id = id + 1;
trt = 0; y = r1; period = 0; output;
trt = 1; y = r2; period = 1; output;
end;
else /* Drug followed by placebo */
do i = 1 to count;
id = id + 1;
trt = 1; y = r1; period = 0; output;
trt = 0; y = r2; period = 1; output;
end;
run;
11
title1 ’Marginal Logistic Regression Model’;
title2 ’Crossover Trial on Cerebrovascular Deficiency’;
proc genmod descending;
class id;
model y = trt period / d=bin;
repeated subject=id / logor=fullclust;
run;
title1 ’Mixed Effects Logistic Regression Model (Random Intercept)’;
title2 ’Crossover Trial on Cerebrovascular Deficiency’;
proc nlmixed qpoints=100;
/* Initial values from GEE output: */
parms beta1=-1.2433 beta2=.5689 beta3=.2951 g11=1;
eta=beta1 + beta2*trt + beta3*period + b;
p=exp(eta)/(1 + exp(eta));
model y ~ binary(p);
random b ~ normal(0,g11) subject=id;
run;
SAS output
Marginal Logistic Regression Model
Crossover Trial on Cerebrovascular Deficiency
1
The GENMOD Procedure
Model Information
Data Set
Distribution
Link Function
Dependent Variable
Number
Number
Number
Number
of
of
of
of
Observations Read
Observations Used
Events
Trials
WORK.ECG
Binomial
Logit
y
134
134
42
134
12
Marginal Logistic Regression Model
Crossover Trial on Cerebrovascular Deficiency
2
The GENMOD Procedure
Class Level Information
Class
id
Levels
67
Values
1 2 3
21 22
38 39
55 56
4 5 6
23 24
40 41
57 58
7 8 9
25 26
42 43
59 60
10
27
44
61
11
28
45
62
12
29
46
63
13
30
47
64
14
31
48
65
15
32
49
66
16 17 18 19 20
33 34 35 36 37
50 51 52 53 54
67
13
Marginal Logistic Regression Model
Crossover Trial on Cerebrovascular Deficiency
3
The GENMOD Procedure
Response Profile
Ordered
Value
y
Total
Frequency
1
2
1
0
42
92
PROC GENMOD is modeling the probability that y=’1’.
14
Marginal Logistic Regression Model
Crossover Trial on Cerebrovascular Deficiency
4
The GENMOD Procedure
Parameter Information
Parameter
Effect
Prm1
Prm2
Prm3
Intercept
trt
period
Criteria For Assessing Goodness Of Fit
Criterion
Deviance
Scaled Deviance
Pearson Chi-Square
DF
Value
Value/DF
131
131
131
163.8863
163.8863
133.5123
1.2510
1.2510
1.0192
15
Marginal Logistic Regression Model
Crossover Trial on Cerebrovascular Deficiency
5
The GENMOD Procedure
Criteria For Assessing Goodness Of Fit
Criterion
Scaled Pearson X2
Log Likelihood
DF
Value
Value/DF
131
133.5123
-81.9432
1.0192
Algorithm converged.
16
Marginal Logistic Regression Model
Crossover Trial on Cerebrovascular Deficiency
6
The GENMOD Procedure
Analysis Of Initial Parameter Estimates
Parameter
DF
Estimate
Standard
Error
Intercept
trt
period
Scale
1
1
1
0
-1.2186
0.5582
0.2743
1.0000
0.3441
0.3784
0.3768
0.0000
Wald 95%
Confidence Limits
-1.8931
-0.1835
-0.4642
1.0000
-0.5440
1.2998
1.0129
1.0000
ChiSquare
Pr > ChiSq
12.54
2.18
0.53
0.0004
0.1402
0.4666
NOTE: The scale parameter was held fixed.
17
Marginal Logistic Regression Model
Crossover Trial on Cerebrovascular Deficiency
7
The GENMOD Procedure
GEE Model Information
Log Odds Ratio Structure
Subject Effect
Number of Clusters
Correlation Matrix Dimension
Maximum Cluster Size
Minimum Cluster Size
Fully Parameterized Clusters
id (67 levels)
67
2
2
2
18
Marginal Logistic Regression Model
Crossover Trial on Cerebrovascular Deficiency
8
The GENMOD Procedure
Log Odds Ratio
Parameter Information
Parameter
Group
Alpha1
(1, 2)
Algorithm converged.
19
Marginal Logistic Regression Model
Crossover Trial on Cerebrovascular Deficiency
9
The GENMOD Procedure
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Standard
Parameter Estimate
Error
Intercept
trt
period
Alpha1
-1.2433
0.5689
0.2951
3.5617
0.2999
0.2335
0.2319
0.8148
95% Confidence
Limits
-1.8311
0.1112
-0.1593
1.9647
-0.6556
1.0266
0.7496
5.1587
Z Pr > |Z|
-4.15
2.44
1.27
4.37
<.0001
0.0148
0.2030
<.0001
20
Mixed Effects Logistic Regression Model (Random Intercept)
Crossover Trial on Cerebrovascular Deficiency
The NLMIXED Procedure
Specifications
Data Set
Dependent Variable
Distribution for Dependent Variable
Random Effects
Distribution for Random Effects
Subject Variable
Optimization Technique
Integration Method
WORK.ECG
y
Binary
b
Normal
id
Dual Quasi-Newton
Adaptive Gaussian
Quadrature
21
10
Mixed Effects Logistic Regression Model (Random Intercept)
Crossover Trial on Cerebrovascular Deficiency
The NLMIXED Procedure
Dimensions
Observations Used
Observations Not Used
Total Observations
Subjects
Max Obs Per Subject
Parameters
Quadrature Points
134
0
134
67
2
4
100
22
11
Mixed Effects Logistic Regression Model (Random Intercept)
Crossover Trial on Cerebrovascular Deficiency
The NLMIXED Procedure
Parameters
beta1
beta2
beta3
g11
NegLogLike
-1.2433
0.5689
0.2951
1
76.6705695
Iteration History
Iter
Calls
NegLogLike
Diff
MaxGrad
Slope
1
2
3
4
3
5
7
9
75.877218
71.2225189
70.9616922
69.3279449
0.793351
4.654699
0.260827
1.633747
3.143617
0.769971
0.959227
0.692069
-31.3385
-36.6519
-1.3592
-1.90957
23
12
Mixed Effects Logistic Regression Model (Random Intercept)
Crossover Trial on Cerebrovascular Deficiency
The NLMIXED Procedure
Iteration History
Iter
Calls
NegLogLike
Diff
MaxGrad
Slope
5
6
7
8
9
10
11
12
13
10
11
12
15
17
19
21
23
25
69.0008522
68.5033645
68.2882471
68.1688967
68.1357412
68.1309946
68.1302223
68.1301696
68.1301694
0.327093
0.497488
0.215117
0.11935
0.033156
0.004747
0.000772
0.000053
2.006E-7
0.679054
0.630885
0.611136
0.296866
0.085458
0.021189
0.013186
0.000147
0.000046
-0.73402
-0.57466
-0.17656
-0.0794
-0.01492
-0.00549
-0.00148
-0.0001
-3.62E-7
24
13
Mixed Effects Logistic Regression Model (Random Intercept)
Crossover Trial on Cerebrovascular Deficiency
The NLMIXED Procedure
NOTE: GCONV convergence criterion satisfied.
Fit Statistics
-2 Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
136.3
144.3
144.6
153.1
25
14
Mixed Effects Logistic Regression Model (Random Intercept)
Crossover Trial on Cerebrovascular Deficiency
15
The NLMIXED Procedure
Parameter Estimates
Parameter
beta1
beta2
beta3
g11
Estimate
Standard
Error
DF
t Value
Pr > |t|
Alpha
Lower
-4.0816
1.8631
1.0375
24.4355
1.6711
0.9269
0.8189
18.8489
66
66
66
66
-2.44
2.01
1.27
1.30
0.0173
0.0485
0.2096
0.1994
0.05
0.05
0.05
0.05
-7.4180
0.01241
-0.5974
-13.1974
26
Mixed Effects Logistic Regression Model (Random Intercept)
Crossover Trial on Cerebrovascular Deficiency
The NLMIXED Procedure
Parameter Estimates
Parameter
beta1
beta2
beta3
g11
Upper
Gradient
-0.7452
3.7137
2.6725
62.0685
-8.31E-6
0.000028
-0.00005
-1.34E-7
27
16
Notes
• The treatment effect is significant in both analyses, but much
larger in the mixed effects model than in the marginal model.
• The proc nlmixed fit has a very large ĝ1,1 of 24.4, but with
a large standard error and a non-significant t-statistic.
• But the proc nlmixed output gives AIC = 144.3 and the “independence” part of the proc genmod output gives Log Likelihood = -81.9432, whence
AIC = (−2) × (−81.9432) + 2 × 3 = 169.9,
showing that including the variance parameter leads to a
much better fit.
28