Correlated data

university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
Non-normal outcomes
Faculty of Health Sciences
I
Correlated data
I
Generalized linear models
Generalized linear mixed models
I
Non-normal outcomes
I
I
I
Lene Theil Skovgaard
I
December 5, 2014
Leprosy
Seizures (briefly)
Two examples with binary outcome
I
I
1 / 99
Population average models (PA)
Subject specific models (SS)
Examples with counts
I
Amenorrhea (longitudinal)
Smoking among school children (cluster)
2 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
Non-normal data
Reminder on binary data
Typical data from e.g. epidemiology are often not normally
distributed (binary, ordinal, counts, survival...)
Examples of binary outcomes:
Generalized linear models (in exponential families):
Multiple regression models,
on a scale that ’corresponds’ to the data:
I
Normal (link=identity),
traditional linear models
I
Binomial (link=logit),
logistic regression
I
Poisson (link=log),
log-linear models, Poisson regression
3 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
I
infection after surgery
I
smoking among school children
I
amenorrhea among contracepting women
d e pa rt m e n t o f b i o s tat i s t i c s
A binary variable X has a Bernoulli ditribution, meaning that
I
P(U = 1) = p
I
P(U = 0) = 1 − p
For such an outcome, the mean value is p,
and the variance is p(1 − p)
4 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Binomial data
d e pa rt m e n t o f b i o s tat i s t i c s
Examples of Binomial distributions
n=4, 20 and 50;
If we sum up binary observations,
Y =
e.g.
university of copenhagen
n
X
i=1
p=0.02, 0.2 og 0.5
Ui = U 1 + · · · + Un
I
number of infections for each hospital
I
number of smokers in each school class
I
number of women with amenorrhea for each general practice
we get a Binomial distribution, Y ∼ Bin(n, p),
5 / 99
6 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Binomial distribution, and approximations
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Poisson distribution
Counts with no well-defined upper limit:
The Binomial variable Y has point probabilities
P(Y = m) =
!
n m
p (1 − p)n−m
m
Its mean is np and its variance np(1 − p)
When n is large, this distribution is very intractable,
so we use approximations
I
p moderate (not too close to 0 or 1): Normal distribution
I
p close to either 0 or 1: Poisson distribution
I
the number of cancer cases in a specific community
during a specific year
I
the number of positive swabs over a certain period of time
Law of rare events:
As the count parameter n in a Binomial distribution gets larger
and the parameter p gets close to either 0 or 1, the Binomial
probabilities are approximately equal to the Poisson dsitribution
P(Y = m) =
where λ = np is the mean value,
as well as the variance.
7 / 99
8 / 99
λm
exp(−λ)
m!
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Generalized linear models
Generalized linear MIXED models
Outcome variable Yi , with a distribution from an exponential
family (includes Normal, Binomial, Poisson, Gamma, ....), with
Outcome variable Yij , e.g. j’th measurement time for individual i:
I
I
Mean value: µi
Link funktion: g assumed linear in covariates, i.e.
g(µi ) = β0 + β1 xi1 + · · · + βk xik = XiT β
where Xi denote the covariate vector for individual i.
I
Normal (link=identity)
I
Binomial (link=logit)
I
Poisson (link=log)
9 / 99
university of copenhagen
Mean value: µij
Link funktion: g, assumed linear in covariate vector Xij .
Two kinds of models:
I
I
Population average models (PA):
g(µij ) = β0 + β1 xij1 + · · · + βk xijk = XijT β
and (Yij1 , Yij2 ) are associated (correlated)
Subject-specific models (SS):
g(µij ) = β0 + β1 xij1 + · · · + βk xijk +bi
bi ∼ N (0, σb2 )
10 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
The two model types
Marginal models = Population Average (PA)
Marginal models: or Population average (PA):
Describe covariate effects on the population mean,
e.g. expected difference between the effects of two
treatments
(corresponds to the repeated statement)
We specify only
I
Marginal mean, E(Yij |Xij ) = µij , where
g(µij ) = XijT β, i.e. covariate effects as usual
I
Distribution (Normal, Binomial, Posson,...)
I
Marginal variance, φV (µij ), depending on distribution
Mixed effects model: or Subject specific (SS):
Describe covariate effects on specific individuals (or
clusters), e.g. expected change over time, or
differences between boys and girls in the same school
class
(corresponds to the random statement)
This creates problems:
11 / 99
12 / 99
I
Some measure of association for Y ’s belonging to the same
individual/unit
I
Multivariate Binomial and Poisson distributions do not exist
I
It is more of an estimation procedure rather than a model
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Marginal models, technicalities
The GEE-method
Instead, we use a
GEE: Generalized estimating equation,
(written in vector notation)
D T Vi−1 (yi − µi ) = 0
where Vi is the (working) covariance matrix Cov(Yi ) and Di is the
matrix of derivatives of the mean value µi with respect to β
13 / 99
university of copenhagen
I
requires an iterative procedure,
I
gives consistent estimates of β
(they have the correct mean when the sample size is large),
even if Cov(Yi ) is incorrect
I
the estimates are asymptotically Normal
(i.e for large samle size, we can construct confidence intervals
with plus/minus 2 standard errors)
I
standard error of β̂ should be based on the
empirical sandwich estimator,
to allow for possible overdispersion and general
misspecification of Cov(Yi )
14 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
Residual variance for non-normal data
Overdispersion
In general, there is no free variance parameter,
since the variance is determined from the mean value:
can be caused by
I
Normal (link=identity), free variance parameter
I
Binomial (link=logit), variance np(1 − p)
I
σ2
Poisson (link=log), variance λ = E(Y )
Overdispersion:
The variance may be seen to be larger than determined by the
distribution.
15 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Marginal models, technicalities II
Since we do not actually have a model,
we cannot use a maximum likelihood approach.
X
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
I
omitted covariates (isn’t that always the case?)
I
unrecognized clusters
I
heterogeneity, e.g. a “zero”-group (non-susceptibles)
Traditional solution:
An over-dispersion parameter φ is estimated and multiplied onto
the variance
or more generally:
Use the empirical sandwich estimator of Cov(Yi )
16 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
Mixed effects models = Subject Specific models (SS)
Interpretation of SS
Observations: Yij , covariate vector Xij
Additional covariate vector Zij , specifying the random effects.
This is a real model, but
We specify
I
Mean, E(Yij |Xij , bi ) = µij , where g(µij ) = XijT β+ZijT bi
I
Distribution (Normal, Binomial, Poisson,...)
I
Conditional variance, φV (µij )
I
I
Variance of random effects, bi ∼ Np (0, G),
where G is the matrix (and software) notation for σb2
Conditional indepence,
given the covariates and the random effects
17 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
I
Inference is conditional on random effects and therefore
specific to the subject
I
It is very difficult to interpret the effect of covariates that are
constant within an individual (i.e. gender, treatment etc)
It may be useful to think about it as
I
The individual is a (class) covariate
I
The effect of another covariate is interpreted as
for “fixed value of all other covariates”, including
for fixed value of the individual
18 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
For traditional linear models (Normality)
For non-normal outcomes
with identity link:
The above is no longer true
due to non-linearity of the link-function
Subject-specific model with random intercept/level
is equal to
Marginal model with compound symmetry covariance structure
(type=CS)
d e pa rt m e n t o f b i o s tat i s t i c s
This means:
The interpretation of the parameters β does depend on the way
that we model the correlation.
And the interpretation of the parameters are different!
More generally:
The interpretation of the parameters β does not depend on the
way that we model the correlation
(although the estimate may change somewhat depending on the
assumed structure)
This implies that effects may either be interpreted cross-sectionally
(marginally, for comparison of different populations, say, of
different age) or subject-specific (effect of ageing for a single
individual)
19 / 99
20 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
A very simple example
Hypothetical example for illustration
Two individuals
Subject specific model with a covariate effect (x-axis) and 21
clusters (bi , individual curves).
Red curve denote population average curve
Individual
1
2
Average
Baseline
0.2
0.6
0.4
Follow up
0.4
0.8
0.6
Difference
0.2
0.2
0.2
log(OR)
0.981
0.981
0.981
OR
2.67
2.67
but log odds for the average is 0.811, and OR=2.25
The “average” of individual OR’s is larger than the OR calculated
from average probabilities
22 / 99
21 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Population average on logit scale
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Interpretations
SS specifies parallel lines on logit scale
Example: The need for glasses increase over age
Marginally:
Odds ratio for being in need of glasses for a population with mean
age 50 compared to a population with mean age 30
is smaller than
but the PA deviates somewhat from a straight line
– and has a smaller slope (smaller effect of covariate x)
23 / 99
Subject specific:
the Odds ratio for needing glasses when you (a specific individual)
are of age 50 compared to when you were at age 30
24 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Counts of leprosy bacilli
Controlled clinical trial:
10 patients treated with placebo P
I
10 patients treated with antibiotic A
I
10 patients treated with antibiotic B
before treatment (baseline, time=0)
I
several months after treatment, (time=1)
Analysis Variable : bacilli
N
drug
time
Obs
N
Mean
Variance
--------------------------------------------------------------------A
0
10
10
9.3000000
22.6777778
1
10
10
5.3000000
21.5666667
B
Recording of the number of bacilli at six sites of the body,
i.e. a count variable
I
d e pa rt m e n t o f b i o s tat i s t i c s
Averages for the leprosy example
Reference: Snedecor, G.W. and Cochran, W.G. (1967).
Statistical Methods, (6th edn). Iowa State University Press
I
university of copenhagen
0
1
10
10
10
10
10.0000000
6.1000000
P
0
10
10
12.9000000
15.6555556
1
10
10
12.3000000
51.1222222
---------------------------------------------------------------------
Note: The variance is obviously bigger than the
average.....overdispersion
25 / 99
26 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Spaghettiplot - the leprosy example
Average plot - the leprosy example
Legends:
A —— B ......
Legends:
A —— B ......
27 / 99
27.5555556
37.8777778
P ——
28 / 99
P ——
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Purpose of investigation
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Why is this not simple?
This is just a before-after study....
1. Evaluate the efficiency of antibiotics:
red vs green lines
2. Compare the two drugs, A and B:
solid vs dotted red lines
3. Quantify the effects of the two antibiotic drugs (SS)
Randomization:
At baseline, all patients have the same expected mean count
(mean value), but by chance, the placebo individuals have larger
values than the remaining groups.
29 / 99
university of copenhagen
I
But we are dealing with non-negative counts,
so we do not have a normal distribution,
although it may be a reasonable approximation...
I
Can’t we just take logarithms?
No, because we have zeroes
I
Some other transformation then?
Yes, square roots, or arcsine,
but the interpretation would suffer a lot
I
Could we just condition on the baseline value?
Yes, we could do that.....
but it becomes more tricky when we have multiple time points
30 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Model reflections
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Model reflections, II
Parametrization of mean values (on the log-scale):
I
We are dealing with counts, so it is natural to consider a
Poisson distribution, with log-link (natural log)
I
Because it is a randomized study, the mean values at baseline
should be identical for the three groups
I
We are prepared to see 3 different changes over time but some of these may be identical
(this is actually the main scientific question)
I
Baseline and follow measurement are correlated within
individuals
31 / 99
Treatment
P
P
A
A
B
B
Period
Baseline
Follow-up
Baseline
Follow-up
Baseline
Follow-up
Mean (on log scale)
β1
β1 + β2
β1
β1 + β2 + β3
β1
β1 + β2 + β4
β3 and β4 denote additional effects of A and B,
when compared to placebo
32 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Marginal model (PA) for leprosy
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Comments to code
I
A_effect=(drug=’A’)*time;
B_effect=(drug=’B’)*time;
I
proc genmod data=leprosy;
class id;
model bacilli= time A_effect B_effect / d=poisson link=log;
repeated subject=id / type=un corrw;
contrast ’Antibiotic effect’ A_effect 1, B_effect 1 / wald;
contrast ’Effect of A equals B?’ A_effect 1 B_effect -1 / wald;
estimate ’Effect B minus A’ A_effect 1 B_effect -1;
estimate "changes for A" time 1 A_effect 1;
estimate "changes for B" time 1 B_effect 1;
output out=pa pred=pred_pa xbeta=xbeta_pa;
run;
33 / 99
I
time indicates the change over time for the placebo group
(the parameter β2 )
A_effect indicates the additional change over time for drug A
(the parameter β3 )
B_effect indicates the additional change over time for drug B
(the parameter β4 )
I
d=poisson: specifies the link-function as log, and the
working correlation matrix as (proportional to) the mean
I
link=log: may overrule the link-function from
dist=poisson, if so needed
I
repeated: specifies an association between measurements on
the same id (corrw requests printing)
34 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Comments to code, II
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Output
The GENMOD Procedure
I
estimate statements:
Estimate combination of the β’s, here
I
I
I
I
β4 − β3
β2 + β3
β2 + β4
contrast statements:
Useful for testing several parameters simultaneously, here the
tests
I
I
35 / 99
β3 = β4 = 0: No (extra) effect of either A nor B
β3 = β4 : Effects of A and B are equal
(identical to the estimate-statement above)
Model Information
Data Set
Distribution
Link Function
Dependent Variable
WORK.LEPROSY
Poisson
Log
bacilli
Number of Observations Read
Number of Observations Used
60
60
Class Level Information
Class
id
Levels
30
Values
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
Parameter Information
Parameter
Prm1
Prm2
Prm3
Prm4
36 / 99
Effect
Intercept
time
A_effect
B_effect
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Output, II
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Output, III: Estimation
GEE Model Information
Correlation Structure
Subject Effect
Number of Clusters
Correlation Matrix Dimension
Maximum Cluster Size
Minimum Cluster Size
Unstructured
id (30 levels)
30
2
2
2
The GENMOD Procedure
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter Estimate
Algorithm converged.
Intercept
time
A_effect
B_effect
Working Correlation Matrix
Row1
Row2
Col1
1.0000
0.7966
Col2
0.7966
1.0000
37 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Output, IV (additional statements)
Mean
Estimate
1.0635
0.5744
0.6109
Mean
Confidence
0.6954
0.4281
0.4478
L’Beta
Confidence Limits
-0.3633
0.4864
-0.8483
-0.2605
-0.8035
-0.1823
Limits
1.6264
0.7707
0.8333
L’Beta
Estimate
0.0615
-0.5544
-0.4929
ChiSquare
0.08
13.67
9.68
Standard
Error
0.2168
0.1499
0.1585
Pr > ChiSq
0.7765
0.0002
0.0019
Contrast Results for GEE Analysis
ChiContrast
DF
Square
Pr > ChiSq
Antibiotic effect
2
6.99
0.0303
Effect of A equals B?
1
0.08
0.7765
Type
Wald
Wald
But note:
It may not be reasonable to estimate the effect
of each single drug in a PA-model!
39 / 99
0.0801
0.1573
0.2186
0.2279
2.2163
-0.3222
-0.9690
-0.9257
2.5304
0.2946
-0.1122
-0.0325
Z Pr > |Z|
29.62
-0.09
-2.47
-2.10
university of copenhagen
<.0001
0.9300
0.0134
0.0355
d e pa rt m e n t o f b i o s tat i s t i c s
Interpretations
Contrast Estimate Results
Label
Effect B minus A
changes for A
changes for B
2.3734
-0.0138
-0.5406
-0.4791
95% Confidence
Limits
38 / 99
university of copenhagen
Label
Effect B minus A
changes for A
changes for B
Standard
Error
I
Alpha
0.05
0.05
0.05
I
There is a significant effect of antibiotics:
6.99 ∼ χ2 (2) ⇒ P = 0.03
The effect of placebo is estimated to
exp(β̂2 ) = exp(−0.0138) = 0.986, i.e a decrease of 1.4%
I
The additional effect of drug A is estimated to
exp(β̂3 ) = 0.58, and the total effect to
exp(β̂2 + β̂3 ) = exp(−0.5544) = 0.574,
i.e a decrease of 42.6%
I
The two antibiotics are not significantly different:
0.08 ∼ χ2 (1) ⇒ P = 0.78
(although the estimated effect is a tiny bit larger for drug A)
40 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Predicted means from Population Average model (PA)
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Wrong analysis
not taking the correlation into account
proc genmod data=leprosy;
class id;
model bacilli= time A_effect B_effect /
d=poisson link=log modelse type3;
******
no repeated statement;
contrast ’Antibiotic effect’ A_effect 1, B_effect 1 / wald;
contrast ’Effect of A equals B?’ A_effect 1 B_effect -1 / wald;
estimate ’Effect B minus A’ A_effect 1 B_effect -1;
estimate "changes for A" time 1 A_effect 1;
estimate "changes for B" time 1 B_effect 1;
run;
Legends:
A —— B ......
P ——
41 / 99
42 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Output from wrong analysis
Analysis Of Maximum Likelihood Parameter Estimates
Parameter
Intercept
time
A_effect
B_effect
Scale
DF
1
1
1
1
0
Estimate
2.3734
0.1362
-0.8419
-0.7013
1.0000
Standard Wald 95% Confidence Wald Pr>ChiSq
Error
Limits
Chi-Square
0.0557
2.2641
2.4826
1813.76 <.0001
0.1060 -0.0715
0.3440
1.65 0.1987
0.1643 -1.1639 -0.5198
26.25 <.0001
0.1566 -1.0082 -0.3944
20.06 <.0001
0.0000
1.0000
1.0000
NOTE: The scale parameter was held fixed.
Note:
I
Larger effects
I
Too small standard errors
I
Much too small P-values
43 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Mixed effects model (SS)
We now assume random intercepts, bi ∼ N (0, σb2 ), in order to
answer the orange question from page 29.
proc GLIMMIX data=leprosy method=quad(qpoints=50);
class id;
model bacilli = time A_effect B_effect /
dist=poisson link=log solution;
random intercept / subject=id type=vc g;
contrast ’Drug x Time Interaction’ A_effect 1, B_effect 1;
contrast ’Effect of A equals B?’ A_effect 1 B_effect -1;
estimate "changes for A" time 1 A_effect 1;
estimate "changes for B" time 1 B_effect 1;
output out=ss pred=xbetamean pred(noblup)=xbeta_ss
pred(ilink)=predmean pred(ilink noblup)=pres_ss;
run;
44 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Comments to glimmix code
university of copenhagen
Output from glimmix analysis
I
method=quad: maximizes the likelihood function
I
qpoints=50: the more quadrature points, the better accuracy
Effect
Intercept
I
random: here we have only one random intercept, so
type=... is unimportant
Cov Parm
Intercept
I
I
d e pa rt m e n t o f b i o s tat i s t i c s
g: print the estimate of σb2
(In glimmix, the parameter σb2 is generally denoted G)
The test of equality of A and B is hard to interpret
and is only shown for making this comment on it
45 / 99
Estimated G Matrix
Row
1
Col1
0.2814
Covariance Parameter Estimates
Standard
Subject
Estimate
Error
id
0.2814
0.09557
Solutions for Fixed Effects
Effect
Intercept
time
A_effect
B_effect
Estimate
2.2412
0.003088
-0.6055
-0.5228
Standard
Error
0.1148
0.1235
0.2036
0.1963
DF
29
27
27
27
t Value
19.53
0.03
-2.97
-2.66
46 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Output from glimmix analysis, II
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Predicted means from Subject Specific model (SS)
Note: Different scaling from p. 41
Estimates
Label
Effect B minus A
changes for A
changes for B
Estimate
-0.08271
-0.6024
-0.5197
Standard
Error
0.2242
0.1657
0.1567
DF
27
27
27
t Value
-0.37
-3.64
-3.32
Pr > |t|
0.7151
0.0012
0.0026
Contrasts
Label
Antibiotic effect
Effect of A equals B?
Num
DF
2
1
Den
DF
27
27
F Value
5.83
0.14
Pr > F
0.0079
0.7151
Note again: Only the drug-specific changes are readily interpreted
47 / 99
Pr > |t|
<.0001
0.9802
0.0061
0.0129
Legends:
A —— B ......
48 / 99
P ——
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Predicted individual means from Subject Specific model (SS)
Predicted means from PA and SS
Legends:
A —— B ......
Legends:
A —— B ......
P ——
49 / 99
university of copenhagen
P ——
50 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Comments on difference between PA and SS
The analysis uses a log-link, and since the logarithmic function is
concave, we have the following:
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Study on epilepsy
Reference: Thall, P.F. and Vail, S.C. (1990). Some covariance
models for longitudinal count data with overdispersion. Biometrics.
Controlled clinical trial:
I
30 treated with pragabide
I
28 treated with placebo
Recording of the number of epileptic seizures during
I
The average of two logarithmic values (SS) is smaller than the
logarithm of the average (PA)
I
The difference between the two is largest for small values
I
Therefor the effects on log-scale (SS) appears larger
51 / 99
I
8-week interval before treatment
I
visits every second week after treatment, i.e.
in 2-weeks interval
I
We consider rates, per week
52 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Spaghettiplot - the epilepsy example
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Mean value plot
Number af seizures per week:
Legends:
Progabide
Legends:
Placebo
53 / 99
university of copenhagen
Progabide
54 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Purpose of investigation
1. Investigate what happens over time,
does the number of seizures decrease?
2. Compare the decrease for a patient treated with pragabide to
the decrease for a similar patient in the placebo group
3. Compare the decrease for a population treated with pragabide
to the decrease for a population treated with placebo
university of copenhagen
Tij denotes the time span corresponding to the number of
seizures, Yij ,
so Tij is either 2 or 8 weeks
55 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Model building
Reasonable model (in principle) for the number of seizures:
I
Poisson outcome
I
Random regression, i.e.
linear effect of week, with individual intercepts and slopes
I
Mean value proportional to length of period (8 or 2 weeks)
log(8) and log(2) used as offsets
This ensures that we model the ratio
Notation:
I
Placebo
Yij
Tij
(on log-scale)
56 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
Random regression, SS model in glimmix
Ecological fallacy
proc glimmix data=seizures method=quad(qpoints=50);
class id trt visit;
model seizures = weeks trt trt*weeks /
dist=poisson offset=lweeks link=log solution;
random intercept weeks / subject=id type=un g;
estimate ’weekly decline trt=0’ weeks 1 weeks*trt 1 0;
estimate ’weekly decline trt=1’ weeks 1 weeks*trt 0 1;
run;
Think about the research question:
d e pa rt m e n t o f b i o s tat i s t i c s
I
Do we want to say something about populations?
between subject covariates
I
or are we interested in specific individuals?
within subject covariates
Output not shown.....
57 / 99
university of copenhagen
58 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Example: suicide and religion
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Analysis on population level: the regions
Percent of suicides increases with percent of protestants.
In a number of regions, we count:
I
Number of suicides
Outcome: % suicides (among all citizens)
I
Number of protestants and catholics,
Covariate: % protestants
Purpose of study:
Do people kill themselves when they live among protestants?
Is this a precise question??
Are protestants more likely to commit suicidide?
59 / 99
60 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Analysis on subject level
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Amenorrhea example
Subdivide each region into individual religion: protestants and
catholics:
1151 contracepting women were randomized in two groups,
receiving
I
100 mg of some drug (trt=0)
I
150 mg of the same drug (trt=1)
All women received injections at time points (time=1,2,3,4) with
intervals of 90 days (no measurement at baseline (time=0)
Each time, it was recorded whether the woman had experienced
amenorrhea (a suspected side effect of the drug) in the 90 days
following the last injection.
More suicides among catholics in regions with many protestants
but they do not “count” as much, since they are a minor group
61 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Amenorrhea example
Many drop-outs
62 / 99
university of copenhagen
Mean value plot - amenorrhea
The MEANS Procedure
Analysis Variable : y
N
N
trt
time
Obs
N Miss
Mean
Variance
-----------------------------------------------------------------------0
1
576
576
0
0.1857639
0.1515187
2
576
477
99
0.2620545
0.1937882
3
576
409
167
0.3887531
0.2382065
4
576
361
215
0.5013850
0.2506925
1
1
575
575
0
0.2052174
0.1633874
2
575
476
99
0.3361345
0.2236179
3
575
389
186
0.4935733
0.2506029
4
575
353
222
0.5354108
0.2494527
------------------------------------------------------------------------
63 / 99
Note: Baseline is unmeasured (time=0)
64 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Mean value plot - on logit scale
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Purpose of the amenorrhea investigation
I
Estimate time trend in the probability of side effects
for each dose of the drug
I
Compare the two doses
Model could include
Do we have linearity?
Not quite...
65 / 99
university of copenhagen
I
A time effect (linear or quadratic)
I
A group difference
but they should be equal at baseline (time=0)
I
An interaction between group and time
(different patterns in the two groups)
I
A random level for each individual
66 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Mixed effects model (SS)
with quadratic time effect
university of copenhagen
Output from mixed effects model
Estimated G Matrix
Effect
Intercept
proc glimmix method=quad(qpoints=50) noclprint data=amen;
class id;
model amenorrhea = time time2 trt*time trt*time2 /
dist=binomial link=logit solution;
random intercept / subject=id g;
contrast ’Interaction with time’
trt*time 1, trt*time2 1 / chisq;
output out=pred_ss pred(noblup ilink)=predicted_ss_mean;
run;
Beware: Test for interaction is difficult to interpret
67 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Row
1
Col1
5.0642
Solutions for Fixed Effects
Effect
Intercept
time
time2
time*trt
time2*trt
Estimate
-3.8058
1.1334
-0.04197
0.5644
-0.1095
Standard
Error
0.3050
0.2682
0.05481
0.1922
0.04961
DF
1150
2461
2461
2461
2461
t Value
-12.48
4.23
-0.77
2.94
-2.21
Pr > |t|
<.0001
<.0001
0.4439
0.0034
0.0273
Contrasts
Label
Interaction with time
Num
DF
2
Label
Interaction with time
Pr > F
0.0021
68 / 99
Den
DF
2461
Chi-Square
12.40
F Value
6.20
Pr > ChiSq
0.0020
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Interpretations
I
d e pa rt m e n t o f b i o s tat i s t i c s
Predicted profiles from SS-model
Random effects variance G (σ̂b2 = 5.0642):
can be cautiously interpreted as a correlation
σ̂b2
σ̂b2 +
I
university of copenhagen
π2
3
= 0.61
The interaction is hard to interpret as a within-subject
covariate, since no individual has received both treatments.
69 / 99
university of copenhagen
70 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Marginal model using GEE (PA)
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Output from marginal model
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
proc genmod descending data=amen;
class id;
model amenorrhea = time time2 trt*time trt*time2 /
dist=binomial link=logit;
repeated subject=id / logor=fullclust;
contrast ’Interaction with time’
trt*time 1, trt*time2 1;
output out=pred_pa pred=predicted_pa;
run;
Parameter Estimate
Intercept
time
time2
time*trt
time2*trt
Alpha1
Alpha2
Alpha3
Alpha4
Alpha5
Alpha6
-2.2461
0.7030
-0.0323
0.3380
-0.0683
1.8475
1.4851
1.7605
2.1610
2.0665
2.2783
Standard
Error
0.1765
0.1581
0.0318
0.1097
0.0284
0.1810
0.1985
0.2482
0.1761
0.2034
0.1827
95% Confidence
Limits
-2.5921
0.3931
-0.0946
0.1230
-0.1239
1.4928
1.0960
1.2740
1.8159
1.6679
1.9202
-1.9001
1.0129
0.0299
0.5529
-0.0126
2.2021
1.8742
2.2471
2.5060
2.4651
2.6364
Z Pr > |Z|
-12.72
4.45
-1.02
3.08
-2.40
10.21
7.48
7.09
12.27
10.16
12.47
<.0001
<.0001
0.3089
0.0021
0.0162
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
Contrast Results for GEE Analysis
Note: We have have a missing value issue here,
because we cannot use maximum likelihood
Contrast
Interaction with time
71 / 99
72 / 99
DF
2
ChiSquare
12.39
Pr > ChiSq
0.0020
Type
Score
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Predicted profiles from PA-model
d e pa rt m e n t o f b i o s tat i s t i c s
Comparison of predicted profiles
Note: New scaling
...and more so, if they are further away from 0.5
PA estimates are closer to 0.5 then SS estimates...
73 / 99
university of copenhagen
university of copenhagen
so effects are smaller for PA
74 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
An alternative SS program
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Output from NLMIXED
PROC NLMIXED
I
very flexible, allows any (non-linear) mean value structure
I
can only handle two “levels”
(i.e. not pupils in classes in schools....)
PROC NLMIXED data=amen QPOINTS=50;
PARMS beta0=-2.5 beta1=0.8 beta2=-0.03
beta3=0.36 beta4=-0.07 g11=0 to 5 by 0.5;
eta = beta0 + beta1*time + beta2*time2 + beta3*trt*time
+ beta4*trt*time2 + b1;
mu = exp(eta)/(1+exp(eta));
MODEL y ~ BINARY(mu);
RANDOM b1 ~ NORMAL(0, g11) SUBJECT=id;
PREDICT mu OUT=predmean;
run;
75 / 99
Parameter Estimates
Parameter
Estimate
Standard
Error
beta0
beta1
beta2
beta3
beta4
g11
-3.8057
1.1332
-0.04192
0.5644
-0.1096
5.0646
0.3050
0.2682
0.05481
0.1922
0.04961
0.5840
DF
t Value
Pr > |t|
Alpha
Lower
1150
1150
1150
1150
1150
1150
-12.48
4.22
-0.76
2.94
-2.21
8.67
<.0001
<.0001
0.4445
0.0034
0.0274
<.0001
0.05
0.05
0.05
0.05
0.05
0.05
-4.4041
0.6069
-0.1495
0.1873
-0.2069
3.9187
Parameter Estimates
Parameter
beta0
beta1
beta2
beta3
beta4
g11
76 / 99
Upper
Gradient
-3.2073
1.6595
0.06561
0.9416
-0.01222
6.2105
-0.00046
-0.00355
-0.01548
0.000112
0.000034
0.00014
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Smoking among school children
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Model for smoking
Hierarchical (multilevel) design:
1498 children (i) → 90 classes (c) → 46 schools (s)
Outcome:
Individual smoking behaviour, smoker (0/1)
Ysci ∼ Bernoulli(psci )
psci : the probability that child i in class c on school s is a smoker.
Model:
logit(psci ) = school covariate effects
+school class covariate effects +Bsc
Purpose of investigation
I
Find out how to make an intervention to prevent smoking
I
Evaluate various covariate effects
+individual covariate effects
As ∼ N (0, ω 2 )
2
Bsc ∼ N (0, τ )
77 / 99
university of copenhagen
+As
between school variation
between classes (within school) variation
Mette Rasmussen
78 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Possible covariates, at various levels
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Initial model
Too simple, but a starting point to gain understanding
Two-level model:
I
Individual (i):
sex/gender, age, parental smoking behaviour, parental
smoking attitude, parental labour market attachment, best
friend smoking
I
Class (c):
sex ratio, number of pupils, grade, teachers
I
School (s):
Type of school (rural, urban)
79 / 99
I
no covariates
I
only random school
nothing here
/
/
proc glimmix data=smoke;
/
class school sclass;
/
model smoker(descending) = /
/ dist=binary link=logit ddfm=satterth s;
random school;
run;
80 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Important note
I
university of copenhagen
Interesting part of output
A full maximum likelihood estimation (method=quad) with a
sufficient number of qpoints is not feasible for this problem,
because of insufficient “space” and “time”.
I
The default approximaive solution is method=rspl
I
The simplest model may be fitted with ML and this yields
results quite close to the ones presented below
The GLIMMIX Procedure
Covariance Parameter Estimates
Cov
Parm
SCHOOL
Effect
Intercept
81 / 99
82 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
Interpretation of estimates
I
Fixed effects:
Only intercept, i.e. overall level: -1.4767
Inverse logit-transformation:
> exp(-1.4767)/(1+exp(-1.4767))
[1] 0.1859264
exp(−1.4767)
= 0.1859264
(1 + exp(−1.4767))
Overall, approx. 18.6% of the pupils smoke
83 / 99
Standard
Error
0.08090
Estimate
0.1557
Solutions for Fixed Effects
Perhaps, some day....
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Estimate
-1.4767
Standard
Error
0.09061
DF
38.01
t Value
-16.30
university of copenhagen
Pr > |t|
<.0001
d e pa rt m e n t o f b i o s tat i s t i c s
Interpretation of random effect
Estimated between-school variance: σ̂b2 = 0.1557
I
A cautios interpretation as a correlation
σ̂b2
σ̂b2 +
I
π2
3
= 0.13
Median Odds Ratio (MOR)
For two randomly chisen individuals from different schools,
(with identical covariates) we calculate median OR for the
high risk individual compared to the low risk individual:
84 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
MOR in practice
The distribution of OR between their risk of smoking (always
chosen as the ratio above 1) will have a median of
and since ω̃ =
we get
MOR = exp(0.954 × ω̃)
Pupils from the same class are no more correlated than pupils from
different classes on the same scholl.
We must introduce an
extra correlation for pupils in the same class...
85 / 99
86 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Inclusion of variation between school classes
proc glimmix data=smoke;
class school sclass;
model smoker(descending) =
/ dist=binary link=logit ddfm=satterth s;
random school sclass;
run;
Covariance Parameter Estimates
Cov
Standard
Parm
Estimate
Error
SCHOOL
0
.
sclass
0.3578
0.1176
Estimate
-1.5083
Standard
Error
0.09318
DF
82.83
t Value
-16.19
university of copenhagen
Pr > |t|
<.0001
d e pa rt m e n t o f b i o s tat i s t i c s
Interpretation of results
I
The variation between schools can be totally explained by the
variation between school classes
I
The intercept (level) changes slightly because of a different
weighting of the observations
I
Median Odds Ratio (MOR) for two children from different
classes in the√same school:
exp(0.954 ∗ 0.3578) = 1.77
I
Solutions for Fixed Effects
87 / 99
Pupils from the same school are correlated
in their inclination to smoke
This does not seem appropriate
0.1557 = 0.3946,
MOR = exp(0.954 × 0.3946) = 1.46
Effect
Intercept
d e pa rt m e n t o f b i o s tat i s t i c s
Interpretation of correlation structure
Choose two random individuals from different schools:
√
university of copenhagen
Median Odds Ratio (MOR) for two children from different
classes in different
schools:
√
exp(0.954 ∗ 0.3578 + 0) = 1.77
88 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
An illustrative figure
Three schools: blue,
red,
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
A possible third level....
green
Imagine an extra level/grouping: Gender group within class,
i.e. a subgrouping in boys and girls,
corresponding to an extra correlation
between pupils of the same gender in the same class.
Note: This is not the same as a gender effect
I
it need not be a systematic difference
I
the group definition is a substitute for cliques
of which we know nothing
Variation between classes in each school,
but schools look alike
Modify the Random-statement to:
random school sclass ggroup;
and remember ggroup in the Class-statement
89 / 99
90 / 99
university of copenhagen
One school, gender group effect
d e pa rt m e n t o f b i o s tat i s t i c s
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Output from 3-level model
The GLIMMIX Procedure
Covariance Parameter Estimates
Cov
Standard
Parm
Estimate
Error
SCHOOL
0
.
sclass
0.1034
0.1562
GGROUP
0.4570
0.1948
Solutions for Fixed Effects
Effect
Intercept
Estimate
-1.5236
Standard
Error
0.09263
DF
83.96
t Value
-16.45
Pr > |t|
<.0001
Gender group/clique seems to be an important concept
91 / 99
92 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Interpretation of results
I
I
d e pa rt m e n t o f b i o s tat i s t i c s
Gender correlation - systematic effect?
Median Odds Ratio (MOR) for two children of opposite sex
(different gender
√ groups) in the same class:
exp(0.954 ∗ 0.4570) = 1.91
Median Odds Ratio (MOR) for two children (of either gender)
in different classes
(at same or different schools):
√
exp(0.954 ∗ 0.4570 + 0.1034) = 2.04
How much does systematic gender effect explain of the random
components?
93 / 99
university of copenhagen
university of copenhagen
A large part of the variation seems to be due to gender cliques,
or is it simply a systematic difference between boys and girls?
proc glimmix data=smoke;
class school sclass ggroup sex;
model smoker(descending) = sex
/ dist=binary link=logit ddfm=satterth s;
random school sclass ggroup;
run;
94 / 99
d e pa rt m e n t o f b i o s tat i s t i c s
One school, systematic gender effect
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Output from 3-level model, with systematic gender effect
The GLIMMIX Procedure
Covariance Parameter Estimates
Cov
Standard
Parm
Estimate
Error
SCHOOL
0
.
sclass
0.1263
0.1517
GGROUP
0.4027
0.1855
Solutions for Fixed Effects
Effect
Intercept
sex
sex
95 / 99
96 / 99
sex
boy
girl
Estimate
-1.3254
-0.4188
0
Standard
Error
0.1200
0.1698
.
DF
143.9
89.17
.
t Value
-11.05
-2.47
.
Pr > |t|
<.0001
0.0156
.
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
Interpretation of results
d e pa rt m e n t o f b i o s tat i s t i c s
Variance component estimates
I
Systematic effect of sex:
OR=exp(0.4188) = 1.52 for girls vs. boys
I
Median Odds Ratio (MOR) for two children in different
cliques of the√same class:
exp(0.954 ∗ 0.4027) = 1.83
I
university of copenhagen
model
school
alone
school and
school class
school, class
and gender group
as above,
with sex
Median Odds Ratio (MOR) for two children in different
classes (at same
√ or different schools):
× exp(0.954 ∗ 0.4027 + 0.1263) = 2.00
school
0.1557
school class
-
gender group
-
0
0.3578
-
0
0.1034
0.4570
0
0.1263
0.4027
How much did systematic gender effect explain of the random
components?
Note the increase in the class variation
97 / 99
98 / 99
university of copenhagen
d e pa rt m e n t o f b i o s tat i s t i c s
MOR, and Odds ratios (OR) for gender
model
school
alone
school and
school class
school, class
and gender group
as above,
with sex
In case of different:
school
1.46
school class
-
gender group
-
gender
-
1.77
1.77
-
-
2.04
2.04
1.91
-
2.00
2.00
1.83
1.52
Systematic gender effect and gender cliques seem to be the most
important determinants for smoking.
99 / 99