A Two-Part Model

PART 8
Two Stage & Joint Models
Term 4, 2006
BIO656--Multilevel Models
1
Motivation:
SEERMED DATA
End of Life Colorectal Cancer Costs
Term 4, 2006
$0
BIO656--Multilevel Models
Expenditure
2
$500,000
Data
Patient – Physician
Professional
Health-Care Services
Cancer
Diagnosis
HMO
Hospice
Factors:
Need-based
Enabling
Predisposing
FFS
Medicare Private Ins.
Claims
Terminal-Phase
Costs
Term 4, 2006
Death
12
mos
BIO656--Multilevel Models
Rejected
Allowed
Co-Pay
Deductibles
3
Medicare Payments
Data
Patient – Physician
Cancer
Diagnosis
Medicare Payments
Terminal-Phase
Costs
Term 4, 2006
Death
3
mos
12
mos
BIO656--Multilevel Models
4
Motivation:
SEERMED DATA
End of Life Colorectal Cancer Costs
Term 4, 2006
$0
BIO656--Multilevel Models
Expenditure
5
$500,000
Density
A “Normal” Distribution
Term 4, 2006
Y
BIO656--Multilevel Models
6
Density
A Complex Distribution
Term 4, 2006
Y
BIO656--Multilevel Models
7
Complex Distributions 
Mixtures of Simple Distributions
Mixtures-of-Experts Models (MEM)
Density
Finite Mixture Models (FMM)
Y Models
McLachlan,
Peel.
(2001), FMM BIO656--Multilevel
Term
4, 2006
Jacobs, Jordan. (1991), MEM, Neural Comp
8
A simple, two-part mixture
$0
1. P(Y>0)
$+
2. E(Y|Y>0)
E(Y+)
Term 4, 2006
BIO656--Multilevel Models
9
A Two-Part Model:
(Intensity & Size)
IS – logit/lognormal
1. logit{ Pr(Yi>0) } = x
2. i.) log10(Yi+) = x + i
ii.) i ~ N(0,2)
0. “Tobit” model: Tobin (1958)
1. Selection (hurdle) models: (Amemiya 1984; Heckman 1976)
2. Zero-inflated models (Lambert 1992; Green 1994)
3. Two-part models (Manning 1981; Mullahy 1998)
Term 4, 2006
BIO656--Multilevel Models
10
Another Two-Part Model:
(Intensity & Size)
IS – Probit/log-Gamma
1. -1{ Pr(Yi>0) } = x
2. i.) log10{E( Yi+)} = x
ii.) Yi+ ~ (,)
Term 4, 2006
BIO656--Multilevel Models
11
A Two-Part Model:
The Intensity-Size GLM
IS – GLM
h1
h2
f
 binary data link function
 continuous data link function
 exponential
familyModels
w/ dispersion 
BIO656--Multilevel
Term 4, 2006
12
Multiple Levels 1
0
Term 4, 2006
+
BIO656--Multilevel Models
13
Monthly SEERMED Data
Month 12
Month 11
Month 10
12
11
12+
10
11+
10+
Term 4, 2006
BIO656--Multilevel Models
14
Multiple
Levels 2
HMREM1
Month 12
f12
g1
Month 11
f11
a
X
g1
Month 10
f10
a
g1
g2
X
0
g2
b
X
0
Term 4, 2006
g2
0
+
b
X
+
X
+
X
BIO656--Multilevel Models
15
A 2-Part Model
1.
Intensity:
logit( i ) = x
2.
Size:
a) i = x
b) Yi+ ~ f ( i , )
Term 4, 2006
BIO656--Multilevel Models
16
A Longitudinal 2-Part Model
1.
Intensity:
logit( ic ) = x + zai
2.
Size:
a) ic = x + zbi
b) Yi+c ~ f ( ic, )
1. Olsen, Schafer, (2001)
2. Tooze, Grunwald, Jones, (2002)
3. Yau, Lee, Ng, (2002)
3.
Random Effects:
ui =
ai
bi
Term 4, 2006
~ N
0
0
,
aa
=
ba
BIO656--Multilevel Models
bb
17
Data Analysis: 3 General Steps
1. Exploration
2. Model Fitting and Estimation
3. Diagnostics
and the greatest of these is…
Term 4, 2006
BIO656--Multilevel Models
18
3
3
3
3
1
2
2
2
2
1
1
0
1
log10 Cost 1
4
5
Uncooked
UncookedSpaghetti
SpaghettisPlot
Plot
Term 4, 2006 10
0
11 Models
BIO656--Multilevel
Month
12
19
Monthly SEERMED Data
Month 12
Month 11
Month 10
12
11
12+
10
11+
10+
Term 4, 2006
BIO656--Multilevel Models
20
Month 10 & Month 11 log10(Costs)
Bivariate Point Mass
Bivariate
Continuous
Distb.
Univariate
Figure
5: Seermed log10 month 1 & 2
Continuous Distbs.
Density
Term 4, 2006
BIO656--Multilevel Models
21
10 & 11Costs
PRISMSEERMED
plot: MonthCosts:
10 & Months
11 SEERMED
70%
5
10%
D2
0.36
Rho
0.56
OR
12.9
D1
0.77
2
3
4
Paired
Response
Intensity
Size
Mixture
plot
1
log10 y11 1
bb
aa
ba
7%
0
13%
0
Term 4, 2006
1
2
3
4
BIO656--Multilevel Models
log10 y10 1
5
22
PRISM Matrix: Months 10-12
0.56
12.9
0.77
6
10%
0.34
8.65
0.54
logy11
0
70%
0
0.25
0.51
15.12
0.83
1
0
2
3
Density
4
5
x
0.04
0
0.36
0
Density
logy10
7%
0
13%
72%
10%
x
logy12
0
77%
Density
15%
5%
8%
Term 4, 2006
0
1
2
9%
4%
BIO656--Multilevel Models
3
4
5
6
0
23
1
2
3
4
5
6
SEERMED MREM
1.
Intensity:
2.
Size:
h1( ic ) = 0 + 1Obs + 2Male + 3Obs*Male + ai
Intensity: Probit, Logistic
a) h2( ic ) = 0 + 1Obs + 2Male + bi
b) Yi+c ~ f ( ic, )
3.
Size: Lognormal, Gamma
Random Effects:
ui =
ai
bi
Term 4, 2006
~ N
0
0
,
a 2
=
ba
BIO656--Multilevel Models
b 2
24
Estimation
Likelihood:
Li()
Whoa.
But:
Non-Linear Mixed Model (NLMM)
• PQL, MCEM, MCMC, …
• Adaptive Quadrature – Newton-Raphson
Zeger, Karim (1991); Davidian, Giltinan, (1993); Pinheiro, Bates (1995);
Term 4, 2006
BIO656--Multilevel Models
Mcculloch (1997); Booth et al. (2001); Rabe-Hesketh, et al. (2004)
25
Estimation: SAS
proc nlmixed data=SEERMED;
parms / data=parms_start;
*- 1) logistic: logit{Pr( Y>0 | a )} = Xalpha + a = “eta0” -*;
eta0 = alpha0_c + alpha1_c*obs + alpha2_c*male + alpha3_c*obsmale + a;
pi_c = exp(eta0) / (1+exp(eta0));
*- 2) log-normal: E( log(Y) | Y>0, b ) = XB + b = “eta1” -*;
eta1 = beta0_c + beta1_c*obs + beta2_c*male + b;
*- log-likelihood -*;
pi=CONSTANT('PI');
if y=0 then ll1 = 0;
else ll1=-.5*log(2*pi*sigma**2)-.5*((log10y-eta1)/sigma)**2;
ll = (1-Gpos)*log(1-pi_c) + Gpos*log(pi_c) + Gpos*(ll1);
model y ~ GENERAL(ll);
RANDOM a b ~ NORMAL([0,0],[tau_aa, tau_ba, tau_bb]) SUBJECT=id;
run;
Term 4, 2006
BIO656--Multilevel Models
26
Estimation: SAS (better)
proc nlmixed data=sanfran qpoints=10;
parms / data=parms_start;
*-logit-*;
eta0 = alpha0_c + alpha1_c*obs + alpha2_c*male + alpha3_c*obsmale + a;
expeta = exp(eta0);
pi_c = expeta / (1+expeta);
tau_aa = exp(logtau_a)**2;
*-lognormal-*;
eta1 = beta0_c + beta1_c*obs + beta2_c*male + b;
phi = 10**(log10phi);
*std dev of log10(Y+1)|b;
tau_bb = (10**(log10tau_b))**2;
*- RE Var -*;
rho_ba = (exp(2*zrho_ba) - 1) / (exp(2*zrho_ba) + 1);
tau_ba = rho_ba*(tau_aa*tau_bb)**.5;
*- log-likelihood -*;
pi=CONSTANT('PI');
if y=0 then ll1 = 0; else ll1=-.5*log(2*pi*phi**2)-.5*((log10y-eta1)/phi)**2;
ll = (1-Gpos)*log(1-pi_c) + Gpos*log(pi_c) + Gpos*(ll1);
model y ~ GENERAL(ll);
RANDOM a b ~ NORMAL([0,0],[tau_aa, tau_ba, tau_bb]) SUBJECT=id;
ods output ParameterEstimates = parms_new;
run;
Term 4, 2006
BIO656--Multilevel Models
27
SEERMED MREM Results 1
Term 4, 2006
BIO656--Multilevel Models
28
MREM Profile Likelihood Plots for 3 c
Scaled Profile Likelihood
Profile ll (alpha3)
Probit*Lognormal
Probit*Gamma
LogitLognormal
LogitGamma
Term 4, 2006
LR  6
BIO656--Multilevel Models
Intensity model Obs*Male interaction term (c3)
29
SEERMED MREM Results 2
Term 4, 2006
BIO656--Multilevel Models
30
10 & 11Costs
PRISMSEERMED
plot: MonthCosts:
10 & Months
11 SEERMED
70%
5
10%
D2
0.36
Rho
0.56
OR
12.9
D1
0.77
2
3
4
Paired
Response
Intensity
Size
Mixture
plot
1
log10 y11 1
bb
aa
ba
7%
0
13%
0
Term 4, 2006
1
2
3
4
BIO656--Multilevel Models
log10 y10 1
5
31
SEERMED MREM Results 2
But do these models fit?…
Term 4, 2006
BIO656--Multilevel Models
32
^12
P12
L12
P10
^10
P11
^11
L12
1.0
P12
^
12
G12
0.8
G12
L11
L11
G11
L
G10
+
10
Y 10
+
Y 11
G11
0.6
+
Y 12
+
Y 12
L10
G10
0.4
+
Y 11
+
Y 10
2000
4000
^
P11
11
Male
0.2
0.0
10
Term 4, 2006
11
12
10 Models
11
BIO656--Multilevel
Month
12
Intensity: Pr(Y>0)
10000 12000
^10
P10
6000
8000
Female
0
Size: E(Y|Y>0)
Data vs. MREM Models
Obs: ,33 Y
Exp: P, L,G
Observed
Diagnostic PRISM Matrix: lognormal IS-GLMM Residuals
13%
QQ Plot1
77%
10%
QQ Plot2
3
4
4%
9%
E
x
p
e
c
t
e
d
2
1
7%
13%
72%
15%
10%
5%
8%
Term 4, 2006
0
1
2
3
4
5
6
9%
5%
8%
Expected
Observed
0
O
b
s
e
r
v
e
d
71%
15%
6%
12%
Observed
70%
10%
5
6
Expected
70%
77%
QQ Plot3
4%
BIO656--Multilevel Models
0
Expected
1
2
3
4
5
6
34
Diagnostic PRISM Matrix: lognormal IS-GLMM Residuals
70%
13%
71%
15%
Density
Res10
6%
70%
10%
77%
10%
Density
4
3
2
1
0
7%
13%
72%
15%
5%
8%
10%
x
77%
Res12
Density
O
b
s
e
r
v
e
d
4%
9%
Res11
x
5
6
12%
E
x
p
e
c
t
e
d
5%
8%
Term 4, 2006
0
1
2
3
4
5
6
9%
4%
BIO656--Multilevel Models
0
35
1
2
3
4
5
6
Review & Related Work
MEM
MREM
HMREM
HMMMM
Ideas
1. Simple Combinations
of Simple Models
2. Complex
(Multi-Level)BIO656--Multilevel
Data:
Term 4, 2006
Models 0
Many Models & Many Pictures
1
+
36
2
^12
l12
L12
^l11
11
^l 10
10
L12
1.0
^l12
12
G12
0.8
G12
L11
6000
4000
Male
L11
G11
L
G10
+
10
Y 10
H10
H11
+
Y 11
G11
+
H
Y12
12
L10
G10
0.6
+
Y 12
H12
0.4
H+
11
Y
11
+
Y
10
H10
10
Term 4, 2006
Intensity: Pr(Y>0)
^l10
10
^l11
11
8000
10000 12000
Female
0.2
0
2000
Size: E(Y|Y>0)
Data
Data vs.
vs. HMREM
HMMMMModels
Models
0.0
11
12
10 Models
11
BIO656--Multilevel
Month
12
37
Review & Related Work
• These ideas are not just for Zero-Inflated Data
• Latent Variables are useful for “connecting” things
Term 4, 2006
BIO656--Multilevel Models
38
Opportunistic Infection & IDU
Always
Users
Interview: Reported Drug Use
Intermittent
Users
Never
Users
Interview: Reported No Drug Use
Opportunistic
Infection
Each Line Represents 1
subject’s time in the study
Term 4, 2006
6 months prior to 1st interview
BIO656--Multilevel Models
Day in Study
39
But what about Possible
Informative Missingness?
Drug Use
OI
Death /
Dropout
Term 4, 2006
BIO656--Multilevel Models
40
Jointly Analyze Survival & OIs
1) logistic model:
logit{ Pr(OIij | ai) } = 0 + 1SUij + 2SUij*HCuseij +
3AUij + 4Periodj + ai
2) Survival Model:
log{ (t) } = 0 + 1SUij + 2AUij + ai
3) Latent Effects:
ai ~ N(0,)
Guo & Carlin (2004)
Term 4, 2006
BIO656--Multilevel Models
41
Warning!
• But “Buyer Beware”
-- Model Assumptions
-- Identifiability
-- Model Fit
-- Marginalize & Check whenever possible
• MLMs require even more due-diligence than usual
Term 4, 2006
BIO656--Multilevel Models
42
References
•
Mixture Models:
– McLachlan, G. J. and Peel, D. (2001), Finite mixture models, John Wiley & Sons.
– Jacobs, R. A. and Jordan, M. I. (1991), “Adaptive mixtures of local experts. Neural
Computation,” Neural Computation, 3, 79–87.
•
Two-Part Models:
– Tobin, J. (1958), “Estimation of Relationships for Limited Dependent Variables,”
Econometrica, 25, 24–36.
– Amemiya, T. (1984), “Tobit models: A survey,” Journal of Econometrics, 24, 3–61.
– Heckman, J. (1976), “The common structure of statistical models of truncation, sample
selection, and limited dependent variables, and a sample estimator for such models,” The
Annals of Economic Development and Social Measurement, 5, 475–592.
– Lambert, D. (1992), “Zero-inflated Poisson regression, with an application to defects in
manufacturing,” Technometrics, 34, 1–14.
– Green, W. (1994), “Accounting for excess zeros and sample selection in Poisson and
negative binomial regression models,” Working Paper EC-94-10, Department of
Economics, New York University
– Manning, W., Newhouse, J., Orr, L., Duan, N., Keeler, E., Leibowitz, A., Marquis, M., and
Phelps, C. (1981), “A two-part model of the demand for medical care: Preliminary results
from the health insurance experiment,” in Health, Economics, and Health Economics, eds.
van der Gaag, J. and Perlman, M., pp. 103–104.
– Mullahy, J. (1998), “Much ado about two: reconsidering retransformation and the
two part model in health economics,” Journal of Health Economics, 17, 247–281.
Term 4, 2006
BIO656--Multilevel Models
43
•
•
•
References
Longitudinal 2-part models
– Olsen, M. K. and Schafer, J. L. (2001), “A two-part random-effects model for
semicontinuous longitudinal data,” Journal of the American Statistical Association, 96,
730–745.
– Tooze, J. A., Gunward, G. K., and Jones, R. H. (2002), “Analysis of repeated
measures data with clumping at zero,” Statistical Methods in Medical Research, 11,
341–355.
– Yau, K. K. W., Lee, A. H., and Ng, A. S. K. (2002), “A zero-augmented gamma mixed
model for longitudinal data with many zeros,” The Australian and New Zealand Journal of
Statistics 44, 177–183.
Estimation:
– Zeger, S. L. and Karim, M. R. (1991), “Generalized linear models with random effects: A
Gibbs sampling approach,” Journal of the American Statistical Association, 86, 79–86.
– Davidian, M. and Giltinan, D. M. (1993), “Some general estimation methods for nonlinear
mixed-effects models,” Journal of Biopharmaceutical Statistics, 3, 23–55.
– Pinheiro, J. C. and Bates, D. M. (1995), “Approximations to the log-likelihood function in
the nonlinear mixed-effects model,” Journal of Computational and Graphical Statistics,4,
12–35.
– McCulloch, C. E. (1997), “Maximum likelihood algorithms for generalized linear mixed
models,” Journal of the American Statistical Association, 92, 162–170.
– Booth, J. G., Hobert, J. P., and Jank, W. (2001), “A survey of Monte Carlo algorithms for
maximizing the likelihood of a two-stage hierarchical model,” Statistical Modelling: An
International Journal, 1, 333–349.
– Rabe-Hesketh, S., Skrondal, A., and Pickles, A. (2004), “Maximum likelihood
estimation of limited and discrete variable models with nested random effects,”
Journal of Econometrics, in press.
Other:
– Guo, X. and Carlin, B.P. (2004), ``Separate and Joint Modeling of Longitudinal and
Term
4, 2006
BIO656--Multilevel
44
Event
Time Data Using Standard
Computer Models
Packages," The American Statistician,
58 16--24.