OUTLINE Linear and Generalized Linear Mixed Models with Flexible Random Eects Distribution for Longitudinal Data 1. Introduction and motivation 2. Linear mixed models with exible random eects distribution Daowen Zhang North Carolina State University SNP for random eects distribution Estimation and Inference Monte Carlo EM algorithm with \double" rejection sampling 3. Generalized linear mixed models with exible random eects distribution [email protected] http://www.stat.ncsu.edu/dzhang2/ 4. Application 5. Simulation studies 6. Summary and discussion Based on joint works with Marie Davidian and Junliang Chen Slide 1 Slide 2 1 1. Introduction and Motivation Longitudinal data: Clinical trials, epidemiological studies, social science studies, etc. Features of longitudinal data: Each subject has repeated measures over time. Our interest is not limited to the subjects in the sample; instead, we want to make inference for the population from which the sample is drawn. Each subject's data tend to be more similar than data from other subjects =) correlation. An example: Framingham study In this study, each of 2634 participants was examined every 2 years for a 10 year period for his/her cholesterol level. Study objectives: 1. How does cholesterol level change over time? 2. How are the cholesterol level and its change associated with sex and baseline age? A subset of 200 subjects' data is used for illustrative purpose. Correlation has to be taken into account to yield valid inference. Slide 3 Slide 4 Cholesterol level over time for a subset of 200 subjects from Framingham study Linear Mixed Model for Longitudinal Data Data: Cholesterol levels over time 400 • 300 250 150 200 Cholesterol level 350 •• • • • • •• •• •• • •••• ••• •• ••• •• ••••• ••• ••• • •• •• • •• • • • •• ••• ••• •• ••• ••• •• • •••• •• • ••• ••• • •••• •• • •• • 0 2 4 • • •• • •• •• ••• •• •• •• ••• • •••• •• ••• •• • •••• •• ••• • ••• •• •• • •• • •• •• •••• •••• ••• • • ••• • ••• • ••• •• • •••• • • 6 8 Covariates: xij (p 1) and sij (q 1). Model: • • •• • • • • • ••• • •• •••• •• ••• •• •• •• • •• ••• •• • • • • Response yij for subject i at the j th time point. yij = x1ij 1 + + xpij p + sTij bi + Ui (tij ) + ij ; •• • •• ••• •• ••• •• • •••• ••• ••• ••• • ••• •• ••• • • •• • where 's are xed eects, bi are random eects, Ui (tij ) is a stochastic process, ij is \measurement error". Common assumptions: bi N(; D(!)). Ui(tij ) is a mean zero Gaussian stochastic process. ij N(0; 2). =) The likelihood function has a closed form. 10 Time in years Slide 5 Slide 6 3 Histogram of 200 estimated subject-specic intercepts 0 1 02 3 2 @ b A N @4 5 ; 4 i0 0 00 b i1 1 01 01 11 31 5A However, this assumption may be too restrictive!, yielding invalid or ineÆcient inference for xed eects and random eects. Slide 7 30 20 have a bivariate normal distribution Percentage where yij is the cholesterol level of subject i measured at the j th time points, ij N(0; 2 ), and bi = (b0i ; b1i ) is assumed to 40 yij = b0i + b1i tij + 1 agei + 2 sexi +3 agei tij + 4 sexi tij + ij ; 10 For Framingham data, we may entertain the following model to address some of the questions: 0 150 200 250 300 Estimates of subject specific intercepts Slide 8 350 where PK (z ) is a K th polynomial, 'q (z ) is density of N (0; Iq ). K = tuning parameter. 2. Linear Mixed Models with Flexible Random Eects Distribution Model: yij = xTij + sTij bi + Ui (tij ) + ij ; i = 1; :::; m; j = 1; :::; ni ; When K = 2, q = 2, z = (z1; z2), P2 (z ) = a00 + a10z1 + a01z2 + a20z12 + a11 z1z2 + a02 z22: where bi has a smooth but unspecied distribution. Q: How to model the distribution of bi ? A: Seminonparametric (SNP) representation of Gallant and Nychka (1987): hK (z) is a density =) Z bi = + RZi ; hK (z )dz = 1 () EfPK2 (U )g = 1; U N(0; Iq ): EfPK2 (U )g = aT Aa (A > 0), density constraint becomes aT Aa = 1 () aT B 2 a = 1 () cT c = 1; where is q 1, R is q q lower triangle, Zi is q 1. When K = 0, hK (z ) = N(0; Iq ). Approximate density h(z ) of Zi by where c = Ba. hK (z ) = PK2 (z )'q (z ); Slide 9 Slide 10 5 Some SNP densities for K = 2 polar coordinate transformation: 0.3 0.4 c1 = sin( 1) c2 = cos( 1 )sin( psi2) 1. The tuning parameter K need not be very large to make SNP exible. 0.2 Density Density 0.2 0.1 2 ( =2; =2]; t = 1; :::; d d 0.0 t ) 1 ); d 1 0.1 cd 1 = cos( 1 )cos( 2 ) sin( cd = cos( psi1)cos( 2) cos( 0.3 0.0 -4 -2 0 2 4 z Slide 11 -6 -4 -2 0 z Slide 12 2 4 Note: f (Yi jz; ) is normal density with mean Vi Æ + SiT Rz and variance Var(ei ) =) f (Yi jz ; )'q (z ) is joint normal when Zi N (0; Iq ), =) f (Yi jZi ; )'q (Zi ) = g(Yi ; )g(ZijYi ; ) Estimation and Inference Model in matrix notation: Substituting bi = + RZi into model, =) =) yij = xTij + sTij ( + RZi) + eij (eij = Ui (tij ) + ij ) Yi = Vi Æ + SiT RZi + ei ; where Æ = ( T ; T )T . =) Likelihood: f ( Yi ; ) = Z PK2 (z )g(z jYi ; )dz Zi jYi is normal with some mean and variance (depending on ). X logff (Y ; )g m i i=1 where Z = g(Yi ; )EZijYi ; fPK2 (Zi )g; Given K , the log-likelihood of model parameters : `( ; Y ) = f (Yi ; ) = g(Yi ; ) =) f (Yi jz ; )PK2 (z )'q (z )dz `( ; Y ) = X logfg(Y ; )g + X log[E m m i i=1 j Z i Y i ; i=1 fPK2 (Zi)g]; has a closed form expression! Slide 13 Slide 14 7 Choice of K Inference for xed eects: Optimizer nlpqn in SAS is used to maximize `(; Y ). 1. Akaike Information Criterion (AIC): AIC = `(^; Y ) pnet: Initial value of can be obtained by maximizing `p (; Y ) = `(; Y ) N (aT Aa 1): 2. Schwarz Bayesian Information Criterion (BIC): BIC = `(^; Y ) 0:5pnetlog(N ): Variance for ^: Var(^) = " @ `(^; Y ) # 2 @@T 3. Hannan-Quinn Criterion (HQ): 1 : Inference for bi : ^bi = ^ + R^Z^i, where Z^i is posterior mode or posterior mean (has closed form). HQ = `(^; Y ) pnetlog(log(N )): Larger is better! AIC prefers larger models, BIC prefers smaller models, HQ is intermediate. Reference: D. Zhang and M. Davidian (Biometrics, 2001). Slide 15 Slide 16 EM algorithm: Treat y as observed data, b = (b1; :::; bm ) as 3. Generalized Linear Mixed Models with Flexible Random Eects Distribution missing data, (y; b) as \complete data"; Given (r), Data: E-step: Q(j(r)) = Eflogf (y; b; )jy; (r)g = Response: yij jbi conditionally independent with bij = E [yij jbi ], Var[yij jbi ] = wij 1 v(bij ). Covariates: xij and sij . Z logf (y; b; )f (bjy; (r))db: M-step: Maximize Q(j(r)) w.r.t to get (r+1). Back to E-step with (r+1) until convergence. Advantage of EM algorithm: Model: GLMM with SNP random eects `((r+1); y) `((r); y) for 8r: g(bij ) = xTij + sTij bi ; i = 1; :::; m; j = 1; :::; ni ; where g(:) is a link function such as the logit link for binary data, the distribution of bi is approximated by SNP: Problem of using EM algorithm for GLMMs with SNP random eects: bi = + RZi : Challenge: Likelihood function does not have a closed form! E-step has to calculate integrations. M-step is not easy to carry out. Slide 17 Slide 18 9 \Double" rejection sampling scheme Monte Carlo EM algorithm: E-step: Obtain a random sample b(1); :::; b(L) from f (bjy; (r)) and approximate Q(j(r)) by MC average QL (j(r)) = = 1 1 X logf (y; b L L l=1 X X X log f (y L m ni L l=1 i=1 j =1 ij (l) 1 X X log f M-step: Maximize Q(j ) w.r.t to get (r ) L m K (r+1) . Back to E-step with (r+1) until convergence. Question: How to get a random sample b(1); :::; b(L) ? Form an envelope for hK (z ; (r) ): 0 hK (z ; (r)) dK (z ; (r)). Standardize dK (z ; gK (z ; (b ; ) : (l) i ) gK (z; l=1 i=1 Update L if necessary. Slide 19 ; ) jb ; ; ) + L (l) i First rejection sampling from fK (bi ; (r) ): ): ) = d K (z ; (r ) ) Z d K (t; sum of density p) = weighted ; V Bernoulli(0:5)). (r ) )dt : (r ) ( = ( 1)V (r) (r) 2 (a). Generate u U (0; 1), z gK (z ; '(r)); (b). If u hK (z ; (r))=dK (z ; (r)) then accept z ; otherwise go to (a) until a z is obtained (called zi ). bi = (r) + R(r)zi. Slide 20 Second rejection sampling from fK (bi jyi (r)): 1. Generate bi from the rst rejection sampling scheme. Monte Carlo EM algorithm with \double" rejection: 2. Generate u U (0; 1). If 1. Choose K , (0), L. Set r = 0. 2. Generate b(l) from f (bjy; (r) ) (l = 1; :::; L) using \double" rejection sampling. u fK (yi jbi ; (r); (r) )=i ; i = supb fK (yi jb; (r); (r) ); then accept bi ; otherwise, return to step 1 until a bi is 3. Calculate QL (j(r)). accepted. 4. Maximize QL (j(r)) w.r.t. to get (r+1)) Note: 5. Construct a 100(1 )% CE for (r+1). If (r) is inside the CE, then set L = L + [L=k] (k = 3, say). The acceptance rate of the rst rejection sampling is usually high ( 50%). 6. At convergence, set (r+1)) to be the MLE of ; otherwise go to step 2. Depending on data, the acceptance rate of the second rejection sampling can be very low. Benet: Allows MC error to be calculated at each iteration. Slide 21 Slide 22 11 Variance of ^: Var(^) = "X logf (y ; ^) logf (y ; ^) # m @ i=1 i @ i T 1 f (yi ; ) can be written Z f (yi ; ) = f (yi jz ; )PK2 (z ; )'q (z )dz = Eff (yi jZ ; )PK2 (Z ; )g 4. Application to Framingham Data : where Z N (0; Iq ). Data: yij = cholesterol level/100, tij = (year-5)/10, sex and baseline age. Model: yij = b0i + b1i tij + 1 agei + 2 sexi +3 agei tij + 4 sexi tij + ij : Approximate f (yi ; ) by f ( yi ; ) = 1 X f f ( y jz L L l=1 i (l) ; )PK2 (z (l); )g; where z (1); :::; z (L) is a random sample from N(0; Iq ). Reference: J. Chen, D. Zhang and M. Davidian (Biostatistics, 2001(?)). Slide 23 The distribution of (b0i ; b1i ) is approximated by a bivariate SNP density with K = 0; 1; 2. 3 (4) tells how baseline age (sex) aects the change of cholesterol level. Slide 24 Regression CoeÆcient Estimates Model Selection Criteria Criterion K=0 K=1 K=2 Log-likelihood AIC BIC HQ -147.3518 -157.3518 -182.1059 -166.7404 -135.4209 -147.4209 -177.1258 -158.6873 -135.3278 -150.3279 -187.459 -164.4107 All criteria selected K = 1. K=0 K=1 K=2 Parameter Estimate(SE) Estimate(SE) Estimate(SE) 1 (age) 2 (sex) 3 (ageyear) 4 (sexyear) E(b0i) E(b1i) 0.0148(0.0035) -0.0064(0.0549) -0.0114(0.0028) 0.1799(0.0450) 1.7219(0.1505) 0.6800(0.1213) 0.0115(0.0032) -0.0011(0.0473) -0.0112(0.0028) 0.1799(0.0454) 1.8608(0.1404) 0.6711(0.1226) 0.0128(0.0032) -0.0285(0.0462) -0.0104(0.0028) 0.1677(0.0453) 1.8161(0.1407) 0.6419(0.1225) Males tend to have a larger change rate than females; older people tend to have smaller change rates, etc. Slide 25 Slide 26 13 Counter plot of the estimated density for (b0i ; b1i ) 0 Slope 0.6 0.8 Density 2 1.0 4 1.2 Estimated density for (b0i ; b1i ) 1 . op e 0.5 1 Inte 2 t rcep 0.4 . 0.5 1.0 1.5 2.0 Intercept Slide 27 .. 0.2 3 Sl . . . .. . . . . ... . . . .... . . . . .. ..... ........ ... . . . . . ... ..... ..... . .. . ... ....... ... . . ... .. ......... . .. . ..... . .. . . .. ......... ..... . . ... . ... ... . . .. . .. . ... . . . . . . . Slide 28 2.5 3.0 3.5 Estimated marginal density for b1i 1.5 Density 1.0 0.6 0.0 0.0 0.2 0.5 0.4 Density 0.8 2.0 1.0 2.5 1.2 3.0 Estimated marginal density for b0i 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.2 0.4 0.6 Intercept 0.8 1.0 1.2 Slope Slide 29 Slide 30 15 Simulation results, 100 data sets: MC Ave. and MC SD are average and standard deviation of the estimates, 5. Simulation Studies Ave. SE is average of estimated standard errors, RE is Monte Carlo mean square error for the indicated t divided by that for K True model: K yij = bi + tij 1 + wi 2 + ij ; i = 1; :::; 100; j = 1; :::; 5: tij = j 3, 1 = 2; wi = I (i 50), 2 = 1; ij N(0; 0:52). Case 1: bi 0:7N( 3; 1) + 0:3N(2; 1) (mixture of normals); Case 2: bi N( 1:5; 6:25). 100 data sets were simulated. Fit the model with K = 0; 1; 2 to each data set. Case 1: AIC preferred K = 1; 2 35%, 65% ( BIC: 76%, 24%; HQ: 56%, 44%) Case 2: AIC preferred K = 0; 1; 2 84%, 7%, 9% ( BIC: 97%, 3%, 0%; HQ: 89%, 5%, 6%). Slide 31 MC Ave. = 0. True values of parameters are in parentheses. =0 MC SD Preferred by BIC Ave. SE MC Ave. MC SD Ave. SE Preferred by HQ RE MC Ave. MC SD Ave. SE RE (a) Mixture Scenario 1 (2) 2.000 0.017 0.016 2.000 0.017 0.016 1.00 2.000 0.017 0.016 1.00 2 (1) 1.158 0.472 0.493 1.034 0.234 0.209 0.21 1.028 0.230 0.208 0.23 1:614 0.369 0.349 1:552 0.275 0.269 0.52 1:549 0.273 0.269 0.52 var(b) (6.25) 6.045 0.638 0.862 6.098 0.654 0.690 1.01 6.099 0.655 0.695 1.00 0.498 0.018 0.018 0.498 0.018 0.018 1.00 0.498 0.018 0.018 1.00 E(b) ( 1:5) (0.5) (b) Normal Scenario 1 (2) 2.000 0.017 0.016 2.000 0.017 0.016 1.00 2.000 0.017 0.016 1.00 2 (1) 0.994 0.512 0.489 0.987 0.533 0.487 1.17 0.990 0.550 0.479 1.08 1:491 0.363 0.346 1:487 0.373 0.345 1.09 1:489 0.380 0.343 1.05 var(b) (6.25) 5.955 0.789 0.849 5.957 0.790 0.861 1.00 5.958 0.790 0.863 1.00 0.498 0.018 0.018 0.498 0.018 0.018 1.00 0.498 0.018 0.018 1.00 E(b) ( (0.5) 1:5) Slide 32 (a) True (solid) and estimated densities: normal(long dashed), BIC(dotted). (b) Estimated densities by HQ The proposed models can be useful for analyzing longitudinal data when distributional assumption of random eects in mixed models is violated. The SNP approach is capable of detecting departure from the normal. Simulation studies show satisfactory performance of the inference procedure. Computation is relatively straightforward for normal data, but could be intensive for non-normal data. Current research: Extend SNP approach to other popular models. 0.3 0.0 0.1 0.2 Densities 0.3 0.2 0.1 0.0 Densities 0.4 (b) 0.4 (a) 6. Summary and Discussion -6 -4 -2 0 2 4 -6 x -4 -2 0 2 4 x Slide 33 Slide 34 17
© Copyright 2026 Paperzz