STAT 5200 Handout #27 ANCOVA: Including Covariates in Models (Ch. 17) Up until now, all of our models have used experimental factors (A) whose levels have been assigned (at random) to experimental units. Sometimes, however, experimental units have additional characteristics (not assigned to them at random) that may affect the response variable (Y). Such characteristics are generally considered uncontrolled nuisance variables; we refer to them as “covariates”, and some literature uses the term “concomitant variables”. Analysis of Covariance (ANCOVA) involves adding these variables (X) to our model in an appropriate way. They are usually treated similar to blocking factors (due to lack of randomization), and including them has a similar variance reduction (as results from blocking). We generally add them as linear effects (with “slope” parameter β): ANOVA: Yij = µ + Ai + εij ANCOVA (assuming additive effects): – note that this assumes equal slopes ANCOVA (allowing interaction): – note that this allows nonparallel slopes Yij = µ + Ai + β Xij + εij Yij = µ + Ai + βi Xij + εij Extensions: o for designs other than completely randomized design o for mixed models o for quadratic (or other non-linear) covariate effects Note: A covariate must be observed before treatment application, or it may in fact be a secondary response variable. Simply including a secondary response variable in the model will cause interpretation to suffer; see text section 17.2. Example: In an experiment studying treatments for leprosy, a pre-treatment score of leprosy bacilli was recorded, and then subjects were randomly assigned to one of three drugs (antibiotics ‘a’ and ‘d’, control ‘f’). After a certain length of time on treatment, the leprosy patients were again scored on leprosy bacilli. data drugtest; input Drug $ PreTreatment PostTreatment @@; datalines; a 11 6 a 8 0 a 5 2 a 14 8 a 19 11 a 6 4 a 10 13 a 6 1 a 11 8 a 3 0 d 6 0 d 6 2 d 7 3 d 8 1 d 18 18 d 8 4 d 19 14 d 8 9 d 5 1 d 15 9 f 16 13 f 13 10 f 11 18 f 9 5 f 21 23 f 16 12 f 12 5 f 12 16 f 7 1 f 12 20 ; 1 /* Fit ANOVA model */ proc glimmix data=drugtest; class Drug; model PostTreatment = Drug; run; Type III Tests of Fixed Effects Effect Drug Num DF Den DF F Value 2 27 Pr > F 3.98 0.0305 /* Fit ANOVA model of difference */ data drugtest; set drugtest; Diff = PostTreatment - PreTreatment; proc glimmix data=drugtest class Drug; Type III Tests of Fixed Effects model Diff = Drug; run; Effect Num DF Den DF F Value Pr > F /* Note this assumes 2 27 2.42 0.1078 Drug beta==1 */ /* Fit ANCOVA model */ proc glimmix data=drugtest plots=residualpanel; class Drug; model PostTreatment = Drug PreTreatment; title1 'ANCOVA Model'; run; /* Consider transformation */ data drugtest; set drugtest; Post1 = PostTreatment + 1; proc transreg data=drugtest; model boxcox(Post1 / lambda=-1 to 1 by 0.05) = class(Drug) identity(PreTreatment); title1 'Box-Cox on response'; run; 2 data drugtest; set drugtest; newPost = sqrt(PostTreatment); newPre = sqrt(PreTreatment); /* Do this to preserve scale; can run TRANSREG with this to ensure sqrt still okay for PostTreatment */ run; proc glimmix data=drugtest plots=residualpanel; class Drug; model newPost = Drug newPre; output out=out1 pred=newPosthat; title1 'ANCOVA Model on sqrt scale'; run; ANCOVA Model on sqrt scale Type III Tests of Fixed Effects Effect Num DF Den DF F Value Pr > F Drug 2 26 1.27 0.2988 newPre 1 26 41.91 <.0001 data if if if proc out1; set out1; Drug='a' then PredA = newPosthat; Drug='d' then PredD = newPosthat; Drug='f' then PredF = newPosthat; sort data=out1; by newPre; /* To make connected lines go left-to-right in plot */ proc sgplot data=out1; scatter x=newPre y=newPost / group=Drug; series x=newPre y=PredA / lineattrs=(pattern=thindot thickness=5); series x=newPre y=PredD / lineattrs=(pattern=longdash thickness=2); series x=newPre y=PredF / lineattrs=(pattern=solid thickness=2); 3 xaxis label='Square Root of PreTreatment Score'; yaxis label='Square Root of PostTreatment Score'; title1 'ANCOVA Model: Leprosy Data'; run; /* Consider Interaction */ proc glimmix data=drugtest plots=residualpanel; class Drug; model newPost = Drug | newPre; output out=out1 pred=newPosthat; title1 'Interaction ANCOVA Model on sqrt scale'; run; Interaction ANCOVA Model on sqrt scale Type III Tests of Fixed Effects Effect Num DF Den DF F Value Pr > F Drug 2 24 0.09 0.9164 newPre 1 24 36.02 <.0001 newPre*Drug 2 24 0.13 0.8750 4 data out1; set out1; if Drug='a' then PredA = newPosthat; if Drug='d' then PredD = newPosthat; if Drug='f' then PredF = newPosthat; proc sort data=out1; by newPre; proc sgplot data=out1; scatter x=newPre y=newPost / group=Drug; series x=newPre y=PredA / lineattrs=(pattern=thindot thickness=5); series x=newPre y=PredD / lineattrs=(pattern=longdash thickness=2); series x=newPre y=PredF / lineattrs=(pattern=solid thickness=2); xaxis label='Square Root of PreTreatment Score'; yaxis label='Square Root of PostTreatment Score'; title1 'Interaction ANCOVA Model: Leprosy Data'; run; (More on such quantitative predictors in STAT 5100 Linear Regression and Time Series) 5
© Copyright 2025 Paperzz