Math 141 Lecture 23: More Dummy Variables: Interactions Albyn Jones1 1 Library 304 [email protected] www.people.reed.edu/∼jones/courses/141 Albyn Jones Math 141 Analysis of Covariance We will study the relationship between the linear models below: Model R formula Y = β0 + βx X + Y ∼X Y = β0 + βx X + βd D + Y ∼X +D Y = β0 + βx X + βd D + βx:d (X · D) + Y ∼X ∗D where the X is numeric and D is a dummy variable. Albyn Jones Math 141 Example: Berkeley Longitudinal Study, 1 > B.lm1 <- lm(ht18 ˜ ht2, data = Berkeley) > summary(B.lm1) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 32.1203 26.6572 1.205 0.233 ht2 1.5998 0.3031 5.278 2.2e-06 --Residual standard error: 7.572 on 56 degrees of freedom Albyn Jones Math 141 One Line to Fit them All! ht18 ~ ht2 190 ● ● ● ● ● ● ● ● 180 ● ● ● ● ●● ● ● ● ● ● ● ht18 ● ● 170 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 160 ● ● ● ● ● ● 82 84 86 88 ht2 Albyn Jones Math 141 90 92 94 Example: Berkeley Longitudinal Study, take 2 > B.lm2 <- lm(ht18 ˜ ht2 + sex, data = Berkeley) > summary(B.lm2) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 49.4174 16.4052 3.012 0.00391 ht2 1.3416 0.1873 7.162 2.05e-09 sexMale 12.0192 1.2356 9.727 1.49e-13 --Residual standard error: 4.633 on 55 degrees of freedom Albyn Jones Math 141 Parallel Lines: No Interaction ht18 ~ ht2 + sex 190 ● ● ● ● ● ● ● ● 180 ● ● ● ● ●● ● ● ● ● ● ● ht18 ● ● 170 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 160 ● ● ● ● ● ● 82 84 86 88 ht2 Albyn Jones Math 141 90 92 94 Parallel Lines: Making the PLot > coef(B.lm2) (Intercept) 49.417437 ht2 1.341649 sexMale 12.019233 with(Berkeley,plot(ht2, ht18, pch=19, col= ifelse(Berkeley$sex==’Male’,’blue’,’red’))) abline(49.41, 1.34, lty=2, lwd=2, col=’red’) abline(49.41 + 12.02, 1.34, lty=2, lwd=2, col=’blue’) title(’ht18 ˜ ht2 + sex’) Albyn Jones Math 141 The ht2 by sex Interaction Model > B.lm3 <- lm(ht18 ˜ ht2 * sex, data = Berkeley) > summary(B.lm3) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 54.7541 20.9285 2.616 0.0115 ht2 1.2806 0.2391 5.356 1.79e-06 sexMale -2.2400 34.3193 -0.065 0.9482 ht2:sexMale 0.1619 0.3895 0.416 0.6792 --Residual standard error: 4.668 on 54 degrees of freedom Albyn Jones Math 141 Interaction Model: non-parallel lines ht18 ~ ht2 * sex 190 ● ● ● ● ● ● ● ● 180 ● ● ● ● ●● ● ● ● ● ● ● ht18 ● ● 170 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 160 ● ● ● ● ● ● 82 84 86 88 ht2 Albyn Jones Math 141 90 92 94 Interaction: Making the PLot > coef(B.lm3) (Intercept) ht2 54.7541433 1.2806343 sexMale ht2:sexMale -2.2400286 0.1619488 with(Berkeley,plot(ht2,ht18,pch=19,col= ifelse(Berkeley$sex==’Male’,’blue’,’red’))) abline(54.75, 1.28, lty=2, lwd=2, col=’red’) abline(54.75 -2.24, 1.28 + .16, lty=2, lwd=2, col=’blue’) title(’ht18 ˜ ht2 * sex’) Albyn Jones Math 141 Both Models Both Models 190 ● ● ● ● ● ● ● ● 180 ● ● ● ● ●● ● ● ● ● ● ● ht18 ● ● 170 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 160 ● ● ● ● ● ● 82 84 86 88 90 92 ht2 The dotted lines are for the interaction model. Albyn Jones Math 141 94 Interpretation of Coefficients Y ∼ X : β0 is the intercept, βx is the slope. Albyn Jones Math 141 Interpretation of Coefficients Y ∼ X : β0 is the intercept, βx is the slope. Y ∼ X + D: β0 is the intercept for the baseline group (D = 0), β0 + βd is the intercept for the other group (D = 1), βx is the common slope. Albyn Jones Math 141 Interpretation of Coefficients Y ∼ X : β0 is the intercept, βx is the slope. Y ∼ X + D: β0 is the intercept for the baseline group (D = 0), β0 + βd is the intercept for the other group (D = 1), βx is the common slope. Y ∼ X ∗ D: β0 is the intercept for the baseline group (D = 0), βx is the slope for that group. β0 + βd is the intercept for the other group (D = 1), βx + βd:x is the slope for that group. Albyn Jones Math 141 What does Interaction mean? 0 −2 −1 Y 1 2 Interaction −2 −1 0 1 2 X Interaction: the distance between the two lines depends on the X coordinate. Equivalently, the slope depends on the group. Albyn Jones Math 141 What does No Interaction mean? 0 −2 −1 Y 1 2 No Interaction −2 −1 0 1 2 X Additive or Parallel Lines: the distance between the two lines is constant. Albyn Jones Math 141 Interpretation Again Additive or No Interaction: The difference between the groups is constant; it does not depend on the value of the covariate X . Additive or No Interaction: The slope for X does not depend on group membership. Interaction: The difference between the groups is not constant; it varies with the value of the covariate X . Interaction: The slope for X depends on group membership. Albyn Jones Math 141 Generic Interpretation Additive or No Interaction: The response to one factor does not depend on the value of the other. Interaction: The response to one factor depends on the value of the other. Not Interaction!! Interaction does not mean the explanatory variables are correlated with each other!! Albyn Jones Math 141 Analysis of Covariance The models again: Model R formula Y = β0 + βx X + Y ∼X Y = β0 + βx X + βd D + Y ∼X +D Y = β0 + βx X + βd D + βx:d (X · D) + Y ∼X ∗D where the X is numeric and D is a dummy variable. Albyn Jones Math 141 Ananlysis of Covariance: Summary For the model with parallel lines Y = β0 + βx X + βd D + the coefficient for the dummy variable βd represents the (constant!) distance between the lines. For the interaction model, Y = β0 + βx X + βd D + βx:d (X · D) + the coefficient for the dummy variable βd represents the difference between the intercepts for the two lines; the interaction coefficient βx:d represents the difference between their slopes. Albyn Jones Math 141 What if there are more categories? Suppose we have a variable called Group with 3 categories: A, B and C. R will automatically code two dummy variables: GroupB and GroupC. The model Y ∼ X + Group represents 3 parallel lines. The model Y ∼ X ∗ Group represents 3 arbitrary lines. Albyn Jones Math 141 Interactions between Numeric Variables One can fit models like lm(Y ˜ X*U) # the same as XU <- X*U lm(Y ˜ X + U + XU) where X and U are both numeric variables. This is really a specialization of the full quadratic model: X2 <- Xˆ2 U2 <- Uˆ2 XU <- X*U lm(Y ˜ X+U+X2+XU+U2) Albyn Jones Math 141 Summary 3 types of models Albyn Jones Math 141 Summary 3 types of models No Group dependence Y ∼X Albyn Jones Math 141 Summary 3 types of models No Group dependence Y ∼X Additive models Y ∼X +G Albyn Jones Math 141 Summary 3 types of models No Group dependence Y ∼X Additive models Y ∼X +G Interaction models Y ∼X ∗G Albyn Jones Math 141
© Copyright 2026 Paperzz