Polynomial Terms in Regression Interaction

Polynomial Terms in Regression
Interaction Terms in Regression
Lecture 21
November 30, 2006
Psychology 790
Lecture 21
Psychology 790
Today’s Lecture
Overview
➤ Today’s Lecture
➤ Schedule
➤ Announcements
●
Polynomial regression models.
●
Interaction terms in regression models.
Terminology Review
Polynomial
Regression Models
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Our New Schedule
Date
11/28
11/30
12/5
12/7
Lecture 21
Topic
Statistical Case Studies
Polynomial Terms and Interactions in Regression
Qualitative and Quantitative Predictors
Final exam discussion
Chapter
K 8.1, 8.2
K 8.3-8.7
Psychology 790
Announcements
Overview
➤ Today’s Lecture
➤ Schedule
➤ Announcements
●
Interested in learning more statistics?
●
Here are three courses you should consider:
✦
Psych 791: it goes without saying, but learn why the
general linear model is so cool.
✦
Psych 892: Test Theory.
Terminology Review
Polynomial
Regression Models
■
Interaction
Regression Models
✦
Learn about how we develop scales and questionnaires.
Psych 993: Statistical Consulting.
Wrapping Up
●
Lecture 21
■
Have data and need stats help?
■
Or, do you want hands-on stats experience under my
guidance?
Having taken 790, you are prepared for all of these courses.
Psychology 790
Terminology Review
Lecture 21
Psychology 790
Quantitative vs. Qualitative Predictor
●
Just as a brief review, what is the difference between a
Quantitative and a Qualitative predictor variable.
●
A quantitative predictor is one that is measured on a
continuum, or can be thought of as a continuous variable
(age, weight).
●
A qualitative predictor variable is one that is measured by
categories, can be either ordered (Likert scale) or
non-ordered (male or female).
●
While we have only be using continuous variables in
regression up to now, we can use a mix of both qualititative
and quantitative predictors, or just qualitative predictors by
themselves (ala ANOVA).
Overview
Terminology Review
➤ Quant v. Qual
➤ Model Order
➤ Higher Order
Model
➤ Book Model
Terminology
Polynomial
Regression Models
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Model Order
●
You have probably noticed that the book always refers to
model 6.5 and calls it a ’first order regression model’
●
So, why is it called a first order model?
●
The order of the model is defined by the largest exponent
value on the x variables.
●
Model 6.5 looks like this:
Overview
Terminology Review
➤ Quant v. Qual
➤ Model Order
➤ Higher Order
Model
➤ Book Model
Terminology
Polynomial
Regression Models
Yi = β0 + β1 Xi1 + β2 Xi2 + ...βip−1 Xip−1 + ǫi
Interaction
Regression Models
Wrapping Up
Lecture 21
●
All of the exponents on the X variables are 1
●
Hence the term ’first order’
Psychology 790
Higher Order Model
●
This chapter first begins with a discussion of higher order
models, that is regression models that have exponents larger
than 1 on one of its X variables
●
The order number is equal to the largest exponent.
●
Here are some examples:
Overview
Terminology Review
➤ Quant v. Qual
➤ Model Order
➤ Higher Order
Model
➤ Book Model
Terminology
2
Yi = β0 + β1 Xi1 + β2 Xi1
+ ǫi
Polynomial
Regression Models
Interaction
Regression Models
●
This is a second order model, the highest exponent is 2.
2
3
Yi = β0 + β1 Xi1 + β2 Xi1
+ β3 Xi1
+ ǫi
Wrapping Up
●
Note that interaction terms in a model are higher order
models:
3
Yi = β0 + β1 Xi1 + β2 Xi2 + β3 Xi2
+ ǫi
Lecture 21
Psychology 790
Book Model Terminology
●
The book refers to models by both their order and the
number of predictor variables.
●
To title the model, we just count up the number of different
X’s and then find the highest exponent.
●
Let’s try this:
Overview
Terminology Review
➤ Quant v. Qual
➤ Model Order
➤ Higher Order
Model
➤ Book Model
Terminology
Polynomial
Regression Models
Interaction
Regression Models
2
Yi = β0 + β1 Xi1 + β2 Xi1
+ ǫi
2
Yi = β0 + β1 Xi1 + β2 Xi1
+ β3 Xi2 + ǫi
Wrapping Up
2
3
2
Yi = β0 + β1 Xi1 + β2 Xi1
+ β3 Xi1
+ β1 Xi2 + β2 Xi2
+ ǫi
Lecture 21
Psychology 790
Polynomial Regression Models
Lecture 21
Psychology 790
Polynomial Regression Models
●
Polynomial Regression Models are regression models that
have higher order terms in them.
●
There are two basic types of uses for these models:
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
●
✦
When the true curvilinear response function is indeed a
polynomial function
✦
When the true curvilinear response function is unknown
but a polynomial function is a good approximation to the
true function
In other words, when the model fits your data.
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Model Appearance
●
First order model has a linear response function.
Overview
First order model Y=1.0 + 2.0 X
Interaction
Regression Models
15
5
10
y
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
20
Terminology Review
0
2
4
6
8
10
x
Wrapping Up
Lecture 21
Psychology 790
Model Appearance
●
Overview
Second order model has a quadratic response function - a
parabola.
Second order model Y=1.0 + 2.0 X − 0.2 X^2
4
1
2
3
y
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
5
6
Terminology Review
0
Interaction
Regression Models
2
4
6
8
10
x
Wrapping Up
Lecture 21
Psychology 790
Model Appearance
●
Overview
Third order model looks like a line that has been pulled in
two directions.
Third order model Y=0.1 + 0.2 X − 0.2 X^2 + 0.1 X^3
Interaction
Regression Models
0
−100
−50
y
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
50
Terminology Review
−10
−5
0
5
10
x
Wrapping Up
Lecture 21
Psychology 790
Estimating Polynomial Models
●
Finding your estimates are done in the same way we did
before.
●
Use SAS prog glm or reg.
●
SAS ends up using the same matrix calculations: Now our X
matrix will include a column for the quadratic term.
●
What is the matrix equation for the least squares estimates
of the regression weights anyway?
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
How do we know to use a higher order term?
Overview
We shall employ the use of our partial sums of squares that
we learned before thanks giving:
Terminology Review
1. First, we will fit the first order model.
●
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Interaction
Regression Models
2. Second, we will fit a second order term. We shall look at
that term given that the first order term is already in the
model: (i.e. x2 | x)
3. If that is significant, you keep it in the model. If it is not,
then stop.
4. Then add the third order term conditional on the first and
second order terms (x3 | x2 ,x).
5. If that is significant, you keep that term in the model.
6. You keep going until you find a nonsignificant order term.
This should be the order that fits your data best.
Wrapping Up
Lecture 21
Psychology 790
Example Data
●
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
From Pedhazur (1997), p. 522:
“Suppose that we are interested in the effect of time
spent in practice on the performance of a visual
discrimination task. Subjects are randomly assigned to
different levels of practice, following which a test of
visual discrimination is administered, and the number of
correct responses is recorded for each subject. As
there are six levels the highest-degree polynomial
possible for these data is the fifth. Our aim, however, is
to determine the lowest degree-polynomial that best fits
the data.”
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
A Visual Discrimination Task
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Data Plot
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
First Order Model
data vis;
set vis;
practice2=practice**2;
practice3=practice**3;
run;
proc glm data=vis;
model correct=practice;
run;
Lecture 21
Psychology 790
First Order Model
The GLM Procedure
Dependent Variable: correct
Source
Model
Error
Corrected Total
DF
1
16
17
R-Square
0.883236
Coeff Var
14.47858
Mean Square
509.1857143
4.2071429
Root MSE
2.051132
F Value
121.03
Pr > F
<.0001
correct Mean
14.16667
Source
practice
DF
1
Type I SS
509.1857143
Mean Square
509.1857143
F Value
121.03
Pr > F
<.0001
Source
practice
DF
1
Type III SS
509.1857143
Mean Square
509.1857143
F Value
121.03
Pr > F
<.0001
Parameter
Intercept
practice
Lecture 21
Sum of
Squares
509.1857143
67.3142857
576.5000000
Estimate
3.266666667
1.557142857
Standard
Error
1.10245037
0.14154156
t Value
2.96
11.00
Pr > |t|
0.0092
<.0001
Psychology 790
Model and Data Plot
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Interaction
Regression Models
axis1 label=(’Practice’);
axis2 label=(angle=90, ’Correct’);
symbol1 v=dot i=rl width=2 cv=black ci=red;
proc gplot data=vis;
title2 ’First Order Model’;
plot correct*practice=1/haxis=axis1 vaxis=axis2 regeqn;
run;
Wrapping Up
Lecture 21
Psychology 790
Test the Squared Term
●
Test the squared term to see if it is appropriate or just the
first order model is what is needed.
●
Really, we need to see if the SS regression in the 2nd order
model (or SS(X12 |X1 )) is significantly increased over what
SS(X1 ) was.
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
proc glm data=vis;
model correct=practice practice2;
run;
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Second Order Model
Dependent Variable: correct
Source
Model
Error
Corrected Total
DF
2
15
17
R-Square
0.942770
Lecture 21
Sum of
Squares
543.5071429
32.9928571
576.5000000
Coeff Var
10.46879
Mean Square
271.7535714
2.1995238
Root MSE
1.483079
F Value
123.55
Pr > F
<.0001
correct Mean
14.16667
Source
practice
practice2
DF
1
1
Type I SS
509.1857143
34.3214286
Mean Square
509.1857143
34.3214286
F Value
231.50
15.60
Pr > F
<.0001
0.0013
Source
practice
practice2
DF
1
1
Type III SS
106.9989477
34.3214286
Mean Square
106.9989477
34.3214286
F Value
48.65
15.60
Pr > F
<.0001
0.0013
Parameter
Estimate
Standard
Error
Intercept
practice
practice2
-1.900000000
3.494642857
-0.138392857
1.53171758
0.50104575
0.03503445
t Value
Pr > |t|
-1.24
6.97
-3.95
0.2339
<.0001
0.0013
Psychology 790
Model and Data Plot
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Interaction
Regression Models
axis1 label=(’Practice’);
axis2 label=(angle=90, ’Correct’);
symbol2 v=dot i=rq width=2 cv=black ci=red;
proc gplot data=vis;
title2 ’First Order Model’;
plot correct*practice=2/haxis=axis1 vaxis=axis2 regeqn;
run;
Wrapping Up
Lecture 21
Psychology 790
Test the Squared Term
●
Overview
✦
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Recall from our previous class where to look to find
SS(X12 |X1 ).
We look at the Type I Sum of Squares...
Source
practice
practice2
●
DF
1
1
Type I SS
509.1857143
34.3214286
Mean Square
509.1857143
34.3214286
F Value
231.50
15.60
Pr > F
<.0001
0.0013
Because the Type I SS for Practice2 is significant, we will
conclude that the squared term is needed in the model.
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Test the Cubic Term
●
Now, we must test the cubic term to see if it is appropriate or
just the second order model is what is needed.
●
Really, we need to see if the SS regression in the 3nd order
model (or SS(X13 |X1 , X12 )) is significantly increased over
what SS(X12 |X1 ) was.
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
proc glm data=vis;
model correct=practice practice2 practice3;
run;
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Third Order Model
The GLM Procedure
Dependent Variable: correct
Source
Model
Error
Corrected Total
DF
3
14
17
R-Square
0.946269
Coeff Var
10.49983
Mean Square
181.8412698
2.2125850
Root MSE
1.487476
F Value
82.18
Pr > F
<.0001
correct Mean
14.16667
Source
practice
practice2
practice3
DF
1
1
1
Type I SS
509.1857143
34.3214286
2.0166667
Mean Square
509.1857143
34.3214286
2.0166667
F Value
230.13
15.51
0.91
Pr > F
<.0001
0.0015
0.3559
Source
practice
practice2
practice3
DF
1
1
1
Type III SS
2.51380085
0.46197584
2.01666667
Mean Square
2.51380085
0.46197584
2.01666667
F Value
1.14
0.21
0.91
Pr > F
0.3045
0.6547
0.3559
Parameter
Intercept
practice
practice2
practice3
Lecture 21
Sum of
Squares
545.5238095
30.9761905
576.5000000
Estimate
0.666666667
1.880291005
0.128968254
-0.012731481
Standard
Error
3.09642834
1.76404484
0.28224300
0.01333558
t Value
0.22
1.07
0.46
-0.95
Pr > |t|
0.8326
0.3045
0.6547
0.3559
Psychology 790
Model and Data Plot
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Interaction
Regression Models
axis1 label=(’Practice’);
axis2 label=(angle=90, ’Correct’);
symbol3 v=dot i=rc width=2 cv=black ci=red;
proc gplot data=vis;
title2 ’Third Order Model’;
plot correct*practice=3/haxis=axis1 vaxis=axis2 regeqn;
run;
Wrapping Up
Lecture 21
Psychology 790
Test the Cubic Term
●
Overview
✦
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Recall from our previous class where to look to find
SS(X13 |X1 , X12 ).
We look at the Type I Sum of Squares...
Source
practice
practice2
practice3
DF
1
1
1
Type I SS
509.1857143
34.3214286
2.0166667
Mean Square
509.1857143
34.3214286
2.0166667
F Value
230.13
15.51
0.91
Pr > F
<.0001
0.0015
0.3559
●
Because the Type I SS for Practice3 is not significant, we will
conclude that the cubic term is not needed in the model and our model fitting exercise is finished.
●
What model do we end up with?
●
What is our resulting R2 ?
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Residual Diagnostics
●
As with first order regression models, you still need to check
your error terms.
●
The same patterns apply, you need normal error terms and
independence in your residuals(big mess of dots).
●
Remember, we can detect nonlinear trends in our error
terms, that is of particular use here.
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Multicollinearity
●
There is one major problem with doing polynomial
regression: Multicollinearity.
●
To reduce multicollinearity present in the data, the common
way to control for this problem is the do a mean centered
regression.
●
In mean centered regression, first you take your X value and
subtract off your mean from each value. You then use that in
your model (i.e. square that or cube that variable)
Overview
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Multicollinearity Example
Overview
data vis;
set vis;
cpractice=practice-7;
cpractice2=cpractice**2;
cpractice3=cpractice**3;
run;
Terminology Review
Polynomial
Regression Models
➤ Function Shape
➤ Estimation
➤ Testing Higher
Order Terms
➤ Example Data
➤ Data Plot
➤ First Order Model
➤ Model and Data
Plot
➤ Second Order
Model
➤ Third Order
Model
➤ Residual
Diagnostics
➤ Multicollinearity
proc corr data=vis;
var practice practice2 practice3;
run;
proc corr data=vis;
var cpractice cpractice2 cpractice3;
run;
Pearson Correlation Coefficients, N = 18
practice
practice2
practice3
practice
1.00000
0.97892
0.93793
practice2
0.97892
1.00000
0.98845
practice3
0.93793
0.98845
1.00000
Pearson Correlation Coefficients, N = 18
cpractice
cpractice2
cpractice3
cpractice
1.00000
0.00000
0.93446
cpractice2
0.00000
1.00000
0.00000
cpractice3
0.93446
0.00000
1.00000
Interaction
Regression Models
Wrapping Up
Lecture 21
Psychology 790
Interaction Regression Models
Lecture 21
Psychology 790
Interaction Regression Models
Overview
●
These models are formed by adding an interaction term.
●
An interaction term is a multiplicative product of other
variables in the model.
●
A simple example is as follows:
Terminology Review
Polynomial
Regression Models
Interaction
Regression Models
➤ Interpretation
➤ Examining
Interactions
➤ Estimation
Procedure
➤ Interaction
Example
➤ Output
➤ Test
➤ Output
Yi = β0 + β1 Xi1 + β2 Xi2 + β3 Xi1 Xi2 + ǫi
●
These models are fit in the same manner as all the other
models that we have looked at.
●
Can you guess the matrix equation for the least squares
estimates of the regression weights?
Wrapping Up
Lecture 21
Psychology 790
Interpretation of Interaction Term
●
The fitting of the model isn’t the hard part, it is trying to figure
out what it means is the problem.
●
We even had a clear solution for the interpretation of main
effects, so what happens to the interpretation of our model
once we include an interaction term?
●
β1 is no longer the increase in y with a unit increase in X1
with X2 held constant, this increase is now: β1 + β2 X2
●
The same is true from X2 : The change in Y is β2 + β1 X1 for
each unit increase in X2 .
●
The reason we have to include the other variable is because
there is an interaction (i.e. we can no longer think we can
hold the other variable constant).
Overview
Terminology Review
Polynomial
Regression Models
Interaction
Regression Models
➤ Interpretation
➤ Examining
Interactions
➤ Estimation
Procedure
➤ Interaction
Example
➤ Output
➤ Test
➤ Output
Wrapping Up
Lecture 21
Psychology 790
Looking at Interactions
●
Overview
✦
Terminology Review
Polynomial
Regression Models
Interaction
Regression Models
➤ Interpretation
➤ Examining
Interactions
➤ Estimation
Procedure
➤ Interaction
Example
➤ Output
➤ Test
➤ Output
Recall the concept that a multiple regression with two X
variables fits a plane as the model.
●
The inclusion of an interaction term in the regression model
will curve or warp the plane.
✦
●
The plane in this case is “flat”, without any curvature.
The extent of the warp depends on the degree of the
interaction.
Let’s examine the plots given in Kutner, p. 310.
Wrapping Up
Lecture 21
Psychology 790
Fitting the Interaction Model
Overview
●
If you believe there is an interaction, fit the first order model.
●
Next fit the interaction term.
●
Test if you should keep the interaction term.
●
If high collinearity, try the mean centered approach.
●
If the interaction is significant, examine the effect.
Terminology Review
Polynomial
Regression Models
Interaction
Regression Models
➤ Interpretation
➤ Examining
Interactions
➤ Estimation
Procedure
➤ Interaction
Example
➤ Output
➤ Test
➤ Output
Wrapping Up
Lecture 21
Psychology 790
Interaction Example
●
To illustrate regression interactions, we use the data set
illustrated on page 1348, from the Study on the Efficacy of
Nosocomial Infection Control (or SENIC).
●
The researchers want to test if there is an interaction
between Age (X1 ) and Infection Risk (X2 ) on the length of
hospital stay (Y ).
●
From Wikipedia: Nosocomial infections are those which are
a result of treatment in a hospital or hospital-like setting, but
secondary to the patient’s original condition.
Overview
Terminology Review
Polynomial
Regression Models
Interaction
Regression Models
➤ Interpretation
➤ Examining
Interactions
➤ Estimation
Procedure
➤ Interaction
Example
➤ Output
➤ Test
➤ Output
proc glm data=senic;
model lengthstay=age infectionrisk age*infectionrisk;
run;
Wrapping Up
Lecture 21
Psychology 790
Interaction Example
Dependent Variable: LengthStay
Overview
Terminology Review
Polynomial
Regression Models
Interaction
Regression Models
➤ Interpretation
➤ Examining
Interactions
➤ Estimation
Procedure
➤ Interaction
Example
➤ Output
➤ Test
➤ Output
Wrapping Up
Lecture 21
Source
Model
Error
Corrected Total
R-Square
0.328850
The GLM Procedure
LengthStay
Sum of
DF
Squares
Mean Square
3
134.5690082
44.8563361
109
274.6413723
2.5196456
112
409.2103805
Coeff Var
16.45198
Root MSE
1.587339
F Value
17.80
Pr > F
<.0001
LengthStay Mean
9.648319
Source
Age
InfectionRisk
Age*InfectionRisk
DF
1
1
1
Type I SS
14.6041001
116.3558517
3.6090564
Mean Square
14.6041001
116.3558517
3.6090564
F Value
5.80
46.18
1.43
Pr > F
0.0177
<.0001
0.2340
Source
Age
InfectionRisk
Age*InfectionRisk
DF
1
1
1
Type III SS
0.35746329
0.82771847
3.60905640
Mean Square
0.35746329
0.82771847
3.60905640
F Value
0.14
0.33
1.43
Pr > F
0.7072
0.5677
0.2340
Parameter
Intercept
Age
InfectionRisk
Age*InfectionRisk
Estimate
8.625989523
-0.040052373
-0.704181502
0.026835030
Standard
Error
5.80641807
0.10633647
1.22860710
0.02242203
t Value
1.49
-0.38
-0.57
1.20
Pr > |t|
0.1403
0.7072
0.5677
0.2340
Psychology 790
Test the Interaction Term
●
Overview
✦
Terminology Review
We look at the Type I Sum of Squares...
Source
Age
InfectionRisk
Age*InfectionRisk
Polynomial
Regression Models
Interaction
Regression Models
➤ Interpretation
➤ Examining
Interactions
➤ Estimation
Procedure
➤ Interaction
Example
➤ Output
➤ Test
➤ Output
Recall from our previous class where to look to find
SS(X1 X2 |X1 , X2 ).
DF
1
1
1
Type I SS
14.6041001
116.3558517
3.6090564
Mean Square
14.6041001
116.3558517
3.6090564
F Value
5.80
46.18
1.43
Pr > F
0.0177
<.0001
0.2340
●
Because the Type I SS for Age*InfectionRisk is not
significant, we will conclude that the interaction term is not
needed in the model.
●
What model do we end up with?
●
What is our resulting R2 ?
Wrapping Up
Lecture 21
Psychology 790
Interaction Example
Dependent Variable: LengthStay
Overview
Source
Model
Error
Corrected Total
LengthStay
DF
2
110
112
Sum of
Squares
130.9599518
278.2504287
409.2103805
Mean Square
65.4799759
2.5295494
F Value
25.89
Pr > F
<.0001
Terminology Review
R-Square
0.320031
Polynomial
Regression Models
Interaction
Regression Models
➤ Interpretation
➤ Examining
Interactions
➤ Estimation
Procedure
➤ Interaction
Example
➤ Output
➤ Test
➤ Output
Coeff Var
16.48428
Root MSE
1.590456
LengthStay Mean
9.648319
Source
Age
InfectionRisk
DF
1
1
Type I SS
14.6041001
116.3558517
Mean Square
14.6041001
116.3558517
F Value
5.77
46.00
Pr > F
0.0179
<.0001
Source
Age
InfectionRisk
DF
1
1
Type III SS
14.5140962
116.3558517
Mean Square
14.5140962
116.3558517
F Value
5.74
46.00
Pr > F
0.0183
<.0001
Parameter
Intercept
Age
InfectionRisk
Estimate
2.043031367
0.080685389
0.760127416
Standard
Error
1.86379463
0.03368383
0.11207632
t Value
1.10
2.40
6.78
Pr > |t|
0.2754
0.0183
<.0001
Wrapping Up
Lecture 21
Psychology 790
Final Thought
●
Interaction terms and
polynomial terms are very
similar.
●
All lower order terms must
be included in the presence
of higher order or
interaction models
●
The Type I SS are very useful for testing the contribution of
higher order terms in a model.
●
Multicollinearity is a big problem in these models.
Overview
Terminology Review
Polynomial
Regression Models
Interaction
Regression Models
Wrapping Up
➤ Final Thought
➤ Next Class
Lecture 21
Psychology 790
Next Time
Overview
●
Next next time: The rest of Chapter 8 of Kutner.
●
Regression models for quantitative and qualitative predictors.
●
A surprise.
Terminology Review
Polynomial
Regression Models
Interaction
Regression Models
Wrapping Up
➤ Final Thought
➤ Next Class
Lecture 21
Psychology 790