Lecture 20 Regression models for quantitative predictors
Chapter8
sec. 1
Polynomial Regression Models
Polynomial regression models have two basic uses:
1. The true curvilinear regression function is a polynomial.
2. The true curvilinear regression function is an unknown (or very complicated) function, but
a polynomial function is a good approximation to the true function.
Second Order Polynomials in One Variable
The model is:
Yi 0 1 xi 11 xi2 i , with xi X i X .
The normal equations, in matrix notation are
( x x)b x Y ,
Where,
1 x1 x12
x ,
1 x x 2
n
n
The least square estimates are
b0
b b1
b11
and
Y1
Y .
Yn
b ( xx) 1 ( xY )
Note: The explanatory variable is centered. The reason is that X and X 2 are often highly
correlated (Multicollinearity problems!), whereas x and x 2 are less likely to be highly
correlated. It can be shown that when the values of the predictor variable are equally spaced, x
and x 2 are uncorrelated.
Third Order Polynomials in One Variable
The model is:
Yi 0 1 xi 11 xi2 111 xi3 i , with xi X i X .
The normal equations (in matrix m=notation) is as above with x and b are now
1 x1 x12 x13
x
1 x x 2 x 3
n
n
n
and
b0
b
b 1
b11
b111
60
50
40
20
10
10
20
30
E(Y)= 52+8x-2x2
30
E(Y)=18-8x+2x2
y
y
40
50
60
Example:
-3
-2
-1
0
1
2
-3
-2
-1
2
20
25
E(Y)=16.3-1.45x-.15x2-.35x3
15
y
20
15
y
1
x
25
x
0
10
10
E(Y)=22.45+1.45x+.15x2+.35x3
-3
-2
-1
0
x
1
2
-3
-2
-1
0
1
2
x
Remark:
1. Any continuous function can be closely approximated by a polynomial function.
2. It is important to not try to make prediction beyond the scope of the regression when using
polynomial regression;
3. Polynomials with order higher than 3 should be used with extreme caution. Higher
polynomials are seldom used in practice.
4. We sometimes more likely to use nonlinear models or linear models with transformed variable
because of the above drawbacks.
Second Order Polynomials in Two Variables
The model is
Yi 0 1 xi1 2 xi 2 11 xi21 22 xi22 12 xi1 xi 2 i ,
where, xi1 X i1 X 1 and xi 2 X i 2 X 2 .
Remark: 12 is called the interaction coefficient. The normal equations (in matrix m=notation) is
as above with x and b are now
1
1
x
1
2
11
x11 x12 x
x 21 x 22 x
2
21
2
12
x
x
2
22
x n1 x n 2 x n21 x n22
x11 x12
x 21 x 22
x n1 x n 2
and
b0
b1
b
b 2 .
b11
b
22
b12
Contour Plot: For each fixed value of E (Y ) ,
E (Y ) 0 1 x1 2 x 2 11 x12 22 x 22 12 x1 x 2
represents a conic section in the plane. The graph of these conic sections for various fixed values
of E (Y ) is called a contour plot. So, the contour curves correspond to different response levels
and show the various combinations of levels of the two predictor variables that yield the same
level of response.
Example: The following are 3d plot and contour plot of E (Y ) 1740 4 x12 3x22 3x1 x2 .
data new;
title 'Response Surface';
do X1=-10 to 10 by 0.25;
Proc g3d data=new;
do X2=-10 to 10 by 0.25;
plot X2*X1=EY;
EY=1740-4*X1**2-3*X2**2-3*X1*X2;
run;
output;
quit;
end;
title 'Contour Curves';
end;
run;
proc gcontour data=new;
plot X2*X1=EY;
run;
quit;
Second Order Polynomials in Three Variables
The model is
Yi 0 1 xi1 2 xi 2 3 xi 3 11 xi21 22 xi22 33 xi23 12 xi1 xi 2 13 xi1 xi 3 23 xi 2 xi 3 i ,
where, xi1 X i1 X 1 , xi 2 X i 2 X 2 and xi 3 X i 3 X 3 .
Implementation of Polynomial Regression Models
Having selected a polynomial regression model, we fit it to the data using Proc Reg in SAS. If all
regression coefficients in the model are significant at some reasonable significances level, then you
are done. However, when there are many terms in the model, this is seldom the case. Terms
corresponding to regression coefficients that not significant on the individual t-test, however,
should not immediately be discarded. Instead proceed as the follows.
First test the hypothesis that simultaneously all of these regression coefficients are zero.
Suppose, for we fit the second order, three variables model and that 12 , 13 and 22 were not
significant. We then would test the hypotheses:
H 0 : 12 13 22 0
H a : 12 0 or 13 0 or 22 0 .
If we cannot reject the above hypotheses at some reasonable level of significance, simply drop the
three terms xi1 xi 2 , xi1 xi 3 , and xi22 from the model and we are finished.
Note: The above hypotheses are tested using the following test statistics:
F
SSR( x22 , x1 x2 .x1 x3 | x1 , x2 , x3 , x12 , x32 , x2 x3 ) 3 MSR ( x22 , x1 x2 .x1 x3 | x1 , x2 , x3 , x12 , x32 , x2 x3 )
MSE
MSE
And it is easier to perform this test using the TEST statement in Proc Reg.
If we reject the hypotheses that simultaneously all these regression coefficients are zero, then we
run three more polynomial regressions, one with each of the three suspect terms xi1 xi 2 , xi1 xi 3 ,
and xi22 deleted from the model. If one of these models has all individual t-tests significant, then
this is the model to use. If two of these models have all individual t-tests significant, go with the
model having highest adjusted- R 2 .
If none of the regression described above have all individual t-test significant, discard permanently
the term which when deleted from the model leads to the highest adjusted- R 2 , and continue
examining regression involving the remaining terms as described above until a model in which all
regression coefficients are significant (except possibly the interception). If several different such
models emerge, go with the one having the highest adjusted - R 2 .
Case Example
Consider the example in section 8.1 of your text.
Y -- The life of the power cell that was measured in terms of the number of discharge-charge cycles that
this power cell underwent before it failed.
X 1 -- The charging rate.
X 2 -- The temperature.
Second Order model
The researcher decided to fit the data using the second order polynomial regression model:
Yi 0 1 xi1 2 xi 2 11 xi21 22 xi22 12 xi1 xi 2 i ,
where, xi1
X i1 X 1
X X2
and xi 2 i 2
.
.4
10
Data life;
data life2;
infile
set life1;
'E:\teaching\STAT512\data\CH08TA01.txt';
Do k=1 to N;
input Y CX1 CX2;
output;
run;
end;
proc means Noprint N Mean Data=life;
drop k _Type_ _Freq_ Ybar N;
output out=life1 N=N Mean= Ybar CX1bar
run;
CX2bar;
data lifenew;
run;
merge life life2;
x1=(CX1-CX1bar)/.4;
x2=(CX2-CX2bar)/10;
run;
Obs
Y
CX1
CX2
CX1bar
CX2bar
x1
x2
1
150
0.6
10
1
20
-1
-1
2
86
1.0
10
1
20
0
-1
3
49
1.4
10
1
20
1
-1
4
288
0.6
20
1
20
-1
0
5
157
1.0
20
1
20
0
0
6
131
1.0
20
1
20
0
0
7
184
1.0
20
1
20
0
0
8
109
1.4
20
1
20
1
0
9
279
0.6
30
1
20
-1
1
10
235
1.0
30
1
20
0
1
11
224
1.4
30
1
20
1
1
data lifenew;
proc corr;
set lifenew;
var CX1 CX2 sqCX1 sqCX2 CX1CX2;
drop CX1bar CX2bar;
run;
sqCX1=CX1**2; sqx1=x1**2;
proc corr;
sqCx2=Cx2**2; sqx2=x2**2;
var x1 x2 sqx1 sqx2 x1x2;
CX1CX2=CX1*CX2; x1x2=x1*x2;
run;
run;
proc reg;
model Y=x1 x2 sqx1 sqx2 x1x2;
run;
Pearson Correlation Coefficients, N = 11
CX1
CX2
sqCX1
sqCx2
CX1CX2
CX1
1.00000
0.00000
0.99103
0.00000
0.60532
CX2
0.00000
1.00000
0.00000
0.98609
0.75665
sqCX1
0.99103
0.00000
1.00000
0.00592
0.59989
sqCx2
0.00000
0.98609
0.00592
1.00000
0.74613
CX1CX2
0.60532
0.75665
0.59989
0.74613
1.00000
Pearson Correlation Coefficients, N = 11
x1
x2
sqx1
sqx2
x1x2
x1
1.00000
0.00000
0.00000
0.00000
0.00000
x2
0.00000
1.00000
0.00000
0.00000
0.00000
sqx1
0.00000
0.00000
1.00000
0.26667
0.00000
sqx2
0.00000
0.00000
0.26667
1.00000
0.00000
x1x2
0.00000
0.00000
0.00000
0.00000
1.00000
Analysis of Variance
Sum of
Mean
Source
DF
Squares
Square
F Value
Model
5
55366
11073
10.57
Error
5
5240.43860
Corrected Total
10
Pr > F
0.0109
1048.08772
60606
Root MSE
32.37418
Dependent Mean
172.00000
Coeff Var
18.82220
R-Square
Adj R-Sq
0.9135
0.8271
Parameter Estimates
Parameter
Standard
Variable
DF
Estimate
Error
t Value
Pr > |t|
Type I SS
Intercept
1
162.84211
16.60761
9.81
0.0002
325424
x1
1
-55.83333
13.21670
-4.22
0.0083
18704
x2
1
75.50000
13.21670
5.71
0.0023
34202
sqx1
1
27.39474
20.34008
1.35
0.2359
1645.96667
sqx2
1
-10.60526
20.34008
-0.52
0.6244
284.92807
x1x2
1
11.50000
16.18709
0.71
0.5092
529.00000
The estimated regression function is as following:
Yˆ 162.84 55.83x1 75.50x2 27.39 x12 10.61x22 11.50xi1 xi 2
Model checking:
1. Residual plots including normality probability plot.
proc reg;
title
'residual plot against x2';
model Y=x1 x2 sqx1 sqx2 x1x2;
proc gplot;
output out=out1 r=resid p=Fitted
plot Resid*x2=1/ vaxis=axis2;
stdr=Std_Res;
run;
run;
symbol1
title
c=green v=dot width=5;
title 'normal probability plot';
'residual plot against fitted';
proc univariate data=out1 plot normal;
axis2 label=('Residual');
var Resid;
proc gplot;
qqplot Resid/ normal (L=1 mu=est
plot Resid*Fitted=1/ vaxis=axis2;
sigma=est);
run;
run;
title
'residual plot against x1';
proc gplot;
plot Resid*x1=1/ vaxis=axis2;
run;
Re s i d u a l
50
40
30
20
10
0
- 10
- 20
- 30
- 40
0
100
200
Pr e d i c t e d
Va l u e
of
Y
300
Re s i
d u a l
5 0
4 0
3 0
2 0
1 0
0
- 1 0
- 2 0
- 3 0
- 4 0
- 1
0
1
x 1
Re s i
d u a l
5 0
4 0
3 0
2 0
1 0
0
- 1 0
- 2 0
- 3 0
- 4 0
- 1
0
1
x 2
60
40
R
e
s
20
i
d
u
a
l
0
- 20
- 40
- 2
- 1. 5
- 1
- 0. 5
N o r ma l
0
0. 5
Qu a n t i l e s
1
1. 5
2
2. Test of fit.
proc glm data=lifenew;
class x1 x2;
model Y=x1*x2;
run;
The GLM Procedure
Sum of
Source
DF
Squares
Mean Square
F Value
Pr > F
Model
8
59201.33333
7400.16667
10.54
0.0895
Error
2
1404.66667
702.33333
Corrected Total
10
60606.00000
F
( SSE ( R) SSE ( F )) c p (5240.44 1404.67) (9 6)
1.82 .
SSE ( F ) / n c
1404.67 (11 9)
data;
Obs
1
pvalue=1-probf(1.82,3,2);
pvalue
0.37385
run;
Hence, P value P( F (3,2) 1.82) 0.37385 and the second-order polynomial regression
function is a good fit.
Partial F Test
Since the last three partial slopes are not statistically significant. Consequently, we decided to test
the hypotheses:
H a : 11 0 or 22 0 or 12 0 .
H 0 : 11 22 12 0
The test statistics is:
SSR( x12 , x22 , x1 x2 | x1 , x2 ) 3
(1646.0 284.9 529.0) 3
F
.78
2
2
1048.1
SSE ( x1 , x2 , x1 , x 2 , x1 , x2 ) /(11 6)
The above F test can also be output by using TEST statement in SAS:
proc reg data=lifenew;
model Y=x1 x2 sqx1 sqx2 x1x2/ss1;
test sqx1=sqx2=x1x2=0;
run;
Test 1 Results for Dependent Variable Y
Mean
Source
DF
Square
Numerator
3
819.96491
Denominator
5
1048.08772
Hence, a first-order model is adequate for the data.
F Value
0.78
Pr > F
0.5527
First Order Model
proc reg data=lifenew;
model Y=x1 x2;
run;
The REG Procedure
Analysis of Variance
Sum of
Mean
Source
DF
Squares
Square
F Value
Model
2
52906
26453
27.48
Error
8
7700.33333
Corrected Total
10
0.0003
962.54167
60606
Root MSE
31.02486
Dependent Mean
172.00000
Coeff Var
Pr > F
R-Square
Adj R-Sq
0.8729
0.8412
18.03771
Parameter Estimates
Parameter
Standard
Variable
DF
Estimate
Error
t Value
Pr > |t|
Intercept
1
172.00000
9.35435
18.39
<.0001
0.0023
x1
1
-55.83333
12.66584
-4.41
x2
1
75.50000
12.66584
5.96
0.0003
The estimated response function is:
Yˆ 172.00 55.83x1 75.50 x 2 or Yˆ 160.58 139.58 X 1 7.55 X 2 .
Estimation of Regression Coefficients
We construct the 90% Bonferroni simultaneous CI for 1 , 2 of the model as the following:
B t (1 2g; n p) t (1 .1/ 4; 8) 2.306 ,
S{b1} 12.67 / .4 31.68,
S{b2 } 12.67 / 10 1.267 ,
The simultaneous CI’s are:
1 139.58 2.306 * 31.68, 2 7.55 2.306 *1.267 .
Interaction Regression Models
1. An interaction model with two predictor variables is written as the following:
Yi 0 1 xi1 2 xi 2 3 xi1 xi 2 i , with xi1 X i1 X 1 and xi 2 X i 2 X 2 .
To illustrate the interaction effects, we considered the following three estimated regression
functions.
a) E(Y ) 10 2 X 1 5 X 2
b) E(Y ) 10 2 X 1 5 X 2 .5 X 1 X 2
c) E(Y ) 10 2 X 1 5 X 2 .5 X 1 X 2
Conditional Effects Plot: The following plots are conditional effect plot. They show the effects of
X 1 on the mean response on different levels of X 2 .
X2=1
20
20
Y
X2=3
Y
X2=3
40
40
60
Reinforcement
Interaction effect
60
Addative Model
0
0
X2=1
0
2
4
6
8
10
12
0
2
X1
4
6
8
10
12
X1
Y
40
60
Interference
Interaction effect
20
X2=3
0
X2=1
0
2
4
6
8
10
12
X1
Response Surfaces and Contour plots: See Page 310 of your text.
Interaction regression models with Curvilinear Effects
Example: Consider the following estimated regression function in your text (Page 309):
E (Y ) 10 3 X 1 4 X 2 10 X 12 15 X 22 35 X 1 X 2 .
By looking at the response surface, contour curve and conditional effect plot, we found:
1. Contour Curves are not parallel.
2. The interaction effect is interference since the two lines intersect.
Implementation of Interaction Regression models
See the example in your text (Page312-313). It is omitted here.
Homework #6 The due time is Nov. 12, 2009 by the beginning of the class.
Page 336-338 of textbook: 8.8acd, 8.16, 8.20
© Copyright 2026 Paperzz