ST512_Solution_Practice_QUIZ3.pdf

NCSU ST512 – Sum2 2011
Quiz 3 Practice
The following exercises were taken from Dr. D. Dickey WEB site for ST512
http://www.stat.ncsu.edu/people/dickey/courses/st512/
1. A multiple regression equation
Yt = Beta0 + Betal Xlt + Beta2 X2t + Beta3 X3t + et
is fit to some data.
df
We obtain:
Sum of Squares
Type I
Type II
Xl
l
l80
X2
l
90
l00
X3
l
50
50
20
300
Error
40
+
:
:
-1 :
(X'X) = :
:
:
:
+
.03
.05
.04
.05
.30
.20
.04
.20
.48
.01
.l2
.l0
+
.0l :
:
.l2 :
:
.l0 :
:
.0l :
+
Give, if possible, the computed F statistic for testing:
(a)
H0:
Beta2 = Beta3 = 0
F =
4.7
(b)
H0:
Beta2 = 0
F =
6.6
(c)
H0:
Beta1 = Beta3 = 0
F =
NP
SS Model = R[Beta1, Beta2, Beta3| Betao} = 180+90+50=320
a) Reduced model :
Yt = Beta0 + Betal Xlt + et
SS( Reduced model) = R[Beta2, Beta3| Beta1, Betao} = 50 + 90 = 140,
MS(Reduced model) = 140/2 = 70
df(Reduced model) = 2
MSE = 300/20= 15
Calculated F = 70/15 = 4.7
b) Ho: Beta2 = 0
SS( Reduced model) = R[Beta2 | Beta1, Beta3, Betao} = 100,
MS(Reduced model) = 100/1 = 100
df(Reduced model) = 1
MSE = 300/20= 15
c)
Calculated F = 100/15 = 6.6
Ho: Beta1 = Beta3 = 0
Reduced model :
Yt = Beta0 + Beta2 Xlt + et
SS( Reduced model) = R[Beta2 | Betao}
ST512
SUMMER2 jul19 2011
1
NCSU ST512 – Sum2 2011
Quiz 3 Practice
2. I ran a regression of Y on X1, X2, X3, X4 and X5 to estimate the
parameters of this model:
Y = beta_0 + beta_1 X1 + beta_2 X2 + beta_3 X3 + beta_4 X4 + beta_5 X5 + e
PROC REG gave me this fitted model
^
Y = 50 + 3 X1 + 5 X2 - 4 X3 + 2 X4 + 7 X5
and these sums of squares and error mean square:
Type I SS
Type II SS
intercept
10000
5000
X1
800
550
X2
600
350
X3
200
220
X4
150
110
X5
50
50
Regression SS
1800
Regression df = 5
Regression MS = 1800/5 =360
MSE = 100 with 20 degrees of freedom.
a) Compute the overall model F test (as PROC REG does) F = 360/100= 3.6
b) Write down the hypothesis being tested.
Ho: beta_1 = beta_2 = beta_3 = beta_4 = beta_5 = 0
H1: nota ll beta_i = 0.
3) I have 2 replicates for each of 4 treatments A, B, C, and D in a
completely randomized design.
a) Fill blanks in X matrix, Use dummy variables to identify each
treatment. Note that X matrix is such that X’X matrix is nonsingular
(i.e., it has an inverse).
A
1
1
0
0
A
1
1
0
0
B
1
0
1
0
1
0
1
0
C
1
0
0
1
C
1
0
0
1
D
1
0
0
0
D
1
0
0
0
B
X =
Instead of PROC REG, I ran this:
PROC GLM; CLASS TRT; MODEL Y = TRT/ SOLUTION;
and here is part of my output:
Source
TRT
DF
3
Type III SS
156.00000
Mean Square
52.00000
T for H0:
ST512
SUMMER2 jul19 2011
Pr > |T|
F Value
5.78
Pr > F
0.0616
Std Error of
2
NCSU ST512 – Sum2 2011
Parameter
INTERCEPT
TRT
A
B
C
D
Quiz 3 Practice
Estimate
29.00000000
-9.00000000
-2.00000000
3.00000000
0.00000000
Parameter=0
B
B
B
B
B
13.67
-3.00
-0.67
1.00
.
Estimate
0.0002
0.0399
0.5415
0.3739
.
2.12132034
3.00000000
3.00000000
3.00000000
.
NOTE: The X'X matrix has been found to be singular and a generalized
inverse was used to solve the normal equations.
Estimates
followed by the letter 'B' are biased, and are not unique
estimators of the parameters.
_
_
2
_
(a) Compute, if possible, sum of (Yi. - Y..) where Yi. is the ith
treatment mean, as usual, sum of (Yi. - Y..)2 = SS(Treatment)
=156.00
(b) Compute the predicted mean value for each treatment. Calculate
the standard error of the mean
Treatment predicted mean
A
20.0
B
27.0
C
30.0
D
29 . 0
s.e(  A ) = s.e(  B ) = s.e( C ) = s.e(  D ) =
TreatmentF  5.78 
MSE
MSE
8.9655


 2.1209
# rep
2
2
MS Treatment  156 3
52


MSE
MSE MSE
MSE  8.9655
ST512
SUMMER2 jul19 2011
3
NCSU ST512 – Sum2 2011
Quiz 3 Practice
4) On the next page is the output from a PROC REG run. You can see that I
have a dependent variable Y and 3 explanatory variables W, X, and Z so I
am fitting a model of the form
Y = B0 + B1 W + B2 X + B3 Z + e
Below the printout are the usual matrices X'X,
need them.
X'Y, and (X'X)
-1
in case you
a) Fill in the blanks in the regression printout.
b) I see Prob>F listed as 0.0001. In terms of my parameters B0, B1, B2,
and B3, what hypothesis is being tested by this F
H0:
B1 = B2 = B3 = 0
H1: not all Bi = 0
and what do I decide about that hypothesis?
(Reject Ho)
c) My obnoxious friend notices that all the P-values of my t-tests exceed
0.05, for example 0.2847 > 0.05 for variable W. She states that none
of my coefficients is significant and thus I have failed to find any
important explanatory variable to explain the variation in Y. How
should I reply to her accusation? Is there some flaw in her arguments?
The overall F-test is greater that the tabular F(num df =3, den df= 4)
and we conclude that not all regression coefficients in the model are
equal to 0. The individual F test for testing the null hypothesis Ho:
Bi=0 fail to reject Ho, suggesting that the partial contribution of
each X-variable, after accounting for the remaining X-var is very
small. This is an evidence of multicolinearity, and we should search
for a model with a smaller number of explanatory variables.
d) If I omit W from the model and regress Y on just X and Z with an
intercept, what will be my new error sum of squares 534.3221
compute R[B1| B2, B3Bo]
e) My model tells me something about the population of Ys having X=12,
W=12, and Z=12 and about the population of Ys having X=12, W=12, and
Z=10 (so only Z is different). Both populations have same variance and
mean of conditional distribution of Y given W=12, X=12, Z=12 and mean
of conditional distribution of Y given W=12, X=12, Z=10 differ in an
amount = B3*(12-10) = 2*B3
f) If possible, estimate the difference in the means of these populations
-4.321472
and estimate the variances of the two populations
variance of each population is equal MSE = 6901.82
============================== PROC REG output ==============================
Model: MODEL1
Dependent Variable: Y
Analysis of Variance
Source
ST512
DF
Sum of
Squares
Mean
Square
SUMMER2 jul19 2011
F Value
Prob>F
4
NCSU ST512 – Sum2 2011
Model
Error
C Total
3
16
19
Root MSE
Dep Mean
C.V.
Quiz 3 Practice
20705.45989
499.74011
21205.20000
5.58872
202.80000
2.75578
6901.82
31.23376
220.97
0.0001
R-square = .9764
Parameter Estimates
Variable
DF
Parameter
Estimate
Standard
Error
T for H0:
Parameter=0
Prob > |T|
INTERCEP
W
X
Z
1
1
1
1
1.818409
3.586294
5.023960
-2.160736
11.94605523
3.23983701
3.1703
3.12524165
0.152
1.107
XXXXXX
-0.691
0.8809
0.2847
0.1326
0.4992
=====================
X'X =
-1
(X'X)
|
|
|
|
Some Matrices
20
399
1209
1613
| 4.5690
| -0.4901
| -0.4067
| 0.3700
399
8213
23878
32195
-0.4901
0.3361
0.3269
-0.3221
1209
23878
75835
100046
-0.4067
0.3269
0.3218
-0.3170
===============================
1613
32195
100046
132683
0.3700
-0.3221
-0.3170
0.3127
|
|
|
|
X'Y =
|
4056 |
| 80577 |
| 252651 |
| 334328 |
|
|
|
|
s.e(b2)
> 31.23376*0.3218
[1] 10.05102
> sqrt(31.23376*0.3218)
[1] 3.170335
Calculate
R(B1, B2, B3|Bo)
= 20705.45989
= R(B2, B3|Bo) + R(B1 |B2, B3, Bo)
R(B1 |B2, B3, Bo) = t2 * MSE = 1.1072 * 31.23376 =
38.28921
If W is dropped from model Error Sum Squares for reduced model is
499.74011 + 34.58202 = 534.3221
with dfe = 16+1=17
> 1.1072 * 31.23376
[1] 34.58202
ST512
SUMMER2 jul19 2011
5