Exercises_Quiz2.pdf

St512
1.
Exercises
SSII10
I have 40 small pine seedlings scattered in a large forest. Using various
instruments I measure S = sunlight reaching the seedling, R = rain reaching the
seedling, A = acidity of air near the seedling, and G = growth all over one
growing season. I regress G on all the other variables using PROC REG, obtaining
corrected regression sum of squares 800 and corrected total sum of squares 1000.
My estimated regression equation is
^
G = 10 + .2S + .5R - .3A
Letting X and Y be the usual regression matrices, find, if possible:
-1
(A) (X'X) (X'Y) =
(B)
The error mean square
MSE = ________
(C)
The overall model F test _________
Choose the correct answer:
If the P-value for this F test is .0014 and we are doing a 5%
significance level test (as usual) this tells us:
i. We can leave out all three explanatory variables (S,R, and A)
ii. We cannot leave out any of the three explanatory variables
iii.
We cannot leave out all three explanatory variables
iv. We can leave out the intercept or one of the explanatory variables.
(D)
_ 2
b'X'Y - n Y
=
(E)
-1
Suppose also that the bottom right corner element of (X'X) is 0.0024.
i. Compute from this the standard error for the coefficient of air acidity
A in the model _______
ii. Compute, if possible, a t test that the coefficient of A is just an
estimate of 0.
t = ___________
2.
All questions refer to this:
I have regressed Y (yield) on
P= soil pH,
M= soil moisture,
T= soil temperature, and MT, where MT =M*T
is the product of moisture times temperature.
As usual, let Y be the column vector of my yields and X my X matrix containing a
column of 1s, plus columns for my explanatory variables.
1
(taken from DD ST512 notes)
St512
Exercises
SSII10
Here is a partial PROC REG output:
Dependent Variable: Y
Analysis of Variance
Sum of
Squares
Mean
Square
F Value
Prob>F
Model
(___) 2214.67512
553.66878
34.051
0.0001
Error
(___)
747.95233
(________)
50
2962.62745
Source
DF
C Total
Parameter Estimates
Variable
DF
Parameter
Estimate
Standard
Error
T for H0:
Parameter=0
Prob > |T|
INTERCEP
P
M
T
MT
1
1
1
1
1
-34.468222
-1.866855
1.854880
1.611417
0.009275
93.43489693
2.16613910
2.40241982
1.39034011
0.03598928
-0.369
-0.862
0.772
1.159
0.258
0.7139
0.3932
0.4440
0.2524
0.7978
Variable
DF
Type I SS
Type II SS
INTERCEP
P
M
T
MT
1
1
1
1
1
1249887
(__________)
1406.756208
792.359286
1.079988
2.212767
12.077159
9.692811
21.841861
(__________)
a) How many rows _____ and columns _____ does my X matrix have?
b) Where possible, fill in the blanks in the above output.
c) Compute the element in row 3, column 3 of the inverse of X'X
_____
d) Suppose my moisture level is M=50 and pH is 5.5.
According to the model, if I increase T (temperature) by 1 unit
without changing moisture or pH, what happens to my predicted yield?
(increase, decrease) by _________
e) Would this answer be the same if M had been given as 60 (and pH=5.5) ?
yes, no
f) Would this answer be the same if pH had been given as 6 and M=50)
yes, no
?
g) Your advisor says that temperature has no effect on yield and no term involving
temperature should be included in the regression.
She says, "look at the p-values 0.2524 and 0.7978 and you will see you can
leave out both terms with T.
2
(taken from DD ST512 notes)
St512
Exercises
Is she right?
SSII10
yes, no
If she is right, explain exactly what the p-values are (i.e. what is 0.7978?)
Use a graph of the t distribution if you like.
If not, compute the proper test statistic _____ and give the degrees of
freedom that are needed to look up the critical value for the test df=______________
h) I see sequential sum of squares 1406.756208 and partial sum of squares 9.692811
for variable M. I could divide each of these by my MSE. For one of these two
this results in an F statistic whose p-value I know. Find this F______ and give
its p-value _______.
3
(taken from DD ST512 notes)
St512
3.
Exercises
SSII10
I have 5 fish tanks with water at each of 4 pH values (4, 5, 6, and 7) in a
completely randomized design. In each of these 20 tanks I have 1 fish so
altogether there are 20 fish.
The weight gain of fish in tankj,
for pHi is Yij.
a)
Write down the lineal model
b)
c)
Write down the X matrix,
I am interested in analyzing the following comparisons between the ph levels:
a. The largest vs the smallest pH level
b. The two central
c. Find out a third orthogonal comparison to the first two.
4.
Here is a PROC PRINT and a PROC REG where you see I have 5 treatments A through E
(from a completely randomized design) and some ORTHOGONAL columns C1 through C4.
My model is the usual:
Y(ij)
= Mu + Tau(i) + e(ij)
SAS output:
5 treatments, completely randomized design
TRT
A
A
A
B
B
B
C
C
C
D
D
D
E
E
E
Y
C1
C2
C3
C4
??
??
??
??
??
??
??
??
??
??
??
??
??
??
??
1
1
1
0
0
0
-1
-1
-1
0
0
0
0
0
0
-1
-1
-1
0
0
0
-1
-1
-1
2
2
2
0
0
0
0
0
0
-1
-1
-1
0
0
0
0
0
0
1
1
1
2
2
2
-3
-3
-3
2
2
2
2
2
2
-3
-3
-3
Model: MODEL1
Dependent Variable: Y
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Model
Error
C Total
4
10
14
631.00403
48.35803
679.36207
157.75101
4.83580
F Value
Prob>F
32.621
0.0001
Parameter Estimates
Variable
4
DF
Parameter
Estimate
(taken from DD ST512 notes)
Standard
Error
T for H0:
Parameter=0
Prob > |T|
TYPE I SS
St512
Exercises
INTERCEP
C1
C2
C3
C4
1
1
1
1
1
36.501133
4.271500
-0.350833
-7.514333
1.416267
0.56779123
0.89775677
0.51832011
0.89775677
0.23179980
SSII10
64.286
4.758
-0.677
-8.370
6.110
0.0001
0.0008
0.5138
0.0001
0.0001
19984.98
109.4742
2.215507
338.791
180.5230
a)
Compute, if possible, the F test for the null hypothesis that the 5 treatments all
have the same mean.
F= _________
b)
Compute, if possible, the F test for the null hypothesis that treatments E and B
have the same mean in the population.
(H0: Tau(2) = Tau(5) )
F = _________
c) Let b2 and b3 denote the estimated regression coefficients for columns C2 and C3
respectively.
Compute the standard error of the difference b2 - b3 from the standard
errors of the two coefficients. Standard error of b2 - b3 is ____
d) Compute the partial SS for C2, C4.
e) I want to test the null hypothesis that the coefficients of C2 and C3 can
simultaneously be set to 0 in the above regression.
2
Give the F test statistic F
= _________ for this hypothesis.
10
5
(taken from DD ST512 notes)