Test1_f10.pdf

Test 1, St711, Fall 2010, Dickey
v1
Throughout, make the usual assumptions (independent, N(0,) ) on the error terms, but
make no other arbitrary assumptions on model parameters.
1. I ran a completely randomized design experiment with three treatment groups. My
model, as usual, was Yij = i + eij. Here are the data and treatment means.
Treatment
Data
mean Yi 
ni
 (Y
j 1
A
8 10
9
B
18 23 22
21
C
14 19 18 21
18
ij
 Yi ) 2
degrees of freedom
(A) (18 pts.) Fill in the 6 blank cells in the table to show each treatment group’s
contribution to the error sum of squares and to the error degrees of freedom.
(B) (12 pts.) Compute the error mean square _______ and its degrees of freedom _____
for the analysis of variance table for the data.
(C) (8 pts.) We discussed the fact that there are infinitely many solutions for the
parameter estimates that give the least squares fit ˆ  ˆi for each treatment i=1,2, or 3.
Give the particular solution known as the reference cell coding solution (the one SAS
uses):
ˆ  _______,ˆ1  ______,ˆ2  ______,ˆ3  _______
(D) (18 pts.) We discussed estimable functions. First, mark any nonestimable functions
with “NE”. Next, for each of the other functions, give its estimate for the effects coding,
for the reference cell coding, and for the cell means coding:
Function Reference cell estimate Effects estimate Cell means estimate

  2
1  2
(E) (15 pts.) My friend uses the overall mean (17) of all 9 data points as an estimate
of     . Compute the standard error _________of this estimate. Compute the bias (0 or
otherwise) _________________ of the overall mean as an estimate of     .
Is     an estimable function? (yes, no). If so, give the (best) estimate _______ and if
not, explain briefly.
2. (15 pts.) I am going to run a completely randomized design experiment with 5
replicates in each of 4 treatment groups (20 observations total). My model is, of course,
Yij = i + eij with eij ~ N(0,  I will perform a 5% level analysis of variance F test
for the null hypothesis of no treatment effects. Here are three scenarios:
Scenario A:  1  13, 2  7, 3  10, 4  14
Scenario B:  1  22, 2  19, 3  18, 4  13
Scenario C:  1  20, 2  23, 3  16, 4  21
Under which of these scenarios is my F test most likely ______ and under which is it
least likely ______ to reject the null hypothesis of no treatment effects?
The answer to this question depends on the computation of a well known parameter for
each of the three scenarios. Assuming  compute that parameter for each of the
scenarios: A _____ B______ and C ______
3. (14 pts.) I used 12 jars each containing 600 insects. I randomly assigned jars to four
pesticides (A,B,C,D), three jars per pesticide. After introducing the pesticides into the
jars, I recorded the numbers of insects killed in each jar as follows:
A
B
C
D
10
5
40
29
38
122
181
30
500 ______
307 ______
190 ______
52 ______
The Kruskal-Wallis test would be appropriate here. It starts by computing a certain sum
ri
(denoted
S
j 1
ij
in the notes) for each group. Fill in the 4 blanks with those sums.
Using PROC NPAR1WAY, I let SAS to do the rest of the computation for me, and I get
this output.
Kruskal-Wallis Test
Chi-Square
1.5641
DF
_____
Pr > Chi-Square
0.6676
Fill in the missing degree of freedom number and label the picture of a Chi-Square
distribution on the right above to show what the other two numbers represent.
*****************answers **************************
SSq: 2, 14, 26 (sum is 42) df 1, 2, 3 (sum is 6) MSE = 42/6=7, 6 df
ˆ  18,ˆ1  9,ˆ2  3,ˆ3  0
Note that this is one of many solutions and recall that for this solution, the ̂ in the
solution is actually an estimate of    3
 NE
   2 21 21 21 (definition: estimable – same number for ALL solutions)
 1   2 -12, -12, -12
The standard error of the ordinary mean of 9 numbers is the square root of MSE/9 = 7/9
which is 0.88. The average of the 9 numbers has expected value
(9  21  3 2  4 3 ) / 9 which differs from the target     by
(9  21  3 2  4 3 ) / 9  (9  31  3 2  3 3 ) / 9  (1   3 ) / 9 and this is the bias (note
that the bias involves parameters whereas an estimate of the bias would be a number (not
the same thing).
If we average the three sample means, we will get the estimate of     that we seek,
namely (9+21+18)/3 = 16 . The following statement will verify this in SAS.
estimate "mn" intercept 3 trt 1 1 1/divisor=3;
The rejection probabilities are functions of the noncentrality parameter (bigger implies
more likely to reject). With r=5 and the known variance 10 we compute 5/10 times each
sum of squares: 0.5(4+16+1+9)=15 for A, 21 for B which is the most likely, and 13 for C
which is the least likely.
Inserting ranks ( ) we have
A
B
C
D
10 (2) 38 (5) 500 (12) ___(19)___
5 (1) 122 (8) 307 (11) ___(20)___
40 (6) 181 (9) 190 (10) ___(25)___
29 (3) 30 (4) 52 (7)
___(14)___
The KW test has 4-1 = 3 df with 4 groups like this.
1.56
0.6676 area to right of vertical line
(location of vertical line on horizontal axis)