AMS572.01 Midterm Exam Fall, 2010

AMS572.01
Midterm Exam
Fall, 2010
Name ___________________________ID ________________Signature____________________ AMS Major? ______
Instruction: This is a close book exam. Anyone who cheats in the exam shall receive a grade of F. Please enter “Yes” or
“No” for “AMS Major”. Please provide complete solutions for full credit. The exam goes 12:50-2:10pm. Good luck!
1. (for all students) In order to test the accuracy of speedometers purchased from a subcontractor, the purchasing
department of an automaker orders a test of a sample of speedometers at a controlled speed of 55 mph. At this speed, it is
estimated that the variance of the readings is 1.
(a) Set up the hypotheses to detect if the speedometers have any bias.
(b) How many speedometers need to be tested to have a 95% power to detect a bias of 0.5 mph or greater using a 0.01
level test?
(c) A sample of the size determined in (b) has a mean of 55.2 and standard deviation of 0.8. Can you conclude that the
speedometers have a bias?
(d) Calculate the power of the test if 50 speedometers are tested and the actual bias is 0.5 mph. Assume a population
standard deviation of 0.8.
SOLUTION: This is basically #7.10 in your homework #3.
(a)
The appropriate hypotheses are:
H 0 :   55 vs. H a :   55
(b)
It is estimated that   1. To assure 95% power for detecting a bias of 0.5 mph or greater, use  =1-Power=0.05.
Then, from equation (7.11),
( z 2  z  ) 2
(2.576  1.645) *1 2
n [
] [
]  71.27

0.5
Therefore, 72 speedometers should be tested.
(c)
The test statistic is
x   0 55.2  55.0
t

 2.121
s n
0.8 72
Since | t | t n 1, 2  t 71,.005  2.576 , our conclusion is to not reject H 0 at level   0.01. There is not sufficient
evidence that the speedometers have a bias.
Note: since the sample size is large, and we did not mention explicating that the population is normal. Therefore,
it is also OK to use the approximate Z-test here.
(d)
Since the population variance is given, so the suitable test here is the Z-test. If the bias is 0.5 mph, the power for
this 2-sided test is
(   55.5) n
(55.5   0 ) n
Power   ( z 2  0
)   (  z 2 
)


(55.0  55.5) 50
(55.5  55.0) 50
  (2.576 
)   (2.576 
)
0. 8
0 .8
  (6.995)   (1.843)
 0  0.967
 0.967
1
2A. (for AMS majors) Suppose we have two independent random samples from two normal populations i.e.,
X1, X 2 ,
, X n1 ~ N  1 ,  12  , and Y1 , Y2 ,
, Yn2 ~ N  2 ,  22  .
(a). At the significance level α, please construct a test of the hypothesis Ho: a 1  b 2 vs. H1: a 12  b 22 . Here a, b
are known constants.
2
2
(b). Suppose we have confirmed that a 1  b 2 . At the significance level α, please construct a test to test whether
2
2
c1  d  e2 or not using the pivotal quantity method. Here c, d , e are known constants. Please include the derivation
of the pivotal quantity, the proof of its distribution, and the derivation of the rejection region for full credit.
SOLUTION:
This is inference on two normal population means, independent samples.
2
2
2
2
(a) This is the usual F-test on two normal population variances: H 0 :  1 /  2  b / a versus H a :  1 /  2  b / a
The test statistic is: F0 
S12 / S 22
S12 / S 22 H 0

~ Fn1 1,n2 1
2
2
 1,0
/  2,0
b/a
At the significance level α, we will reject H0 if F0 is smaller than Fn1 1,n2 1, / 2, L or F0 is greater than Fn1 1,n2 1, / 2,U
b 2
 . Here is a simple outline of the derivation of the test:
a
H 0 : c1  d  e2 versus H a : c1  d  e2 , which are equivalent to: H 0 : c1  e2  d versus
2
2
2
2
2
(b) Given that a 1  b 2 , we set  2   and thus  1 
H a : c1  e2  d


(a) We start with the point estimator for the parameter of interest  c1  e2  : cX  eY . Its distribution is
N c1  e2 ,  2  c 2b /  an1   e2 / n2  using the mgf for N  ,  2  which is M t   exp t   2 t 2 / 2 ,


and the independence properties of the random samples. From this we have
Z
 cX  eY    c
1
 e 2 
 c 2b /  an1   e2 / n2
~ N  0,1 . Unfortunately, Z can not serve as the pivotal quantity because σ is
unknown.
(b) We next look for a way to get rid of the unknown σ following a similar approach in the construction of the pooled-
a

2
2
2
2
2
variance t-statistic. We found that W    n1  1 S1   n2  1 S 2  /  ~  n1  n2  2 using the mgf for  k
b

 1 
which is M t   

 1  2t 
k/2
, and the independence properties of the random samples.
(c) Then we found, from the theorem of sampling from the normal population, and the independence properties of the
random samples, that Z and W are independent, and therefore, by the definition of the t-distribution, we have
obtained our pivotal quantity: T 
 cX  eY    c
1
 e 2 
a
 n1  1 S12   n2  1 S22
b
* c 2b /  an1   e 2 / n2
n1  n2  2
~ tn1  n2  2 .
2


(d) The rejection region is derived from P T0  c | H 0   , where
T0 
 cX  eY   d
H0
a
 n1  1 S12   n2  1 S22
b
* c 2b /  an1   e 2 / n2
n1  n2  2
~ tn1  n2  2 . Thus c  t n1  n2  2, / 2 . Therefore at the
significance level of α, we reject H 0 in favor of H a iff T0  t n1  n2 2, / 2
2B. (for non-AMS majors) A group of babies all of whom weighed approximately the same at birth are randomly
divided into two groups. The babies in sample 1 were fed formula A; those in sample 2 were fed formula B. The weight
gains attained from birth to age six months were recorded for each baby. The results were as follows:
Sample 1:
Sample 2:
5
9
7
10
8
8
9
6
6
8
7
7
10
11
8
10
6
9
(a). Please construct a 95% confidence interval for the mean differences in weight gains between the two formulas.
(b). Use suitable tests to investigate the differences between the weight gains of the two groups (Use α =.05. Please state
the assumption(s) of the tests.)
(c). Please write up the entire SAS program necessary to answer questions raised in (b). Please include the data step as
well as tests for testing for various assumptions.
SOLUTION: Inference on two population means. Two small and independent samples.
2
Formula A (sample 1): X 1  7.33 , s1  1.58 , n1  9
2
Formula B (sample 2): X 2  8.67 , s2  1.58 , n2  9
2
2
Under the normality assumption, we first test if the two population variances are equal. That is, H 0 :  1   2 versus
H a :  12   22 . The test statistic is
F0 
s12 1.58

 1 , F8,8, 0.05,U  3.44 .
s22 1.58
2
2
Since F0 < 3.44, we cannot reject H0 . Therefore it is reasonable to assume that  1   2 .
(a) The 95% C. I. for the mean difference is
X 1  X 2  t16,0.025  s p
where s p 
1 1

  7.33  8.67   2.12*1.58 2 / 9
n n2
(n1  1) s12  (n2  1) s22
 1.58
n1  n2  2
Therefore 95% C.I. is [-2.92, 0.24].
(b)
Next we perform the pooled-variance t-test with hypotheses H 0 : 1   2  0 versus H a : 1  2  0
t0 
X 1  X 2  0  7.33  8.67   0

 1.80
1 1
1 1
sp

1.58 
n n2
9 9
Since t0  1.80 is greater than t16,0.025  2.12 , we cannot reject H0. We have insufficient evidence to reject the
hypothesis that there is no difference in the mean weight gain between the two formulas.
(b) (1) Both populations are normally distributed
3
(2)  12   22
(c) /*Problem #2B*/
data babies;
input formula wt_gain;
datalines;
1
5
1
7
1
8
1
9
1
6
1
7
1
10
1
8
1
6
2
9
2
10
2
8
2
6
2
8
2
7
2
11
2
10
2
9;
run;
proc univariate data=babies normal;
class formula;
var wt_gain;
title 'Check for normality';
run;
proc ttest data=babies;
class formula;
var wt_gain;
title 'Independent samples t-test';
run;
proc npar1way data=babies wilcoxon;
class formula;
var wt_gain;
title 'Nonparametric test for two-mean comparisons';
run;
3. (for all students) Arctic and Alpine Research investigated the relationship between the mean daily air temperature and the
cocoon temperature of woolybear caterpillar’s of the High Arctic.
(a) According to the data, can you conclude, at the significance level of 0.05, that the caterpillar’s body temperature
is higher than the outside air temperature?
(b) What assumptions are necessary for the above test?
(c) Please write the entire SAS code to check the assumptions necessary in (b) and to perform the test asked for in (a).
Day
1
2
3
4
5
Temperature (ºC)
Air
Cocoon
10
15
9
14
2
7
3
6
5
10
4
Solution: By taking the paired differences (Diff) between the cocoon and the air temperatures for each day sampled, this
problem reduce to a one-sample t-test on Diff.
Day
1
2
3
4
5
Air
10
9
2
3
5
Temperature (ºC)
Cocoon
15
14
7
6
10
Diff
5
5
5
3
5
(a). Sample statistics: n = 5, x  4.6, s = 0.9. Hypotheses: H0: μ = 0 versus Ha: μ > 0.
x 0
4.6  0

 11.5
s / n 0.9 / 5
Since t0  11.5  t4,0.05  2.13, we reject H0 in favor of Ha at the 0.05 significance level. That is, we conclude, at the
Test statistic:
t0 
significance level of 0.05, that the caterpillar’s body temperature is higher than the outside air temperature.
(b). The assumption is that the population distribution of “Diff” is normal.
(c) The SAS code is as follows:
data temp;
input day air cocoon @@;
diff=cocoon-air;
datalines;
1 10 15
2 9 14 3 2 7 4 3 6
;
run;
5 5 10
proc univariate data=temp normal;
var diff;
run;
Oh well, you have done your best for your midterm. Now cheer up please. You will have
plenty of chances to improve your scores – the quizzes, the team project, and the final
exam.
Enjoy the following fun animations:
1. Biostatistics vs. Lab Research
http://www.youtube.com/watch?v=PbODigCZqL8
2. Biostat Tutorial: Communicating Results
http://www.youtube.com/watch?v=s5tV727P0Gc&feature=related
And I wish you all a very happy and safe Halloween!
5