June Exam 2013-MEMORANDUM

EXAMINATIONS – JUNE 2013 (MEMORANDUM)
FACULTY OF SCIENCE AND AGRICULTURE
DEPARTMENT OF MATHEMATICAL SCIENCES
SSTT 321- Experimental Designs
DURATION: 3 HOURS
MARKS: 100
SUBMINIMUM: 40%
Internal Examiner(s)
J. M. Batidzirai
Moderator
Mr. Chifurira
External Examiner
Mr. Chiruka (UFH- Alice Campus)
INSTRUCTION TO CANDIDATES
1.
2.
3.
4.
5.
6.
7.
8.
Please ascertain that this question paper has five (5) pages.
The paper consists of 6 Questions, attempt them all.
All questions should be answered on the provided answer sheet [100 Marks].
Calculators may be used and only round final answers.
Inform the invigilator during the examination of any problem.
Start each new question on a new page.
Number your questions correctly.
Statistical tables are provided on the last page.
Page 1 of 18
QUESTION 1 (10 MARKS)
a. Define the following terms as they are used in experimental design
i.
Experimental units
are the objects on which the response and factors are observed or measured.
ii.
1
Treatment
treatments of an experiment are the factor-level combinations utilized
iii.
1
Response variable
is the variable of interest to be measured in the experiment. We also refer to the response
as the dependent variable
iv.
1
Completely Randomized Design
is a design in which treatments are randomly assigned to the experimental units
1
OR
Is a design in which independent random samples of experimental units are selected for
each treatment.
v.
Factors .
are those variables whose effect on the response is of interest to the experimenter. Also
known as Independent Variables.
1
Quantitative factors are measured on a numerical scale, whereas qualitative factors are not
(naturally) measured on a numerical scale
vi.
ANOVA
is a statistical procedure that is used to test the null hypothesis that the means of 3or more
populations are equal. It is the cornerstone of an experimental design.
Page 2 of 18
1
In t-test, we tested the null hypothesis that the means of only two populations are equal,
but in ANOVA we are now testing for the means of 3or more populations
b. Assumptions of ANOVA
1. The population from which the samples are drawn are (approximately) normal
1
2. The population from which the samples are drawn have the same variance (or standard
deviation) i.e. σ1 = σ2 = ………….. = σk
1
3. The p treatments are randomly assigned, one treatment to each experimental unit within
each block
1
c. In experimental design, what does it mean if a design is called a 2 X 3 X 2 design?
Is a design with 3 factors (or independent variables), one with 2levels, the other one with
three levels and the other one with two levels
1
******************************************************************************
QUESTION 2 [10 Marks]
a. State any three advantages of a completely randomized design.
1
i.
It is easy to construct
ii.
It is easy to analyze even if sample sizes may be different for each treatment
iii.
The design maybe used for any number of treatments
1
b. Prove that for treatment sum of squares in Completely Randomized Designs,
SSTot  SSTreat  SSError
Page 3 of 18
1
p
ni

SSTot   yij  y
i 1 j 1
ni
p
2
    y
ni
p

 

 2  y
  yij  T i
i 1 j 1
ij
i 1 j 1
  yij  T i  T i  y
i 1 j 1
ni
p


ni
p

 T i T i  y   T i  y
ij
i 1 j 1

2
ni
p
2

Ti Ti  y
2
i 1 j 1

sum.over. j
p

 ni  yij  T i
i 1
p

2
p
i 1

 SS Error  2 T i  y
i 1

 2 T i  y
  y
p
j 1
ij

p
j 1

p

yij  T i  ni  T i  y
i 1


 T i  SSTreat
p


 SS Error  0  SSTreat ........sin ce T i  y  0
i 1
p


p
p
i 1
i 1
p
p
p
p
Ti
  y  y   y  0
i 1 ni
i 1
i 1
i 1
ie T i  y   T i   y 
i 1
p
ni
Therefore2
y
ij


T i T i  y  0
i 1 j 1
SSTot  SS Error  SSTreat
Page 4 of 18
2
2
p
ni

SSTot   yij  y
i 1 j 1
ni
p
2
    y
ni
p

 

 2  y
  yij  T i
i 1 j 1
ij
i 1 j 1
  yij  T i  T i  y
i 1 j 1
ni
p


ni
p

 T i T i  y   T i  y
ij
i 1 j 1

2
ni
p
2

Ti Ti  y
2
i 1 j 1

2
sum.over. j
p

 ni  yij  T i
i 1
p

2
p
i 1

 SS Error  2 T i  y
i 1

 2 T i  y
  y
p
j 1
ij

p
j 1

p

yij  T i  ni  T i  y
i 1

2

 T i  SSTreat
p


 SS Error  0  SSTreat ........sin ce T i  y  0
i 1
p


p
p
i 1
i 1
p
p
p
p
Ti
  y  y   y  0
i 1 ni
i 1
i 1
i 1
ie T i  y   T i   y 
i 1
p
ni
Therefore2
y
ij


T i T i  y  0
i 1 j 1
SSTot  SS Error  SSTreat
[7 marks]
***************************************************************************
QUESTION 3 [25 Marks]
Four different diets were compared to test which one has an effect on the coagulation rates of
patients. The diets were labeled diet A, B, C, and D. We are interested in how the diets affect the
coagulation rates of patient. The coagulation rate is the time in seconds that it takes for a cut to
stop bleeding. 16 patients were available for the experiment, so we will use 4 on each diet.
Page 5 of 18
Randomization was done to assign the patients to the four treatment groups. The measured
coagulation times for each diet are given below:
:
A
62
60
63
59
Mean
61
DIET
B
63
67
71
64
66.25
C
68
66
71
67
D
56
62
60
61
68
59.75
a. Test at 5% significance level if there is a statistical difference in the coagulation times.
[11 marks]
b. Calculate Fisher’s LSD at 0.05level of significance.
[2 marks]
c. Use Fisher’s LSD in b) above to perform some pairwise comparison tests to test which
Diets appear to be different.
[7 marks]
d. Construct a 95% confidence interval for the mean differences between diet A and diet B.
[3 marks]
e. Do the conclusions from your answer in d) above suggest the same conclusions as in a)?
[2 marks]
SOLUTION
i.
STEP 1 (State the hypothesis)
Write down the null hypothesis and the alternative hypothesis used to test for the
statistical significant difference in coagulation time
H o :  A  B  C  D (Means are equal for all the diets)
1
or (There is no treatment effect)
H1 :  A  B  C  D (Means are not the same for the diets)
1
Or (There is an effect due to the treatment)
Or (At least one of the means differ from the others)
Page 6 of 18
STEP 2 (Significance level and distribution)
Test at 0.05 level of significance.
Since we are testing for the means of more than 2samples, we use an F-distribution
1
STEP 3 (Rejection region)
Fcritic  F(3;12) 0.05  3.49
Therefore, we reject H 0 if Fcal  Fcrit  3.49
1
STEP 4 (Test Statistic)
Construct the ANOVA table for the data above
 p ni

  yij 
(62  60  63  ......  61) 2 1020
i 1 j 1


CM 


 65025
n
16
16
ni
TSS   yij2  CM   622  602  632  ....  612   65025  653000  65025  275
p
i 1 j 1
2
 612 66.252 682 59.752 
Ti
SST  
 CM  



  CM  65216.5  65025  191.5
4
4
4 
i 1 n
 4
p
SSE  TSS  SST  275  191.5  83.5
[3 marks]
Page 7 of 18
The ANOVA Table
SOURCE
Treatment
Error
Total
df
1
3
12
15
MS
SS
191.5
83.5
275
F
1
1
63.833
6.95833
9.17.37
STEP 5 (Conclusion)
Since Fcal (from the ANOVA) > Fcritic , we reject H o and conclude that at least one of the
mean coagulation times are different for the four chemical agents. Hence there is an
1
effect due to the chemical agent.
ii.
Calculate Fisher’s LSD at 0.05level of significance.
1 1
1 1
LSD  t 2 (n  p) MSE     t0.05 2 (16  4) 6.9583   
n n 
4 4
j 
 i
 t0.025 (12) 3.47915  2.1788(1.865248)  4.064002
iii.
1
1
Use Fisher’s LSD in iv) above to perform some pairwise comparisons to test which Diets
appear to be different
H o :  A  B
H o :  A  C
H o :  A  D
H o :  B  C
H1 :  A   B
H1 :  A   C
H1 :  A   D
H1 :  B   C
H o : B  D
H o : C   D
H1 :  B   D
H1 : C   D
Reject H o if | yi  y j | LSD  4.065
2
1
Page 8 of 18
PAIR
A& B
A& C
| yi  y j |
5.25
7
A &D
1.25
B&C
B&D
C&D
1.75
6.5
8.25
conclusion
Comparison
3
to LSD
5.25 > LSD Reject
7 > LSD
Reject
Do not
1.25 < LSD reject
Do not
1.75 < LSD reject
6.5 > LSD
Reject
8.25 > LSD Reject
Therefore A and D, as well as B and C appear to be the same, whilst the other pairs are
different
iv.
1
Construct a 95% confidence interval for the mean differences between diet A and diet B.
CI  y A  y B  LSD
1
 5.25  4.064002
 (9.89; 1.185998)
v.
1
1
Make a conclusion from your answer in vi) above
Since the confidence interval above does not include zero, we reject H o and conclude
that the means between diet A and B are not the same.
This is the same conclusion as in a) above
1
1
QUESTION 4 [15 Marks]
A chemist wishes to test the effect of four chemical agents on the strength of a particular type of
cloth. Because there might be variability from one bolt to another, the chemist decides to use a
randomized block design, with the bolts of cloth considered as blocks. She selects five bolts and
applies all four chemicals in random order to each bolt. The resulting tensile strengths follow.
Analyze the data from this experiment (use  = 0.05) and draw appropriate conclusions.
Page 9 of 18
Chemical
1
73
73
75
73
1
2
3
4
2
68
67
68
71
Bolt
3
74
75
78
75
5
67
70
68
69
4
71
72
73
75
a. From the information given, name which ones are the following:
1
i.
Experimental units, BOLTS
ii.
Treatment, CHEMICAL
iii.
Blocks, BOLTS.
iv.
Response variable. TENSILE STRENGTH
1
1
1
b. How many factor levels are there in each factor? 5 LEVELS BOLTS &
4 LEVELS CHEMICAL
1
c.
STEP 1 (State the hypothesis)
H o : 1  2  3  4 (Means are equal for all the chemicals)
H1 : 1  2  3  4 (Means are not the same for the chemicals)
1
H o : 1  2  3  4  5 (Means are equal for all the bolts)
H1 : 1  2  3  4  5 (Means are not the same for the bolts)
1
STEP 2 (Significance level and distribution)
Test at 0.05 level of significance.
Since we are testing for the means of more than 2samples, we use an F-distribution
STEP 3 (Rejection region)
Page 10 of 18
For Chemicals, we reject H 0 if Fcal  Fcritic  F(3;12) 0.05  3.49
For Chemicals, we reject H 0 if Fcal  Fcritic  F(4;12) 0.05  3.9
1
1
STEP 4 (Test Statistic)
ANOVA
Source
Df
1
SS
MS
Chemical
3
12.95
1
4.32
Block
4
157
39.25
Error
Total
12
19
21.8
191.75
F
2.38
1
21.5659
1
1.82
STEP 5 (Conclusion)

For chemical, since Fcal (from the ANOVA) > Fcritic , we reject H o and conclude that at
5%, there is a significant difference in chemical agent

1
For bolts, since Fcal (from the ANOVA) < Fcritic , we do not reject H o and conclude that
at 5%, there is a significant difference in bolts
1
******************************************************************************
QUESTION 5 (15 Marks)
An engineer is designing battery for use in a device that will be subjected to some extreme
variations in temperature. The only design parameter that he can select at this point is the plate
material for the battery, and he has three possible choices. When the device is manufactured and
is shipped time the field, the engineer has no control over the temperature extremes that the
device will encounter, and he knows from experience that temperature will probably affect the
effective battery life. However, temperature can be controlled in the product development
Page 11 of 18
laboratory for the purposes of a test. The engineer decides to test all three plate materials at three
temperature levels, 15 0C , 70 0C and 125 0C , because these temperature levels are consistent
with the product end- use environment. Four batteries are tested at each combination of plate
material and temperature, and all 36 tests are done in random order. The experiment and the
resulting observed battery life data are given in the table below:
Material
Type
1
15 C
130
155
74
180
y11.  539
2
150
159
188
126
Temperature
70 C
34
40
80
75
y12.  229
125 C
20
70
82
58
136
106
25
58
y21.  623
3
138
168
110
160
122
115
y13.  230
y22.  479
174
150
120
139
70
45
y23.  198
96
82
y1..  998
y2.. 
1300
104
60
y31.  536
y32.  583
y33.  342
y.1.  1738
y.2.  1291
y.3.  770
y3.. 
1501
y... 
3799
a. Test at 5% significance level if material type, and temperature have an effect on the life
of a battery
[12 marks]
b. Is there a choice of material that would give uniformly long life regardless of
temperature?
[3 marks]
SOLUTION.
a..
STEP 1 (State the hypothesis)
Write down the null hypothesis and the alternative hypothesis used to test for the
statistical significant difference in coagulation time
H o : There is no effect due to material type
H1 : There is an effect due to material type
1
Page 12 of 18
H o : There is no effect due to temperature
H1 : There is an effect due to temperature
1
H o : There is no effect due to material type and temperature
H1 : There is an effect due to material type and temperature
1
STEP 2 (Significance level and distribution)
Test at 0.05 level of significance.
Since we are testing for the means of more than 2samples, we use an F-distribution
STEP 3 (Rejection region)
For material type, we reject H 0 if Fcal  Fcrit  F(2;27) 0.05  3.35
For temperature, we reject H 0 if Fcal  Fcrit  F(2;27) 0.05  3.35
1
For material type*temperature, we reject H 0 if Fcal  Fcrit  F(4;27) 0.05  2.73
STEP 4 (Test Statistic)
Construct the ANOVA table for the data above
Page 13 of 18
1
a
b
r

TSS   yijk  y...
i 1 j 1 k 1
a

SSA  br  y i..  y ...
i 1


2
 (130  105.53) 2  (155  105.53) 2  ....  (60  105.53) 2  77646.97
2
 (3)(4) (83.1667  105.53) 2  (108.333  105.53) 2  (125.083  105.53) 2 
 10683.7222
a

SSB  ar  y . j .  y ...
i 1

2
 (3)(4) (103.083  105.53) 2  (107.5833  105.53) 2  (64.16  105.53) 2 
 39118.7222
a
b

SSAB  r  y ij .  y i..  y. j .  y...
i 1 j 1

2
 4 (134  83.1667  103.083  105.53) 2  (155.75  83.166  107.583  105.53) 2  ......  (85.5  125.08  64.16  105.53) 2 
 9613.7778
SSE  TSS  SSA  SSB  SSAB
 77646.9722  10683.722  39118.7222  9613.77778
 18230.75
ANOVA Table
Source of variation
Material Types
Temperature
Material Types
*Temperature
ERROR
TOTAL
Df
SS
1
1
2
2
10683.72
39118.72
5341.86
19559.36
7.91
28.97
4
9613.78
2403.44
3.56
27
35
18230.75
77646.97
675.21
Page 14 of 18
MS
F
2
STEP 5 (Conclusion)

Since Fcal (from the ANOVA) > Fcritic , we reject H o and conclude there is a
significant effect on the life of a battery due to material type

Since Fcal (from the ANOVA) > Fcritic , we reject H o and conclude there is a
significant effect on the life of a batteru due to temperature

1
1
Since Fcal (from the ANOVA) > Fcritic , we reject H o and conclude there is a
significant effect on the life of a battery due to both material type and temperature
1
b.
y32.  119.75 and y32.  145.75
MSE
675.21
 3.5
 45.47
n
4
q  q0.05  3, 27  *
y 22.  y 32.  145.75  119.75  26  TSD  45.47
1
1
So we do not reject H 0 .Hence the analysis shows that at temperature 70 0C , the mean
battery life is the same for material types 2 and 3
1
a. How much percentage of the variability in battery life is explained by the plate material
in the battery, temperature and interaction (of plate material in the battery * temperature)?
SS Model  S Material  SSTemperature  SS Interaction
1
 10683.72  39118.72  9613.78
1
=59416.22
So, R 2
SS Model
SSTotal

1
59416.22
77646.97
Page 15 of 18
1
=0.7652
Therefore, 76.52% of the variability in battery life is explained by the plate material in the
battery, temperature and interaction (of plate material in the battery * temperature)
1
*********************************************************************
QUESTION 6 (10 Marks)
a. Define 2k factorial design.
Is an experiment whose design consists of k- factors,
1
each with 2 values
1
(or levels) and whose experimental units take on all possible combinations of these levels
across all such factors.
1
b. Give a general statistical model for the 2k factorial designs
Yijk     i   j   k  ij   ik   jk   ijk   ijk
2
c. A Random Block Design Experiment was carried out and the following ANOVA table
was constructed.
SOURCE
df
SS
MS
F
Blocks
3
3.4767
1.16
15.26
Treatment
2
5.4767
2.74
36.05
Error
6
0.4533
0.076
Total
11
9.4067
Calculate the relative efficiency of Random Block Design to Completely Randomized Design
And interpret it
Relative efficiency=

MSEC
MSE R
1
(b  1) MSB  b( p  1) MSE
(bp  1) MSE
1
Page 16 of 18

(4  1)(1.16)  4(3  1)(0.076)
(12  1)(0.076)
 4.89  5
1
This means that 5times as many observations of each treatment would be required in a CRD to
get the same accuracy and precision for treatment comparisons as with the RBD
2
********************************************************************
QUESTION 7 (10 Marks)
The computer output n Appendix A was extracted. Give a short description and analysis of the
output
SOLUTION

Discuss the descriptives, including the overall mean for these 20 observations is 335.8585
and overall standard deviation is 30.63702

3
For data analysis, the Levene’s test for equality of error variances shows a p-value of
0.867. Now, since p > 0.05, it shows that error variances have equal variances across all
groups.

1
-From the ANOVA table, CLASS has a p-value of 0.47. So since p<0.05, we reject H o :
and conclude that CLASS has a significant effect on the number of points in class. We
should keep it in the model.
1
-Also GPA has a p-value of 0.08 < 0.05, which shows, again, that GPA has a significant
effect on the number of points in class. We should keep it in the model.
1
-The interaction of CLASS*GPA also has a p-value of 0.31 < 0.05. So it has a significant
effect on the number of points in class. We should keep it in the model.

The model is also has a p-value of 0.12 < 0.05, which shows that it is sufficient.
The model has the form Yijk     i   j  ij   ijk

1
1
The R2 value is 0.486. This means that 48.6% of the total variation is explained by the
model.
1
Page 17 of 18
Remember: R 2  1 

SSresidual
SS
 mod el .
SSmod el  SSresidual SSTotal
The adjusted R2 is 0.39. This means that 39% is the amount of variation around the mean
explained by the model, adjusted for the number of terms in the model.
 SSresidual


DFresidual 

Remember: R  1 
 SSmod el  SSresidual 


 DFmod el  DFresidual 
2
END
Page 18 of 18
1