THE USE OF A PRELIMINARY TEST FOR INTERACTIONS
Dl THE ESTnvlATION OF FACTORIAL MEANS
•
by
Atindraruohan Gun
Institute of Statistics
iVl1meograph Series No. 436
July 1965
iii
ACKNOWLEDGEMENTS
This investigation was suggested by Dr, R. L. Anderson and was
carried out under the guidance of the chairman of my advisory commi.ttee, Dr. H. R. van der Vaart.
I owe a debt of gratitude to them
for taking a lively interest in my work and for helping me cross
many a hurdle on the way.
I am also grateful to the other members of
the committee for going through the rough draft and making some
helpful suggestions.
The late Professor S. N. Roy, as a former
member of my conmittee and otherwise, had been a constant source of
inspiration.
In his death I suffered a personal loss.
My thanks are due to Mrs. Linda Kessler and Mrs. Ann Bellflower
for doing the difficult typing of this dissertation.
I am also
thankful to Mrs. Diana Smith who did the programming for most of the
numerical part of the work.
I should also put on record my
appreciation of the help I received from Miss Helen Ruffin,
Mrs. Dorothy Green and Miss Lillian Hunter on numerous occasions.
Finally, I would like to express my gratitude to the U. S.
Educational Foundation in India, the Institute of International
Education and the North Carolina State for financial assistance,
and to the Government of West Bengal, India, for granting me leave of
absence from my job in Presidency College, Calcutta.
iY
TABLE OF CONTENTS
Page
vi
LIST OF TABLES .
vii
LIST OF ILLUSTRATIONS
1.
INTRODUCTION
1
1.1. Statement of the Problem
1.2. Review of Literature
2
0
;1
7
THEORETICAL ASPECTS OF 2x2 CASE
10
2.1. Notation
2.2. Preliminary Test Procedure and Definition of the
Estimator T' .
2.3. Distribution of the Estimator T
2.~ Expected Value of the Estimator and Its Mean Square
Error
2.~ Some Properties of the ~ Function.
2.& General Observations
2.7. Appendix.: Monotonicity of' Incomplete Beta Function .
3•
FURTHER S'lUDY OF 2x2 CASE WITH SOME NUMERICAL RESULTS
3.1. Computational Procedure
3.2. Percentage Pbints of x
3.~ Bias as a Proportion of 711
....
3.~ Relative Mean Square Error: MSE(T)/(02/r)
3 .5- Comments on MSE .
3.~ Optimal Choice of a
3.7. An Alternative Formula for MSE
3.& Appendix: Proof of Lemma 3.5.1
4.
THEORETICAL ASPECTS OF pxq CASE
10
16
17
23
25
26
29
29
30
31
34
.37
40
48
52
55
4.L Notation
4.2. Estimation of a Single Cell Mean
4.3. Simultaneous Estimation of Cell Means
4.~ Some Relations between ~ and ~*
4.5- Preliminary Tes t Procedure
4.~ Distribution of the Estimator of a Cell Mean
4.7· Expected Value and MSE of the Estimator of a Cell Mean, and
AMSE
4.8. General Observations
4.9- An Alternative Approach
10
..
55
56
58
61
62
64
67
71
73
v
TABLE OF CONTENTS (continued)
Page
5,
FURTHER STUDY OF pxq CASE WITH SOME NUMERICAL RESULTS
75
5·L Percentage Points of x
75
75
5·2. Bias of Tij_as a Proportion of 7ij
5·3· Relative AMSE: AM3E/( cf2 /r)
5.4. Comments on the Values of AMSE
5,5·
6.
Optimal Choice of a
85
SOME DECISION-THEORETIC RESULTS
6,1. Introduction
6,2. Fixed.sample Decision Problem
6.3. Decision-theoretic Formulati.on of a TE Procedure
6.4. Bayes Rules for Simple TE Problems .
6.~ Bayes Rules for Simultaneous Estimation of Cell Means
in a Factorial Experiment
6.6, Minimax Decision Rules
7,
78
84
SUMMARY AND CONCLUSIONS
90
90
90
94
9'5
.
100
102
104
7. L Summary .
104
7.~
106
107
7.2. Conclusions
Suggestions for Further Research
LIST OF REFERENCES
111
vi
LIST OF TABLES
Page
302.1
Upper a-points of x (2x2 case)
3.301
Values of
lii
E(~ll- T), r=e
32
303·2
Values of
l~i E(~ll- T), r=5
33
3.303
Values of l~i E(~ll- T), r=lO
33
304.1
Values of the ratio MSE(T)/(02/r ), r=e
35
3.4.2
Values of the ratio MSE(T)/(c,.2 /r)' r=5
36
3.4.3
Values of the ratio MSE(T)/(02/r ), r=lO .
36
3.6.1
Levels of significance for ANOVA test corresponding to
a = .25 and .50
47
5.1.1 Upper a-points of x (pxq case) .
l1J
liJ
31
76
E(~ij- Tij ), p=q=3
77
E(~ij- Tij ), p=q=5
77
5·2.1
Values of
5·2.2
Values of
5.3.1
Values of the ratio AMSE/(a2/r), p=q-3, r=3
80
5.3.2
Values of the ratio AMSE/(i /r), p=q-3, r-5
81
5.3.3
Values of the ratio AMSE/(i/r), p=q=5, r=3
82
5.3.4 Values of the ratio AMSE/(d2/r), p=q=5, r-5
83
5.3.5
Values of ~nf r AMSE*(Tla)/o2
87
5.5.2
Levels of significance for ANOVA test corresponding to
a-.25, .50 and. 75
89
vii
LIST OF ILLUSTRATIONS
Page
3.5.1
2
Variation of rMSE/a with ~ for different significance
levels (with r=5)
I.
38
1.
1.1.
INTRODUCTION
Statement of the Problem
Consider
a factorial experiment with two factors, A and B, having
,
p and q levels, respectively.
combinations has
r(~
Suppose that each of the pq treatment
2) replications.
Denoting by Yijk the response to the treatment combination (AiB )
j
in the k-th experimental unit, let
Yijk = llij + €ijk
i = 1,2, .•. • •• , p
j = 1,2, ... · •• , q
k = 1,2, .•. · .. , r
,
(1.1.1)
}
,
where llij is the expected response to the combination (AiB ) per experij
mental unit, and €ijk is the random component, assumed
NIID(O,~2).1/
We shall further assume that for our experiment the 'fixed effects
model' (Eisenhart's Model I) is appropriate, so that we may write
J.L ij = J.L
+
(1.1.2)
ai + t3 j + 'Y ij
where
II
'Y
= general
effect,
ai
= (main)
effect of Ai'
t3 j
= (main)
effect of B ,
j
ij
=
interaction effect of Ai and Bj
,
1/NIID(O,~2) means 'normally, independently and identically distributed
with mean 0 and variance ~2' •
2
and
q
E (3
j=l
q
E
j=l
P
E
1=1
j
=0
r' j = 0
for
i
= 1,2, •...•• ,p
r ij = 0
for
j
= 1,2,...
,
l.
• •• , q
Now suppose our objective is to estimate a cell mean, say
~ij'
Let!/
r
(i = 1,2, ... ,p ;
= E yijk/r
j =
1,2, ... ,q)
k::=l
q
Yi. = E yij/q
-
..
,
j=l
and
..
Y
P
= E
i"'~l
=
Y. j
P
E Yij/p
i=l
q
E yij/pq
j=l
The least squares estimators of the parameters in model (1.1.2) are
,.
~
..
=Y
,.
ai =
,.
I'ij
,
..
..
..
..
.. Yi.
Y1. .. Y
= Yij
,
..
A
..
(3j = y .j .. Y
..
.. Y.
j
,
(1.1.5 )
+y
Thus the lea~t squares estimator of ~ij under model (1.1.2), which is in
consequence uqbiased aDd minimum"varianc~ under that model, is
!/For simplicity, we prefer these symbols to the more elaborate
yij.'
Y1. .' Y. j.
and
y
.
3
...
(1.1.6 )
...
..
III
J.l ij = J.l + a i + t3 j + I'ij
III
= Yij
We have
III
E(J.l ij )
= J.lij
2
III
and
var(J.l ij )
= 0"
(1.1.6a)
/r
(1.1.6b)
If interaction effects are supposed to be absent, then
mo~el
(1.1.2)
reduces to
(1.1.2a)
,
,
with the same restrictions on a sand t3 s as in (1.1.3).
interaction model, the least
sq~res
..
J.l ij
which is unbiased and
= J.l
estimator of J.l
..
+
ij
Under this
no~
is
III
a i + t3j
minimum~variance
under this model.
Now suppose (1.1.2), and not (1.1.2a), is the appropriate model for
our experiment.
. ....
.
Noting that J.l, ai' I'j and I'ij are unbiased estimators
of the corresponding parameters, that
..
Var(J.l)
..
Var(ai )
..
var(l'ij)
2
= .0"
=
2
0"
2
= 0"
/pqr
,
(q-l)/pqr
,
(p-l)(q-l)/pqr
..
var(t3 j )
= 0"
2
(p-l)/pqr
,
(1.1.8 )
,
and that these components are mutually independently distributed, we
have then
-
E(~ij)
-
var(~ij)
and
4
= ~ij - Yij
= cr2 (1
=
cr
2
(1.1.9a)
+ (q-l) + (p-l)]/pqr
(1 _ (P-l)(q-l)J
(1.1.9b)
pq
r
Suppose further that minimum mean square error (MSE),
rather than unbiasedness plus minimum variance, is our
good eptimator.
eriterio~
of a
This means in effect that we have in m+qd a loss function
of the fonn
where c is a positive quantity which may
is the proposed estimator of
~ij'
and
depe~d
o~r
on the parameters, and T
goal is to control the risk
It is seen that in this situation, even though (1.1.2) is the proper
model,
~ij'
times.
For
rather than
..
~ij'
will
...
MSE(~ij)
b~
the more desirable estimator at
..
= var(~ij)
= cr2/r
(1.1.10 )
and
MSE(~ij)
-
= var(~ij)
= cr
2
r
so that
-
2
+ Yij
[1 _ (p-l)( q-:q] +'Y 2
pq
1j
.
MSE(~ij) ~ MSE(~ij)
(1.1.11)
5
as long as
2 < a
(p-l)(q-l) ~
rij-r'
pq
2
(1.1.12)
As a consequence, if the relevant p8l'ameters would be known, the
following estimation procedure would be advisable:
li ij
However, since these parameters are unknown, the
,.
natur~l
procedure that
suggests itself is to substitute the outcome of a preliminary
the
para~eter
inequalities.
te~t
for
This leads to the following estimation
procedure:
1) First, test for the hypothesis that
or possibly
so~e
average of all pq such quantities (see
Chapter 4), is less than or
2)
eq~al
to unity.
In case significance is found in step 1), take Il
estimate of
ij
as ~he
li ij ; otherwise take liij •
Hence, if ,(x) is the probability of rejection of the hypothesis
r 2ij ~ var(r• ij ) (or similar hypotheses in the p x q case)~ the ultimate
estimator T has the form 1/
{l - I(x)}
-
Ii~j +
.
I(X)liij
l/A next step, n,9t tafen in this work, mie;ht be to lqok into more
general functions of li , li , and some test statistic.
ij
ij
6
The consequences of such an estimation procedure following a test of
significance (a TE procedure, for short) form the subject-matter of the
present investigation.
The heart of the difficulty is that the interpreta-
tion of the joint application of a test and an estimation procedure has
to be different from the interpretation that would be valid for their
individual applications.
The fact that a test and an estimation procedure
are carried out on the same data causes these two procedures to
b~
interdependent in the probability sense, which entails rather cumbersome
distribution problems.
We shall derive the distribution of the suggested estimator, its bias
and mean square error.
Next, we shall study how the bias and the mean
square error are affected by our choice of r, the number of replications,
and
Q,
the level of significance of the preliminary test.
A rather detailed study will be made for the simple case where
p
=q = 2
(Chapters 2 and
3).
For the general case, directions in which
our findings for the 2x2 case will need to be modified will be indicated
and these modifications worked out to some extent (Chapters 4 and 5).
In Chapter 6, we shall set up the general problem of estimation
after a preliminary test as a problem in decision-making and shall
investigate the nature of Bayes rules and minimax rules.
We shall first
make some general remarks and then devote our attention to some special
cases, viz., the estimation of the mean of a normal population when the
variance is known as well as when it is unknown, and the estimation of
cell means in a factorial experiment.
7
1.2.
Review of Literature
We have taken as the starting point of our discussion a paper by
Anderson (1960) in which he considered the MSE's of parameters as a basis
of comparison between alternative models and alternative modes of
analysis for faotoria1 experiments.
The problem of estimation after a preliminary test of significance
has been studied in a variety of contexts (all different from ours) by
a number of investigators,
All have examined, as we shall do, how the
biases and MSE's are affected by following a TE procedure.
Mosteller (1948), Kitagawa (1950) and Bennett (1952) dealt with a
2
TE procedure for estimating the mean of a normal population N(~l'~ ),
when samples from this and another population N(~2,~2hwith the same
variance, are given.
If we denote by n , n the two sample sizes and by
1
2
Xl' X the two sample means, the procedure consists of making a test for
2
the equality of
~1' ~2'
- as the estimate in case of
aM taking Xl
significance and taking the pooled value
n1x- + n x
1
2 2
n1 + n
2
otherwise.
Bancroft (1944) considered a similar situation, where samples of
sizes n , n drawn from normal populations N(~l,~i) and N(~2'~~) are
1
2
2
given and the problem is to estimate ~1' The hypothesis of equality of
the two variances is first tested by means of the sample variances si
2
and s2'
2
~1;
2
In the case of significance, one takes sl as the estimate of
otherwise, the pooled value
8
is taken as the estimate.
Kitagawa (1950) also considered the problem of
in
inte~penetrating
~
x
•
sometimes pooling
sampling, where actually one nas a sample from a
bivariate normal population with means
estimate
,
,
~
x
and
~
y
, say, and one wants to
x and yare the two sample means, the
x or (x + Y)/f as the estimate, according
Supposing
followed is to take
hypothesis of equality of
~
x
and
~
y
method
as the
is rejected or not in a prelimfnary
test of significance.
Banc~oft
(1944) and Kitagawa (1963) investigated the application of
TE procedures to
l~~ear
dealt with some aspects
regression analysis.
Larson and Bancroft (1963)
~f
Kitagawa (1963) also
the same problem.
considered the use of TE procedures in response surface
a~alysis.
Asanq (1960) considered the application of TE procedures to some
problems ip
bio~etrical
and pharmaceutical research.
The procedures
boil dowp to the estimation of means after testing for variances, and
the estimation of a variance-like function of effects in analysis of
variance after a preliminary test of these effects.
Bennett (1956) discussed TE procedures in which ultimately one
wants interval estimators, rather than point estimators, of means and
variances.
Huntsberger (1955) generalized the TE procedure by considering a
weighted sum of simple
esti~tors,
the weights being determined by the
observed value of the test statistic.
In the case of normally
9
distributed estimators, he saowed that the weighting procedure on the
whole effects a greater degree of control on the MSE than does the TE
procedure.
Kitagawa has brought together many of these findings in his recent
publication
(1963) where he has discussed some other aspects of the TE
procedure as well.
Bancroft
A mention may
al~o
be made of the review article by
(1964).
There
~s
one
asp~~t
in which our TE procedure differs from those
envisaged in the investigations cited above.
In our case, the
hypo~hesis
involved in the preliminary test bas the form
(parameter)2~ ~(variance of its estimator)
~
being
~n
appropriate positive
co~stant,
referred to, the hypothesis is of
t~e
whereas in the investigations
usual null type:
parameter ; 0
,
We are thus copcerned with
t
materia~
significance
rather than with
ordinary sigpifica~ce. The nQ~iOn of material significance and some
methods of testing for material significance in certain specific
problems bave been disc~ssed by Hodges and Lehmann
(1954).
10
2.
2.1.
THEORETICAL ASPECTS OF 2x2 CASE
Notation
In the 2x2 caSe we may consider
mean ~ll'
We shall
~ee
t~e
problem of estimating the cell
(in Section ,.1) that the results for this mean
hold with little modification for the other cell means as well.
We shall denote by ~ (0 ~ ~ ~ 1) the level of significance at which
the preliminary test is made. T will denote the estimator of ~ll'
,.
which equals ~ll ~n the case of significance and ~ll otherwise.
Fbr simplicity, we shall write
w
z
and
...
==
..
Note that '1
...
11
...
.. +<Y1 . - y ) + (Y. l - y ) =
.
..
...
'- Y.l
+y
y
- y
11
1,
.. 11
== y
:;; '1
is
t~e
~ll
== ~ll
~ll
.
}
(2.1.1 )
least squares estimator of '1 11 in the interaction
model (1.1.2).
The
sample error variance (under the interaction model) will be
denoted by s2 and the number of its degrees of freedom by~.
Thus pere
and
2.2.
Prelimin~ry
Test Procedure and Definition of the Estimator
T
From Chapter 1, .p. 5, it is seen that prior to estimation we are to
test the hypot4esi6
11
or,. equivalent,
(2.2.1a)
against tqe altefnative
li.:
where
ljr ;::
Conaiderations of
Ijr)l
,
2/2
4r r;1.1
(j
invar1anc~
under the customary
tran~formation
groups dictate that this test Should be based solely on the statistic
(vide Letlq1Bnn (1959), cnaJ>ter 7, Sectton 5, an.d Hodges and :I;..enmann
(1954) ) I
Let ex be
t~e 8~q1:flied
level of significance ~ i. e.,
~uppose
we require tnat
Where ~(4rz2/s2)
= probability
8 denptes tbe part of the
9
~S
par~ter
91 will qenote the part of
Note
t~at
ot rejecting
a
EO
space 9
at the point 4rz 2 /s 2 and
corre8~on.ding
to HO' Just
to H •
l
generally in test1p.g a hypot4esis the 80al is to maximize
the power unifo$Y over
el~
corre~ponqing
i.e., in qur case to maximize
(2.2.4)
SUbject to (2.2.3), and
i~
i8 knpwn that
~4e
UMP invariant test is given
by:
cp
o
;::
{l
0
2
/s 2 > C
it'
4rz
if
2 2
4I1z /s S C
However, in a TE procedure the purpose
over 9, the
~ean
square error of the
sho~d
ultimat~
be to minimize, uniformly
eetimator (see Section
1.1)
(2.2,6a)
'lhus we want to minimize (by
~~ing
MSE,(T) = Ee(T ~
= Ee(W
a suitab;t..e choice of cp)
j,A11)2
+ ~(4rz2/s2)z -
~1+)2
unif9rmly over 8.
Ji'ote that for non-!'f.mdomized
tests
,
'
MS~e(T~
= Ee((l -
2 2
cp(4rz /s »)(w -
q)
u11 )2
+
~(4rz2/s2)(w
+
z - ~11)2] .
(~.2.6c)
We should, therefore, look for an invariant test .(4rz 2/s 2 ) whioh
leve~
haS
test
,
a and Which
minimi~es
m!q1~izing (~,2.6b)
test.
(a,2.6b). We
for each &6
~
Sha~l
call an
~~variant
a • un1f9nm1y TE-best invariant
1lowever, the following theorerq ap.ows tnat a unU'ormly 'TE-best
invSfiant test does not
T1leC?rem ?2i~'
~~~dQm~zed
or
e~1st.
2
Fpr a ~ 0 there 1/3 no invsr:f,snt test w(4rz /~2),
non-rsndomize~, ~u~~
that
and sUGp. 1;;hat
(see (2.2. pb ) )
is a minimum for e8Gh
ee ij.
13
Proof:
Con~ider
the test of
leve~
0 (and hence of level a)
(2.2.8)
Fbr all
~oints
of
e
for which
~
= 0 (for definition of Wsee (2.2.2»,
this 1s the TE-best invariant test.
if ~
=0
MSEe(T)
(implyiqg Yll
= Ee(W
FQr from (2.2.6b) it follows that
= Q); then
- ~11)2 + 2 ~(4rz2/s2)(w - ~ll)z + ~2(4rz2/s2)z2]
= Ee(w ~ ~ll)2)
+
~e(~2{4rz2/62)z2~/
(since w, z and s are independently
distributed and E(w) = ~ll when Yl1
the
e~uality
holding it,!
It will be
eno~gh
~
=0
= 0.)
a.e.
to show that there exists another test of
which for '4f.... .. is better than the above test.
if
4rz 2/s 2
if
4rz 2/s 2 < F
leve~
q
Put
> Fa
-
a
Hel'e and elsewhere in this thesis Fa stands for the upper a-point of a
certain non-qentral F
distrlb~tion--see
(2.2.13a).
We shall see in what
follows that for this test
~o
that it satisties (2.2.7a) as well as
~ ~ 0
does.
From (1.1.11)
~t
!/Sinoe this theorem admits randomized tests, ~ may assume values
Qtqer than 0 or 1. Henqe ~2 does not reduce to ~ here, as it did in
(2. ~ .6c).
14
follows that for the estimator corresponding to the preliminary test
(2.2.8)
MSEe(T) ....
if
00
ljr .... 00
On the other hand, Theorem 3.5.1 shows that for the estimator
corresponding to the preliminary test (2.2.9)
2
MSEe(T) .... a/r
as
oo
ljr ....
Consequently, it does not seem unreasonable to use the UMP invariant
test of level a as the preliminary test in our T.E procedure and to
study the properties of the resulting estimator.
Thus we shall be
using the preliminary test (2.2.5) where C is so chosen that
sup Pr[ 4rz2 /s2 >
e€ 8 0
ci eJ '"" 0:
2 2
Since the statistic 4rz /s has a distribution dependent on
ljr '""
4r1il/a2
alone and since the power function of this test is monotonically
increasing in
ljr,
the above is equivalent to the condition
Pr[ 4rz2
/s 2>1
C ljr
Here,4rz
2 2
~
IJ
..
1lI
a
(2.2.10)
has the non-central F distribution with (l,v) d,f. (or,
equivalently, the non-central t
2
distribution with v d.f.) and non-
centrality parameter
ljr '""
2
2
4rh/a
.
(2.2.11)
If Fa denotes the upper a-point of the non-central F distribution with
(l,v) d.f. and
ljr '""
1, then (2.2.10) implies that
C ::; F
a '
and then (2.2.5) coincides with (2.2.9).
The prob~Qility element of F = 4rz 2 /s 2 , give~ e.g., in Graybill
(1961), is
exp(-v/2)
(1jr/2)m
L --m~m=o
1
.
(f/v)m-l/2
- - B(m+l/2,\I/2)
(1+f/\I)\I/2im+l/2
IX)
=r--,=,",~-'7"::""{"
( / \I )
d f
,
Hence Fa is given by the equation
CD
a
=J
Ill)
exp(-1/2)
1
L
m=o
F
a
2
m ,
m. B(m+l/2, \1/2)
(2.2.13a)
For notational simplicity, we shall deal with
2
x ::; 4rz /v s
2
1 + 4rz 2 /\l s2
::; ~
1+F7'V
(2.2.14)
and
(2.2.14a)
rather than with F and F.
a
This x
a is then given by
CD
exp(-1/2)
I:
m=o
where I
xa
(r,~)
is the incomplete beta function
1
B(r,s)
x r-l( I-x )s-l dx
16
2.3.
Distribution of the Estimator T
Note that the TE estimator (2.2.6a) of the cell mean
~ll'
in case
the preliminary test is defined by (2.2.9), is
w
if
T -
- { w+z
where F
= 4rz~/s2
if
apd F is still defined as in equation (2.2.13a).
ex
Hence the probability element of T is
where gl and g2 are the conditional density fUnctiqns of w and w+z
respectively.
Noting that w, ~ and s2 are mutually independently distributed (the
first two normally, a~ N(~ll- Yll,3~2/4r) and N(Yll,~2/4r), respectively,
and the last one as '1.2 ~2/v, '1. 2 being the (central)
v
v
d.f.), we may write the expression (2.3.2) as
hI (t)
JJ
h (Z) h (s2)dZ d(s2)dt
2
3
R
= hl(t) J0{'h 2 (Z)
h (s2)dZ d(s2)dt
3
R
+ h4,(t,. )dt
-If h4(t/Z) h2(Z) h3 (s2)dZ d(s2)dt
R
'1.
2 statistic with v
17
JJ
= h4 (t,. )dt +
(hl(t) - h:(tlz)]h2 (Z) h (S2)dZ d(S2)dt
3
R
.
2
and h are the dens!ty functions of w, z and s ; h4 ( t, z )
2
3
the joint density of t = w+z and z; h4 (t,.) the marginal density of
Here hi, h
t
= w+z and h4* the conditional density of t given Z; and the region of
integration R is defined by
R =
222
f ( z, s)
4 rz ~ s F(X} ,
C
and R is the complement of R.
Since the marginal distribution of t = w+z is N(~11,~2/r) and the
conditional distribution of t
= w+z given z is
,
.
2
N(~11+Z-Yll,3~ /4r),
(2.3.3) may be written as
2
(2n~2/r)-1/2 exp[-(t-~11)2/ 2~ ]dt
JJ
+
,~2 fl/2
((zn o
exp(-(
t-~11-t'yll)2/(2
2
. tf)}
R
2
_ (2n •
h
2.4.
2
2Q:)~1/2 exp{-(t-~ -z-t'y
~
11
2
)2/(2 • ~)}] x
11
(Z) ~(s2)dZ d(~2)dt
Expected Value of the Estimator and Its Mean Square Error
The result
yields from (2.3.5) the expected value of T:
18
co
Ee(T) ::
.
Jt
Pr(dt)
-co
:: fJ ll +
j}' [(fJll-)'ll) -
(fJll +z-)'11)]h2 (Z) h (S2)dZ d(S2)
3
R
(where, as before, the sUbscript
e
serves to remind the reader that these
2
expected values de~end on the fJ ij , ?'ij and rr )
(2.4.1)
Again, the result
yields from (2.3.5) the MSE of T:
co
E (T-fJ ll )2 ::
e
1
-co
(t-fJ ll )2 Pr(dt)
2
2
h (Z) h (S )dz d(s )
2
3
2
:: rr /r + 2)'1111 z h 2 (Z) h (s2)dZ d(s2)
3
R
JJ
-!j
R
2 ( z) h (E\ 2 )dz d(s 2 )
Zh
2
3
(2.4.2)
19
Putting
(k 1s a positive integer1
we thus have
(2.4.4a)
and
(2.4.4b)
Now
J
k =
JJ (2~ ~f1/2
z;k
eXP[-(~-'Y11)2/(2 ~)J
h (s2)dZ d(s2)
3
R
= exp(-V/2)(21t
~f1/2
JJ
zkexp[-z2/(2'
R
On making the transformation
u = 4rz 2/cr2
we have
Hence
,
~») x
20
to use a notation from Rao (1952), viz.,
Also
for
x
~
0
for
x
<
0
•
21
Now
since ~s2/~2 has the (central) X2 d~strib~tion wi~h v d,f.
Also we have
the result, given, e.g.~ in Rao (1952),
w
= B(!?l~P2)JvPl ..l(1"V)P2"1
dv
o
(2.4.6)
where again I (P ,P2) is an incomplete beta functton.
w l
Consequently,
2
2
x du d(vs /~ )
= 711
and
exp(-W/2 )
t
m=o
(y/i)m
m.
I x (m+3/2,v/2)
a
22
~e
change of the order of integration and summation made
~ere
is valid
by virtue of 'lheorem 21B ;tn Halmos (1950).
Let us define t,he SyIDboi Q§t/2, ItJ/2, Xa ) by the equation
Qj(1lt/2,ItJ/2,Xa )
wher~
= exp(-1lt/2) ~ (1Jt/~t
m.
m=o
I x (m+j,ItJ/2)
a
,
> O. We may then write
j
and
In terms ot this Q ~nct1on, (2.2.l4b), (2.4.l)
a~d
(2.4.2) are as
follows:
(2.4.8)
(which determines the cut-off point x ),
a
and
Ee(T-~11)2 = ~2 + 2r~~/2{1lt/2,~/2,Xa)
~2
- 4:r
~2
=r-
%/2(1lt/2,'/J/2,Xa )
4
(1 -
2
Q3/2(1lt/2,~/2,Xa) - rll~/2(1lt/2,V/2,Xa)
i
.
+ ~ [2~/2(1lt/2,ItJ/2'Xa) - ~/2{V/2,~/2'Xa)1J .
(2.4.10)
i
P
j
1/Sometimes
we
I;lh~ll
write s!mply Qj'
23
2.5.
Properttes of t4e Q FUnctio~
SQm~
(a)
Since
O<I
-
x
(m+j,',1/2) <1
-
a
,
we also have, on multiplying each term by ('Ijr /2)m/m'. and adding over m
from 0 to
fSJ,
the !lower and upper bounds being attained only if x
a
=0
and x
a
=1
respectively.
For 0 < x
a < 1, somewh.at better
bou~ds
are obtained as follows:
Making the transformation
X ::::: Z x
a
~
we p.ave
I
Xa
xm+j - l (1_x)',1/2-1 dx : : :
x~+j
f
1
zm+J-l(J,.-XaZ )',1/2-1 dz
o
> x~+J
J
1
zm+j -1(1_Z )',1/2-1 dz
o
Hence
and
A~a~n,
making the transformation
we have in a
simi~ar
manner
and
co;nst~nt,
(b)
Other things remaining
Ix
incre8cses monotonically
ex
as X increases (i.e., as ex decreases). Hence Qj(V/2,V/2,Xex ) also
ex
increase$ monotonic~lly as X inoreases from 0 to 1 (or, equivalently,
ex
as ex decreases from 1 to 0), j, V and ~ remaining fixed.
(c)
As shown by Jordan (1962), p. 85, for 0 < xa < 1
and
J,
for any
j' with 0
<J<
j'
(see Section 2·7)·
(d)
<0
,
because of
(2.5.4), showing that other things remaining the same, Q is
a monotonically decreasing function of
Section 3.5 we
~hall
V, provided 0 < Xex < 1. In
show that actually Q
~
0 as V
~
CXl
•
25
2.6.
General Observations
On tbe
b~sis
or tbe results
ment~oned
in the preceding
s~ction,
some general comments can be made on the bias and MSE or tbe estimator
T (with 0 < a < 1).
Tbe bias of,' T is
and in magnitude is
(2.6.2)
Beo~use
of (2.5.1) then
~
i.e., tbe bias of T lies between
t~at
of
~ll
~
and tbat or
(2.5.5) implies that tbe bias becomes smaller and
frqrn 0 to
smal~er
~ll'
Further,
as V increases
GO.
Aga;ln, conside:r tne MSE or T.
We have
(2.6.4)
Hence
> 0 ror
Tbis means that
greater than
Also
~n
t~t
the domain 0
of
...
~ll'
~
0 ~ V~ 1
V ~ 1 the MSE of T is
n~cessarily
26
> 0 for 1
It follows that in the domain
ljr
?
.:s
ljr
<
(2.6.6)
CXl
1 the MSE of T is necessarily greater
I/o
than that of
For
ljr =
~11.
0
so that at this value of
value of x
a
the MSE is increased by taking a smaller
ljr,
(i. e., a larger value of a).
Another result that follows from (2.5.1) is that for
MSEe(T)
2
cr
so that if
2.7.
ljr ~
/r
< 1/2
ljr
Q5/2
<1-""8<1,
(2.6.8)
"
1/2, then T is necessarily better than ~ll'
Appendix:
Monotonicity of Incomplete Beta liUp.ction
The incomplete beta function
x
Ix(p,q)
J
=
1
yP-l(l_y)q.-ldy
f
o
where 0
.:s x .:s 1
J yP..
l(l_y)q-l dy
,
0
and p, q > 0, is known to have the properties that
I x (p, q) > I x (p+l, q)
and
I (p,q) < I (p,q+l)
x
x
,
provided 0 < x < 1 (see, e.g., Jordan (1962), p. 85).
that these results can be
generali~ed,
We shall show
that in fact I (p,q) is, for
x
27
o < x < 1,
a monotonically decreasing (increasing) function of the real
variable p (of the real variable q).
Let p'. > P > O.
Then, denoting the complete beta functILon by
B(p,q), we have
x
1
-j j'yP'- (1_y)q-l zp-l(l_z)q-l dy dz
1
o
0
1 x
~J'
x
J
yp-l(l_y)q-ltP>l(i_z)q-ldY dz
0
Now, in the domain
{(y,z) J 0 < Y < x
,
z/y > 1
,
x < Z < 1)
,
we have
$0
that also
I
(zjy)p :p :> 1
,
i.e. ,
n-l zp' -1 >y_"p' -1 p-l
Z
yF
Tnis imp11es that the expression (2.7.1) is positive and hence that
28
(2.7.2)
if p'
>
P
> o.
In a similar way, one can show that
if q'
>
q
>0
•
3.
3.1.
FURTHER STUDY OF 2X2 CASE WITH SOME NUMERICAL RESULTS
Computational Procedure
Note that
is the probability that a Poisson r.v. with parameter V/2 takes the value
I
m (m
= 0,1,2, ...••• ).
These terms have been tabulated, e.g., in the
tables published by the Defense System Department of GE (1962).
v~lues
Also
of the incomplete beta function
for various combinationsof
p and q and for x ranging from
,
been tabulated in Pearson (1934).
° to 1 have
Using these two sets of tables and
applying, wherever necessary, the relation
we have found values of Ql/2(1/2,~/2,X) and from these, by inverse
interpolation, those of the upper
x
~point,
= 1:'7~
X
ct of our test statistic
,
as defined in (2.2.14).
The same procedure of combining tables of the Poisson distribution
with those of the incomplete beta function yields values of
~/2(V/2,~/2,xa) and ~/2(V/2,v/2,xa)' whence, using (2.4.9) and (2.4.10),
we have obtained the bias of T as a proportion of Y and rMSE(T)/~
ll
for various combinations of
~
= 4(r-l),
a and
V.
2
But here we had to
resort entirely to the electrontc computer, since the tables are not as
30
extensive as needed for our purpose.
The computer generated series of
terms of the Poisson law and of values of the incomplete beta function
and then combined them in the desired manner.
It is to be noted that the formulas for bias and MSE are essentially
the same for each of the cell means ~ij (i
because 1 12
= 1 21 = -
1
11
and 1
22
= 111 ;
= 1,2
and j
= 1,2).
This is
while the preliminary test in
each case is based on
F
= 4rz 2/s 2
Hence the biases of the estimators of all four cell means have the same
magnitude but are either positive or negative, i.e., they are
while the MSE's are all equal.
All values appearing in the tables (in this chapter and in Chapter
5) are correct to the last decimal shown.
3.2.
Percentage Points of x
In order to study the variation of the bias and MSE with r, a and
1 rr , we have considered three values of r, Viz., 2, 5 and 10,
V = 4rr l2/2
which correspond to ~
= 4, 16 and 36, respectively.
Fbr each of these
values of r, we have considered procedures based on tests of levels
a = .01, .05, .25 and .50.
The two trivial values, a
corresponds to making no test and taking
~ll
=0
(whiCh
as the estimate JOd a
...
=1
(Which corresponds to making no test and taking ~ll as the estimate) may
also be taken into account, and for these the percentage points of x are
x
o
=1
31
and
irrespective of the number of replications.
The percentage points x
a for a = .01, .05, .25 and .50 are shown
in Table 3.2.1.
Table 3.2.1.
Upper
~points
of x (2x2 case)
Number of
replications
(r)
.01
Level of significance (a)
.05
·25
2
.908
.789
.492
.246
5
.479
.342
.160
.0670
10
.258
.173
.0754
.0303
3·3.
·50
Bias as a Proportion of Y
ll
We have seen in Section 2.6 that
Also 0
~ ~/2 ~
1.
Hence Q3/2 is the proportion of Yll that constitutes
the bias of the estimator T.
The following tables show how the relative
bias
varies with
~
for our chosen values of r and a.
Note that Q3/2 is also
the proportion of Y that constitutes the bias of the estimator
ij
~ij for any i,j (i,j
= 1,2).
Tij of
32
The values corresponding to a = 0 and a = 1 are not shown here.
.
Fbr
these two cases, the biases are 1 11 and 0, irrespective of r and V, since
the estimator T equals
Table 3·3·1.
-
~ll
in one case and
~ll
in the other.
Values of 1 11 E(~11-T)
(r=2)
.01
.05
.25
.50
·985
.983
.981
·923
·915
.608
.586
·907
.899
.883
.866
.842
.800
.565
.544
.505
.468
.418
.344
.283
.189
.082
.260
.243
.228
0
0.2
0.4
0.6
1.0
1.4
2.0
·979
·975
·971
.964
3.0
4.0
6.0
10.0
·952
.939
·912
.850
.758
.676
.524
.213
.186
.163
.133
.095
.068
.034
.009
33
Table 3·3·2.
Values of r~i
E(~ll-T)
(r=5 )
'"
0
0.2
0.4
0.6
1.0
1.4
2.0
3.0
4.0
6.0
10.0
Table 3.3.3.
.01
.05
.25
.50
·987
·983
.980
.976
·967
.958
.942
·911
.874
·790
.601
·925
·912
.899
.885
.858
.830
.786
·711
.637
.498
.278
.588
.561
.534
.509
.461
.416
·357
.275
.210
.120
.037
.233
.216
.200
.186
.159
.137
.109
.074
.050
.023
.005
Values of r~i E(~ll-T)
(r:10)
0
0.2
0.4
0.6
1.0
1.4
2.0
3·0
4.0
6.0
10.0
.01
.05
.25
·50
·988
.984
.980
·975
.965
.954
·934
.895
.849
.744
.516
.926
·912
.897
.882
.851
.820
·770
.687
.605
.453
.229
.577
.548
·520
.494
.444
.399
·339
.256
.192
.106
.031
.228
.211
.195
.181
.155
.132
.104
.070
.047
.021
.004
34
From the above tables it is seen that, for fixed r and a, the bias
decreases as
Wincreases from zero. This is just a verification of what
we proved theoretically in (2.5.3), viz., that Q3/2 is a monotonically
decreasing function of
w.
For a given a, the bias may start (at
level for a higher value of r.
increasing
W= 0) at a slightly higher
But the rate at which it decreases with
Wis higher, the higher the value of r; and, on the whole,
there is a definite gain in taking a larger number of replications.
As regards the effect of the level of significance a, it is seen
that as a increases, rand
Wbeing kept fixed, the bias decreases. This,
again, is a verification of the result in Section 2.5(b) that Q}/2 is a
monotonically decreasing function of a.
sidered moderate (even when
The relative bias may be con-
Wis small) for a value of a like .50 or even
.:£2..
However, the main sUbject of our investigation is not the bias
but the MSE Which will be discussed in the next section.
3.4.
Relative Mean Square Error:
MSE(T)/(~2/r)
Formula (2.6.4) shows that for fixed r, a and
W,
the MSE of the
proposed estimator T is given by
(3.4.1)
In this section we shall study the behaviour of this ratio for varying
r, a and W.
35
= 0, x a = 1
We first note that for a
ta~ing
T equal to
~ll'
the least squares estimator under the no-
interaction model), Q3/2
= ~/2 = 1, for all rand
I
MSE( T)
fJ'2/ r a=o
On the other hand, for a = 1, x
taking T equal to
model), Q3/2
~ll'
hence
+ ljr /4
a = 0 (the case that corresponds to
for all rand
MSE(T)I
fJ'2/ r a=l
ljr.
= 3/4
ljrj
the least squares estimator under the interaction
= ~/2 = 0,
identically in
(the case that corresponds to
=1
hence
ljrj
,
We have, therefore, excluded a
following tables, where we give, for r
= 2,
=1
from the
5 and 10 and
ljr
= 0,
0.2,
0.4, 0.6, 1.0, 1.4, 2.0, 3.0, 4.0, 6.0 and 10.0, the values of the ratio
for five levels of significance, viz., a
Table 3.4.1.
= 0,
.01, .05, .25 and .50.
Values of the ratio MSE(T)/(fJ'2/r )
(r=2)
a
ljr
0
0.2
0.4
0.6
1.0
1.4
2.0
3.0
4.0
6.0
10.0
0
.01
.05
.25
·50
.75
.80
.85
·90
1.00
1.10
1.25
1.50
1.75
2.25
3·25
·754
.804
.855
·905
1.005
1.105
1.252
1.495
1·731
2.183
2·993
.769
.821
.872
·922
1.020
1.115
1.252
1.463
1.652
1.966
2.135
.848
·894
.936
.976
1.04 7
1.109
1.185
1.273
1.324
1.352
1.272
·935
·959
.981
1.000
1.031
1.055
1.078
1.096
1.097
1.078
1.047
36
Table 3.4.2.
Values of the ratio MSE(T)/(~2/r)
(r=5 )
0:
1jr
0
.01
.05
.25
.50
0
0.2
0.4
0.6
1.0
1.4
2.0
3.0
4.0
6.0
10.0
·75
.80
.85
·90
·753
.805
.857
.908
1.011
1.113
1.264
1.506
1.733
2.123
2.590
.769
.824
.878
·932
1.035
1.133
1.270
1.468
1.624
1.819
1.845
.853
·901
·946
·987
1.058
1.116
1.182
1.247
1.272
1.253
1.140
.942
.965
.985
1.003
1.030
1.050
1.069
1.080
1.077
1.056
1.020
Table 3.4.3.
0
0.2
0.4
0.6
1.0
1.4
2.0
3.0
4.0
6.0
10.0
LOO
1.10
1.25
1.50
1.75
2.25
3·25
Values of the ratio MSE(T)/(~2/r)
(r=10)
0
.01
.05
.25
.50
.75
.80
.85
·90
1.00
1.10
1.25
1.50
1.75
2.25
3·25
·753
.805
.857
·909
1.013
1.117
1.269
1.511
1.734
2.098
2.436
.769
.825
.880
·934
1.039
1.138
1.275
1.468
1.615
1.777
1·727
.856
·905
·949
·989
1.059
1.115
1.178
1.237
1.256
1.229
1.118
.943
·966
.986
1.003
1.030
1.050
1.067
1.077
1.073
1.052
1.018
37
3.5.
Comments on MSE
An examination of the values tabulated in Section 3.4 indicates how
the MSE of T changes with r, a and
ljr.
First, for fixed r and a (0 < a < 1), as
ljr
increases from zero the
quantity rMSE(T)/~2 increases for a time, reaches a maximum and then
tends to decrease.
apparent from Tables
For small a, this tendency to decrease may not be
3,4,l~3.4,3,
sufficiently large ljr-values.
follow from Theorem 3.5.1.
since we have not considered
However, the fact that this is so will
We first state the following lemma which
is proved in Section 3.7:
For a > 0, a a 0, lim ljra Qj (ljr/2,q,X )
Lemma 3.5.1.
a
= 0,
ljr~ClO
Si;nce
we immediately have from the lemma
Theorem 3.5.1.
For a >
° rMSE(Tlo:)/~2 ~ 1 as 1/1 ~m
Secondly, for given a and
for a change in r.
ljr,
the change in rMSE(T)/~2 is only slight
Indeed, in the range from
ratio is almost the same for r
=2
as for r
that the MSE itself remains constant.
almost as ~2/r does:
values of
ljr
ljr =
= 10,
° to
ljr =
4.0, this
This is not to say
On the contrary, the MSE varies
it varies almost inversely as r,
Also, for higher
the ratio itself decreases appreciably as r increases.
Lastly, for fixed rand
in a (see Figure ,,5,1),
ljr,
the MSE is highly sensitive to a change
We found in Section 2.6 that for
2 • .
° < a< 1 the
2
MSE is greater than [3/4 + 1/1/4]~ /~ ifljr ~ 1 and greater than ~ /r if
e
e
e
1.75
1.50
cr = .25
N
~ 1.25
UJ
(/)
:E
0:=.50
~
cr:: I
7'~~
1.00-1
I
~
.75
o
2
3
4
5
6
7
8
9
VALUE OF \II
Figure 3.1.
'at'o
1. n
Fv8rl
")f'
~._
~,K~E/r-~)
r.oU
1'-
e_.....
"ith
-t'-•.,..
!'l~ff
'Y'p""t
-'.~ ..
11-". •
u __
..l.
..,....,(''''
S ;O'n;-f"
.. r"_ .. lC<""
..... 1.- ':: -1/ ~=-.t ~:
<~th
r=5)
10
39
0/ > 1.
The higher the value of
0,
the higher generally is the difference
2
between MSE and (3/4 + 0//4Jcr /r for 0/
between MSE and cr
2
/1'
.s 1
but the lower 16 the difference
on the larger part of the range
(1,=). }urther,
1/ in (0,1] increases with 1.ncreas1.ng a., the
while the maximum regretmaximum regret in (l,ea) decreases with fncreasing o.
The value of '"
at·which this second maximum occurs also draws nearer to 1/r
=1
as a
approaches unity.
We can, however, show that this", must always be
greater than 5/2.
In fact., usIng a result from Secti.on 2.5(d), we have
But
>
(21/r~3)G~/
/
., 2 ~ (21/r=5)Q32
:: (5=21/r)( Q3 / 2= ~/2) + 2~/2
2:
2~/2
i.1'
1/r ~ 5/2 ,
showing that the po.1nt at which the derivattve Is zero (gIving a maximum)
must be to the right of 5/2.
It is thus seen that, on the whole,
by c.hoosLr!.g
a l.ow level of
significance, we reduce the MSE for 1/r£(0,1J, whIle we increase the MSE
liThe difference
rMSE(Tlo)!cr
2
~ ~n rMSE(Tlo)/cr2
may be called our 'regret' resulting from our cho:i.ce of o.
function of the parameter vector
~
It is a
(in the present case of 1/r alone).
40
On the other hand, a high level of significance, by and
for 0/£(1,00).
large, leads to a slight increase in the MSE for 0/£[0,1] but to a
comparatively big reduction for 0/£(1,00).
This would seem understandable
from the nature of the power function of the test we are using.
Note
that this power function is monotonically increasing, as a function of
either 0/ or a.
Because of what we said in Section 1.1, w may be called
the 'proper estimator' for 0/
0/ > 1.
~
1 and w + z the 'proper estimator' for
The probability of choosing the proper statistic then equals
l-(value of the power function) for 0/ ~ 1 and equals the power itself
for 0/ > 1.
The monotonic character of the power function, therefore,
implies that the probability of choosing the proper statistic is smaller
(larger) in [0,1] and larger (smaller) in (1,00), the larger (smaller) the
value of a.
3.6.
Optimal Choice of Ct
We have seen that when the test procedure suggested in (2.2.5) is
used, the ratio rMSE(T)/~2 depends on 0/, r and Ct.
The non-centrality
parameter, 0/, is generally an unknown quantity and hence is beyond the
experimenter's control.
The number of replications, r, also is in most
cases determined by the resources available to the experimenter.
However, the level of significance of the preliminary test, Ct, is
completely within his control, and he may choose any a that he considers
appropriate.
low leveL
The aim, of course, is to keep the ratio rMSE(T)/~2 at a
The difficulty stems from the fact that an a that is good
41
for some
~-value
may be bad for some other.
As
always in decision
theory, the problem is how to cope with this non-uniformity.
We may
suggest four approaches which are discussed below:
(a)
Bayes Approach:
Suppose that the experimenter has some prior
knowledge of the likely magnitude of
~
and that this knowledge can be
mathematically represented by a distribution function, say
s(*)' In
that case the value of a (or, equivalently, of x ) that minimizes the
a
average
co
J'
ER( *) =
d~( *) ,
R( 1/r)
o
where
R(*) = rMSE(T)/~2 ,
may be said to be optimal for the purpose of the preliminary test.
We illustrate this procedure of choosing a with a gamma-type prior
distribution:
,
13 > 0, s > 0, 1/r > 0
From (2.4.7) and (2.4.10)
ER(",)
1
=1
co
@s
(l"$)s-+m+l
+ !:
m=o
- -12
GO
- '4 !:
m=o
co
!:
m=o
I3
s
(l"$)s-+m+l
r( s-+m+l)
mt
I x (m+3/2''IJ/2)
a
r( s-+m+l) . I
x (m-+5/2, \)/2)
m1
a
The derivative of ER(W) w.r.t. x
a equals
(~~)
l~ . (~~)
m
m
Equating this to zero and making some simplifications, we find that the
optimal x
+
a (hence the optimal a) is given by the equation:
~
x
(\)+3) .
l~
x
.
2Fl(\)/2+5/2,S+1;5/2;1~)
=0
where 2Fl is a hypergeometric function.
For given t3 and s, this equation
may be solved by using some method of iteration.
Since our choice of
the prior distribution was rather arbitrary, we shall not pursue this
matter.
However, some general observations may be made without going into
the mathematics of a Bayes solution.
knowledge indicates that the probability is very high that
should choose a small a.
,
Thus, if the experimenter s prior
W< 1,
he
On the other hand, if this probability is
small, then an a near to unity ought to be selected.
(b)
Minimax Approach:
In the absence of any prior information
regarding the likely magnitude of
W, one may like to guard against the
worst and choose 0 in such a way that
rMSE (T)/cr2
Max
w
W~o
is a minimum.
o
=
The result
(
2.6.6) then shows that this
t
minimax
0
t
is
In other words, according to this approach, we should use the
1.
estimate
~ll'
irrespective of the sample observations.
out to investigate if we could improve upon
interesting to us.
~ll'
Since we had set
this result is not very
Of course, if the nature of a concrete problem calls
for a choice of a loss function leading to (3.6.5) and for the
apPlication of the minimax criterion, then the estimato r
~ll
is the
obvious choice.
(c)
each
Minimizing the Ll-norm of the Regret FUnction:
Wthe
regret
Min rMSEo/(Tlo)
o
say.
Consider for
cr
2
=
s(o/,a)
,
(3.6.6)
Evidently, this regret function equals
s(1jr,o)
=
{
rMSE(Tlo)/cr22 - (3/4 iir/4 )
for
o/e[O,lJ
rMSE( TI0) / cr
for
1jre(l,<Xl) .
- 1
An approach that might seem appealing would be to choose an 0 that
minimizes the total area
<Xl
J
o
s( 1jr, o)do/
44
under the regret function curve1/.
However, the follOWing theorem
shows that this approach (of minimizing the Ll-norm of the regret
,.
function) again leads to the estimator ~ll' regardless of the
observations taken.
Theorem 3.6.1:
The level of significance a for which
J
II)
s(\jr,a)d\jr
o
is a minimum is a :;: 1.
j
Proof:
s(\jr,a)d\jr
o
1
:;: J
J
II)
2
{rMSE/cr - (3/4 +\jr/4)} d\jr +
o
2
(rMSE/cr -1)d\jr
1
1
:;: ~J
{(1-\jr)(1-Q3/2) + \jr(Q3/2-
o
J {(
II)
+
~
~/2)}
d\jr
.
\jr-l)Q3/2 + \jr( Q3/2-
~/2)} d\jr
1
1
:;: ~ J
o
II)
(l-\jr )d\jr +
~
J' {(
\jr-l
)~/2
+ \jr( Q3/2-
~/2)} d\jr
0
liThe difference between our approach and that of Huntsberger (1955)
in his Theorem
2 is to be noted. Our approach would correspond to
t
Huntsberger s if we minimized
II)
J;
(rMSE _ 1 )d\jr
2
cr
+ W{IX (m+3/2,v/2 )
a
~ Ix (m+5/2,v/2)}]dW
a
;:: "S1 + '21 coE [ (2m+l)Ix (m+3/2,v/2 )
m=o
a
+ 2(m+l){Ix
(m+3/2,v/ 2 ) - Ix (m+5/2,v/2 )}]
a
ct
1
?"S '
the equality holding iff x
a
= 0,
i.e., iff a ;:: 1.
A remark similar to
the one made at the end of Subsection (b) pertains to this result.
(d)
Minimax Regret Approach (minimizing the Chebyshev norm of the
regret function):
Another approach which might be suggested, and which
would seem qUite appropriate in the absence of any prior information
regarding the variability of W, would be to choose ct in such a manner
as to minimize
An exact determination of this
9
minimax regret
d
presents some problems
because of the complicated expression for the regret function and has
not been attempted.
As for a promising suggestion for nlunerical work,
see the end of Section 3.7.
AnYWay, it is seen from Tables 3.4.1-3.4.3 that this minimax regret
a lies between .25 and .50.
A value less than .25 makes the maximum in
(l,co) too large, while a value greater than .50 increases the maximum
in [0,1] to too high a level.
Apart from any theoretical criteria, it would appear from the
tables that a value of a between .25 and .50 (perhaps even closer to .50)
;:; "S'1 + '21 E
00
[
m=o
(2m+l)Ix (m+3/2,'V/2 )
0:
1
.?"S' '
the equality holding iff Xo:
= 0,
i.e"
iff
0:
= I,
A remark similar to
the one made at the end of Subsection (b) pertains to this result,
(d)
Minimax Regret Approach (minimizing the Chebyshev norm of the
regret function):
Another approach which might be suggested, and which
would seem quite appropriate in the absence of any prior information
regarding the variability of
would be to choose
\jr,
0:
in such a maLmer
as to minimize
An exact determination of this
9
mintmax regret
9
0:
presents some problems
because of the complicated expression for the regret function and has
not been attempted.
As for a promising suggestion for numerical work,
see the end of Section 3,7.
Anyway, it is seen from Tables 3,4.1-3,4.3 that this minimax regret
0:
lies between .25 and ,50,
A value less than .25 makes the maximum in
(l,oo) too large, while a value greater than .50 increases the maximum
in [O,lJ to too high a level,
Apart from any theoretical criteria, it would appear from the
tables that a value of
0:
between .25 and .50 (perhaps even closer to .50)
46
also effects a satisfactory degree of control over the MSE throughout
the range of W-values.
Moreover, for such a value of a the relative
bias of the estimator is also only of a moderate magnitude.
Note that in our discussion we have been concerned with a test for
the hypothesis
H:
o
1Jr,:S1
rs.:
1Jr>1
against the alternative
However, the usual tables refer to tests for
H:
o
w:= a
against the alternative
Noting that in both cases the critical regions have the form
and that the power function for such
&
critical region is monotonically
increasing in W, it is possible to convert the values of a related to
our test into levels of significance related to the usual ANaVA test.
This requires only the determination of the value of the power function
at W = a when the value at 1Jr := 1 is a.
Since the power function of our
test is
it means that we have to find the value of
when
We show these values corresponding to a
= ,25
and a
= ,50
in the
following table, for the set of r-values considered in the tables
for bias and MSE,
Table 3,6.1,
Level of significance for ANOVA test corresponding to
a = .25 and ,50
r
.25
,50
2
,12
,32
5
10
,10
·30
,10
·30
Thus we may conclude that for r between 2 and 10 (and likely bey'ond)
the preliminary test may be conducted as the usual ANOVA test, however,
at a level of significance between .10 and ,30, most likely closer to
~
than to .10.
This would seem to corroborate the suggestion made
by Anderson (1960), p. 52, that the preliminary ANOVA test be made at
about the 25% level,
One thing is clear:
the customary levels of
significance, .01 and ,05, are completely unsuitable for a TE procedure
in the 2x2 case, unless there is prior information that ~ is very
likely to lie in [0,1].
For such a value may reduce the MSE appreciably
in (0,1] but will make it inordinately high in (1,=).
48
3.7.
An Alternative Formula for MSE
The formula for the
MSE
suitable for many purposes.
of
T
in terms of tt:.e
Q
function will not be
One would like to have an expression for it
in a closed form rather than in the form of an infinite series.
section a general method is suggested
for the special case of
=
I'
a~d
In this
the actual formula obtained
2.
'!he ratio rMSE/(i can be written in the form~
00
- _1_
exp( -*/2)] exp( -u/2)cOSh(J"fu)J1J; {l
4J21C
where
U
y(x, ,,/2)}dU,
0
= 4rz 2 /rr2 ,
x =
~l/2F
a::
y(x,,,/2) is the incomplete gamma
and
function:
x
J exp(~'t)t\l/2-l
dt/r(,,/2)
o
and
.[f'" = /41' yll/rr .
Now putting s
= ,,/~/
00
exp(-1jr/2) jexP(-u/2)
o
Sinh(J~U){l - Y(X,,,/2)}
2r(S)]C exp{o
1 + J1jr ).2}] ,.f exp(-t)t s~·ldt
{= "2C/u
00
00
=
du
~(JU= J1jr);
-
1/,,/2 = 2(1'-1), an even integer.
ex p
SU/F'a
co
= 2r(S)
J[
ex p{-
~CJU- .IT); -
~(JU + fl)2}]
exp{-
o
co
=
rfsr ][exp{- ~ ((y=1)2 + 2Sy2 /F
)}
Q
=
ex p{-
~((Y+l/
o
+ 2S y2 /Fa )}]
by the substitution
Since only odd powers of yare involved in the polynomial, this reduces
to
ifsrJ
co
ex{-
~
{(y_l)2 + 2Sy2 /F }]
a
[(-~)S-ly2S-1 + (S_1)(~)S-2
y2S-3
-co
+ .•• .•• + (s-l)' ] dy
In a similar way, we have
co
eXP(-o//2U exp(-u/2)
COSh(~)JU{l -
Y(X,\I/2)} du
o
x [(~)S-l y2S + (S_1)(~)S-2 y2s-2 +
+ (S-1) iy 2]dY .
(3.7. 4 )
50
Now only even powers of yare involved in the polynomial, and so the
right-hand side of equation (30704) equals
+
oe
0
00
0
+ (6-1) ~ y
2J dy
If we make the substitution
+x
a
in (30703) and (30'705), then (30701) becomes
x
S;l (s-Ill
(2.t)i
i~
F
a
i=o
(t+x )2i +2 dt
a
This can now be simplified by noting that
J.
CD
-CD
(1
2
exp -
~t
2/x )zrldz
a
=
o if n is odd
(-1-)={n+l)/2 r(n+l)
2x
a
2
if n is even.
Thus, e.go, for the case r
= 2,
s
= 2,
formula (307.6) gives
.51
Note that, for small v and for some analytical purposes, a formula
of this type can be more easily used than the infinite series in
e.g., it gives a direct proof of Theorem 3 5.1.
(2.6.4);
(We have retained the
earlier proof because it hardly needs any modification in the general
context of a p x q experiment, and the generalization of the present
formula is bound to be cumbersome.)
For numerical purposes too, it
52
would be more convenient to use than formula
(2.6.4), at least with the
usual number of replications, since in that case it would require the
evaluation of considerably fewer terms.
Thus it is much easier to use
in locating the point on the *-axis for which the MSE is a maximum or
in the determination of the minimax regret o.
for determining
the
that for each
the regret function has two local maxima, one for
0
minimax regret
An iterative procedure
*£[O,lJ and the other for 0/£(1,00).
the
0
is suggested here.
0
First note
Our problem is then to determine
for Which the larger of the two is least.
Supposing we dra\l a
graph for each of the local maxima as a function of 0, we are to find,
proceeding from
0 ~
1 backwards, the abscissa of the first point of
intersection of the two curves.
The idea of proceeding from
0 = 1
backwards is suggested by the results from Tables 3.4.1-3.4.3 which
seem to indicate that the maximum in
(1,=)
is a monotonically decreasing
function of o.
3.8.
Appendix:
Proof of Lemma 3.5.1
From (2. .2), for
0
> 0,
1. e ., for x
monotonically increasing function of q.
integer contained in
o < 1, Qj(0//2,q,x0 ) is a
If q
o
= [q~l],
the highest
q+l, we then have
(3.8.1)
Now
53
the change of the order of summation and integration being admissible
by v1rtue of Theorem 21B 1n Halmos (1950).
m=o
Also
(",/2)m
ym+j -l
m!
B(m+j, qoJ
= (1 )
r qo
1
I:
(1/r!2, )m • ( m+J) (
) ...... (m+j+q -l)ym+J-l
m+j+l
m.
0
co
(1/r!2)m
CD
m=o
q
m!
= r(qo) m:o
[L
dY~
m+j+q
y
0
-lJ
q
= r(~J . ~~ {yJ+q,,-l
exp(ty/2)}
qo
1
= r(qo)
(/ )
. exp t.Y 2
I:
1=0
j +q -1-1
a1 y o ,
where
with
i=O,l, ••• ••• q
o
•
Hence
(3.8.2)
where
qo
Since
1::
bi(W/2)i is a polynomial 1.n W, the right-hand side of
i=o
(3.8.2) ~ 0, implying that Qj(t/2,q,xa) also"':'> 0, as W ~
so does waQj(w/2,q,xa) for a > O.
CD
,
and
55
4.
THEORETICAL ASPECTS OF pxq CASE
4.1 Notation
As in the 2x2 case, here also we shall denote bya(O
~a ~ 1)
level of significance at which any preliminary test is made.
denote the estimator of
This T.. equals
~J
il..
~J
~ij'
T
ij
the
will
the population mean for the cell (i,j).
in the case of significance at the preliminary test
and equals ;ij otherwise
In line with
(2.1.1), we shall write
w.. . y
~J
and
-
+
CY.
~.
-
-
- y
) +
CY..J
- Y )
(4.1.1)
-
Z.. • Yij - y. - y . + Y
~.
.J
~J
wij is the same as
-
~ij
~ , the least squares
and Zij is the same as 1
ij
estimator of 7.. under the interaction model (1.1.2).
~J
With
(1.1.2) supposed to be the appropriate model for our experi-
ment, we shall denote by
si
the interaction mean square and by
d.f.
The sample error variance will be denoted by s
by v.
Thus
l • (p-l)(q-l), v • pq(r-l),
V
2
vl
its
and its d.f.
(4.1.2)
and
(4.1.3)
4.2.
Estimation of a Single Cell Mean
When we proceed to the case of factors, A and B, at least one of
which has more than two levels, the problem of the estimation of any
particular cell mean and that of the estimation of all pq cell means
simultaneously are to be distinguished from each other.
The first
problem is discussed in the present section mainly for the sake of
completeness.
It is not to be expected that the need for it will
often ari.se in practice since usually one
~w.il1
condu.c t a different
type of experiment if one i.s interested only in one treatment combinat:i.on.
As we stated in Chapter 1, we have taken low MSE as the criterion
of a good estimator, and we are making a choice between the two sets
of estimators,~.. and ; .. ,on the basis of this criterion.
lJ
lJ
Looking at (1.1.12), we see that, when pcq=2, the procedure for
estimating a cell mean is the same for all four cells.
this case, since 7
ij
cells, and since z .. J.J
-
- + 7
11
In fact, in
, the hypotheses are the same for all four
+ zlJ
.. ,
the corresponding tests of significance
are also the same for all four cells.
For the same reasons, the
simultaneous estimation of all four means follows the same procedure.
However, the above results do not hold when p > 2 and/or q > 2,
and so the estimation of a single cell mean and that of all cell means
simultaneously will then be entirely different problems.
Suppose our problem is to estimate a single cell mean
~
.. ,
J.J
Referring back to (1.1.12), we would choose either ; .. or ~ .. , according
lJ
lJ
as
2
(]
(p-l)(q-l)
< r
pq
57
or
2
7 ..
J.J
>
i
(p-l)(q-l)
pq
r
2
if 7 .. and cr were known quantities
As these are unknown, we make our
0
J.J
choice on the basis of a test for
2
Ho :
pqr 7 ij
< 1
2
(p-l) (q-l)cr
against the alternative
2
pqr 7 ij
>
1
2
(p-l)(q-l)cr
The UMP invariant test of level a for H , under appropriate transo
formation groups (vide Lehmann
(1959), Chapter 7, Section 5), has the
rejection region
2
pqr
Z ••
_ _ _..:::J.""",]_:-
> Fa (1, v)
,
(p-l) (q_l)s2
where
Fa(l,v) is the upper a-point of the non-central F distribution with
(l,v) ~.£. and non-centrality parameter
pqr 7
2
ij
2
(p-l)( q-l) cr
=
1
The problem and the expressions being similar to those in Chaptern 2
and 3, we can say off-hand that the findings of those chapters in essence
hold here also, if we bear in mind that here
2
pqr 7 ..
J.J
replaces
*
(p-l)( q-l)i
2
2
.4r7 ll /cr and v • pq(r-l) replaces v .4(r-l).
58
4.3. Simultaneous Estimation of Cell Means
When our problem is to estimate all pq cell means simultaneously,
we need some method of combining the pq MSE's to obtain a single criterion
of estimation.
This mode of combining the MSE's is bound to be more or
less arbitrary.
We shall consider weighted averages of MSE's.
This
means, in effect, that we have in mind loss functions of the form
p
q
C E
E
i-I j =1
(T .. = ~o
lJ 1.J
.lJ
Wo
•
where c > 0 may depend on t h e parameters
p
q
E
E
i-l j -1
w
lJ
o
,
-
2
(4.3.1)
.)
Y ,wlJ.. > 0
and
1. The wieght w" will then indicate the importance of
lJ
the cell (i,j) in the whole set.
In a real-world situation, it might
be the proportion, among all users of the factors A and B, of those who
use'the combination (A.B.).
l
J
The risk function corresponding to
(4.3.1)
is the weight,ed average of MSE's (to be denoted by AMSE):
q
E
AMSE
except for the multiplier c.
j -1
w., E (T .. - ~. ,)
lJ
IJ
2
1.J
Here T is the estimator of
ij
,
~ij
under a
given estimation procedure.
The minimization of the AMSE may then be taken as the basis of a
choice between rival estimation procedures.
For the least squares
procedure under the full model,
-
Tl· J' - YolJ.
11
In the sequel c will depend on cr (see Section 5.5) alone.
59
On the other hand, for the least squares procedure under the nointeraction model,
T;J' • Y +
.L
(Y. - Y ) +
~..
•
(y J.•
Y)
,..,
Il 1,J
..
•
Now
AMSEW)
•
E E w.. 0
. . lJ
1 J
2
/r •
0
2
/r ,
while
(p-l)(q-l)
pq
(p-l) (q-l)]+
pq
J + 1:.}
lJ
2
EEw .. 1 ·
i j
lJ iJ
(4.3.4)
Formulae
(1)
(4.3.3) and (4.3.4) suggest the following TE procedure:
First, test the hypothesis
2
W . . l ..
i j
lJ ~J
Ho : E E
<
2
0
r
(p-l)(q-l)
pq
against the alternative
H : E E w.. l 2..
l i j lJ ~J
(2)
>
(p-l)( q-l)
pq
In case significance is found in step (1), estimate each Il .. by
the corresponding
lJ
~ .. j
lJ
otherwise estimate each Il .. by the corresponding
lJ
!I In this work we have only considered the choice between neglecting
the interactions for all cell means and including the interactions for all
cell means. A question for future research would be how to allow for cases
where interactions may be neglected for some cell means and included for
others.
6a
However, a satisfactory test for H against H (for unequal weights)
o
l
is not available.
Even the heuristic test, viz., the one based on the
statistic
pqr
2
E E w.• Z ••
. . l.J l.J
1. J
/s2
presents such a distribution problem, especially because the distribution
will not depend solely on
as would make it meaningless to be too fastiditious on this point.
In
step (1) we shall, therefore, be concerned with a test for
2
2
H~: r ~ ~ Yij/O ~ (p-l)(q-l)
1.
J
against
H{: r E E Y:
i j
./02 >
l.J
(4.3.6)
(p-l)(q-l)
In this way, we shall make the preliminary test independent, as it were,
of the choice of the weighting system.
Note that H~ and H{ correspond to the traditional ANaVA test for
interactions, except for the aspect of material significance.
(In
traditional ANaVA we test the hypothesis
2
r E E w••
i j
l.J
/02
=
a
against the alternative
Although we shall apply this simplification to the test procedure in
step (1), we shall not apply it to the loss function to be used in
assessing the AMSE of the estimators used in step (2); there we shall
still allow the weights attached to the different cells to be unequal.
61
As we shall see, the properties (the bias and AMSE) of the
estimators T will then depend on two parameters,
ij
2
2
\jI • r EEl' ij/cr
i
j
and
2
2
(4.3.8)
\jI* = pqr E E w. . I' . . / cr
i
4.4.
Some Relations between
j
l.J l.J
\jI and \jI*
Obviously,
Since
'" >
- r max
..
1.,J
2
2
1'.l.J./ cr
we then have
(4.4.1)
\jI ~ \jI*/pq.
In case either p or q is 2, the inequality
(4.4.1) can be improved upon;
for then there will be two pairs (i,j) for which
we shall have
From
(i)
\jI
>
\jI*/pq
1':l.J.'s
are a maximum, and
(4.4.2)
(4.4.1) it follows that:
(4.4.3)
if \jI • 0, then \jI* • 0;
and
i f \jI* ~
(ii)
00,
then \jI
(4.4.4)
~ 00 •
Also
where
- •
w
E Ew .. /pq
i
and
2'
I'
j
l.J
2
•
= l/pq
E E I'ij/pq .
62
We have, therefore, the following rough interpretation:
If the weights are all equal (. l/pq) , then
w*. W.
If generally
2
larger weights go with larger 7 'S , then W* > W. On the other hand,
1J
if generally larger weights go with smaller 7~J'S, then W* < W.
4.5. Preliminary Test Procedure
In choosing a satisfactory test for H~ against H{, our purpose
is, of course, the minimization of the AMSE (see equation (4.3.2))
uniformly over the parameter space
However, for the same reasons as
those mentioned in Section 2.2, we shall be using a test that is best from
the point of view of power rather than of AMSE.
Referring to Lehmann (1959), we see that invariance considerations
demand that a test for H' should be based on the statistic
o
2
r E E z"/V l
i J lJ
s-?
1
.i-
The UMP invariant test of level a has the rejection region
2
1
>C
s2
where C is so chosen that
(vide Lehmann (1959) and Hodges and Lehmann (1954)).
2 2
Now sl/s has the non-central F distribution with v
and with non-centrality parameter
or simply Fa'
W.
l
and v ~.!.
Let us denote this statistic by
63
As in Chapter 2, it will be convenient to deal with the statistic
x,
==
say, rather than with
V
1F
v
a
=x
1 + vl Fa
a
v
say, rather than with F
a
Since the probability element of F(vl,vjV) is
00
r.
exp (-",/2)
m-o
(4.5.6)
(see,
~.a.,
Graybill (1961)), that of x is
exp("'/2)
0;~
m-o
Hence x
a
llr /2)m
1
~
m.
is given by
(]) !Vl)m
exp (-!V )
l
i.~.,
r.
m=o
,Ixex (!Vl + m,
m.
tv)
:I
l-a ,
by
Q
V
l /2
(V l / 2 ' v/2j x )
a
using a notation. introduced in Section 2.4.
==
l-a ,
(4·5·8)
64
4.6.
Distribution of the Estimator of a Cell Mean
Considering any of the pq cell means, let us study the nature of
the distribution of its estimator, the estimation being based on the
results of the preliminary test mentioned above.
gen~lity,
~ll
is
Without any loss of
we may consider the case with i-j-l, for which the cell mean
and the estimator is Tll ·
Note that T is defined by:
ll
F<F
- a
if
F>F
if
where
T
ll
~l'
(4.6.1)
a
zll are as given by (4.1.1).
The probabiiity element of
is then
gl(tlF <
- Fa ) Pr (F <
- Fa )dt + g2(tlF
(4.6.2)
> Fa )Pr(F > Fa )dt,
where gl and g2 are the conditional density functions of
W
ll
and w + zll
ll
respectively.
Now w and zll are independently distributed, the former as
ll
2
o
r
2
N(r
ll
,;
[1 _ (p-l)(q-l) )
pq
(P-l~~q-l». Asmthe
and the latter as
test statistic F =
si/~2,
we know that
lI S2 /02
has the (central) x
2
distribution with v
independently of wll and zll'
~.!. and is distributed
Before taking up
8
2
1
, let us state the
following generalization of the Cochran-Fisher theorem, first proved by
Madow (1940).
We state it in the form given by Graybill (1961):
65
Theorem 4.6.1: 11 Let
Yl' Y2 ,··· ... , Yn be independent random variables,
Y being distributed as N(~ ,1), and let
r
r
k
'i..1. -
E 'i..'A.'i..
i-l
~
Then a necessary and sufficient condition for
'i..'A.'i..
~
x2
to be distributed as a non-central
with ~.!. n i and non-centrality
parameter A., where n. is the rank of A. and
~
~
~
Ai=H,'Ai .!:!:.
and for
,
,k) to be mutually independent is that
l..'A.'i.. (i-l,2,
~
k
En. =
i-l ~
n .
(4.6.3)
It will be convenient to consider
2
?
vlsi/ cr
to be composed of two parts:
2
u
=
pqr Zll
(p-l)( q-l)
i
(4.6.4)
and
uHere'i..' = (Yl ,Y2 ,
yn ), H,' = (~1'~2"" ""~n~ and
'i.., H, are the corresponding column vectors. A.(i-l,2, ... ,k) is an nxn
~
symmetric matrix.
11
66
2
It is known that vls~/a2 has the non-central X distribution with v ~.!.
l
and non-centrality parameter
r
1: 1:
i j
2
2
'1 . . /a
lJ
• '" .
2
On the other hand, u has the non-central X distrib~tion with 1 d.f.
and non-centrality parameter
2
pqr'1 ll
= "'1 (say).
(4.6.6)
(p-l)(q-l)i
By making a suitable orthogonal transformation on the variables
Yijk(i=1,2, ... ,p; j=1,2, ... ,q; k-l,2, ... ,r), we can then show from
2
Theorem 4.6.1 that U has the non-central X distribution with vl-l
~.!.
and non-centrality parameter
and that U is independent of zll .
Also both zll and U are independnt of w and s
ll
~
.
Hence (4.6.2) may be written as
hl(t)
+
J~r
h2(Zll)h3(U)h4(s2)dZlldUd(s2)dt
fff
h5(t,zll)h3(U)h4(s2)dZlldU d(s2)dt
C
R
•
hl(t)
JSJ
R
+ h (t,.)dt 5
hz(Zll)h (U)h 4 (s2)dZ lldU d(s2)dt
3
fSS
h5(t,zll)h3(U)h4(s2)dZlldU d(s2)dt
R
• h (t,.)dt +
5
,fff
R
[hl(t) - h; (tlzll)]h2(Zll)h3(U)h4(s2)
dZ
ll
dUd(s2)dt
(4.6.8)
67
Here h l , h2 , h and h 4 are the density functions of wll,zll' U and s
3
2
h (t, Zll) is the j oint dens ity of t .. wll + zll and zll' h ( t , ) the
5
5
marginal density of t • w + zll' and h * the conditional density of t
l
5
given Zllj the region of integration R is defined by
and Rc is the complement of R.
By reductions similar to those in
Section 2.3 the above expression may be put in tne form:
2
-
i-
2
0
2
(21f0 /r) - exp [- (t-J..L ll ) /2r-Jdt
+
JJJ [(21f
p+q-l
pq
2
rfL
)- !
exp
(-
t-J..L ll + 711 )2/2. p+q-l
pq
2
fL}
r
R
2
n):-
~
~ exp
(-
t-J..Lll+ 7 11
)2/2
p+q-l
'pq
x
4.7.
Expected Value and MSE of the Estimator of a Cell Mean, and AMSE
From the above density function, we obtain, putting
(k is a positive integer),
and
These results are analogous to those in Chapter 2, except for the
variate U in (4.7.1), which was zero in the earlier case.
68
Now using W , W and W, as defined in equations (4.6.6), (4.6.7)
l
2
and (4.3.7),we make the following reductions:
J - exp (-W /2)(2rc)"",
l
k
rT-f
x
exp(-u/:2)
m!
R
For k-l, terms under the summation sign corresponding to even values
of m vanish, and so
J - exp (-w /2)(2n)
l
l
-iJJS
exp( -u/2)
~
(2m+l) !
R
x
h3(U)h4(S~) ~ dU d(s2)
fU
(Wl /2)m
m!
Also (cf.(4.6.7»
CD
h (U) - exp (-W2 /2) E
3
n-o
Hence
Jl -
?'n exp( -W/2)
JJJ
CD
CD
E E
R m-o n-o
x
2
2
h4 (s )du dU d(s )
If we put
U' • u
then this reduces to
+ U ,
G(~,
m+
~,
U)h (U)h (s2)dU dU
4
3
d(s~
(",I 2)P
j
1
(
p.
G
(",/2)P
p!
G(l
2'
in a manner similar to that in Section
~.4.
2'
P
v
1 + 1,U ')b 4(2)
(2)
P+ '2
s dU , ds
1 U')G(-21, 2'
v
+ "2 + ,
vl
vs
2
a
The validity of the change
of the order of integration and summation fOllows from Theorem 278 in
Hilmos (1950).
When k.e, terms in
van:i,sh and so
2
(4.7.4) that correspond to odd values of m
) ,
70
and on applying toe same type of algebra as we used for J , we find that
l
this equals
2 co ("'/2)P
~ -~- p!
+ I'll"
p-o
1
G( 2' p +
exp( -,I'/2)
~
vl
2"
+ 2 ' U')
~
(y/2)P I
~
p!
p-o
Using the symbol Q, defined in
(
xa
].
p +
2
2
h (S )dU' d(a. . )
4
l
v
:2
+
1 y)
'2
(2.4.5), we may write
4.7.7)
and
(4.7.8)
Substituting these in
(4.7.2) and (4.7.3), we have
71
and
E (T
e
11
- ~
11
)
2
i [V
1
-= 1 - - Q
pq
r
viJ,
",' +1
2
ry 11
]
+ ---.:.- (2iQ
- 'Q
) (4.7.10)
2
V1. 1
vl 2
~+
(J
~+
The expectation and MaE of the estimator of any other cell mean
Generally, for any ~ .. (i=1,2, ... ,p;
have similar expressions.
IJ
j=1,2, ... ,q), whose estimator is T.. , we have
IJ
and hence
=
p
q
E
E
i=l j =1
=;
e (T .. -
w.. E
IJ
IJ
~. . )
IJ
2
2
[1 - (p-l)(q-l) Q
pq
where
** is as defined in 4.3.6).
4.8
General Observations
+
L
v~ + 1 pq
(Q
t
V
)J,
- Q
+ 1 vt +2
(4.7.11)
Note that expressions (4.7.9) and (4.7 11) are similar to the
corresponding expressions in Chapter 2, viz., (2.4.9) and 2.4.10).
The points of difference are that now we have
Q
vi
of Q3/2 and Q5/2 ; (p-l)(q-l)/pq instead of 1/4, and
and Q
instead
+1
Vi +2
** as well as *.
fact, the expressions in Chapter 2 may be looked upon as special forms
of (4.7.9) and (4.7.11), for in the 2x2 case
(p-l)(q-l)/pq
= 1/4
,
In
and
Hence the observations made in Section 2.6 have a straightforward
generalization to the present situation.
Thus for 0 < a < 1,
and the bias
o < IE~(Tij) - ~ijl <
tends to zero as Wtends
I
Yijl Q
Yi
to
+1
(4.8.1)
00 •
Also
*
_ {(l _ (p-l)(q-l) ) + -1pq
pq
AMSE~
2
CJ
=
/r
(p-l),(g-l)
pq
> 0,
[1 _Q 1 +
Y~+r
if 0 ~
*~
W
L*
pq
[2Q
}
- Q '-
v~+ 1
v~ +2
lJ
(4.8.2)
(p-l)(q-l)
and
- 1 =
+
> 0, if W*
*
L
pq
=:
plq ['iI* - (p-l)(q-l)]Q
[Q
t
v
+ 1
- Q
v +2
t
"'i
+ 1
]
(4.8.3)
(p-l)( q-l)
This means (cf. equations (4.3.3) and 4.3.4»
that in the domain
o < 'iI* < (p-l)( q-l), the set of estimators T.. is ~ecessar,ily worse
-
. -
lJ
73
than the set of estimators
w* > (p-l)(q-l),
.
w*
:=
~
.. ,
J.J
On the ather hand, in the domain
the set of estimators T.. is necessarily worse than
J.J
the set of estimators
For
f'<J
~ij
0 ,
AMSE g
2
a
:=
(p-l)(q-l)
1-
Q
/r
(4.8.4)
v~ + 1
Because of the properties of the Q function (see Section (2.5)), this
means that the ratio at
for a particular
x
W.
* :=
W
0 is smaller, the larger the value of x
a
However, with varying
* := 0),
W(but W
,
for a given
a ' the ratio increases as Wincreases .
4.9
An Alternative Approach
For the simultaneous estimation of all cell means, we have
considered above the minimization of the AMSE as our gOal.
An
alternative approach would be to choose a procedure that keeps the
largest of the pq MSE's
E~(T .. - ~i')
~
J.J
2
( i "\'1 ,2, . . . , p j
J
j =1 ,2, . . . , q )
at a low level.
Now, for the estimators '"~ij all the MSE's are equal:
'"
MSE~(~ .. )
'"
J.J
:=
a2 /r
(4.9 1)
for (i=l,2, ... ,q).
On the ather hand, for the estimators ~ .. the lrgest MSE is
J.J
Max
i,j
[ 1 _ (p-l)(q-l) ]
pq
The expressions (4.9.1) and (4.9.2) suggest the following TE
procedure:
(1)
First, make a test for the hypothesis
2
H :
o
Max
i,j
2
(p-l)(q-l)
pq
< E:....
'Y ij - r
against the alternative
2
(p-l)(q-l)
pq
(J
r
(2 )
In case H is rejected in step
0
(1),
A
as the
take fl
ij
estimates of fl ; otherwise take ""'
fl .. as the estimates.
ij
lJ
However, here too we have the problem of developing a suitable
test for H.
o
Even the heuristic test given by the rejection region
Max
i,j
l:.
lJ
>
C
raises too difficult a distribution problem, especially so because
the distribution of the statistic
Max
i,j
2
lJ
z ..
is not dependent solely on the parameter
2
Max 'Y ••
. .
1,J
lJ
75
5.
FURTHER STUDY OF pxq CASE WITH
SOME NUMERICAL RESULTS
5.1
Percentage Points of x
In the pxq case, with the preliminary test given in Section 4.5,
the study of the bias for a single cell mean will go along the same
lines as for the 2x2 case.
Of greater interest is the variation of the AMSE with varying p
and q, varying Wand W* and also with varying rand a.
For the numerical part of the work, we have considered two pairs
of values of p and q, viz., p<ltq=3"and p=q=.5, two values of r, viz.,
r=3 and r=5, and three non-trivial levels of significance, viz., a=.25,
.50 and .75.
The critical values' of the test statistic x corresponding
to a=.25, .50 and .75, for the chosen values of p, q and r, are shown
Here, again, the two trivial values of a, viz., a=O
in Table 5.1.1.
and 0-1, have not been considered, since the corresponding percentage
points are
x
o
= 1
(5.1.1 )
and
(5.1.2 )
irrespective of p, q and r(r
5.2
~
2).
Bias of T.. as a Proportion of r ..
~J
From
(4.7.9)
~J
the estimator, T of the cell mean Il
has expectation
ij
ij
E('Fij ) = Il ij - rijQ V
..l
+1
2
(!'
~, xa )
Table 5.1.1. Upper a-points of the statistic x
Size of
experiment
.25
{
p=q=3
{
p-q-5
Level of significance (a)
·50
r-3
.396
.290
.193
r-5
.238
.167
.107
r-3
.446
.386
.328
r-5
.281
.238
.198
is the proportion of r
Hence Q
v
·75
~ + 1
tnat constitutes the bias of T .
ij
ij
We present the values of this relative bias
(
r -1
ij E Il i
(which is the same for all
~j)
.)
. - T.. - Q
J
lJ
Vi + 1
for our chosen values of p, q, r and a
in Tables 5.2.1 and 5.2.2.
Note that for
a-o,
the bias of Tij is -r ij , and for a-l, the bias
is zero, irrespective of p, q and r.
As in the
2~
case, here also we have chosen eleven values of
W.
Note that a W for the 2x2 case is not comparable to the same w-value for,
say, the 3x3 case . . However, the values
would seem to be comparable (cf. formula for expectation of interaction
mean square; also the null-hypothesis is
W~ 1 (see (2.2.1» in the 2x2
case, as opposed to the null hypothesis
W~ (p-l)(q-l) (see (4.3.5) and
(4.3.7»
in the pxq case).
We have, therefore, multiplied the w-values
77
for the 2x2 case by 4 and 16, respectively, to obtain comparable
v-values for the 3x3 and 5x5 cases.
Table 5.2.1.
v
0
0.8
1.6
2.4
4.0
5.6
8.0
12.0
16.0
24.0
40.0
3
ex • .25
ex • .50
ex • .75
ex •• 25
r • 5
a • .50
a • ,75
.876
.830
.783
.734
.635
.540
.410
.243
.135
.035
.002
.661
·589
.522
.460
.352
.265
.168
.074
.031
.005
.000
.359
.296
.243
.199
.131
.086
.044
.014
.004
.000
.000
.888
.840
.788
.735
.626
·522
.382
.209
.105
.022
.001
.673
.596
.524
.457
.343
.251
.153
.062
.024
.003
.000
.363
.296
.241
.195
.126
.080
.040
.012
.003
.000
.000
Table 5.2.2.
r •
Values of
7~~ E(~ij- Tij )
(P'llCl-5 )
ex • .25
3
ex • .50
ex • .75
a • .25
.987
.963
.921
.861
.697
·508
.264
.059
.009
.900
.000
·938
.865
.767
.655
. 42 9
.247
.089
.011
.001
.000
.000
.804
.664
.522
.391
.195
.085
.020
.001
.000
.000
.000
·992
·971
·931 .
.867.
.653
.360
.075
.001
.000
.000
.000
:r •
'!r
0
3.2
c6.4
9.6
16.0
22.4
32.0
48.0
64.0
96.0
160.0
r •
5
,
ex • .50
a • .75
·955
.886
.787
.667
.412
.195
.036
.001
.000
.000
.000
.837
.695
.544
.402
.190
.072
.011
.000
.000
.000
.000
On examining these tables, we find that the observations we made
in connection with the bias in Section
bias decreases as W increases.
only slight.
3.3 hold here as well. Thus the
The change in the bias for changing r is
On the other hand, a change in
effects a substantial change in the bias.
~he
level of significance
The magnitude of the relative
bias seems to be dependent on the numbers of factor levels, p and q.
Thus, while for p-q -3 the relative bias corresponding to a . . 75 may be
said to be only moderate, for p-q-5 even with such a high level of
significance the relative bias remains quite high, at least towards the
beginning of the range of W-values.
It would thus seem that, if keeping
the relative bias under control is one of our objectives, then a should
be increased progressively with p and q.
5.3 Relative AMSE: AMSE/(a2 /r)
As we stated earlier, our main goal is to minimize the average mean
square error (AMSE) when we are trying to estimate all pq cell means
simultaneously.
Here we shall study the variation of the AMSE or rather
of the ratio (see equation
AMSE
-r
a /r
•
(4.7.11»
1-
+
~:
[2QYi +i~'
~, x~)
- QYi
+~~, ~,
xa )]
(5.3.1)
The values of this ratio corresponding to our chosen values of p,
q, r and a are shown in Tables
5.3.1 - 5.3.4.
on
2
2
W* - pqr 1: 1: wij ?' i ./0'
i j
J
Note that the AMSE depends
79
as well as on
'iT - r
2
2
t t '>'i ./0
i j
J
Hence, we need, for each set of p, q, r and a values, a two-way table,
with
rows.
'iT
varying between the columns, say, and
For
'iT*
~*
varying between the
we have taken the same eleven values as for
'iT.
Not all
the 11 x 11 combinations are possible, however, as shown in Section 4.4,
The cells corresponding to the inadmissible combinations have been left
blank in the tables.
Y
Besides a - .25, .50 and .75, the two trivial values, a - 0 and
a - 1, may also betaken into account.
However, for a
~
0,
and hence
if
AMSE
0
2
/r ·
irrespective of
if
'iT
,
and r, while for a - 1,
and hence
AMSE
=
~
a /r
for all p, q, r,
'iT
and
1,
'iT* •
!/JhTables 5.3.1 - 5.3.4 the columns for 'iT • 40.0 are omitted for lack
of space and because only in two columns some values are different from
1.000. These columns correspond to p-q=3, 0-.25, and even there the
largest difference is .009 for r-3 and .004 for r=5.
80
Table 5.3.1.
Values of the ratio
AMSE (cr2 r)
(p=q=3, r=3)
0
0 .611
.8
1.6
2.4
4.0
0:-.25 5.6
8.0
12.0
16.0
24.0
40.0
0 .706
.8
1.6
2.4
4.0
0:=·50 5.6
8.0
12.0
16.0
24.0
40.0
0 .840
.8
1.6
2.4
4.0
0:-.75 5.6
8.0
12.0
16.0
24.0
40.0
.6 8.0 12.0 16.0 24.0
.760 .818 .B92 .940 .984
.818 .863 ·920 .956 ·989
.877 .908 .947 .972 ·993
81
Values of the ratio AMSE/(cr2 /r)
Table 5.3.2.
(paq-3,
0
0 .605
.8
1.6
2.4
4.0
5.6
8.0
12.0
16.0
24.0
40.0
0 .701
.8
1.6
2.4
4.0
5.6
8.0
12.0
16.0
24.0
40.0
0 .839
.8
1.6
2.4
4.0
5.6
8.0
12 .0
16.0
24.0
40.0
r-:;)
.6 8.0
.768 .830
.826 .873
.883 ·917
12.0 16.0 24.0
·907 ·953 .990
·932 .966 ·993
.956 .979 .996
82
Table 5.3.3.
Values of the ratio
(P=<1-5,
o
o .368
3.2
6.4
9.6
ex-.25 16.0
22.4
32.0
48.0
64.0
96.0
160.0
o .400
3.2
6.4
9.6
ex-.50 1.6.0
22.4
32.0
48.0
64.0
96.0
160.0
o .486
3.2
6.4
9.6
ex-.75 16.0
22.4
32.0
48.0
64.0
96.0
160.0
AMSE/(o2 /r)
r=3)
16.0
.554
.651
'--:-;~+---"-~-,- • 747
22.4
.675
.747
.820
32.0 48.0 64.0 96.0
.831 .962 .994 1.000
.870 .971 .996 1.000
.909 .981 .997 l.000
83
Table 5.3.4.
Values of the ratio
(p=q-5,
o
o .365
3.2
6.4
9.6
ex-.25 16.0
22.4
32.0
48.0
64.0
96.0
160.0
o
3.2 t-=-=~~~
6.4
9·6
a-.50 16.0
22.4
32.0
48.0
64.0
96.0
160.0
o .464
3.2
6.4
9.6
ex-.75 16.0
22.4
32.0
48.0
64.0
96.0
160.0
2
AMSE/{cr /r)
r=5)
16.0
32.0 48.0 64.0 96.0
·952 ·999 1.000 1.000
.965 .999 1.000 1.000
.979 1.000 1.000 1.000
84
5·4.
Comments on the Values of AMSE
Let us now examine the values tabulated in Section 5.3 to see how
the AMSE is affected by changing r,
Firs~
0,
consider the variation of the AMSE with 'If and 'If*,
0(0 <0 < 1) and r. being kept fixed.
to vary,
'If and 'If*.
~he
If 'If is kept fixed and 'If* allowed
AMSE increases monotonically with 'If*.
apparent from eqtlation (4.7.11) as well.
This, of course, is
On the other hand, if 'If* is
kept fixed and 'If allowed to vary, then the behavior of AMSE is a bit
It is seen that for small 'If*, the AMSE increases monotonically
irregular.
with 'If and tends to unityj for large 'If*, the AMSE decreases monotonically
as 'If increasesj while the tables seem to indicate that for some intermediate 'If*-value:s, as 'If increases the AMSE increases for a time, reaches
a maximum and tllen tends to decrease.
Corresponding to Theorem 3.5.1,
here also we have the following theorem, which can be proved using
Lemma 3.5.1:
Theorem 5. 4.1.
rAMSE(Tlo:)/i ~ 1 as 'If or 'If* ~
2
That
3.5.1.
On the other hand, if 'If*
(4.4.4»
and
r AMSE/ a
~
Proof:
~rom
~ 00
~ 00,
ex > 0.
is obtained directly from Lemma
then 'If
~ 00
(cf. equation
(4.4.1)
'If*a'''£J(f, q, xo )
for a
1 for 'If
provided
00,
S
(pq'lf)aQj(f, q, xo:) ~ 0,
> 0, sa that, because of (4.'If.ll), rAMSE/a2~ 1.
Secondl~r, as was true for the 2x2 case, the change in rAMSE/a2
corresponding to a change
Last~y,
in 0:.
~~e
i~
r is rather negligible.
for fixed r, 'If, 'If* , the AMSE is highly sensitive to a change
general effect of a higher level of significance is to raise
the value of the AMSE toward the lower part of the range of 'Ii*-values,
!.~.,
for 'Ii* ::: (p-l)(q-l), and to reduce the value of the AMSE to'ward
the other part of the range, !.~., for 'Ii* > (p-l)(q-l).
If one uses equal weights, then 'Ii* • 'Ii, and the behavior of the
2
ratio rAMSE/a can be seen from tpe principal diagonals (marked for
convenience) of the above tables.
It will be found that in thi.s parti-
cular case the ratio behaves very much like the quantity
rMSE/a2 for
a 2x2 experiment (see Section 3.5).
5.5.
Optimal Choice of a
Here, too, some suggestions may be made as to the choice of a
suitable level of significance for the preliminary test.
One may again
consider the four approaches we talked about in Section 3.6:
(a)
Bayes Approach:
Note that since the ratio
AMSE
2
a
/r
is dependent on both 'Ii and 'Ii*, to use this approach we must start with a
joint prior distribution of 'Ii and 'Ii*.
We do not attempt to say anything
more than the following, because the choice of the prior distribution is
bound to be artibrary:
It is apparent from Tables 5.3.1 - 5.3.4 that if
one's prior knowledge indicates that
~
is very likely to be less than
(p-l)(q-l), then a relatively small value of a should be used for the
preliminary testj while in case the prior information indicates that
is not so likely to be small, then a high value, perhaps one near to
unity, should be selected.
(b)
Minimax Approach:
In the absence of any prior knowledge about 'Ii
and W*, one might like to choose a in such a way that
~*
86
Max
r AMSE
2
\jr, \jr*
(J
The result, (4.8.3) implies that this minimax level of
is a minimum.
significance is
so that according to this approach one should
~,
choose the estimatom
t2 J.J
.. ,
irrespective of the sample observations
0
A
remark similar to the one made at the end of SQbsection (b) of Section
3.6 pertains to this result.
(c)
Minimizing the
!l~of
discussed for the case
the Regret Function:
\jr =\jrt.
This approach will be
The regret function analogou$ to (3.6.6) is
now
Min
a
r AMSE\jr(Tla)
2
-s(\jr,a)
(J
and it equals
1) + 1- J for
- [-1 - (p-l)(9pq
pq
- 1
for
\jr ~
\jr
S
(p-l)(q-l)
(p-l)(q-l)
As in the 2x2 case, this approach leads to the choice of a-l again.
(d)
Minimax Regret Approach:
weights, when
\jr ~
\jr • \jr*.
We shall take the special case of equal
The result (4.8.2) shows that for any
(p-l)( q-l)
1
which value is attained in case a-o.
any
\jr
> (p-l)(q-l)
(p-l) (q-l) + 1pq
pq
,
The result (4.8.3) shows that for
Inf'
r AMSE",(T1a)
2
a
-
1 ,
(J
which value is attained in case a-I.
Table 5.3.5 lists the numerical
values of these expressions for p-q-3 and p-q-5 and the same eleven
*-values as were used in Tables 5.3.1 - 5.3.4.
Table 5.3.5.
Values of Inf' r AMSEw(Tla)/i
a
\jI
p-q-3
\jI
p-q-5
0
.556
.644
0
.360
3.2
6.4
.488
·744
1.000
0.8
1.6
.616
2.4
.733
.823
4.0
1.000
9.6
16.0
5.6
8.0
1.000
22.4
1.000
1.000
32.0
1.000
12.0
1.000
48.0
1.000
16.0
1.000
64.0
1.000
24.0
l.000
96.0
1.000
40.0
l.000
160.0
l.000
Considering the differences between the entries in the principal
diagonals of Tables 5.3.1 - 5.3.4 and the corresponding entries in
Table 5.3.5, we find that the level of significance minimizing
Max s (w ,a)
w
is some value between .25 and .50 (closer to .50).
On a comparison
of this result and the result in Section 3.4, it would seem that the
•
minimax regret a is at least approximately independent of p and q •
8£3
A fortune as:pect) in the case of equal weights) of thid choIce of
a between .25 and .50
(nea~r to
-=..22l
is that not only is the maxtmum
regret minimized, but also the resulting maximum is brought to a lov
level, so that a sufficient degree of control over the AMSE throughout
the range of w-values is obtained. (The situation is similar to that
1n the 2x2 case, see Section 3.6).
By looking at the tables) one cun
say that such a choice would be useful even when une£ual we:i.sht~ <~£~<
u',(,d, E!J2yided. J.a rger weights go wi ttl smaller J.nternctions (
But t:n case
-,Lc,A!YlGE too high.
w*
fs mU.cIl larger t.ba!1 1f,' sucb an
IIcr
W*.
:1. be to cont;rol tbe AMSE throughout the
X!..U~E.
value of a .
;;;~;;~"_"' shall Ld hf: t.he
w:ilJ rJul-{(":,
In a practical situation one is not likely;.'
i.. tbe relati VEo: magnj.tudes of wand
"IOU}
ex
"',;yJ
The proper course the'
W·oW'* table,
by (:boc,',hg
.J
Evidently.• in this case J the hisher !,xg,L;, P:r:
value of
a:.
Let m; now convert the value of
a:) rela ted to our test for
H : W< (p-l)(q-l)
a
against
into levels of significance to be used in corresponding ANOVA tests
fDr
H : Ijr • 0
o
aga:i.nst
'rhe level of significance of the ANOVA test, corresponding to an
in our test for material significance, is
a
used
1 - Qv /2(0, v/2, x)
a
l
These levels of significance are shown in Table 5.5.]
for tbe
set of valuesof p, q and r considered in this chapter.
Table 5·5·1.
Levels of significance for ANOVA test corresponding to
a-.25, .50 and ·75
.
p=q-5
....._.
~----~
.25
3
.05
.17
.40
r-5
.04
.15
.38
{ r-3
.007
.04
.13
r-5
.004
.02
.10
Size of the
ex eriment
p-q-3
--_
a
·50
C-
We may conclude that for the
~ual-weights
·7
case our preliminary
test may be conducted as an ANOVA test at about the
~
level for a
3x3 experiment and the .02 level for a 5x5 experiment.
However, when unequal weights are used, an ANOVA test at
high a level as
~,
under proper control.
as suggested by
Anderson(l~),may~
~
as
keep the AMSE
This situation arises presumably because the
weights are not being taken into account in the preliminary test procedure.
The development of a better test procedure is thus called for to deal with
the case of unequal weights.
6.
6.1.
SOME DEx;:ISION-THBORETI<; ~UJ:,T3
Introduction
In the course of thfs work, we
the question hOW TE procedures,
~n
~ve
at various
general, woulQ
ti~s
into
~1t
theoretic framework and whether some concepts of that
those discussed in previous chapters) might be
TE procedures.
qince we feel that.our
'tlaQught about
8
t~eory
~elptul
decision(other than
1n constructing
particula~ p~Qblem m16~t ben~fit
from such an approach, we have collected SOme ot our
thO~hts
1n this
chapter.
6.2.
Fixed-sample Decision Problem
Although the theory of statistical decisions can be treated in a
more general set-up, we shall here be concerned
with samples of a fixed size, so
t~t
we
h~ve
w1~h
to deal
problems connected
o~y
with terminal
decisions (vide Wald (1950».
We have a random variable (possibly a vector) x, taking values 1n
a space
~
whioh is the underlying sample space w;tth Boral field
(1,a.) is defined P, a family of probability measures P.
a.
On
In most
s1tuations,
p
= fPe , e
where ® is a parameter space.
£
e}
,
Together with ~, we have also a space ~
of possible decisions regarding the probability measures in p.
typical member
of~
will be denoted by d.
L such that
L(e,d)
A
There 1s also a loss function
91
is the loss incurred through taking the decision d when
e
is the true
parameter.
A decision rule (function) 6 is a measurable
~.
tunc~iqn
from
~
to
Thus 6 is a rule for making a decision on the basis of an observation
on x.
The choice between any two rival decision functions is made on
the basis of the loss function L, or rather the risk function r, where
r(e,6)
=jL(e,6(x»
,
dPe(x)
1
This choice may be made from various standpoints
88
illustrated by
the following definitions:
Definition 6.2.1.
A decision function 6 is sai~to be admissible if
o
there exists no other decision rule 6 such that
r(e,6) -< r(e,6 0 ) , all e
I:
e ,
and the strict inequality holds for at least one 9 ,
Definition 6.2.2.
is a
~~field
Let ~ be a probability meas~re on
of subsets
of~.
e.
(e,j),
where B
~
Let the average risk relative to
of a
rule 6 be denoted by r*(~,8), i.e., let
r*(F,6) = j'r(e,8) dl'(A)
e
Then a decision rule 8
I!
is said to be a Bayes
~le
relative to
~
if
for all 6.
Definition 6.2.3.
A decision function 6
o
is said to be m1pimax if
sup r(e,6 o ) ~ sup r(e,6)
e
A
for all 6.
Definition 6.2.4.
(or
~ost
A decision function 6o is said to be
mini~ax
regret
stringent) if
sup[r(e,6 0 )
-
e
inf r(e,a)] ~ suprr(e,6) - inf r(e,6)]
6
e
6
for all 6'
For easy reference, we now state some t4eorems from Wald (1950).
T~eorem
6.2.1.
v.r.t. a
Suppose, for every
~-finite
measure
function by fe(X),
e
B
E
8, P is absolutely continuous
a
X on (l,a) and denote the corresponding depsity
Assume further that:
(1) the functions L(e,6(x» end
is the smallest
E
~-~ield
fe(~)
are measurable (C), Where
generated by the sets AxB, Where A ~Gl,
fH
(2) the loss function is non-negative: L(e,d)? O.
If there exists a rule
J'
L(e, 6Q(X»
ao
satisfYing these conditions and for whic4
fa (x) d,.(e)
A
:sf L(A,d) fe~X) d,,(~d
A
,
a:j..l d
E
~
all d
f:
1J ,
,
Or, equivalently,
JL(~b60(X»
~
~x
dll'x(Fd
~J
L(A,d) dP'x(e)
~
being the posterior probability measure
60 is a Bayes rule relative to f.
On
,
(~~B) relativ~
to ~, then
e
93
Definition 6.2.5.
Two decision rules, 51 and 52' are said to be
x-equivalent if 6l (x) = 5 (x) a.e. [xl, where X has the same meaning
2
as in Theorem 6.2.1. They are said to be risk-equivalent if
(Note that x-eqUivalence implies risk-equivalence.]
Definition 6.2.6.
~.
Consider, e.g., a Bayes rule 6~ relative to
~
.
is said to be unique except for x-eqUivalence if any other Bayes
F
rule relative to ~ is x-equivalent to 5 .
6
~
Theorem 6.2.2.
A Bayes rule relative to a prior distribution
~
is
admissible if it is unique except for x-eqUivalence.
Theorem 6.2.3.
Let 5 be a decision rule such that there exists a
o
prior distribution
~
relative to which 50 is a Bayes rule, then if
r*(~,5
~
0
) -> sup
e
r(e,~
Vo )
,
then 5o will be a minimax decision rule.
Corollary 6.2.3a.
If 5 be a Bayes rule relative to some prior distri~
bution ~, such that the risk r(e,5 ) is constant over ~, then 5 is a
F
F
minimax decision rule.
Theorem 6.2.4.
Suppose that there is a sequence of prior distributions
~k(k
= 1,2, .....• )
~k'
If there exists a decision rule 50 for which
and that 6k is a Bayes decision rule relative to
,
then 5 is a minimax decision rule.
o
6030
Decision=them:'et:ic Forml..:lation of a '1'E Procedure
The problem of estimation of a parameter aft.er
8
test of signifi,·
car;.ce may be considered 1.n the framework of d,e:-:ision theorySuppose the element of t.he
j:r~
1,s
8~~~
~
pa:car~eter
vector
e tI'"t,at
0
,je are jrrte:ccsted
Fr'om other considerationJ3 two rj.'lal estimators, T., and
.L
9re available to us
0
HS.'.rlng taken
tl
random observation x
£
T1",y
c
1: 'We 'tant
t·: :'iecide whic:,l of T (x) and T,.,(x) is to be taken as the estimate of
1
Co
We are 'thus :Lnterested in the
~
class {), of decision functi.ons (with
values in e)J a typical member of which is
w.r~ere (;l(X)
e,
&defined
by
is assumed to be a measurable function taking values 0 or 10
Tne fact that we are using
srr~ll
MSE as the criterion of a good
estimator implies that we have the following equared-error loss:
:=
de)[(l ~. e(l(x)1(T1 (x) - e1 J2 + ep(x)(T2 (x)
=
~1}2l
(60302)
vb,ere c( f:l) > 0 is independent of 91 but may depend on other elements of
p
0
L'1e ri sk f'uncti on
j
s r J gi ven by
which is the MSE of 8 at ~j except for the mUl'tiplier c( e)
0
The question might very well be asked why we are taking this
restricted class 6 of decision rules (estimators) instead of taking the
0
95
class of all estimators.
'Ihe purpose is to set up a framework in
which the question raised in Section 2.2 can be discussed.
best possible function
~(x)
for (603.1) (if a
~best9
Finding the
function exists in
some sense or other) amounts to finding the optimum partition of the
sample space for a TE rpocedure.
Since our objective is to control the
MSE of the estimator, the notions of power and level of significance of
the preliminary test are clearly irrelevant.
604.
Bayes Rules for Simple TE Problems
= (xl ,x2 ,ooo,xn )
Let x
where
be a random sample of size n from N(~,rr2),
is unknown and is to be estimated.
~
For estimating
~
on the basis
of this sample, we may say that we have two available estimators, T
1
and T
2
=x
0
Now
::: ~
2
and
so that if
~
2
~ a
2
/n
one would prefer T , and if
l
one would prefer T
2
0
A TE procedure would be:
(1)
Test the hypothesis
(2)
If significance is obtained in step (1), take T2 (X) as the
estimate; otherwise take T (x).
1
=0
In the above notation, we now have
a(x) = [1
- qJ(X)]o 0 +
(6.4.1)
q>(X). X
and
L(e,6) = n 2 [(1 ~ q>(x)}~2
cr
+ q>(x)(x_~)2]
As the prior di.stribution let us take the conjugate prior distribution corresponding to N(~jcr2).
Some argillaents in support of
conjugate priors have been gi.ven by Raiffa and Schleifer (1961),
Chapter 3.
One mathematically quite cogent argument i8 that in this
way we can work within a closed system (in the sense that the posterior
distributions are of the same type as the prior distributions).
Case .1:
cr
2
Known
Here
®
=
I-
(~
ro
<
~
< J
(Xl
and we need consider the prior distribution
distribution of
~
of~.
The conjugate prior
may be put in the form:
(vide Raiffa and Schlaifer
(1961), p. 54).
The posterior distribution of
~
given x is
,
where
nIt = n' + n
and
mIt
=
(n'm' + n x)/n"
(6.4.4)
97
Theorem 6.2.1 implies that a Bayes solution is defined by:
J~2 d!x(~) <J (x~~)2 dP'x(~)
m
o
if
x
if
co
6 (x) =
F
Le. j
6 (x)
F
={:
-"
if
2xm <x-2
if
2xm" >x-2
-?
t
i . e., (n -n)x- - 2n'm' x > 0
,
i. e. , (n' _n);(2 - 2n'm'x < 0
i . e. ,
0
6 (x) =
S
=
x
if
x >0
-
and
or
x < 0
and
(n'-n)x - 2n' m~ >0
(n'-n)x .. 2n'm' <0
if
x >0
and
(n'-n)x
or
x < 0
and
- 2n'm'
<0
(n'-n)x _ 2n'm' >0
(6.4.5)
We have nine different si,tuations according to the values of n,
m', and for each of these situations 6~(X)
=
x in those
sample space which are listed in the last column:
2n~m'
(1)
n~
>n
and
mt >0
o < x <
(2)
n' >n
and
m' <0
2n'm'
<x<O
n'-n
(3 )
n' >n
and
m =0
(/l ,
(4 )
n' <n
and
m' >0
2n'm'
x < - noon'
(5 )
n' <n
and
mt <0
x<O
(6)
n' <n
and
m' = 0
):
g
•
n -n
the empty set
or
or
x >0
x > .. 2n'm'
noon'
n~,
subsets of the
(7)
n' • n
and
m'
>
0
x>
0
(8)
n' = n
and
m'
<
0
x<
0
(9)
n'
=n
and
m' :: 0
= any measurable subset of X
(including ¢ and 1. )
Some features of these results deserve special attention.
~
if the prior information regarding
we would choose
x over
is comparatively precise (n'
a small part of the real line.
greater part of the real line.
x over
than that of
x,
When
~
the
When we take any particular case in each
group, the result would again seem to support common sense.
case (1) above.
> n),
On the other hand,
< n), we would choose
if this is comparatively diffuse (n'
Thus,
E.g., take
has a positive mean m' and a variance smaller
a negative
x will
generally underestimate ~, while if
x is much larger than m' it will be too much of an overestimate, compared
to 0,
Note that, for the same problem, a TE procedure, such as those
discussed in the previous chapters, would lead to the choice of 0 when
2 y?
nx-2/ a<a
and to the choice of
non-central
Y?
x otherwise
(where
with 1 d.f. and non-centrality parameter
2
/a2 = 1
n ~
Case 2:
x~ is the upper a-point of a
. )
2
a Unknown
Here
e = (~,
2
a
I -
aa < ~ <
00,
2
a
O} .
99
Any conjugate prior distribution has the following normal-gamma form
(see i~iffa and SChlaifer (1961), p. 55):
,
2
2 1/2
9
t
2
2
2
2 - ~.~
dF(Il,O" ) = (21(0-)exp[- n (J..l.=m ) /20" ] exp[-v'v' /20" ](0" )
1
dpd
2
i-
C'
(6.4.6)
=
00
' <
< moo,
vt ,n9 'v, >0
,
and the corresponding posterior distribution is
:::: (21(0-2 (1/2 exp[- n"(Il- m,,)2/20"2]
"
eXP[-v"V"/20-2J(0-2)-~ + 1
2
x dll do- ,
where
n" :::: n'
"
'1.1='1.1
,
,
+n
+'1.1+
m" :::: (n'm' +
ruc)/n" ,
- 2 -) "
v" -_ r(
. v , v • + n ,12)
m
+ ( VV + nx
n m,,2 J/. V "
1
,
and
v =n -
1
v
=L
1
(x.- x)2/ v
1
Theorem 6.2.1 implies that a Bayes solution is defined by:
& (x)
F
J
00
if
11
J
00
2
dlr x (ll) >
(x-Il)2 dF x (ll)
Here F'x(ll) is the posterior marginal distribution of IJ1 and it has density
oc
,
100
Since this distribution has mean m" , the above Bayes solution comes out
to be
if
-"
2 xm
< x-2
if
-" > x-2
2 xm
,
and thus is the same as the Bayes solution (6.4.5) for the case of
known cr2 .
Note that in both cases the Bayes solutions are unique except for
~-equivalence (~
= Lebesque
measure) and hence, from Theorem 6.2.2,
are admissible.
6.5.
Bayes Rule for Simultaneous Estimation of Cell Means in a
Factorial Experiment
Let us now take up for consideration the problem of simultaneous
estimation of the cell means of a pxq factorial experiment.
set-up and notation of Chapter 4, we have as estimators of
Using the
~ll
the two
rival sets:
and
We are looking for a decision rule Which Will, on the basis of a set of
random observations xijk(i
estimate
~
= l, ..• ,pj
according to the estimator
j
= 1, .•• ,qj
k
= 1, .•• ,r),
- on one part of the
~
sample space
A
and according to the estimator
-11,
1::.
on the remaining part.
)
11 , 1J.12 , ... , 1J.1q, 1J. 21' ... IJ. pq , and
column vector. Similary for ~, ~, etc.
:=
' 1J.
~
\
j
.~
is the corresponding
101
We are thus considering the class of decision rules 6 of which a
typical member is a vector
~
defined by
~(x) = [1 - ~(x)];
where
~(x)
=0
+
~(x)~
J
or 1.
Suppose, as in Chapter 4, we are employing low AMSE as a criterion
of good estimators.
L(e,&) = c(~)
We then have a loss function of the type:
2
2
f ~WiJr(l - ~(x)}(~ij=
~ij) + W(x)(~ij- ~ij) 1
A
,
(6.5.2)
Where c(~) > 0 is independent of the ~ij' s but may depend on the other
element of 8, viz., ~2
= r/~2.
We shall here take c(e)
The risk
function is
and is the AMSE of
~
except for the multiplier
c(~).
The conjugate prior density of ~, in case ~2 is known, is (cf.
Raiffa and Schlaifer
(2n~2/r
(1961), p. 56)
)-pQ/2 exp[- r' E E
i
(~ij- m~j)2/2~2]
j
-
eo
<
~ij
< eo
,
, , ,
v ,r ,,,
>0 .
As in Section 6.4, in either case the Bayes solution would be the
_
same.
It leads to the choice of ~
A
= !!
or 1:.
=
*f
+
_
=~ ,
according as
102
i,e"
according as
This follows from Theorem 6,2.1,
If equal weights are used for all cell means, 1. e., if
Wij
:.:: l/pq,
then the above takes on a simpler form because
One would then choose
~
or ~" according as
(r'-r) L L z2 - 2r'
i j
ij
Here again these Bayes rules are unique except for A-equivalence
(A
==
Lebesgue measure) and hence are admissible.
6,6.
Minimax Decision Rules
Let us now study the nature of minimax decision rules for the
problems discussed in Sections 6,4 and 6.5.
It seems that in general
the minimax estimators are the conventional (unbiased and minimum=
variance) estimators.
Thus for the problem of Section 6.4 the minimax estimator in the
class 6 is x.
and m
For consider a conjugate prior distribution with n~ < n
0, i.e., case (6) of Section 6.4.
==
prior distribution 6(X)
r(e,x)
==
==
As we have seen, for such a
x is a Bayes solution,
And the fact that
constant (i,e., unity) implies, by virtue of Corollary 6,2.3a,
that x is a minimax estimator in the class 6'
Being a Bayes rule unique
except for A-equivalence, it is also admissible.
103
For the problem of simultaneous estimation of all pq cell means in
a pxq factorial experiment, discussed in Section 6.5, which is also the
main topic of our investigation, we can easily show that the usual
estimators
~
constitute a set of minimax estimators in the class
in case the weights
W
ij
are all equal.
For in that case we find from
(6.5.7) that, for the conjugate prior distribution with mit.
,
r
9
< rJ
r(ej~)
~(x)
'"
~
~
~!/,
is a Bayes decision rule.
J
=
0,
Also, for this rule
'" constant (i.e., unity), which implies, by virtue of Corolla17'
6.2.3a, that ~ is rninimax.
corresponding to r
,
,
(We have not used our Bayes estimators
> r, mij
=
0, s1.nce they have a risk that increases
linearly with ~* and hence violate the condition of Theorem 6.2.3.)
For the general case of possibly unequal weights, too, it is
conjectured that.! '" ~ is minimax in the class 6'
From (6.5.6), it is
seen that for the sequence of prior distributions (Fk) for which
m~j '" 0 (all i,j) and r~ ~ 0,
the Bayes rules (6.5.6) also tend to
It seems plausible that the nice properties of normal distributions will
ensure that the sequence of average risks r*(F k '5k) for these Bayes
solutions will also tend to the average risk of ~, viZ., unity.
being the same as r(p,~), all
e
This
e ® , Theorem 6.2.4 would imply that
i
is minimax.
!/The difference between these results and the results in Subsection
(b) of Section 3.6 and Subsection (b) of Section 5.5 is that the
functions eo in the earlier results were restricted to UMP invariant
tests, whereas in the class 6 there is no such restriction on ~ .
104
7.
7.1.
SUMMARY AND CONCLUSIONS
Summary
Itnas been our purpose in this investigation to study the
consequences (in terms of biases and mean square errors (MSE's) of
estimators) of estimating the cell means of a factorial experiment
according to either one of two different formulas, depending on the
resl1.1ts of. a preliminary test for interactions, whereby the preliminary
test and the final estimation procedure are carried out on the same set
of data.
Such procedures combining preli.minary testing and subsequent
estimation will be called test-estimation (TE) procedures; the resulting
estimators, TE estimators.
In Chapter I the general ideas were introduced, and it was pointed
out that we would confine our attention to two-factor experiments with
an equal number of replications per cell.
It was assumed that for our
experiments Eisenhart's Model I is appropriate and that usual normality
assumptions are valid.
Related literature was also cited in this
Chapter.
The special case of factors with two levels each was considered in
Chapters 2 and 3.
In Chapter 2 the distribution of the TE estimator of
a cell mean corresponding to the UMP invariant test for interactions
was obtained, and expressions were given, in terms of a function QJ for
its bias and MSE.
In Chapter 3 numerical values of 'the relative bias'
and 2the relative MSE' were obtained.
Different approaches to the choice
of a suitable significance level for the preliminary test were
investigated and various theorems proved, showing tnat two of these
105
approaches lead to a degenerate TE estimator, viz., the least squares
esti,mator.
An alternative formula for the MSE, which, at least for
smaller numbers of replications, will cut down substantially on the
numerical work, was also derived.
Chapter 4 is concerned with the general case of pxq factorial
experiments (p > 2 and/or q > 2).
Two modes (see Sections 4.3 and 4.9)
of combining the MSE's of the individual estimators of the pq cell means,
and thereby obtaining a single criterion for the goodness of the
estimators, were discussed.
For a special case of one of these modes,
viz., taking the average of the MSE's (AMSE) with equal weights, the
distribution, the bias and the MSE of the estimator of any particular
cell mean were obtained.
to those in Chapter 2.
The formulas were found to be closely analogous
In Chapter 5 we reported numerical results for
= q =3
two special cases, viz., p
g
and p
= q = 5.
The magnitude of
the relative bias' was examined numerically as a function of ~, non"'
centrality parameter
V.
and for unequal weights.
The AMBE was studied both for equal weights
It appeared that in the case of unequal
weights a second parameter,
V*,
had to be introduced along with
V.
The
question of what would be a suitable significance level for the
preliminary test was investigated for this more general case, and
generalizations of the theorems of Chapter 3 were presented.
Chapter 6 explored the possibility of studying TE procedures in
the framework of decision theory.
Here our findings have been tentative
but are expected to help future work on the choice of a good preliminary
test procedure.
The test procedures that we proposed in Chapters 2 and
106
!~
are based on the notions of significance level and power, hence are
somewhat unsatisfactory, because these notions seem irrelevant when our
ultimate purpose is to control the MSE's or AMSE of the estimators.
7.2. Conclusions
Now supposing the preliminary test suggested in Chapter 2 or
Chapter 4 is used in a TE procedure, the MSE's of the estimators can
be controlled by a proper choice of r, the number of replications, and
a, the level of significance of the test.
The value of r will in
general be determined by the resources available to the experimenter, and
there is not much of a choice in this respect.
As regards a, the choice
will depend on the context of the estimation problem.
If we have some prior information regarding the likely magnitude
of the non-centrality parameters 0/ and 0/*, we may utilize this
information to choose a suitable a.
Thus, if it is known that
w*
is
very likely to be less than VI ~ (p~l)(q-l), then a small value of a
should be chosen; While in case indications are that
W* is not so
likely to be less than VI' then a rather high value, perhaps near unity,
ought to be selected.
In the absence of any such prior knowledge, a minimax approacn.
might be adopted.
squares estimators,
This approach leads to the choice of the u8'21
".
~ij'
least·~
and so does the approach of minimizing the
L1-norm of the regret fQnction in the equal-weights case.
However, we
set out to find alternatives to these estimators, and so these approaches
do not seem helpful to us.
107
Another appraoch which would seem quite appropriate in the absence
of any prior knowledge about the variability of ~ and ~* would be to
choose ex in such a manner as to minimi,ze the maximum regret.
For the
2x2 case, and also for the pxq case when equal weights are used, thi,s
approach leads to the choice of an ex between .25 and .50 (nearer .50).
(These
ex~va1ues
refer to a pre11minaFf test for material significance;
for conversion into significance levels for customary ANOVA tests, see
Tables 3.6.1 and 5.5.1.)
However, in the pxq case, when unequal weights are used, an ex of
this type does not effect a satisfactory degree of control over the
AMSE for values of
5.3.1~5.3.4).
W*
that are too large in comparison to
Tables
It appears that in such a situation ex should be
progressively increased as p and/or q increase.
might be suitable for the 3x3 case, an ex
the 5x5 case.
W(see
=
Thus While an ex
= ,75
.90 might be required for
On the other hand, the use of such a high value of ex
will offset any gain that might accrue from the use of a TE procedure
for values of
w* that
are small compared to
W. Presumably, the
development of a better preliminary test procedure is called for to
deal with the case of unequal weights.
7.3.
Suggestions for Further Research
In our investigation, we have been concerned with two factors
only.
For dealing with more than two factors, there is a simple,
straightforward generalization of our procedure.
Thus (Whatever the
number of factors) we may like to use one type of estimates if the
interactions of the highest order are 'materially significant' and
108
another type of estimates otherwise,
Consider, e.g., an experiment with
three factors A, Band C, having a, b, and c levels respectively, with
r replications for each of the abc treatment combinations,
Here the
model analogous to (2.2) is
~
aI'
A
~
I--'j' Ik
(~)ij' (Cq)ik' (PY)jk
=
general effect;
=
main effects,'
~ interaction effects of 1st order;
(013'1)Ijk
interaction effects of 2nd order.
Let us vrite the corresponding sample mean as
vhere the components are the least squares estimates of the correuponding
parts of
(7.3.1).
Taking 10v average mean square error as the criterion
of a good set of estimators, we may choose as estimates of
~tijk
m + a. + b. + ck + (ab)., + (ac). k + (bc) 'k + (abc). 'k =
~
J
~J
~
J
~J
if
and
r
~ ~ ~W'jk(013/)i2'k > (a-l)(b-l)(c-l)
i
j k
~
J
Yi J'k
109
otherwise.
The corresponding TE procedure being similar to those
discussed in Chapters 2 and 4, our theoretical findings in those chapters
may be expected to be valid here, too.
However, in case there are more than two factors, one might want
to have a choice between more than two types of estimates.
Thus for
the case of three factors, we might like to choose among
m + ai
+ b j + ck '
m + a i + b j + ck + (ab\j + (ac)ik + (bC)jk ' and
m + a i + b j + ck + (ab)ij + (ac\k + (bc)jk + (abc)ijk
'
depending on the outcome of a preliminary test concerning their AMSE's.
A TE procedure for this problem will presumably need more complicated
mathematics and, indeed, more complicated statistical thinking, for
the preliminary test now takes the form of a multiple decision problem.
The case of non-balanced factorial experiments, where the numbers
of replications for different combinations are unequal and possibly
disproportionate, has not been considered by us.
Although our findings,
e.g., regarding the optimal choice of significance level, are expected
to hold broadly in this context also, a detailed mathematical treatment
ought to be made.
Also, a more extensive decision-theoretic treatment of our problem,
than was made in Chapter 6, would seem to be in order.
Even in the more elementary treatment of Chapters 2-5, a different
family of tests, which should aim at keeping under control the MSE or
AMSE, needs to be devised.
110
In the pxq case, we should try to find a better way of handling the
case of possibly unequal weights for the AMSE.
If in the definition of
the AMSE unequal weights are attached to the cells, a redefinition of
main effects and interactions might also be appropriate.
A generalization of Huntsberge~ s
(1955) method of estimation to
the case of factorial experiments may be studied.
This is expected to
lead to a greater overall control of the MSE (or AMSE) here as well.
111
LIST OF REFERENCES
Anderson, R. L. 1960. Some remarks on the design and analysis of
factorial experiments. Contributions to Probability and Statistics
(ed. Olkin et al.).· Stanford University Press, Stanford, Calif.
Asano, C. 1960. Estimations after preliminary tests of significance
and their applications to biometrical researches. Bull. Math.
Stat. 2: 1-24.
Bancroft, T. A. 1944. On biases in estimation due. to the use of
preliminary test of significance. Ann. Math. Stat. 15: 190-204.
Bancroft, T. A. 1964. Analysis and inference for incompletely
specified models involving the use of preliminary tests of
significance. Biometrics 20: 427-442.
Bennett, B. M. 1952. Estimation of means on the basis of preliminary
tests of significance. Ann. Inst. Stat. Math. ~: 31-43.
Bennett, B. M. 1956. On the use of preliminary tests in certain
statistical procedures. Ann. Inst. Stat. Math. ~: 45-47.
Defense System Dept., General Elec. Co. 1962. Tables of the Individual
and Cumulative Terms of Poisson Distribution. D. Van Nostrand
Co., Inc., Princeton, N.J.
Graybill, F. A. 1961. An Introduction to Linear Statistical Models,
Vol. I. McGraw-Hill Book Co., Inc., New York, N. Y.
Halmos, P. R. 1950. Measure Theory. D. Van Nostrand Co., Inc.,
Princeton, N. J.
Hodges, J. L. and Lehmann, E. L. 1954. Testing the approximate
validity of statistical hypotheses. Jour. Royal Stat. Soc.,
Ser. B 16: 261-268.
Huntsberger, D. v. 1955. A generalization of a preliminary test
procedure for pooling data. Ann. Math. Stat. 26: 734-743.
Jordan, C. 1962. Calculus of Finite Differences.
Publishing Co., New York, N. Y.
Chelsea
Kitagawa, T. 1950. Successive processes of statistical inferences.
Mem. Fac. Sc., Kyushu University, Ser. A. 12: 13~-180.
Kitagawa, T. 1963. Estimation after preliminary test of significance.
University of California Publications in Statistics 1(4):147-186.
112
Larson, H. J. and Bancroft, T. A. 1963. Sequential model building
for prediction in regression analysis, I. Ann. Math. Stat. 34:
462-479.
Lehmann, E. L. 1959. Testing Statistical Hypotheses.
and Sons, Inc., New York, N. Y.
John Wiley
Madow, W. 1940. The distribution of quadratic forms in non-central
normal random variables. Ann. Math. Stat. 11: 100-101.
Mosteller, F.
1948.
On pooling data.
Jour. Amer. Stat. Assoc. 43: 231-
2~.
Pearson, K. (Ed.). 1934. Tables of the Incomplete Beta Function.
Cambridge University Press, U. K.
Raiffa, H. and Schlaiffer, R. 1961. Applied Decision Theory.
Harvard University, Boston, Mass.
Rao, C. R. 1952. Advanced Statistical Methods in Biometric Research.
John Wiley and Sons, Inc., New York, N.Y.
Wald, A. 1950. Statistical Decision Functions.
Inc" New York, N.Y.
John Wiley and Sons,
© Copyright 2026 Paperzz