Brogan, Donna and J. Sedransk; (1971)Estimating one mean of a bivariate normal distribution using a preliminary test of significance and a two-stage sampling scheme."

I
I
I
I
I
I
I
I
I
I
I
I
I
\1
-I
I
I
(I
I
ESTIMATING ONE MEAN OF A BIVARIATE NORMAL DISTRIBUTION
USING A PRELIMINARY TEST OF SIGNIFICANCE AND A TWO STAGE
SAMPLING SCHEME
By
D. R. Brogan and J. Sedransk
Department of Biostatistics, U. N. C
and
Department of Statistics, University of Wisconsin
Institute of Statistics Mimeo Series No. 752
June 1971
I
I
(I
Estimating One Mean of a Bivariate Normal Distribution Using a
Preliminary Test of Significance and a Two Stage Sampling Scheme
by D. R. Brogan and J. Sedransk
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
ABSTRACT
Let (X,Y) have a bivariate normal distribution with unknown mean vector
(llx, l1y) and known covariance matrix
r.
It is desired to estimate ].ly'
stage sample is obtained on (X,Y), X only, and Y only.
·A first
A preliminary test of
H : lly=llx is performed, the result of which specifies a second-stage sample.
O
The estimator of l1y is either a regression estimator or a pooled estimator which
pools estimators of ].lX and lly'
The bias and mean square error of this estimation
procedure are derived, and a numerical example is discussed.
-_._~
..
-
I
I
)Ii
I
I,
I
I
I
I
I
I
I
I
I
I
I
I
'I
I
Estimating One Mean of a Bivariate Normal Distribution Using a
Preliminary Test ,of Significance and a Two Stage Sampling Scheme
by D. R. Brogan and J. Sedransk*
1.
Statement of the Problem and Examples
Let the random variable (X,Y)' follow the bivariate normal distribution
with mean vector E{X,Y), = (pX,~)f and known covariance matrix
L,
where
(1.l)
It is desired to estimate Py when there is evidence that perhaps Py=P •
X
There
are available n>O bivariate-observations on (X,Y), an additional n~>O independent observations on X, and an additional ny>O independent observations on
Y: Using these observations, the null hypothesis H : Py=P versus the a1terX
O
native hypothesis H : ~;&PX is tested. After this preliminary test, a second
1
stage of sampling is carried out on one, two or three of the following
random
,,,
variables:
(X,Y) only, X only, and Y only.
The estimator of Py , using data
from the first and second stages of sampling, depends upon the acceptance or
rejection of H •
O
This general estimator reduces in special cases to estimators
proposed by other authors who have considered the "preliminary test" approach to
*D. R. Brogan is Assistant Professor, Department of Biostatistics, University
of North Carolina School of Public Health. J. Sedransk is Associate Professor,
Department of Statistics, University of Wisconsin. This research was partially
supported by National Institutes of Health Biometry Training Grant 5TlGM34,
National Institute of Mental Health Training Grant MH10373, and National Center
for Educational Statistics (U.S. Office of Education) Contract Number OEC-3002041-2041.
I
I
~'I
-2-
estimating
~Y'
In this paper, the bias and mean square error of the proposed
estimator are derived.
A situation where such an estimation scheme would be appropriate is the
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
estimation of the average systolic blood pressure (SBP) of a hospital population.
SBP is available on each admitted patient's hospital record.
However, it is
well known that SBP varies within each person depending upon time of day,
general level of excitement, and so on.
Hence, theoretically, measurements of
SBP should be" taken on a random sample of admitted patients
standard conditions.
under the set of
Then, one could easily estimate
st~ndard
conditions.
~Y'
und~r
some set of
where Y is SBP measured
However, since measurements on Yare
difficult (and expensive) to obtain, it would be hoped that measurements on X,
SBP measured under non-standard conditions, for a relatively large sample of
patients might be used in conjunction with measurements on Y for a relatively
small sample of patients in order to estimate
~Y'
In this example, it may be
that 1~-~xl is small; almost certainly, though, a~ would be larger than
a;.
A random sample on X at the first stage could be collected from past
hospital records where, it is assumed, SBP was measured under non-standard conditions.
An independent bivariate random sample on (X,Y) could be obtained by
measuring SBP under standard conditions on a sample of patients, as well as by
;r
using the usual SBP from the hospital records of the same patients.
A further,
independent, random sample on Y at the first stage (though not necessary) could
possibly come from a research project done on some of the hospital population
where SBP was purposely taken under standard conditions and the SBP under nonstandard conditions is not available.
Of course, it is necessary to take
caution that these three samples do, indeed, come from the same population.
F~ther
~.
~
sampling
can be done at the second stage under several options (see
.
.
-
.
I
I
-3-
Section 3).
'I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
Letting CX' C ' and Cxy be the respective per unit costs of measuring SBP
Y
under non-standard, standard, and both conditions, it is obvious in this example
that Cx<Cy<Cxy.<Cx+Cy •
Hence, observations on X may be "preferred to",pbserva-
tions on Y if X can be used effectively to estimate
~Y.
This type of cost con-
figuration is typical of many situations in which one might wish to apply the
preliminary test approach discussed in this paper.
That is, one wishes. to
estimate ~y' but an observation on X is less costly than an obs~vation on Y
and there is some
p~ior
evidence that
~y=~X.
Another example is that of estimating the average volume of trees in a
forest.
For any given tree it is possible to measure Y, the actual volUme.
however, is difficult and expensive.
This,
A much cheaper method which may provide a
good estimate of the volume is to measure the height H of the tree and' the diameter D at a specified height off the ground.
X
= kD 2 H.
Then the volume is estimated by
If 1~-~xl is small, then measureme~ts of X could be pooled with
measurements of Y in order to estimate
~.
Still another example is the post-enumeration surveys
which~the
Census
Bureau conducts to check on the adequacy of coverage and content in the decennial
census of population,and housing [Bailar, 8].
For each of several geographic
;r
areas, there could be available the population count or the attribute data from
the post-enumeration survey (i.e. Y) and from the original census (i.e. X).
These observations can possibly be combined in order to estimate the total population or the average value of some attribute for these areas.
It would be
hoped that the combined estimate would be more accurate than the census figures
alone.
I
I
-4-
These three examples illustrate the following considerations:
~I
(1) a pooled
estimator seems appropriate if I~Y-~Xr is small; (2) observations on X instead
of Yare desirable from a cost viewpoint; and (3) a correlation between X and Y
I
may allow utilization of information on X even though l~y-~xl is not small.
I.
2.
I
I
I
I
I
I
I
I
I
I
I
I
I
I
Previous Investigations of Related Problems
Some of the authors who have studied parametric estimation problems by
using pooling. procedures after a preliminary test of significance are Mosteller
4
[17], Bennett [10, 11], Kitagawa [15], Bancroft [9], Asano. [3,
4;
5], Kale and
Bancroft [14], Han and Bancroft [12], Asano and Sugimura [7], and Huntsberger
[13].
Asano and Sato [6] and Sato [19] have considered two bivariate populations
and two multivariate populations, respectively.
Tamura [20, 21] has considered
non-parametric estimation after a preliminary test of significance.
This study differs from these other investigations in three respects.
First, there is no published research where a two-stage sampling procedure has
been considered in conjunction with a preliminary test of significance.
The
,/
two-stage estimation procedures discussed, for example, by Yen
[22]
and Arnold
.r_
and Al-Bayyati [2], use information from the first stage to determine the
sampling plan at the second stage, but a preliminary tes.t of significance is
not used to make this determination.
The two-stage sampling scheme is useful
because, for a given budget (assuming
CX<Cy~Cxy)'
it may be advantageous to do
some additional sampling after the preliminary test is done.
H :
O
~=~X
For example, if
is accepted, then a large sample where only X is measured is reason-
able for the second stage.
However if H :
O
~Y=~X
is rejected, then a small
sample where only Y is measured is probably more feasible for the second stage.
In addition, the method proposed here allows a bivariate sample on (X,Y) at the
second stage.
•
t
I
I
0"
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-5-
Second, there are no published results where both independent and dependent
sampling can be included atueach stage.
Thus,·given a budget and cost function,
in the procedure proposed here one can, at least theoretically, determine the
optimal allocation of resources (1) between sampling at the first and second
stages, and (2) among bivariate and univariate (both X and Y) sampling at each
stage.
Third, every investigator except Kitagawa [15] and Mehta and Gurla!1d [16]
..
has considered the random variables X and Y to be independent. Mehta and
Gurland [16]"however, are concerned with testing hypotheses about a bivariate
normal population, Qne of which is the null hypothesis that p=O.
Kitagawa [15],
on the other hand, considers p#O, where p is unknown, although his investigation
is
speci~lized
by having only a one-stage bivariate sample of size n with
----
cr~=cr~=cr2 •
In many prospective applications of the preliminary test approach (such
as the SBP example) it is to be expected that cr~#cr~.
Further, it is important
to extend the results available for "pooling means" to include the numerous
applications where p is not necessarily zero, and where random samples of sizes
n>O,
~>O,
and ny>O as described in Section 1 are selected.
having to. us.e approx~mate distribution theory,
I
Thus, to avoid
has be~n assumed to be known
;r
in this investigation.
A small Monte Carlo study has been carried out to deter-
mine the effect on the bias and mean square error of the estimator of
estimating the components of
I
~Y
from
[Ruhl, 18].
Finally, it may also be noted that (1) those authors considering cr~#cr~
[e.g. 10] assume that cr~ and cr~ are known; and (2) some of the authors [14, 17]
I
I
-6-
who take cr~=cr~=cr2 assume, for simplicity, that cr 2 is known.
-'I
I
I
I
I
I
I
I
Special cases of the sampling procedure considered in this investigation
are the same as those studied by many of the above authors except for those
investigations where cr 2=cr 2=cr 2 and/or p are assumed to be unknown.
X Y
3.
The Sampling Procedure, Some Notation, and Some Special Cases
bivar~at~
A random sample (Xl,Yl), ••• ,(Xn,Y ) is selected from the
n
distribution..
In addition, an independent random sample of n
taken on X, and a random sample of
stage sample, thus, are
(n~)
ny
obs~rvations
X
is taken on
normal
is
~bservations
4
Y~
In the first
observations on X and (n+ny) observations on Y.
Note that only Xi and Y are correlated, (i=l, ••• ,n), where (Xi,Y i ) denotes
i
the i-th element in the. bivariate sample.
At this point a preliminary test of the null hypothesis H :
O
the alternative hypothesis HI:
first stage.
I
I
I
I
I
I
I
I
I ----_._-_.---
rejected.
~y:;'J.1x
~=~X
versus
is done using the sample data from the
On the basis of the preliminary test H is either accepted or
O
A second stage sample is then taken, again allowing a bivariate sample
on (X,Y) and two independent samples, one each on X and Y.
If H _is accepted,
9
the size of the bivariate sample will be nO' and the size of the independent
samples on X and Y will be n
and nOY' respectively.
~imilarly,
if H is
O
•
rejected, the sizes of the samples at the second stage will be n , nIX' and n ly •
l
OX
Thus, the notation for this sampling procedure can be summarized as follows.
Let Xn and Yn be the sample means _on X and Y from the bivariate sample at the
first stage.
Likewise, let X
~
and Y
pendent samples at the first stage.
ny
be the sample means from the two inde-
Analogously, if H is accepted, let the
O
respective sample means be denoted byX , Y ,X
,and Y
• If H~ is rejected,
nO
nO
n OX
nOY
v
the sample means will be X , Y , X ,and Y
• The notation and sampling
_
_
n1
n1 . nlX
n1y
scheme are illustrated in Figure 1.
.--.---
I
I
-7-
Figure 1
1\1
I
I
I
I
I
I
I
I
I
I
I
I
I
II
I
Two Stage Sampling Procedure
and Resultant Estimator of ~Y
FIRST STAGE SAMPLE MEANS
(x tY )
n n
x
~
PRELIMINARY TEST
-SECOND STAGE SAMPLE MEANS
ESTIMATO;a OF lly
"
jf
c 1x +n +c 2Y + +c 3X +n +c 4Y +n
n O n nO
~ OX
ny OY
\
'
ESTiMATOR OF lly
I
I
~I
I
I.
I
I
I
I
I
I
I
I
I
I
I
I
I'
I
-8-
Note that only one of the two possible second stage samples is realized.
However, in determining the bias and mean square error of the resultant
estimator, the values of the sample sizes of both second stage possibilities
must be considered.
Thus, in using such an approach, values of n,
~,
ny,
nO'
n OX ' nOY' n , nIX' and nly would be fixed in advance of sampling, but only one
l
of the two sets (nl , nIX' nly ) and (nO' n OX ' nOY) would be realized.
This sampling procedure includes several possibilities.
By
takin~
n=p=O,
one has the one-stage sampling scheme considered by several authOrs [10, 14, 17].
For applications of 'the type
have n>O, nx>O, and
~=O.
il1ustra~ed
oy the SBP example, one would typically
At the second stage of sampling one might take
nO=nOY=O and nOX>O if H is accepted, whereas one might choose n =n =0 and
O
1 1X
nly>O if H is rejected.
O
However, the sampling procedure is completely general
---~
-
in that it includes the possibility of both dependent and independent sampling
at each stage while at the second stage a different procedure may be followed
depending on whether H is accepted or rejected.
O
4.
The Preliminary Test Statistic
Before defining the preliminary test statistic, some additional notation
is introduced.
X +n
n
0
and
First, define N = n+n +n +n '
X
l X 1X
~are
X
Then sample means such as
defined as
(4.1)
and
-1 K..y = NX
(nX +nIX -ML_X +nUX
)•
Jo'x.
n
n 1 --x: ~
. nlX .
(4.2)
I
I
'~I
-9-
Given the sampling scheme discussed in section 3, a preliminary test of
H :
O
Py=~X versus H :
l
I
I
I
I
I
I
I
I
I
I
I
I
I
~(~a) - ~(-~a) = l-a.
I-
5.
I
I
~Y~X is made using the test statistic
(4.3)
Under HO'Z is normally distributed with a mean of zero and variance cr~, where
(4.4)
If the correlation p is near one, then better power on the preliminary
test might be obtained by using the test statistics Z' instead of Z, where
This is because cr 2 < cr 2
Z
Z'
Z' = Yn-Xn and has varianC~_?~,
if, and only if,
(4.5)
.,,-
Obviously, inequality (4.5) is satisfied if p<O.
is satisfied unless p is close to one.
In general, inequality (4.5)
Since, for most applications, the
correlation will be zero or moderately positive, Z as defined in (4.3) will be
.'
if
used as the preliminary test statistic in the estimation of
Let
H :
O
~a
Py=~X
~.
be the critical value with Type I error equal to a for the test of
versus H :
l
density function
~(t)
~Y~~X
using the N(O,l) distribution with probability
and cumulative distribution function
~(t).
That is,
Hence, H will be rejected whenever Izl>~acrz·
O
" of
The General Estimator ~Y
~Y
Let case 1 be defined by the following conditions:
Py~~X'
n>O,
nx~O,
I
I
~I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
f
I
-10-
ny,>0, n1 >0,
n1~>0, n1Y~0,
nO=nOX=nOY=O.
Assuming these sample sizes to be pre-
determined and not dependent upon a preliminary test of significance, the
maximum likelihood estimator of
~Y
under these conditions is
(5.1)
where
(n+n )
1
(5.2)
g2 =cr~[1...,p2k]
and
--(5.3)
A
is unbiased and is a weighted average of the mean Y +n
and a
tty 1Y
"
adjusts
Yn+n
on
the
basis
The
regression
estimator
regression estimator of ~y.
Note that
~1
.
XN
/1
of the difference between
and X +n. Note, also, that the variances of
X
n 1
-1
-1
Y +n
and the regression estimator are gl and g2 ' respectively. Hence, ~1
tty 1Y
.
is a weighted average:o~~two unbiased, statistically independent estimators of
A
~,
with each estimator weighted inversely proportional to its variance.
Let case 0 be defined by the conditions
nOX>O, nOY>O, n1=n1X=n1Y=0.
~=~X'
n>O,
~>O,
Uy>O,
n~O,
Assuming, again, these sample sizes to be predeter-
mined and not dependent upon a preliminary test of significance, the maximum
likelihood estimator
of'~y
is
(5.4)
I
I
(I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-11-
where
(n+n )
O
=
hI
(l_pZ)
(n+n )
o
h = (l_pZ)
2
Note that
unbiased.
A-
~O
[l.2 - ~]. h = (~+nox)/a~
a
a a '
3
X
XY
(5.5)
[5- - ~].
a a '
(J
y
h
XY
4
=
(ny+n OY ) I a~ •
is a weighted average of four unbiased estimators of
~
and hence
The weights hi' i=1, ••• ,4, as defined in (5.5), minimize the variance
A-
of
~O.
These two maximum likelihood estimators suggest the definition of the
estimator of
~
under the possibilities of accepting or rejecting H •
O
accepted, then the estimator
--._-
A-
~
of
~y
is defined as
A-
~YO
If H is
O
where
(5.6)
If H is rejected, then the estimator
O
A-
~y
of
~y
is defined as
A-
~Yl
where
/
(5.7)
A-
For the derivation of the bias and mean square error
of~~y'
wi' i=1,2, and
c ' i=1,2,3,4, are assumed to be arbitrary known constants such that (5.6) and
i
(5.7) are satisfied.
Also, the regression coefficient
a in
(5.7) is assumed
to be an arbitrary known constant.
4
r
Reasonable choices for wi and c i would be wi = gi/ (gl+g2) and c i = hi/[
hi]·
.
i=1
Likewise, a reasonable choice for
would be pay/a as suggested by (5.1).
X
a
However, these choices will not necessarily minimize the variance or mean square
I
I
,
I
I
I
I
I
I
I
I
I
I
I
I
I
I
'f
I
-12-
" even though they do minimize the variance of the
error of lly
" was defined.
estimators from which ~
maximumlik~lihood
Hence, for generality, the bias and mean
square error are derived for wi' c i ' and 13 being known constants.
6.
"
Bias of ~
" is defined as
The expectation of ~y
E(~) = E[~Ylllzl>~a(Jz]pr[/zl>~a(Jz]
+
E[~ollzl<~a(Jz]pr[/.~I<~a(J~']
(6.1)
•
where all terms have been previously defined in (4.3), (4.4), (5.6), and (5.7) •
.-
This can be written as
(6.2)
where h(z) is the density function of Z, and
mean ~
= ~-~X
z-
is normally distributed with
and variance (J~ as defined in (4.4).
If now A and Z are bivariate
ditional expectation
~A'
normal random variables, where A has uncon~
Anderson [1] shows that
;t(A IZ=z) = ~A+cr -2 (z-~)Cov(A,Z).
(6.3)
z
Using (6.3) to obtain the conditional expectations in (6.2),
A
E(~y)
is obtained
as
(6.4)
I
I
II,
I
I
I
I
I
I
I
I
I
I
I
I
I
I
'I
I
-13-
where
HI = c Cov(X + ,Z) + c 2cov(Yn+n 'Z) + c 3Cov(X +n ,Z) + c 4Cov(Y +n ,Z)
1
n nO
nX OX
Ily OY
0
(6.5)
and
--._-
and 0, a standardized measure of the difference between
~
and
~x'
is defined
as
.(6.7)
/
The covariances which appear in HI and H are easily derived as
2
'.'-
;f
Cov(Xn+n ,Z)
o
(6.8)
(6.9)
(6.10)
Cov(Y
+n ,Z)
I1y OY
·(6.11)
I
I
-14-
('~
I
I
"I
I
I
I
I
I
I
I
I
I
I
I
I
I
(6;12)
(6.13)
Cov(Yn+n ,Z)
1
(6.14)
Cov(X +n ,Z)
n X IX
(6.15)
The bias is immediately obtained from" (6.4) and can be considered as a
function of 0, since all other terms in (6.4) are known constants.
Denoting the
bias as B(o), then
---.-.
(6.16)
where
~(t)
and
~(t)
are defined in section 4.
Even in this very general form, it is possible to demonstrate some properties of B(o).
First, the bias is zero if 0=0.
consider the behavior of B(o) for 0>0 since B(-o)
"
lim
be shown by using l'H?s~!ta1's Rule that o~ B(o)
"
--
~
Also, it is necessary only to
= -B(o).
= O.
Furthermore, it can
The expression for B(o)
will simplify for some choices of the weights wi and c ' if some of the nine
i
possible sample sizes are taken to be zero, or if p=O.
,.,
7.
Mean Square Error of
~
The mean square error of ~Y is derived by first finding E(a~) and then
,.,
Starting the derivation similarly. to that
using MSE(~) =
,.,
of E(~) yields
I
I
-15-
tt
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I'
I
E{~) =
(7.1)
If A and Z follow a bivariate normal distribution with unconditional means
~A
and A, respectively, then it follows from standard multivariate normal theory
that
E(A 2 1Z=z ) =
Letting
'"
A=~O
2
2 + 2~A Cov,
(A Z)( Z-uA)/ cr 2 + Var (A) + Cov 2(A,Z) [(Z-fr)2 -,1].'
z
~A
and then
crz
'"
A=~Yl'
.. crz
(7.2)
.
and using (1.1) and (7.2), gives, after substantial
algebra, the mean square error as a function of 0, i.e.
---.
E; -0
MSE(o) = VI +
_~J:o [VO-vl+02a~(cI+C3)2_20(cI+C3)tHI+(t2_1)(H~-H~)la~1~(t)dt.
(7.3)
a
where HI and H are defined in (6.5) and (6.6),
2
jf
(7.4)
and
(7.5)
I
I
-16-
A.
A
~yo
'I:
Note that Vo and VI are the unconditional. variances of
I
I
I
I
I
I
I
I
I
I
I
I
I
I
mean square error is a symmetric function of 0, i.e. MSE(-o)
,
I
and
~Yl'
respectively. -
A few properties of MSE(o) can be ascertained in this general form.
= MSE(o).
it is necessary only to investigate the behavior of MSE(o) for
it can be shown that
~~ MSE(o)
=
o~o.
VI' the unconditional variance of
First,
Hence,
Second,
~Yl. Third,
MSE(O) is equal to the unconditional variance of ~Yl' i.e. VI' plus another term
which can either be positive or negative.
The expression for MSE(o) in (7.3) can be integrated and written as in (6.16)
-0) or (-~ -0). One
a
a
of us has written a computer program to evaluate the bias and mean square error
as a function of ¢(x) and
~(x),
where x takes the value
as given in equations (7.3) and (6.16).
(~
What is typically done in studies of
preliminary test procedures is to evaluate the bias and mean square error for
- ..--.
various values of a and
matrix.
~
or 0, for some given sample sizes and variance-covariance
This paper presents the additional problem, however, of determining
values for the weights wi and c
8.
i
and the regression coefficient
B.
Choice of the Weights w.1 and c.1 and the Regression Coefficient
B
~
It is theoretically possible to choose the w., c., and
1
error as given in (7.3) is minimized.
If this is
1
pursu~d,
First,;~he solution for wI' w ' and
become evident.
2
B so
that the mean square
however, three things
B is independent of the sol-
ution for the c., which simplifies matters considerably.
1
Secondly, however, the
solution for w and B involves two simultaneous equations with terms bf order
2
three such as
B2 w2 ,
Bw~, etc.
Third, the solutions for wi' c i ' and
be functions of 0, which, of course, is unknown.
feasible to choose the weights wi' c ' and
i
B so
B will
all
Hence, it does not appear
that mean square error as given
in (7.3) is minimized with respect to these parameters.
I
I
It
I
I
I
I
I
I
I
I
I
I
I
I
I
I
.'1
I
-17-
A logical choice for the regression coefficient B is B = (Xly/a as suggested
x
by the maximum likelihood estimator in (5.1). If this is done, VI of equation
(7.4) simplifies to
(8.1)
where k is defined in (5.3).
Also, H in equation (6.6) reduces to
2
w2na~(1-p2k)
H2= (n~l)(n+ny)
Now, with
B defined
(1-w2)a~~
+
B=
"--_.
as
(Uy+nly)(n~)
(8.2)
pay/ax and with VI and H defined as in (8.1)
2
and (8.2), MSE(o)
in (7.3) can be minimized with respect to the w. and c .•
. 1 . 1 .
This
will produce one linear equation in w • Using the method of LaGrange multipliers
2
4
to find the c which minimize MSE(o) subject to the restriction l c.=l leads
i
i=l 1.
to five simultaneous linear equations in five unknowns, i.e. c ' c ' c ' c ' and
l
2
3
4
A, where A is the LaGrange multiplier.
solved in the usual manner.
~
A
i
can be
However, the solutions for both the w. and c. are
still a function of 0, which is unknown.
o by 0
These equations for wi and c
1.
1.
It is possible, of course, to estimate
;f
A
from the sample data and, hence, have the c. and w. be functions of O.
1.
1.
However, the formulas given in this paper for bias and mean square erro: of
A.
~
would no longer be appropriate.
As a solution to this dilemma, the logical choices for the wi and c i are
4
. -1
-1
wi = gi(gl+g2) , i=1,2, and c i = hie l hi] , i=1, ••• ,4, where gi and hi are
i=l
defined in (5.2) and (5.5). Recall that these weights minimize the unconditional
variances of Uyl and
PyO '
respectively.
Using these weights yields HI = 0, and
I
I
l'
I
I
I
I
I
I
I
I
I
I
I
I
I
I
'I
I
-18-
Vo reduces to V*o
to V*
l
=
= [L~
hi] -1 •
If, in addition,
i=l
(gl+g2) -1 and H2 reduces to
S = pay/aX'
then Vl reduces
(8.3)
"
Using these values for
S and
for the w. and c. yields the bias and mean square
1.
1.
•
error as
~a-o
B*(o)
=
f
(8.4)
-~a-o
--
and
MSE*(o)
(8.5)
;t
Even with these simplifications, however, it is still difficult to tell how
the bias and mean
square error will behave for various values of a, p, etc.
Hence, numerical evaluations are necessary for any further analysis of any of
the particular sampling plans which this estimation procedure encompasses.
I
.1
f
I
I
I
I
I
I
I
I
I
I
I
I
I
I
II
I
-19-
9.
A Numerical Example
As an indication of the effect of a, 0, and p upon the bias and the mean
square error of the procedure, a simple numerical example is given in this
section.
This example assumes no second-stage sampling, i.e. nO=nOX=nOy=n1
=nlX=nly=O while n>O,
~>O,
~>O.
and
Using the weights w., i=1,2 and c., i=l, ••• ,4 as suggested in section 8,
1
1
the pooled estimator which is used whenever H :
O
~Y=~X
is accepted becomes
4
nO =
[l
h.]-1[h X +h..,Y +h X +h Y ]
1 n
4 ny
i=l
n 3 n
~
1
(9.1)
X
where
h
1
=---(1_n p 2)
[1
(1T X
p]
cra
;
X Y
(9.2)
1
h 2 -- (l_p2) [(1T
n
Y
If, in addition, p=O, then the pooled
P ].
cra
'
X Y
h
estimato~
4
2
= ~_/crY.
r
in (9.1) reduces to a simple
weighted average of Xn+nx and Yn+ny' i.e.
, p=O.
(9.3)
For this example, the regression estimator, which is used whenever
H :
O
~=~X
is rejected, becomes
,..
~l
1
= (g +g ) - [glY
1
2
+g2{Y
ny
n
+
PC1y
-(X +n.._-X )}],
aX n. --x n
.'
(9.4)
I
I
'f
I
I
I
I
I
I
I
I
I
I
I
I
I
I·
'I
-20-
where
(9.5)
If, in addition, p=O, then the regression estimator simply reduces to Yn+ny'
i.e. the unpoo1ed estimator of
regarding
~Y
which uses none of the available information
~X'
Furthermore, this example is for the following specified
0 2 =25
X
0 2 =16
n=15 ' x
~_=30, and ~.=10.
'Y'
y
p~rameters:
4
The values of 0, B(o), and MSE(o) have
been computed'for 3 values of <X(.50, .25, .10),7 values of p(-0.5, -.25,0,
.25, .33, .50, .67), and 8 values of b.(0, .8,1.6,2.4,3.2,4.0,4.8,5.6).
These results are in Tables 1 through 3.
To give a reference point for mean square error when reading Tables 1 through
3, Table 4 gives, for 7 values of p, the variance of the regression estimator
in (9.4), the variance of the pooled estimator in (9.1), and the variance of the
unpoo1ed estimator Yn+ny'
Recall that the unpoo1ed estimator and the regression
estimator are unbiased for all values of b. =
is unbiased if, and only if, b.=0.
~Y-~X'
whereas the pooled estimator
Table 4 shows that the pooled estimator has
the smallest variance, and the variance increases as p increases.
This happens
because the additional information on ~X provided by the bivariate sample becomes
less useful for estimating
~Y
as p increases.
estimator is maximum when p=O, and decreases as
The variance of the regression
Ipi
increases.
The unpoo1ed
estimator has the largest variance, and is the same as the regression estimator
when p=O.
If it is known that b.=0, then the obvious estimator for
estimator.
~Y
If one does not know the value of b., but knows that
regression estimator should be used to estimate
~y'
is the unpoo1ed
b.~0,
then the
By making' a preliminary test
I
I
-21-
•I
I
I
I
I
I
I
I
I
I
I
I
I
'1
J
I
of significance, one expects to use the pooled estimator whenever ~~O or 0~0
and hence reduce the mean square error below the value of the variance (also
mean square error) of the regression estimator.
Consider first the effect of 0 on -B(o) for all a and for all p.
1 through 3 show that for 0=0, B(o)=O as noted previously in section 6.
Tables
As 0
increases beyond 0, -B(o) increases monotonically to a maximum at a value of 0 around
1.3 or 1.4.
For o increasing beyond 1.3 or 1.4, -B(o) approaches zero 'asymptoti4
cally as stated in section 6.
The effect of a upon -B(o) can be seen by noting that the maximum value of
-B(o)
increa~es
as a gets smaller.
E.g., for p=.25, the maximum values of -B(O)
for a=.50, a=.25, and a=.lO are, approximately, .03, .10, arid .25, respectively.
This relationship holds for all values of p.
In addition, as a increases, -B(o)
attains its asymptotic value of zero for smaller values of 0, i.e. it approaches
zero more rapidly.
These properties occur because, as a gets smaller, the prob-
ability of making a Type II error increases, and the Type II error then results
in a biased estimator of
~Y.
These effects of 0 and a upon B(o) are the same'
~
as those found by other investigators who considered only the special case when
p-O.
The effect of p.upon -B(o) is more difficult to indicate directly because p
~
. ;t
affects -B(o) in at least two ways.
as p increases, 0 increases.
as an increase in o.
First, equations (4.4) and (6.7) show that
Hence, an increase in p produces the same effects
Second, equations (9.2) and (9.5) show that p, as a known
parameter, is a component of the weights used in the weighted average estimators
in (9.l) and (9.4).
Hence, p is a component in the expressions for B{o) and
MSE(o) other than via o.
For
~=.8,
Table 1 illustrates the interaction of these two factors.
as p increases from -.50 to .67, -B(o) begins at .03, decreases to
I
I
-22-
t·
I
I
I
I
I
I
I
I
I
I
I
I
I
I
.02, and then increases to-.05.
Now, i f p were having no effect over and above
its effect through 0, then -B(o) should steadily increase as p increases because
o is increasing from .66 to .87.
This isn't true, however, since -B(o) decreases
until p attains some point in the interval
[O,~).
For
~=1.6
in Table 1, one
would expect -B(o) to decrease as p increases if p was having its only effect
through o.
However, -B(o) decreases until p reaches a point in
it begins to increase.
The same general behavior is seen for
the minimum value of -B(o) appears to occur for p E [~,~J.
[O,~),
~=2.4,
although
For ~~3.2 in Table 1,
an increase in p produces a decrease in -B-(o), most likely primarily through the
influence of o.
value of
~
S~ilar
patterns are seen in Tables 2 and 3, except that the
for which an increase in p always produces a decrease in -B(o) gets
smaller as a gets smaller (i.e.
pectively).
~=3.2,
2.4, and 1.6 in Tables 1, 2, ,and 3, res-
In general, it appears from this example that an increase in p
will cause the same effects as an increase in 0, with the following exceptions:
1)
For 0 in the range of, approximately 0 to .7, an increase in p produces
a decrease in -B(o) rather than an increase.
2)
/
For 0 in the range of, approximately, 1.6 to 1.8, an increase in p
produces an increase in -B(o) rather than a decrease.
Consider now the effect of 0 on MSE(o) for any given value of p and a.
-,
J.
Looking down any column of Tables 1, 2, or 3, it can be seen that the minimum
'-
MSE(o) occurs at 0=0.
This minimum value of MSE(o) is less than the variance
of the regression estimator, but greater than the variance of the pooled
estimator.
As 0 increases beyond 0, MSE(o) increases monotonically until it
reaches the value of the variance of the regression estimator.
approximately around 0=.8.
This occurs
As 0 increases beyond .8, MSE(o) increases monotonically
until it attains its maximum value for 0 approximately equal to 2.
I
and then
As 0
I
I
t
I
I
I
I
I
I
I
I
I
I
I
I
I
I
,I
I
-23-
increases beyond 2, MSE(o) decreases monotonically toward a limiting
- value which is the variance of the regression estimator.
Hence, the pooling
procedure yields maximum mean square error around 0=2, with MSE(o) approaching
from above the variance of the regression estimator for 0>2 and MSE(o) less
than the variance of the regression estimator for 0<.8 (approximately).
The effect of a upon MSE(o) can be seen by noting that, for fixed p, the
minimum value of MSE(o) decreases as a decreases.
E.g., for p=.25, the,minimum
value of MSE(o) is .60, .54, and .45 for a=.50, .25, and .10,
the maximum value of MSE(o) increases as a·decreases.
re~pectively.
Also,
E.g., for p=.25, the
maximum values of MSE(o) are (approximately) .65, .74, and .90 for a=.50, .25,
and .10, respectively.
In addition, it can be noted from Tables 1 through 3
that, as a decreases, MSE(o) approaches its asymptotic value more slowly.
--._~
Hence, a smaller value of a will result in a larger reduction in MSE(o) if 0
is small (approximately less than .S), but, on the other hand, will result in
a larger increase in MSE(o) if 0 is moderate (approximately equal to 2).
These
effects of 0 and a upon MSE(o) are the same as those found by other investigators
for the special cases where p=O.
...,-.
P can affect MSE(o) either via 0 or through its influence on the weights
in the weighted estimators.
The general effect upon MSE(o) of increasing p
;~
from -.50 to .67, as illustrated in Tables 1 through 3, is to first increase
MSE(o) to some maximum value and then to decrease MSE(o).
~,
For any given value of
Table 1 shows that the maximum value for MSE(o) is attained for p approximately
equal to zero, whereas in Tables 2 and 3 the maximum value for MSE(o) is attained
for some value of p in the interval (-.25,.25).
Hence, in general, it appears
that an increase in the absolute value of p will decrease MSE(o), altnough this
relationship between p and MSE(o) is definitely not symmetric about the point p=O.
I
I
f
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-24-
10.
Some General Conclusions
The bias and mean square error of a two-stage sampling scheme which involves
a preliminary test of significance have been derived.
Numerical investigation
of some one-stage examples only indicate that in this procedure a and 0 have an
effect on B(o) and MSE(o) which is similar to that reported by other authors
who have considered similar procedures.
In addition, it appears that an increase
in the absolute value of p will generally decrease MSE(o).
The relationship
between p and"B(o) is not obvious, but it appears that p influenCes B(o) primarily
through the effect of p upon o.
/
I
I
(t
-25-
Table 1
Value of 0, -B(o), and MSE(o) for Various
2
p and ~ where 0 y2 =16 ' X
0 =25' n=15 , ~=30 , ny=10 ,
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
cx=.50
p
.00
.25
.33
.67
.50
-.50
-.25
0.0000 1
0.0000 2
0':5379 3
0.0000
0.0000
0.5955
0.0000
0.0000
0.6156
0.0000
0.0000
0.5997
0.0000
0.0000
0.5860
O.oooq
0.5444 4
0.0000
0.0000
0.4832
.8
0.6616
0'.0309
0.5674
0.6940
0.0263
0.6220
0.7317
0.0239
0.6404
0.7762
0.b263
0.6267
0.7929
0.0287
0.6149
0.8301
0.0372
0.5792
0.8730
0.0512
0.5244 .
1.6
1.3232
0.0342
0.6083
1. 3880
0.0273
0.6556
1.4633
0.0230
0.6685
1.5524
0.0232
0.6532
1.5858
0.0246
0.6418
1.6601
0.0297
0.6078
1. 7459
2.4
1. 9847
0.0192
0.6096
2.0819
0.0137
0.6527
2.1950
0.0101
0.6623
2.3286
0.0089
0.6432
2.3787
0.0089
0.6301
2.4902
0.0096
0.5912
2.6189
0.0104
0.5300
3.2
2.6463
0.0064
0.5895
2.7759
0.0039
0.6354
2.9266
0.0024
0.6475
3.1049
0.0017
0.6290
3.1717
0.0016
0.6156
3.3202
0.0015
0.5756
3.4919
0.0013
0.5140
4.0
3.3079
0.0013
0.5763
3.4699
0.0007
0.6263
3.6583
0.0003
0.6413
3.8811
0.0002
0.6244
3.9646
0.0002
0.6113
4.1503
0.0001
0.5718
4.3648
0.0001
0.5110
4.6573
0.0000
0.6238
4.7575
0.0000
0.6108
4.9803
0.0000
0.5714
5.2378
0.0000
0.5108
5.4335
0.0000
0.6237
5.5504
0.0000
0.6107
5.8104
0.0000
0.5]14
6.1107
0.0000
0.5108
~
0
4.8
3.9695
0.0002
0.5722
4.1639
0.0001
0.6241
4.3899
0.0000
0.6401
5.6
4.6311
0.0000
0.5715
4.8578
'. 0.'0000
0.6238
5.1216
0.0000
0.6400
1
0
2 B (O)
sMSE(o)
0.0000
0.0375
0.5527
./
I
I
t
I
I
I
I
I
I
I
I
I
I
I
I
I
I
f
I
-26-
Table 2
Value of 0, -B(o), and MSE(o) for Various
p and ~ where cr~=16, cr~=25, n=15, n =30, ny=10,
X
a=.25
p
-.25
0
.25
0.0000
0.0000 2
0':4620 3
0.0000
0.0000
0.5205
0.0000
0.0000
0.5454
0.0000
0.0000
0.5385
0.6616
0.1041
0.5515
0.6940
0.0996
0.6110
0.7317
0.0966
0.6355
1.6
1.3232
0.1273
0.6942
1.3880
0.1150
0.7445
2.4
1.9847
0.0829
0.7235
3.2
f:.
-.5
.50
.67
0.0000
0.0000
0.5288
0.0000
0.0000
0.4974,
0,.0000
0.0000
0.4501
0.7762
0.0978
0.6285
0.7929
0.0999
0.6190
() • 1 ('17 F-
a 120?
u.j81l
O.SJbB
1.4633
0.1044
0.7566
1.5524
0.0984
0.7370
1.5858
0.0980
0.7234
1.6601
0.1002
0.6835
1.7459
0.1058
0.6194
2.0819
0.0680
0.7566
2.1950
0.0552
0.7520
2.3286
0.0459
0.7172
2.3787
0.0437
0.6989
2.4902
0.0406
0.6497
2.6189
0.0384
0.5771
2.6463
0.0335
0.6602
2.7759
0.0240
0.6910
2.9266
0.0166
0.6880
3.1049
0.0115
0.6568
3.1717
0.0102
0.6399
3.3202
0.0082
0.5936
3.4919
0.0065
0.5265
4.0
3.3079
0.0088
0.6016
3.4699
0.0053
0.6428
3.6583
0.0029
0.6510
3.8811
0.0016
0.6296
3.9646
0.0013
0.6155
4.8
3.9695
0.0015
0.5777
4.1639
0.0007
0.6269
4.3899
0.0003
0.6414
4.6573
0.0001
0.6243
4.7575
0.0001
0.6111
4.9803
0.0000
0.5716
5.2378
0.0000
0.5109
5.6
4.6311
0.0002
' 0.5723
4.8578
. O:bOOl
5.1216
0.0000
0.6401
5.4335
0.0000
0.6238
5.5504
0.0000
0.6108
5.8104
0.0000
0.5114
6.1107
0.0000
0.5108
1
0
.8
1
0
2 B(0)
3MSE (0)
0.6241
.33
0.8301
0.8730
'4.1503
4.3648
0.0008
0.0005
0.5744 .---- 0.5124
I
I
IJ{
I
I
I
I
I
I
I
I
I
I
I
,I
I
I
,I,
I
-28-
Table 4
Variances of Three Estimators of
~Y'
for
Various Values of p where cr~=16, cr~=25,
n=15,
~=30,
ny=lO
Estimator
p
Regression
(9.4)
Pooled
(9.1)
Unpooled
Yn+n
y
-.50
.5714
.2051
.6400
-.25
.6237
·.2587
.6400
.00
.6400
.2974
.6400
.25
.6237
.3263
.6400
.33
.6107
.3342
.6400
.5714
.50
.67 ---, '.5108
.3478
.3581
.6400
.6400
I
I
I"
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-29-
REFERENCES
[1]
Anderson, T. W., An Introduction to Multivariate Statistical Analysis,
New York: John Wiley and Sons, Inc., 1958.
[2]
Arnold, J. C. and H. A. Al-Bayyati, "On Double-Stage Estimation of the
Mean Using Prior Knowledge," Biometrics, 26 (1970) 787-800.
[3]
Asano, Chooichiro, '~ Note on Modified Two-Sample-Theoretical Estimation
of Biological Assay," Bulletin of Mathematical Statistics, 9 (1960) 41-56.
[4]
Asano, Chooichiro, "Estimations after Preliminary Test of Significance and
Their Applications to Biometrical Researches," Bulletin of.Mathematical
Statistics, 9 (1960) 1-23.
[5]
Asano, Chooichiro, "Some Considerations on the Combination of Estimates
from Different Biological Assays," Bulletin of Mathematical Statistics,
10 (1961) 17-32.•
[6]
Asano, Chooichiro and Sokuro Sato, "A Bivariate Analogue of Pooling of
Data," Bulletin of Mathematical Statistics, 10 (1962) 39-59.
[7]
Asano, Chooichiro and-Masahiko Sugimura, "Some Considerations on Estimation of Population Variance Due to the Use of Pooling Data," Bulletin of
Mathematical Statistics, 10 (1961) 33-44.
[8]
Bailar, Barbara, "Recent Research in Reinterview Procedures," Journal of
the American Statistical Association, 63 (1968) 41-63.
[9]
Bancroft, T. A., "Analysis and Inference for Incompletely Specified Models
Involving the Use of Preliminary Test(s) of Significance," Biometrics, 20
./
(1964) 427-442.
[10]
Bennett, B. M., "Estimation of Means on the Basis of Preliminary Tests of
Significance," Institute of Statistical Mathematics Annals, 4 (1952) 31-43.
[11]
Bennett, B. M., • "On~ the Use of Preliminary Tests in Certain Statistical
Procedures," Institute of Statistical Mathematics Annals, 8 (1956) 45-52.
[12]
Han, C. P. and T. A. Bancroft, "On Pooling Means When Variance is Unknown,"
Journal of the American Statistical Association, 63 (1968) 1333-1342.
[13]
Huntsberger, D. V., "A Generalization of a Preliminary Testing Procedure
for Pooling Data," Annals of Mathematical Statistics, 26 (1955) 734-743.
[14]
Kale, B. K. and T. A. Bancroft, "Inference for Some Incompletely Specified
Models Involving Normal Approximations to Discrete Data," Biometrics, 23
(1967) 335-348.
. (15]
Kitagawa, T., "Estimation After Preliminary Tests of Significance," University of California Publications in Statistics, 3 (1963) 147-186. ----
I
I
~
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-30-
[16]
Mehta, J. S. and John Gurland, "Testing Equality of Means in the Presence
of Correlation," Biometrika, 56 (1969) 119-126.
[17]
Mosteller, F., "On Pooling Data," Journal of the American Statistical
Association, 43 (1948) 231-242.
[18]
Ruhl, D. J. B., Preliminary Test Procedures and Bayesian Procedures for
Pooling Correlated Data, unpublished Ph.D Dissertation, Ames, Iowa:
Iowa State University, 1967.
[19]
Sato, Sokuro, "A Multivariate Analogue of Pooling of Data," Bulletin of
Mathematical Statistics, 10 (1962) 61-76.
[20]
Tamura, Ryoji, "Nonparametric Inferences with a Preliminary. Test," Bulletin
of Mathematical Statistics, 11 (1965) 39-61.
•
[21]
Tamura, Ryoji, ~'Some Estimate Procedures with a Nonparametric Preliminary
Test I," Bulletin of Mathematical Statistics, 11 (1965) 63-71.
[22]
Yen, Elizabeth, "On Two State Non-Parametric Estimation," Annals of Mathematical Statistics, 35 (1964) 1099-1114.