Multiple inverse sampling in post

Journal of Statistical Planning and
Inference 69 (1998) 209-227
ELSEVIER
journal of
statistical planning
and inference
Multiple inverse sampling in post-stratification
Kuang-Chao Changa Jeng-Fu Liu a, Chien-Pai Han b, *
a Department of Statistics, Fu Jen Catholic University, Taipei, Taiwan, ROC
b Department of Mathematics, University of Texas at Arlington, Box 19408, Arlington,
TX 76019-0408, USA
Received 14 October 1996; accepted 25 August 1997
Abstract
In sample survey, post-stratification is often used when the identification of stratum cannot be
achieved in advance of the survey. If the sample size is large, post-stratification is usually as
effective as the ordinary stratification with proportional allocation. However, in the case of small
samples, no general acceptable theory or technique has been well developed. One of the main
difficulties is the possibility of obtaining zero sample sizes in some strata for small samples.
In this paper, we overcome this difficulty by employing a sampling scheme referred to as the
multiple inverse sampling such that each stratum is ensured to be sampled a specified number of
observations. A Monte Carlo simulation is carried out to compare the estimator obtained from
the multiple inverse sampling with some other existing estimators. The estimator under multiple
inverse sampling is superior in the sense that it is unbiased and its variance does not depend
on the values of stratum means in the population. (~) 1998 Elsevier Science B.V. All rights
reserved.
A M S classification: 62D05
Keywords: Double-inverse sampling; Multiple inverse sampling; Post-stratification; Stopping
time
1. Introduction
Post-stratification or stratification after selection o f the sample, is a sampling procedure which is used in the case that we would like to stratify on a key variable, but
are unable to place the sampling units into their correct strata until after the sample
is selected. Personal characteristics such as age, sex, race and educational level are
c o m m o n examples. The procedure consists o f the following three steps: (1) Take a
simple random sample; (2) classify the above sample into strata; (3) use the classified
data to estimate the unknown population parameter by the usual method o f stratified
random sampling.
* Corresponding author.
0378-3758/98/$19.00 (~) 1998 Elsevier Science B.V. All rights reserved.
PH S0378-3 758(97)00157-2
210
K.-C Chang et al./Journal of Statistical Planning and Inference 69 (1998) 209-227
The post-stratification procedure is almost as precise as the usual proportional stratified sampling, provided that (i) each stratum weight is known, and (ii) the sample is
reasonably large, say ~>20, in every stratum, (see Cochran, 1977 or Scheaffer et al.
1990). But, in the case of a small sample, the post-stratification may have the following
problems: (a) What can we do if the embarrassing situation of getting empty strata
in the sample occurs? (b) In the above situation, is it worth to keep taking samples
sequentially until every stratum is covered? Problem (a) has been investigated by Doss
et al. (1979). Before we discuss their work, let us first introduce our notations in this
paper:
N
n
#
a2
Nh
nh
L
)5
population size
sample size
population mean
population variance
value obtained for the ith unit
in the sample
number of strata
sample mean
s2
sample variance
s~
tip
estimator of # by poststratification
Yi
#h
~
Yhj
Wh
)7h
hth stratum size
sample size in hth stratum
population mean of the hth stratum
variance of the hth stratum
value of the jth unit in the subsample
classified into the hth stratum
weight of the hth stratum
sample mean obtained from subsample
classified into the hth stratum
sample variance obtained from the
subsample classified into the hth stratum
Using the above notations, we have the following formulae:
nh
L
1
Nh
n
L
i=1
h=l
1 Z
Yhj, h = 1,2 ..... L,
where )5 and tip are estimators of p obtained from simple random sampling and
post-stratification (nh >~1, h = 1,2 .... ,L), respectively. It should be noted that the sample size n of the first-stage simple random sampling is a fixed number, while the
classified sample sizes nh thereafter are random variables. In the case of small samples, a commonly used estimation procedure for # is to collapse the empty post-strata,
if there are any, with neighboring strata. Let )3c(L) be the estimator of # under this
procedure. Then, when L = 2, we have
33p
Yc(2) ----- fi
if nln2 ~ O,
otherwise.
(1.1)
K.-C ChanOet at/Journal of Statistical Plannin9 and Inference 69 (1998) 209-227
211
When L/> 3, we assume the following prior informations throughout this paper so that
it is easy to define 'neighboring strata'.
(A1) #1 4 # 2 ~ ' - . ~#L.
( A 2 ) #h - #h-I "--<#h+l - #h, h = 2 . . . . . L - 1.
The above assumptions are realistic in practical survey sampling. For instance, in
estimating the average income o f a population, we m a y classify the income levels as
the lowest, the second lowest, medium and the highest, which correspond to the ordered
strata in the population and the differences among lower income levels are usually less
than those among higher income levels. Under assumptions (A1) and (A2), the hth
stratum is the 'neighboring stratum o f the (h + 1)th stratum', h = 1. . . . . L - 1. Thus,
when L = 3, we have
tip
if nln2n3 ¢ O,
Yh
if nh = n,h = 1,2,3,
y~(3)= { (W1 q- W2)fi2 + W3fi3 if
(w~+ w2)y~+ w3Y3 i f
Wlfi 1 q'- ( W 2 ..}- W3)y2
n~ = 0 and n2n 3 ¢ 0,
(1.2)
n2 = 0 and nln3 ¢ O,
i f n3 = 0 and nln2 ¢ O.
When L = 4, we have
yc(4)
=
tip
if H h = l nh ¢ 0,
)5h
if n h = n , h =
i (WI + m2)fi2 -~- W3fi 3 q- W4fi 4
(W1 + W2)fil + W3fi3 + W4fi4
Wlfil + (W2 + W3))72 + W4)~4
mlfil q- W2fi2 q- (W3 q- W4)fi3
(1 - W4)fi3 + W4354
1. . . . . 4,
if nl =4 0 and n2n3n4 5& 0,
i f n2 = 0 and nln3n4 ¢ O,
i f n3 = 0 and nln2n4 ¢ O,
if n4 = 0 and nln2n3 ¢ O,
(1.3)
if nl = n2 = 0 and n3n4 • O,
(1 - W4)fi2 q- W4fi4
if nl = n3 = 0 and nzn4 • 0,
(WI ~- W2)fi2 -[- (W3 -+- W4)fi3
( 1 - W4))31 + W4fi4
if nl = n4 = 0 and nzn3 ¢ O,
(W1 + W2))31 + (W3 + W4))53
if n2 = n4 = 0 and nln3 ¢ O,
Wlfi I ~-(1 - W1)fi2
if n3 = n4 = 0 and nln2 ¢ O.
if n2 = n3 = 0 and nln4 ¢ O,
When L ~>5, the formulae for )Sc(L ) can be obtained in a similar way, but they are
tedious so we will not display them.
Another estimator o f # employing post-stratification is the one developed by Doss
et al. (1979):
rid = E L = I a, Why~h/E(ah)
L
E h = l ah Wh/E(a,)
(1.4)
212
K.-C Chang et al./Journal of Statistical Planning and Inference 69 (1998) 209-227
where
1 if at least one unit of the sample of size n
is in stratum h,
0 otherwise
ah =
and
-
n
if N is finite,
E(ah)
1 -
(1 - Wh)n
if N is infinite.
When all stratum weights are equal, it can be proved that rid is unbiased (see Doss
et al., 1979) and that 33~ coincides with tic(L); otherwise )3d is biased.
In this paper, we propose a procedure to avoid problem (a) and investigate question
(b) by considering a sequential-type estimator of/~, which is introduced in Section 2.
The new estimator is compared with 33c(L) and Yd by a Monte Carlo study. The
simulations in Sections 4 and 5 show that the sequential-type estimator has smaller
mean-squared-errors than other estimators when the differences among the population
means are large, since the mean-squared-error of the sequential-type estimator does not
depend on these differences.
2. Sequential-type estimator of # based on multiple inverse sampling
Consider a population (finite or infinite) consisting of L strata with known stratum
weights Wl ..... WL. If the population is finite, we assume that N is large and n / N
is very small so that the finite population correction can be ignored. Given an initial
small sample size n and a specified minimum subsample size rnh for each stratum, h =
1..... L, we stop sampling if n h >~mh for all h, where r/h is the number of observations
of the initial sample in the hth stratum. Otherwise, we keep taking samples until the
specified minimum subsample size is attained in each stratum. Let the final sample size
L
for the hth straturn be nhs, h = 1. . . . . L, then the final total sample size ns = ~-~h=l r/hs
is a random variable. The subscript s stands for 'sequential'. When the final sample
is determined, we then estimate p by the usual stratified sample mean. Thus, our
sequential-type estimator of p is given as
L
fis = Y ~ Wh fihs,
(2.1)
h=l
where fihs = ( 1/nhs) ~ - ~ I Yhj.
This type of sequential sampling scheme was called double-inverse sampling when
L = 2 by Tu and Han (1982). As a natural extension of the idea, we will use the term
multiple-inverse sampling for the general situation that L >/2.
K.-C Chang et al. I Journal o f Statistical Planning and Inference 69 (1998) 209-227
213
A desirable property of )7s is unbiasedness, which can be shown as follows:
L
WhE(-~hs)
E(33s) --- Z
h=l
L
-~ Z
WhE[E(fihs [ nhs)]
h=l
L
= ~-~ Wh#h = 1~.
h=l
Another good property of )5s is that its variance does not depend on the values of the
L strata means #h's because
V()Ss) =E[V(fis I nls ..... nLs)] + V[E(fi s I nls ..... nrs)]
L
h=l
where E(1/nhs) depends only on stratum weights Wh's. The computation of E(1/nhs)
is nontrivial. We use the Taylor series approximation (see Thompson, 1992, p. 111)
to obtain
(1)
E
~hs
1
1
"~ E(nhs~--)+ [E(nhs)]
~
V(nhs).
(2.3)
Combining Eqs. (2.2) and (2.3), we can approximate V(fis) by
L
V()Ss) ~ Z
h=l
W~2~2 E(n~s)
h ~'h [E(nhs)]3 -
(2.4)
If mh > 1 for all h, VOSs) can be estimated as
L
V(ys) ---Z ^2w/~
' ' 2 o"h [~3'E(n2s)
h=l
where
4
nhs
= [1/(nhs-
1)]y~
2
(Yhj - - Y h s ) ,
h=
1..... Z.
j=l
In order to answer question (b) in Section 1, we need to compare the estimators
35s,rid and tic(L). Apart from the consideration of sampling cost, the comparison should
be made on the basis of equal sample sizes. However, the equal sample sizes requirement cannot be satisfied since ns >~n. A remedy for this difficulty is to substitute the
214
K.-C Chang et al./Journal of Statistical Plannin9 and Inference 69 (1998) 209-227
expectation of ns for n in the comparison. Since E(ns) may not be an integer, we will
use [E(ns)] + 1 as the common sample size for the estimators )3d and )3c(L ) where [.]
denotes the integer part, and we denote this common sample size by n c. In the next
section, we compute E(nhs) and E(n~) in Eq. (2.4) for the case of two strata, i.e.,
double inverse sampling. The computation of E(ns) is nontrivial if L > 2, and we will
discuss it separately in Section 5. Before we find E(ns), we first give the following
lemma that relates E(ns) and E(nhs):
Lemma 2.1.
E(nhs) = WhE(ns),
(2.5)
h = 1. . . . . L.
Proof. For each i = 1,2 .... and h = 1. . . . . L, let Aih be the event that Yi belongs to the
hth stratum and let X/h be the indicator function of the event Aih, i.e.;
Y/h = 1A,h.
Note that, for each h, Xlh, X2h.... are i.i.d, random variables which are independently
adapted to the increasing sigma fields Ak, k >~ 1, where Ak = a{X/h I i = 1. . . . . k, h =
1. . . . . L}. Also, it is not difficult to see that ns is a proper Ak-stopping time for which
E(ns) < ~ and nhs can be expressed as
ns
nhs = Z
Xih.
i=1
Now, it follows from Wald's lemma (see, for example, Woodroofe 1982, Theorem 1.3,
p. 8) that
E(nhs)
----
E(Xlh)E(ns)
= WhE(ns).
This completes the proof.
[]
Intuitively, the estimator 33s based on multiple inverse sampling as we proposed,
should behave somewhat like a stratified sampling estimator under proportional allocation.
3. Double inverse sampling
In this section, we derive formulae to compute E(nhs) and E(n~s ) for the populations
having two strata. We begin with the following lemma:
Lemma 3.1.
If L=2 and mh >>.l,h
2
E(ns) = n + ~
h=l
= 1,2,
then
mh-- 1
y~ ( 7 ) (mh --i)w~-l(1-- wh)n-i.
i=0
(3.1)
K.-C Chang et al./Journal of Statistical Planning and Inference 69 (1998) 209-227
215
Proof. Without loss of generality, we assume that n >~ml +m2. Let Bib be the event that
n h ~ - i and Zih be the negative binomial random variable with probability mass function
(p.m.f.)
f(gih ) ~ (
Zih -- 1
I W~h-i( 1 -- W h )zih--mh+i
Zih ----mh -- i, m h -- i q- 1 .....
h = 1,2, i = 1,...,mh -- 1.
\mh --i-- 1
Then
2 mh-- 1
g(ns) = n "P(nl >~ml and nz>~m2) + Z
Z
h=l
I
=n
2 mh--1
1- Z
Z
h=l
1
E(ns
[ Bih)P(Bih)
i=0
2 mh--I
P(Bih)
+ Z
i=0
Z[n-[-
h=l
E(gih)lP(Bih),
i=0
which reduces to Eq. (3.1) since E(zih) = (mh -- i)/Wh.
[]
Thus, when L = 2 , the E(nhs) in Eq. (2.4) can be calculated by using Eqs. (2.5) and
(3.1). The term E(n~s ) in Eq. (2.4), when L = 2, is given in the next lemma.
Lemma 3.2. I f L = 2 and mh >~l, h = 1,2, then
W~(1
E(n2hs)=Z(m2 h --fl)
-
-
Wh)n-i
+nWh(l
--
Wh)+(nWh) 2
i=0
Wh
+
(m--rnh--n+
i ) ( l _ Wh) 2
i=n--m+mh+l
x[2i+l+(m--mh--n--i)Wh](~)
(3.2)
W ~ ( 1 - W h ) n-i.
Proof. Assume that n>~ml + m2 = m and let Xh be the binomial random variable with
parameters n and Wh, then E ( X ~ ) = n W h ( 1 - Wh) + (nWh) 2. Next, let Uih = Z i h - ( m mh --n--k i) where zih is the negative binomial random variable with parameters 1 - Wh
and m - m h - - n + i, i.e., with p.m.f.
(
f(Zih)
=
Zih--1
m
-- mh
)
n+ i- 1
--
Zih=m--mh--n+i,...;
rjtzxm--mh--n+irlzzih--(m--mh--n+i)
(1 - , h)
i=n--m+mh+l,...,n;
"h
,
h = 1,2.
Then,
E ( U ~ ) = Z(Uih) + [E(Uih)] 2
Wh
= (m
-
m h --
n + i)(1
wh ]2.
Wh)2 + [(m -- m h - - n + i)(1 --Wh)]
216
K.-C Chano et al./Journal of Statistical Plannin9 and Inference 69 (1998) 209-227
Now,
n
E(n~s) = E E(n~s I nh = i)P(nh = i)
i=0
mh
n--m+mh
= Em]P(nh =i)+
i=0
+
E
i2p(nh = i)
i=mh+l
~
E[(i + Uih)2]p(nh = i)
i=n--m+mh + l
mh
n
= E ( m 2 h - i2)p(nh = i ) + E i2 ( 7 ) W~(1i=0
+
Wh) n-i
i=0
~
[2iE(Uih) + E(U~)]P(nh = i)
i=n--ra+mh + l
mh
= E
n
(m] - i 2 ) ( i )
W~(1- wh)n-i+ E(X~)
i=0
+
~
Wh
"
(m - mh - n + i)(1 ---~Vh)2-[2t(1 --
Wh)
i=n--m+mh+l
+ l + (m -- mh -- n + i)Wh] ( 7 ) W~(1-- Wh) n-i
which reduces to Eq. (3.2).
[]
In the next section, we will use formula (3.1) to compute the common sample size
nc for tic(L) and rid in the simulation. As a special case, when ml = m2 = 1, Eq. (3.1)
takes the simple form
W~' W~'
E (ns) = n + --~l -t- ~ 2 .
(3.3)
4. Comparison and simulation
In this section, we use mean-squared-errors (MSE) to compare the estimators fi~,rid
and fie(L). For the time being, we only consider the case of two strata with negligible
finite population correction. The case with more than two strata are complicated and
we will treat them in Section 5.
The MSE of )~a with two strata is
MSEOSd) = VO3d) + BIAS2OTa),
(4.1)
K.-C. Chan9 et al./Journalof StatisticalPlanningandInference69 (1998)209 227
217
where,
V(~d) = V[E(~dlnl,nz)] + E[V(~a[nl,n2)]
=[E(bl)E(b2)- E(blb2)](#l-/~z) 2 + E (~lnl ~> 1)E(al)al 2
+E(~[n2>~l)E(a2)ff~
(4.2)
with bl -- (al W1/E(al ))/[al W1/E(al )+a2 W2/E(a2)], b2 = (a2 W2/E(a2))/[al W1/E(al )+
azWz/E(az)],E(al) = 1 - W~, and E ( a 2 ) = 1 - W~'. The expectations of bl,b2, etc. in
Eq. (4.2) can be calculated as follows:
E(bl)= W,E(a2)E [
al
]
al WIE(a2) + a2 W2E(al )
(
= w , ( 1 - ve~)
1-W~-W,
1 - Tv~
I-W~ -1)
~---~
re;+' +
and similarly,
( 1 -- W~- W~
~(b~) = w~(1 - w f )
1 -~v~
W2r-1 )
w~ +~ + 1 - rvf-
"
Next,
E(blb2) = W1W2E(al)E(a2)E I [al WIE(a2)ala2
+ a2W2E(al)]2 I
= WIW2(1 -
W(')(1 - W~')( 1 -(1-
~ (b~ In~ ~>1 = W~[e(a2)]2e
knl
= W2(1 - W~)2
~
- - 1 -W~
- - Wff)
w n ~ Z H,rn+l ~2'
"2
/
n-l )
n(1
-
-
I n~ >.1
a~W~E(a2)TazW2E(a~)
W~)2 + n ~ 1 nnl
=
n,n--n, I
m 1 m~~wn+l)2
nl(1 _-~n+-'l
1
2
and similarly,
E (b~
= W~(1 -
w2n) 2
W~_2
n(1 - Wn) 2
2
~-!~ ( r t )
n2=l
n2
wI-n2w~2
nz(1
l
I
2
From Eq. (4.1), we see that MSEO3d) depends on the stratum mean difference
] /~l -/~2 1. The BIAS2OSa) in Eq. (4.1) is usually small relative to V05d). A detailed discussion for BIAS(33d) is given in Doss et al. (1979).
218
K.-C. Chang et al./ Journal of Statistical Planning and Inference 69 (1998) 209-227
The MSE of )Sc(L ) with two strata is
MSE[)~c(2)] = P(nln2 ¢ O)V(fiplnln2 ¢ O) + P(nln2 = 0)MSE(35 ] nln2 = O)
n--l
nl=l
nl
nI
n -- n 1 f
= Zn--I / n ) W~I~];--nl / W?O'2-~- W220"I~
nl=l
nl
n1
n -- n 1 /
q - ( W ~ W 2 q- W 12 Wdn ) ( # l - ]g2) 2 -}- -1( W l n Ol2 -]- W 2no.22 /~,
(4.3)
n
which depends on I ]21 - ]22 I also.
We see that MSEOSd) and MSE[)Sc(2)] depend on I]21 -]22], whereas MSEO3s) does
not. Hence, if we have prior information that I#1 -]221 is large and a 2 and a 2 are
relatively small, then our best choice in estimating ]2 will be )5s.
In order to verify our discussion above, we compare the mean-squared-errors of
35s, )3d and 35c(2) by Monte Carlo simulation. In the simulation, we first specify an initial
sample size n and two minimum subsample sizes ml and m2 such that n>~ml + m2.
Then, we compute E(ns) by L e m m a 3.1 and we use n¢ = [E(ns)] + 1 as the common
sample size for Yd and 95c(2). The subsample sizes of strata 1 and 2, which are random,
will be denoted by nit and n2c, respectively. Our simulation consists of the following
steps:
Step l: Using the random number generator subroutine D R N U N from IMSL (International Mathematical & Statistical Library), we generate a random sample of size
nc from the Uniform (0, l) distribution. The number of observations that are less than
the known weight W1 in the sample is the subsample size nlc. The subsample size for
stratum 2 is n2c nc - nlc. Thus, both nlc and n2c are random.
2 2
Step 2: Given the stratum means #1,#2 and stratum variances al,rr2,
we generate
two independent subsamples of sizes nlc and n2c, each from a specified continuous distribution, by using various IMSL subroutines (e.g., D R N N O A for normal distribution,
DRNEXP for exponential distribution, etc.).
Step 3: From the generated data in Step 2, we compute the estimators )3d and 33c(2).
Step 4: Fixing the values of W1, W2,]21,]22,02,0"I and nc (hence # = W1121 q- W2122
is also fixed), we repeat the previous three steps 2000 times.
Step 5: Now, we have obtained 2000 values for each estimator, say y~di) and )7c(2)(i),
i = 1, 2 . . . . . 2000. The Monte Carlo estimate for the mean-squared-error of Yd is
=
1
M S E ( ' P d ) - 2000
2OOO
Z(~i)
i=l
]2)2.
K.-C Chang et al./Journal of Statistical Planning and Inference 69 (1998) 209-227
219
The Monte Carlo estimates for the mean-squared-errors of t~(2) is calculated in the
same way.
Step 6: The computation of Monte Carlo estimate for MSE(ts) is basically the same
as the above procedure expect that
(a) in Step 1, we start with the initial sample size n. If both subsample sizes are larger
than the specified minimum subsample sizes, we stop and use tip. Otherwise, we
keep generating Uniform (0, 1) random observations until both the minimum subsample sizes ml and m2 are obtained. Then, the final sample size ns is determined.
(b) in Step 2, the subsample sizes nlc and n2~ are replaced by nls and n2s, respectively,
where nls ~>ml and n2s ~>m2.
(c) in Step 4, the sample size nc is replaced by ns which may differ in each of the
2000 repetitions because it is random.
The Monte Carlo simulation described above will be carried out under various
parameter values. These values are
n = 2(1)30,
W1 = 0.05(0.05)0.95;(W2 = 1 - Wl),
#1,#2 -- 0(±1) + 200,
al,a2
1(1)10. (Hence,
2 22 =
O'1,O"
1,4,9 ..... 100.)
All possible combinations of the above parameter values, together with various strata
distributions (e.g., normal, uniform, exponential, Weibull, etc.), are considered in the
simulation. In the following, we present a few typical results in our simulation for the
case that both strata are normally distributed. Tables 1-3 give the Monte Carlo estimates, denoted by MSE, for the MSE's of ys,)Sa and tc(2) under different
parameter values. In Table 1, n = 2 , W1 =0.5 and ml----mE = 1, hence, nc = 3 by
Eq. (3.3). In Table 2, n=5, Wl=O.8,ml=m2=2. Thus, n c - l l by Eq. (3.1). In
Table 3, n =20, W1 = 0.95,ml = 3 and m2 = 2 , hence, nc -- 42. In all tables, MSE(fis) in
the 4th column represents the MSE of Ys computed from the approximation formula
(2.4) and MSE[tc(2)] in the 6th column represents the MSE of fie(2) computed from
formula (4.3). We see that the values of MSE and MSE are close in all tables, which
ensures the consistency between our simulation and analytical results. We can also
see that MSE(t~), apart from some negligible simulation errors, is the same for given
values of n, Wl, W2, 0.2 and 0.2 regardless of the values of 121 and #2. This means that
the value of MSE(fis) is independent of the values of #1 and #2, and this independence
relation does not hold for other estimators in the tables, which agrees with our analytical conclusion in this section. MSE(ts) is much smaller than the MSE's of other
estimators when I#1 --#21 is large and a 2, a 2 are relatively small. This phenomenon
also holds for other distributions which are not presented here to save space.
One concern of using Ys is that the sample size n~ may become very large. The
maximum value of ns in the 2000 repetitions, denoted by max{ns}, is given in the
K.-C Chang et al./ Journal of Statistical Planning and Inference 69 (1998) 209-227
220
Table 1
Mean-squared-errors of estimators
( L = 2 , n = 2 , ml =m2 = 1, W1 =0.5, nc = 3 , #1 = 0 , max{ns} = 12)
.1/2
O'~
O'~
MSE(Ys)
MSE(Ys)
MSE[ye(2)]
MSE[yc(2)]
MSE(Yd)
MSEOTd)
10
10
10
10
10
10
10
10
10
1
1
1
9
9
9
25
25
25
1
9
25
1
9
25
1
9
25
0.48
2.41
6.26
2.41
4.33
8.19
6.26
8.19
12.04
0.40
2.19
5.43
2.12
3.91
7.11
5.38
7.08
10.94
6.61
8.07
10.99
8.07
9.53
12.45
10.99
12.45
15.36
6.61
8.10
11.04
7.96
9.49
12.46
10.79
12.35
15.36
6,61
8.07
10.99
8.07
9.53
12.45
10.99
12.45
15.36
6.61
8.10
11.04
7.96
9.49
12.46
10.79
12.35
15.36
30
30
30
30
30
30
30
30
30
1
1
1
9
9
9
25
25
25
1
9
25
1
9
25
1
9
25
0.48
2.41
6.26
2.41
4.33
8.19
6.26
8.19
12.04
0.42
2.17
5.29
2.04
4.03
6.86
5.90
7.19
10.36
56.61
58.07
60.99
58.07
59.53
62.45
60.99
62.45
65.36
56.89
58.42
61.40
57.95
59.52
62.54
60.49
62.10
65.15
56.61
58.07
60.99
58.07
59.53
62.45
60.99
62.45
65.36
56.89
58.42
61.40
57.95
59.52
62.54
60.49
62.10
65.15
Table 2
Mean-squared-errors of estimators
( L = 2 , n = 5 , ml = m 2 = 2 , W1 =0.8, n¢ = 11, #1 = - 1 0 , max{ns} = 4 3 )
~2
0"2
0"2
MSE(fis)
MSE(fis)
MSE[fic(2)]
MSE[fie(2)]
MSE(fid)
10
1
1
10
10
10
1
1
9
0.14
0.30
0.61
1.09
0.14
0.31
0.61
1.12
1.47
1.63
1.95
2.09
1.55
1.70
2.01
2.12
1.63
9
25
1
1.83
2.02
2.24
10
10
10
10
10
9
9
1.25
1.29
2.25
2.28
2.29
9
25
25
25
25
1
9
25
1.56
2.99
3.15
3.47
1.61
3.07
3.25
3.57
2.57
3.33
3.49
3.81
2.60
3.31
3.48
3.80
2.79
3.38
3.63
4.09
30
30
30
30
30
30
30
30
30
1
1
1
9
9
9
25
25
25
1
9
25
1
9
25
1
9
25
0.14
0.30
0.61
1.09
1.25
1.56
2.99
3.15
3.47
0.15
0.30
0.61
1.10
1.28
1.60
3.04
3.21
3.58
5.59
5.75
6.07
6.21
6.37
6.69
7.45
7.61
7.93
5.94
6.10
6.40
6.48
6.64
6.95
7.62
7.79
8.11
5.77
6.22
6.69
6.74
6.86
7.54
7.85
8.34
8.43
h e a d l i n e o f e a c h t a b l e as a r e f e r e n c e s o that t h e u s e r s h a v e a n i d e a o f t h e p o s s i b l e
e x t r e m e s a m p l e size.
W h e n W1 = W2, 354 b e c o m e s u n b i a s e d , a n d t h e r e f o r e , M S E 0 3 d ) = V 0 3 d). T h e 8 t h c o l u m n in T a b l e 1 g i v e s t h e M S E o f rid c a l c u l a t e d f r o m E q . ( 4 . 1 ) , d e n o t e d b y M S E ( f i d ) ,
K.-C. Chano et al /Journal of Statistical Plannin9 and Inference 69 (1998) 209-227
Table 3
Mean-squared-errors of estimators
(L = 2, n = 20, ml = 3, m 2 = 2, W1 = 0.95, n¢ = 42, #1
#2
=
221
20, max{ns} ----224)
a~
a~
MSE(ys)
MSE(fs)
MSE[yc(2)]
MSE[Yc(2)]
MSE(yd)
50
50
50
50
50
50
50
50
50
1
1
1
9
9
9
25
25
25
1
9
25
1
9
25
1
9
25
0.03
0.04
0.06
0.29
0.30
0.32
0.81
0.82
0.84
0.03
0.04
0.06
0.28
0.29
0.3l
0.76
0.77
0.80
0.30
0.30
0.31
0.47
0.48
0.50
0.84
0.85
0.86
0.30
0.28
0.30
0.45
0.46
0.48
0.81
0.82
0.84
0.33
0.32
0.34
0.48
0.49
0.51
0.83
0.85
0.87
200
200
200
200
200
200
200
200
200
I
1
1
9
9
9
25
25
25
1
9
25
1
9
25
1
9
25
0.03
0.04
0.06
0.29
0.30
0.32
0.81
0.82
0.84
0.03
0.04
0.06
0.27
0.29
0.31
0.80
0.82
0.83
9.91
9.43
9.45
9.60
9.61
9.63
9.97
9.98
10.00
9.83
9.15
9.17
9.25
9.26
9.28
9.55
9.56
9.58
11.06
10.26
10.28
10.34
10.36
10.39
10.63
10.65
10.68
for this special situation. W e can also see that MSE[fie(2)] coincides with MSE(33d)
in Table 1 because W1 = W2.
5. Populations with more than two strata
W e n o w consider the case that L~>3 for c o m p a r i n g fis, f i d and fie(L). The exact
formula for c o m p u t i n g E(ns) is given for the case that L = 3 and ml = m2 -- m3 = 1.
W e have the following lemma:
L e m m a 5.1. I f L = 3
andmh = 1, h = 1,2,3, then
1
1
E ( n s ) = n + ~]([(W2 + W3)" - W~ - W~'] + -~2-2[(W1 + W3
1 w
+W33 [(
+
W,n
"2
'+w2)n-w~-w~]+
(
1+
ml
+
m3)
+
w~' (1+w2
l-W1
m•
(
),
- W~' - W~']
w3)
-W33+-~2
1+
ml
+
m2)
Proof. Define the following seven m u t u a l l y exclusive and exhaustive events:
AI = [nln2n3 • 0],
(5.1)
222
K.-C. Chang et al./ Journal of Statistical Plannino and Inference 69 (1998) 209~27
A2h:Enh:0"ni0]hforby,23
d3h=[nh=n],
f o r h = 1,2,3.
Then we have
3
E(ns) = E(ns ]A1 )P(A1 ) + Z E(ns IA2h)/'(&h )
h=l
3
+ Z E(ns IA3h)P(A3h ).
(5.2)
h=l
It is easy to see that
E(ns[A1) = n
(5.3)
1
E(ns[A2h) = n + -~h
for h = 1,2,3,
(5.4)
To compute E(nslA3h), typically, E(nslA31 ), let Bh be the event that the (n+ 1)th sample
falls into the hth stratum, for h = 1,2,3; then we have
3
E(nslA31 ) = Z E(nslA31 fq Bh)P(BhlA31 )
h=l
= [E(nslA31) 4- 1]W1 4- [n 4- 1 4-E(Z3)]W2 4- [n 4- 1 +E(Z2)]W3,
where Z2 and Z3 are geometrically distributed random variables with parameters W2
and Wa, respectively, so that E(Z2) = 1/W2 and E(Z3) = 1/W3. Solving the above
equation for E(nslA31), we obtain
1
E(nslA31)=n+ l_--Z~!
(
W3 W2)
1+~,-22+~--33 •
(5.5)
Similarly, we can get E(nslA32) and E(nslA33). Also, the probabilities of the events
A2h and A3h can be computed directly, e.g.,
P ( A 2 1 ) = (W2 4- m3) n -
mff - m~,
(5.6)
P(A31) = W~.
Combining Eqs. (5.2)-(5.7), we obtain Eq. (5.1). This completes the proof.
(5.7)
[]
When L>~4, or when L = 3 and mh > 1 for some h, we will compute E(ns) by
simulation, again with 2000 repetitions, for the comparisons of Ys, YO, and fie(L).
Now, let us consider the MSE. When L = 3 , the exact formula for MSE[)Sc(3)] is
given in the next lemma.
K.-C Chang et al./Journal of Statistical Plann&g and Inference 69 (1998) 209-227
223
Lemma 5.2. When L = 3, the MSE o f 33c(3) in Eq. (1.2) is
n--2 n--kl--I
MSE[)3c(3)] = Z
Z
kl =1
× \
kl
n!
W~' wk22Ihvn-k'-k2
(k,')(k2!)[(n - h i - k 2 ) ! ]
"2
k2=l
+ k2---2 + n --kll--k2
)
h=l
(5.8)
n-I
+Z
\ k2 ]
k2=l
n-k2
n-1
+Z
kl=l
I(rv2+w)2(
q-zn-1 ( n )
\n-kl ]
kl =1
~
Proof.
MSE[)Sc(3) ] = E{[yc(3) _ #]2}
= E [ ( p p - #)2 I nln2n3 ~ O]P(nlnzn3 7~ O)
3
q'- Z E[('Yh --
#)2[nh =
n]P(nh = n)
h=l
+E{[(W1 + W2)332 + W3Y3 - #]21nl = 0 and n2n3 ~ 0}
P(nl = 0 and n2n3 ~ O)
+E{[(W1 + W2):31 + W3333 - #]Zln2 = 0 and n~n3 57/:0}
P(n2 = 0 and nln3 ~ O)
+E{[W1331 + (W2 + W3)Y2 - #]2[n3 = 0 and nln2 ~ 0}
P(n3 = 0 and nln2 ~ O)
n--2 n--kl--1
= Y~- Z
E[(pp - #)2In1 = k, and n2 = kv]
kl--1
k2=l
P(nl = kl and n2 = k2)
224
K.-C Chan 9 et al. I Journal of Statistical Plannin9 and Inference 69 (1998) 209-227
3
+Z
WffE[(Yh -/2)2 inh =_nl
h=l
tl-1
-}- Z E{[(WI nt- W2).y2 + W3y 3 -/2121ni
k2=l
P(nl
= 0
n--1
-~- Z E { [ ( W 1
kl =1
P(nl
=
0
and n2 = k2}
and n2 = k2)
-~- W2)Yl -~- W3y 3 -/2121/'/1 = kl and n2 -- 0}
= kl and n2 = 0 )
't- Z E{[W1)51 -t- (W2 q- W3))52 -/2121nl = kl and n3 = 0}
kI=1
P(nl
= k l a n d n3 = 0)
n--2 n-kl-I
~
=~
gl[
(kl~)(k2~)E(._kl_k2)~lW~'W~W;-~'-~2
kt =t k2=1
(w~4
11
X k
w~4 -~- n ~~v~
k~l~k2
"~ ~
)+'Z w;
h=l
+ Z"-' (n)
~ w2~w;-~
k2= 1
x E { [ ( W j + W2)22 + W3y 3 -/2121nl -- 0 and n 2 = kz}
"4- zn-l ( n )
kt =1
×E{[(W1 -t- W2))51 -t- W3)53 -/2121n 1 = kl and n2 = 0}
rt-1
+
tq
"2
kj =1
xE{[W1)~I
+ (W2 + Ws).f'z --/212]nl =
kl and n3 = 0},
where
E{[(W1 + Wz))52 +
W3.Y3--/2121nl =
0 and n2 = k2}
W3fi3]2lnl =
0 and n2 = k2}
= E { [ ( W j + W2))52 +
-2/2E[(W1 + W2))52 +
W3~3[nl =
0 and n2 = k2] +/22
]
+(/2~-/2)2
K.-C Chang et al./ Journal of Statistical Planning and Inference 69 (1998) 209-227
225
= E[(WI + W2)2y~ + 2(WI + W2)W3.P2Y3 + W2f'~lnl = 0 and n2 = k2]
-2#[(Wl + W2)~2 + ~'r3]A3] -{'-]A2
----(WI + W2)2 (°"2\k2+'u2) + 2W3(WI q-- W2)#z/.t3
(\ n --0.3k2 + p32) - 2/~[(W1 + W2)/~2 + W3p3] +/~2
\k2]
=(wl+w2)2(0.23
"[- [(WI -'[- W2)f12 q'- W3/A3 - ,U]2
+
-
and
E{[(W1 + W2)yl + W3y3 - p]2lnl = kl and n2 = 0}
and
E{[W1fil + (W2 + W3)fi2 - / ~ ] 2 ] n l --- kl and n3 ---- 0}
can be handled in a similar way. []
From Eq. (5.8), we see that MSE[)3e(3)] is an increasing function of the stratum
mean differences I#l - # 2 ] and 1#2- P3 l- This undesirable property of tic(L) also holds
when L ~>4. However, we will not derive the exact formulae for tic(L) when L ~>4 due
to tediousness.
When L ~>3, the MSE of rid also depends on the stratum mean differences, which can
be seen from the formula for VO3d) in Doss et al. (1979). In the following, we present
two typical simulation results for the three strata case and one result for the four strata
case with normal distribution in Tables 4 ~ , respectively. The parameter values are
given in the headline of each table. The Monte Carlo estimate for the MSE of each
estimator is denoted by MSE as in Section 4. The 6th column in Tables 4 and 5, and
the 8th column in Table 6, headed MSEO3s), give the MSE of 33s computed from the
exact formula (2.2), in which the term E(1/nhs) is also calculated by simulation with
2000 repetitions. We see that the values of MSEO3s) and MSEO3s) are close, which
ensures the consistency between our simulation and the exact formula (2.2). The 7th
column in Tables 4 and 5, headed MSE[fic(3)], give the MSE of tic(3) computed from
the exact formula (5.8). We see that the values of MSE[fic(3)] and MSE[fic(3) ] are
also close. It is also apparent that the value of MSE(fis) does not depend on the values
of Pl ..... #r. Again, MSE(fis) is much smaller than MSE[fic(L)] and MSE(fid) when
the differences among [A1
/2L are large and tr2. . . . . 0.2 are relatively small, e.g., when
Pl = 10, P2 =30, #3 =80, /~4 = 2 0 0 and a 2 . . . . .
0.2 = 2 5 in Table 6.
.....
K.-C Chang et al./Journal of Statistical Planning and Inference 69 (1998) 209-227
226
Table 4
Mean-squared-errors of estimators
(L = 3, n = 10, ml = m2 = m3 = 1, W1 = 0.6, W2 = 0.3, W3 = 0.1, nc = 14, #1
#2
#3
=
10, max{ns} = 63)
a~
g~
a~
MgE(Ys)
MgE(Ys)
MSE[yc(3)]
MSE[Yc(3)]
MSE(y d)
30
30
30
30
30
30
30
30
30
80
80
80
80
80
80
80
80
80
1
1
1
25
25
25
81
81
81
1
25
81
25
1
81
25
81
81
1
81
25
25
81
1
81
25
81
0.10
1.60
3.04
2.52
2.17
4.23
6.24
7.68
8.16
0.10
1.57
2.99
2.51
2.21
4.21
6.24
7.63
8.12
6.13
7.29
8.69
8.08
7.65
9.66
10.94
12.34
12.64
6.28
7.54
9.09
8.41
7.83
10.11
11.34
12.88
13.16
11.63
13.63
14.28
13.80
13.40
15.30
16.91
18.54
18.56
50
50
50
50
50
50
50
50
50
150
150
150
150
150
150
150
150
150
1
1
1
25
25
25
81
81
81
1
25
81
25
1
25
25
81
81
1
81
25
25
81
81
25
81
81
0.10
1.60
3.04
2.52
2.17
4.23
6.24
7.68
8.16
0.10
1.58
3.01
2.50
2.20
4.20
6.23
7.66
8.15
24.27
25.43
26.84
26.23
25.80
27.81
29.08
30.49
30.79
24.81
26.22
27.92
27.09
26.37
28.94
30.03
31.71
31.99
46.15
47,71
49.18
48.55
47,93
50,27
51.69
53,14
53.54
Table 5
Mean-squared-e~ors of estimmors
(L = 3, n = 20, mt = 2, m2 = 3, m3 = 2, W1 = 0.15, W2 = 0.75, W3 = 0.10, nc = 27, ~1
max{ns} = 92)
f12
#3
¢~
~
a~
MSE(fis)
MSE(fis)
MSE[fic(3)]
MSE[Yc(3)]
=
10,
MSE(fid)
40
40
40
40
40
40
40
40
40
80
80
80
80
80
80
80
80
80
1
1
1
25
25
25
81
81
81
1
25
81
25
1
81
25
81
81
1
81
25
25
81
1
81
25
81
0.04
1.17
2.80
1.12
0.55
2.87
1.76
3.38
3.62
0.04
1.20
2.88
1.16
0.56
2.98
1.82
3.52
3.76
1.21
2.25
3.61
2.17
1.73
3.67
2.81
4.17
4.42
1.34
2.45
3.83
2.35
1.87
3.85
3.00
4.36
4.64
1.81
2.97
4.30
2.83
2.40
4.30
3.53
4.84
5.15
60
60
60
60
60
60
60
60
60
150
150
150
150
150
150
150
150
150
1
1
1
25
25
25
81
81
81
1
25
81
25
1
81
81
81
81
1
81
25
25
81
1
25
81
81
0.04
1.17
2.80
1.12
0.55
2.87
1.76
3.38
3.62
0.04
1.18
2.86
1.17
0.57
2.95
1.81
3.48
3.75
5.39
6.43
7.79
6.36
5.91
7.86
7.00
8.36
8.60
5.94
7.11
8.54
6.99
6.47
8.53
7.63
9.04
9.33
7.75
9.01
10.39
8.83
8.35
10.35
9.52
10.88
11.21
K.-C. Chang et al./Journal of Statistical Planning and Inference 69 (1998) 209-227
227
Table 6
Mean-squared-errors of estimators
(L : 4, n : 30, m 1 . . . . .
m4 : 2, W 1 : 0.30, W 2 : 0.45, W 3 : 0 . 1 5 , ne : 33, /tl = 10, maX{ns} = 90)
f12
~3
20
20
20
20
20
20
20
20
20
40
40
40
40
40
40
40
40
40
30
30
30
30
30
30
30
30
30
80
80
80
80
80
80
80
80
80
ff~
a~
a~
a~
MSE(Ss)
MSE(5 s)
MSE[Sc(4)]
70
70
70
70
70
70
70
70
70
1
1
1
25
25
25
81
81
81
1
25
81
25
1
81
25
81
81
1
25
81
25
81
81
25
25
81
1
81
25
25
81
1
81
25
81
0.03
0.81
1.76
0.86
1.02
1.92
1.65
2.28
2.79
0.03
0.79
1.70
0.86
1.02
1.88
1.71
2.31
2.80
0.31
1.05
1.92
1.09
1.30
2.08
1.86
2.42
2.94
0.74
1.50
2.33
1.52
1.73
2.48
2.30
2.84
3.35
200
200
200
200
200
200
200
200
200
1
1
1
25
25
25
81
81
81
1
25
81
25
1
81
25
81
81
1
25
81
25
81
81
25
25
81
1
81
25
25
81
1
81
25
81
0.03
0.81
1.76
0.86
1.02
1.92
1.65
2.28
2.79
0.03
0.79
1.72
0.86
1.05
1.90
1.69
2.30
2.82
4.13
4.82
5.64
4.85
5.06
5.79
5.61
6.15
6.64
7.97
8.68
9.45
8.68
8.89
9.58
9.46
9.97
10.45
114
MSE(y d)
6. Conclusions
In this paper, we use the multiple inverse sampling scheme for small-sample poststratification. The estimator Ys under this scheme is better than other existing estimators
in the sense that MSE()3s) is independent of the values of the stratum means. Finally,
we would like to point out that the double inverse sampling in Sections 3 and 4 is
somewhat different from the one in Tu and Han (1982) that our initial sample size n
is at least 2 whereas n starts from zero in theirs.
Acknowledgements
We would like to thank a referee for very helpful comments and suggestions.
References
Cochran, W.G., 1977. Sampling Techniques, 3rd ed. Wiley, New York.
Doss, D.C., Hartley, H.O., Somayajulu, G.R., 1979. An exact small sample theory for post-stratification. J.
Statist. Plann. Inference 3, 235-248.
Scheaffer, R.L., Mendenhall, W., Ott, L., 1990. Elementary Survey Sampling. Duxbury Press, Boston.
Thompson, S.K., 1992. Sampling. Wiley, New York.
Tu, C.T., Han, C.P., 1982. Discriminant analysis based on binary and continuous variables. JASA 447-454.
Woodroofe, M., 1982. Nonlinear Renewal Theory in Sequential Analysis. Society for Industrial and Applied
Mathematics, Philadelphia.