LIMIT lliEOREMS FOR AN EXTENDED COUPON COLLECTOR'S PROBLEM
AND FOR SUCCESSIVE SUBSAMPLING WIlli VARYING PROBABILITIES*
by
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina, Chapel Hill
Institute of Statistics Mimeo Seires No. 1235
June 1979
\
LIMIT THEOREMS FOR AN EXTENDED COUPON COLLECTOR'S PROBLEM AND FOR
SUCCESSIVE SUBSA.1I1PLING WITH VARYING PROBABILITIES*
PRANAB KUMAR SEN
University of North Carolina, Chapel Hill, N.C.
ABSTRACT
Asymptotic normality as well as some weak invariance principles
for bonus sums and waiting times in an extended coupon collector's
problem are considered and incorporated in the study of the asymptotic
distribution theory of estimators of (finite) population totals in
successive sub-sampling (or multistage sampling) with varying probabilities (without replacements).
Some applications of these theorems
are also considered.
AMS Subject Classification Nos:
Key WOPds & Phpases:
60FOS, 62E20.
Asymptotic normality, bonus sum, extended coupon
collector's problem, invariance principles, successive sub-sampling,
tightness, varying probabilities, waiting times.
*Work partially supported by the National Heart, Lung and Blood
Institute, Contract No. NIH-NHLBI-7l-2243 from the National Institutes
of Health.
,
,
-2-
1.
Let
INTRODUCTION
N items of a finite
a 11 · . ' 1aN be the variate values of
population and let
Pp" ·,PN be positive numbers such that
, N P =1. Consider a successive sampling scheme where items are
Ls=l s
sampled one after the other (without replacement) in such a way that
at each draw the probability of drawing item
Ps
if item
s
s
is proportional to
has not already appeared in the earlier drawals.
According to the usual terminology [viz., Hajek (1964) and Sukhatme
and Sukhatme (1970)], we term this successive sampUng with varying
probabiZities (without repZacement) (S$VPWR).
probability that item s
SSVP~~
and let
Let
~(s,
n) be the
is included in a sample of size
Sen) ={s: item s €sample of size n}.
estimation of the population total
T
,N
=Ls=la s '
n
in this
Then, for the
the Horvitz-Thompson
estimator is
(1.1)
where
I, items
s €sample of size n;
wns ={ 0, otherwise, for
0..2)
l:s; s :s;N.
The varying probability structure introduces certain complications
in the study of the asymptotic distribution theory of
Rosen (1970,
1972) considered an alternative approach (through the coupon collector's
problem) and provided some deeper results in this context.
Consider a
, k ~l} of independent and identically distributed random
k
variables (i.i.d.r.v.), where
sequence
{J
p{J =s} =p
k
s
for
s =1, ... ,N
and
k
~1.
(1. 3)
,
-3-
Let then
=
V
k
a
J
{ 0,
k
/11(J , n),
k
(1.4)
otherwise, for
k :<:1;
=inf{m: number of distinct
J , ... ,J
l
m
=k}, k:<:1. (1.5)
Then, Rosen (1,970, 1972) has shown that
Tn
where
m :<: 1,
V
\'N
Y
Lk=l nV
v
= \' n y
k
Lk=l nk
=B
nv '
n
sa
(1.6)
y,
stands for the equality of distributions and, for every
B
nm
is the bonus sum at the mth stage in a eoupon eolleetor's
{a*
P' 1 S:s S:N} where a* =a /l1(s, n), 1 S:s S:N.
ns' s'
ns
s
Thus, the asymptotic normality of (randomly stopped) bonus sums provides
problem with the set
A similar treatment holds for many other related
the same for
estimators.
Recently, Sen (1969) has formulated a martingale approaeh
to this problem and established some invariance principles under no
extra regularity conditions.
Sub-sampling (or multi-stage sampling) is often adapted in
practice and has a great variety of applications [viz., Cochran
(1963, Ch. 10)].
Here, each of the
N items of the population
(called the primary units) is composed of a number of smaller units
(sub-units) and a common practice is to select first a sample of n
primary units and then to use subsamples of sub-units in each of the
selected primary units.
Suppose that the sth primary unit has
sub-units with variate values
For each
M
s
\' s p sj -1)
L.j=l
-
sJ
s
so that
=l, .. .,M ,
a =b + ••• + b
for s =1> ... , N.
sl
sMs '
s
s, we conceive of a set {p., 1 S:j S:M}
numbers (such. that
ms
b., j
sJ
M
s
(1.7)
of positive
and consider a SSVPWR scheme, where
(out of M ) sub-units are chosen.
s
Then, an analogous estimator
-4-
(1.8)
where
w* .
sJ
and
=
I, j.th sub-unit Esub-sample of m from items,
s
{ 0, otherwise, for
j =1, ... ,M
(1.9)
s
~*(j,
m) is the probability that the jth sub-unit belongs to
s
the sub-sample of m sub-units from items s, for 1 ~j ~M
and
s
s
1 ~s ~N. From (1.1) and (1.8), one may consider the following
s
(1.10)
The procedure can be extended to the multi-stage case in a natural way.
The object of the present investigation is to study the
asymptotic distribution theory of estimators of
sub-sampling (SSS)VPWR schemes.
T
Note that for each
to (1.5) and (1.6), one can define a stopping number
in successive
s,
analogous
\)~s) and bonus
s
A
V
m ~l, such that a == B*
B*
(s)
for
every
s(=l, ... ,N).
s,m'
s
s , Vm
'
s
s ES(n)}
But, the multitude of the stopping numbers {v , \)(s)
n
ms '
introduces complications in a direct extension of the Rosen approach
sums
to SSSVPWR.
On
the other hand, the martingale approach of Sen (1979)
can more readily be extended to subsampling schemes, and this will be
systematically explored here.
In Section 2, we start with an extended coupon collector's
problem and present some invariance principles relating to bonus sums
and waiting times; the proofs of these theorems are considered in
t
-5-
Section 3.
Section 4 deals with the limit theorems for SSSVPWR.
The
concluding section is devoted to some general remarks and specific
applications.
2.
INVARIANCE PRINCIPLES FOR AN EXTENDED COUPON COLLECTOR'S PROBLEM
Let us consider a sequence
{n
, N ~l}
N
of coupon collector's
situations
(2.1)
where the
~(s)
I~=lPN(s) =1, V N ~l.
numbers with
array
are real numbers and the PN(s)
{XN(s),
1
~(s)
variables where
for
s =l, •.. ,N.
distribution
{~}
N ~l}
Ss SN;
Let us now consider a triangular
of row-wise independent random
is defined on a probability space
[Typically,
~s.]
are positive
XN(s)
is an estimator of
(~(s), ~S' ~s),
aN(s)
with a
Then, an extended coupon collector's situation
is defined by letting
(2.2)
Also, let
{Jnk , k
~l}
be a (double) sequence of (row-wise) i.i.d.r.v.,
where
P{JNk =s} =
~(s),
for
1 Ss SN
and k
~1.
(2.3)
Let then
y*
= {XN(JNk ), i f J Nk
nk
0, otherwise, for
\,n Y*
Z*Nn -- L.k=l
Nk'
Also, if the
XN(s)
t- {JNr ,
n >_ 1,'
k
.r <k}
(2.4 )
~l;
z*
NO -- Y*NO -- 0 .
are nonnegative r.v., then
(2.5)
ZNm is
!
in
n, and,
we let
UN(t) =min{n: z~ ~t},
t
e:R+
=[0,
00)
•
(2.6)
-6-
is termed the bonus sum after
Z*Nn
and UN(t)
n coupons in the situation
is the waiting time to obtain the bonus sum
0.~
-~
t.
Let us denote by
o
.
aN (s) =EX N(s),
aN (s)
2
=EX N(s)
and
2
2
0
Ns =V(XN(s)) =aN(s) -{aN(s)} ,
CY
(2.7)
for
s =1,·· . ,N.
As in Rosen (1969), we assume that
sup{ max NPN(s)} :S;;Ml
N l:S;;s:S;;N
lim{
(2.8)
<00 ,
ma~ I~(s) I/ ~:=laN(s)l~}·
= 0
~
1
N~ l:s;;s:S;;N
,
li~~nf{(I~=laN(S)PN(S));I(I~=laN(S)/N)} ~ M2 >0.
(2.9)
(2.10)
Further, without any loss of generality, we may set
-l\'N
N L.s=laN(s) '" 1 as N -+00
Then, we assume that for every
(2.11)
E >0,
Note that by (2.9), (2.11) and (2.12), for every
li~~UP{N-lI:=lE[lX~(S) -aN(s)1 I(IX~(s)
E >0,
-aN(s) I >EN)]} =0
(2.13)
Further, for (2.12) or (2.13) to hold, a slightly more stringent
(but, more easily verifiable) condition is that for some 0 >0,
S~P{N-lI:=lEIXN(S) -a~(s) 12+ O} :S;;Mi <00 •
(2.14)
Let us also denote by
(2.15)
-7-
(2.16)
(2.17)
*2
dNn
2
= dNn
+
e2 ,
Nn
n ~O·.
(2.18)
for every n.
(2.19)
Note that by (2.11) and (2.12),
. {N-1 e 2 } < 00,
l1~~UP
Nn
Also, it follows from Sen (1979) [viz., his (2.12) through (2.14)]
that if
(2.20)
then
2
2 :0> l'1m sup n -ld N
· 1n
. f n -l .dN
O < 1 1m
< 00
N-+oo
n
N-+oo
n
(2.21)
Then, we have the following
Theopem B.l.
Undep (2.8), (2.9), (2.10), (2.12) and (2.20), as
N -+00" (ZNn - <P~)/dNn
has asymptoticaUy a standaPd normal distpibution.
Actually, Theorem 2.1 can be extended to an invariance principle
for
{ZNn -<P~n}
as follows.
A= [0, T] and for every N,
For an arbitrary
T (0 <T <00),
consider a sample process
let
WN ={WN(x) , x EA}
by letting
x EA,
(where
[s]
denotes the largest integer :o>s),
to the space DIA] ,
Theopem 2.2.
W belongs
N
endowed with the Skorokhod J 1 -topology.
so that
Undep (2.8), (2.9), (2.10) and (2.12), the finite
dimensional distpibutions (f.d.d.) of {W }
N
Gaussian and
(2.22)
{W }
N
is tight.
ape aU asymptoticaUy
-8-
Let us now consider the case of the waiting time
(2.6),
Note
that~
by definition, for all
p{U*(t) >X} = p{z*
N
N[x]
{UN(t)}
in
x, t >0,
<t}
(2.23)
and hence, Theorem 2.1 yields the asymptotic normality of the standardized
form of
UN(t)
{UN(t)}.
Similar results hold for the f.d.d.'s.
is monotone in
by some routine steps.
t,
Further,
and hence, the tightness part also follows
Hence, in Section 3, we proceed to prove only
Theorems 2.1 and 2.2.
3.
PROOFS OF THEOREMS 2.1 AND 2.2
Before we proceed to prove these theorems, we consider some
preliminary results which will be needed in the sequel.
<PNn
Let
(-nPN(s)J
N
=Ls=l~(s) 1 -e
, n~O;
(3.1)
2
N 2
-nPN(s) [
-nPN(s)J
c Nn = Ls=l XN (s) e
1 -e
N
- n [ LS=l~(s)PN(s)e
-nPN (S)J 2
, n ~O.
(3.2)
Then, we have the following
Lemma 3.Z.
Under the hypothesis of Theorem 2.Z,
C~n/d~n -+ 1, in probability, as N-+oo
Proof:
(3.3)
By virtue of (2.16) (2.20), (2.21) and (3.2), it suffices to
show that
N
Ls=l (~(s)
-npN(s)
-a~(s))PN(s)e
L
0, as N -+00,
-1 N
2
-npN(s) [
-nPN(s) J
l-e
N Ls=l(XN(s)-aN(s))e
Lo,
(3.4)
asN-+oo.
(3.5)
We shall only prove (3.5) as the proof of (3.4) follows on parallel
lines.
Let
-9-
1 2
~Ns =N- (XN(s) -aN(s))e
Then, by (2.7),
=0,
E~N
-np (5) (
-nPN(s))
N
1 -e
,
l:::;s :::;N,
s =l, ... ,N.
(3.6)
while, by (2.12) [or (2.13)], for
s
every
E >0,
(3.7)
and hence,
L==l E{ ~N I (I ~N
S
s
I :::; E) }
Further, (3.7) insures that for every
~ 0,
as
E >0,
L==lP{I~NSI>E}~O as
N+oo.
(3.9)
Finally, using (2.11) (at the ultimate step), for every
L==l~{~~SI(I~NSI
:::;
L==lE{~~sI(I~NSI
:::;E)} :::;E)}
(3.8)
N +00.
(E{~NsI(I~Nsl
E >0,
:::;E)})2]
:::;EL==lE{I~NslI(I~Nsl
:::;E)}
N
1 N
-nPN(s) (
-nPN(s))
:::; ELs=lEI~Nsl :::;2EN- LS=laN(s}e
1-p
1
-< 2"
E
N-1~N
Los=l aN*()
s
1
,.., 2"
(3.10)
E,
and this can be made arbitrarily small by choosing
E so.
Hence, by
an appeal to the Degenerate Convergence Theorem [viz., Loeve (1963,
p. 317)], we conclude that
(3.5).
L==l~NS L
0 as
N +00,
which proves
Q.E.D.
Let !N
Lemma 3.2.
=(~(l),
... ,~(N))' and
FN be the a-field generated by
Under the hypothesis of Theorem
2.l~
the aonditionat
(ZNn -$Nn)/cNn (given FN) is asymptotiaatty (in
probabiUty)normat with mean 0 and unit varianae.
distribution of
-10By (2.4)1 (2.5) and (3.1), given
Froof:
collector's situation
((XN(s), !>N(s}),
FN, we have a coupon
l:;;s :;;N} and (ZNri - epNn)/c Nn
corresponds to the standardized form of the bonus sum.
The martingale
approach developed in the proof of Theorem 3.1 of Sen (1979) works out
equally well (given FN) when the ~(s) are replaced by XN(s),
while Lemma 3.1 insures that n -1 C
is bounded away from 0 and 00,
Nn
in probability. Hence, the proof follows directly along the lines of
the proof of Theorem 3.1 of Sen (1979) and therefore the details are
omitted.
By virtue of Lemma 3.1 and 3.2, we have for every real t,
E{eXP(it(ZNn -epNn)/d Nn IFN} =exp(- }t2) +op(l),
as
N +00.
(3.11)
1
Lemma :3.:3.
Under the hypothesis of Theorem 2 .l.,
asymptotiaally a
Froof:
no~al
N-~(epNn - ep~n)
has
distribution.
By (2.15) and (3.1), we obtain that
N-~[epNn -ep~n) =N-~~~=l[XN(S) -~(S)Hl _e-nPN(S))
=
I~=l~Nn'
say,
(3.12)
where the
~Ns
are independent rv's with
E~Ns
=0,
l:;;s :;;N.
Also,
by (2.12),
N {*2
~s=lE ~NsI( I~NS I >e:)
} + 0,
Now, (3.13) insures that for every
~~=l p{ I~Ns I >e:} -+ 0
and
as N +00 (V e: >0).
e:>0,
as
(3.13)
N+oo,
~~=l E{ I~Ns 11 (I~NS I >e:) } I
+ Q,
(3.14 )
and hence,
II~=IE{~NsI(I~Nsl <e:)}/ = 1~~=1E{~NSI( I(~NS I >e:)}!
(3.15)
-11SimilarlYI
1:~=1 (E{~ I( I~sl ~£)}}2 ~ 1::=1 [E{It;~ II( It;Nsl
~
I:=l E{
~
(1::=lP{It;Ns l
>E)})2
~~~}p{ It;Ns I >E} ~ (1 ~:~~t{ It;Ns I >£}) 1:~=1 E{ t;~;}
>E}JN-le~n
-+
01
(3.16)
by (2.19) and (3.14).
Finally, by (2.17), (2.19), (3.2) and (3.3),
IT~=lE{~;I(I~sl ~£)} - N-le~n] -~O,
as
N~oo.
(3.17)
Hence, by an appeal to the Normal Convergence Theorem [viz., Loeve
(1963, p. 316)], the desired result follows.
Q.E.D.
Weare now in a position to prove Theorems 2.1 and 2.2.
for any real
Note that
t,
E{exp (i tN-~{ZNn - ¢~n})}
= E{ Gxp
(itN-~(¢Nn - ¢~nJ)1E Exp (itN-~(ZNn
I
- ¢Nn]) FN]}
(3.18)
Hence, Theorem 2.1 follows directly from (3.11), Lemma 3.3 and (3.18)
along with (2.18) and (2.19) -(2.21).
Let us proceed on to the proof of Theorem 2.2.
and
{nlN, ... ,n
mN
}
satisfying (2.21), given
1
distribution of
{N-~(ZN
. n
jN
-¢N*n
),
For any
m(~
1)
F ,
N the conditional
1 ~j ~m} can be shown to be
jN
asymptotically (in probability) multinormal by using the proof of
Lemma 3.2 and the results in Section 4 of Sen (1979).
{Wn },
the convergence of f.d.d.'s of
it remains to study the
1
asymptotic distribution of
any
~
Hence, to study
{N-~(¢NnjN - ¢~njN)'
= (AI' ... ,Am)(~ 2), by (2.15) and (3.1),
1 ~j ~m}.
Note that for
-12-
(3.19)
and hence, the proof of Lemma 3.3 may be virtually repeated to
establish the asymptotic normality of the left-hand side of (3.19).
-~Non ), 1 ~j ~m}
jN
jN
and by a direct extension of (3.18) to the vector case, the desired
This yields the asymptotic mu1tinorma1ity of
_1<
{N-~(~N*n
0
{N 2(ZN*n -~Nn ), 1 ~j ~m} follows. Thus, it
jN
jN
remains only to study the tightness of {W } . Towards this, we write
N
mu1tinorma1ityof
WN(s)
=N-~ZN[NX] -~N[NX]} + N-~{<PN[NX] -~~[NX]}
=WN(X) +W~(:)C), say, for
Also, we define the modulus of continuity
wo(x) =sup{ Ix(t) -xes)
(3.20)
XEA.
I:
wo(x),
0 >0,
It -s I ~o; s, tEA}.
by
(3.21)
Then, using our Lemma 3.1 and adapting the conditional probability law,
given
FN, we obtain on proceeding as in Section 4 of Sen (1979) that
for every
NO
E >0
and a sequence
such that for
n >0, there exist a
and
{E(N)
C
N
E}
a positive integer
0 >0,
of subsets of the sample spaces of
{~N}
N ;<:N '
O
N)
P{w (W
o
}n , V ~N E E (N)
> ElFN} <
P{~ tE(N)} <
Then, by (3.22) and (3.23), for
zn
1
,
(3.22)
(3.23)
N ;<:N '
O
N) >d <}n(1 -}n) + }n <n
P{wo(W
(3.24)
Thus, it remains only to show that
(3.25)
-13For this, we let
(3.26)
where we note that (2.15) and (3.1) are properly defined for any real
n,
not necessarily an integer.
sup { IW~(x) - W~(x)
I:
Then, by (3.20) and (3.26),
x £A}
1
N (
) (-NXPN(S)
-[Nx]P (S))
-e
n
= sup { N-~ll:s=l XN(s) -a~(s) e
::;
I:
}
x£A
{I
1 ~.~
N
0
- [NX]PN(S)}
sup N-~ max (l_e- UP N(S1. -sup N- IS=llxN(s) -~(s) Ie
{
O::;u::;l .
l::;s::;N
x£A
(3.27)
as
1 -e -a ::;a, V a
~O
and by (2.8),
PN(s)::;N -1 Ml , Vl::;;s ::;N.
On the
other hand, repeating the proof of Lemma 3.1, it can be shown that
N-lI~=lIXN(s) -a~(s) I is Opel)
side of (3.27) is 0
p
show that
_k
(N~.
(as
N+oo),
so that the right-hand
Hence, to prove (3.25), it suffices to
"'0
}
P{w8 (W
N) >£ <n, V N ~NO·
(3.28)
Note that by definition of "'0
W in (3.26) and the fact that
N
-a
1 -e ::;;a, V a ~O, we have for every y ~x,
~
~
E [W (y) - W (x) ]
N
N
=N
2
-1 N 2 -2NXPN(s) (
-N(y-X)PN(S)) 2
Is=lONse
l-e
::;; (y _X)2{ max NPN(S)}2{N-lIN=lO~ }
l::;s::;;N
.
s
s
l
::;; Mi(y -X)2{N- I:=la (s)} [by (2.7) -(2.8)]
N
2
'" Ml(y -x) , by (2.11).
(3.29)
Therefore, (3.28) follows from (3.29) and Theorem 2.3 of Billingsley
(1968, p. 95).
Hence, the proof of Theorem 2.2 is complete.
{
.
-14-
4.
LIMIT THEOREMS FOR SSSVPWR
We are primarily interested here in the limiting distributions
of
in (1.10) and other related estimators.
situation (where we let
N +00),
we assume that
In the asymptotic
n,
the number of
primary units in the sample, also increases satisfying (2.10), while
m ' s ES(n) mayor may not be large. In this context, we may also
s
allow the sampling scheme for the sub-units to be rather arbitrary
~ot
necessarily a SSVPWR), while we assume that the primary units are
sampled in accordance to a SSVPWR.
To incorporate the asymptotic situation, we introduce the
following modifications of the notations introduced in Section 1.
For every N,
we denote by
{(~(l),
PN(I)), ... ,(~(N), PN(N))}
scheme relating to the SSVPWR of the primary units.
the
Similarly, we
rewrite (1.7) as
(4.1)
a (s) = b l+ .. ·+b M' s =1, ... ,N
N
Ns
Ns s
may also depend on N) and in (1.1) replacing the Ps
(where the
M
by PN(s),
we define
s
~(s,
n) and
SN(n)
in an analogous way.
Based on the sub-sample of ms sub-units from the sth primary unit,
let
~(s)
be a suitable estimator of aN(s)
one in (1.8)], for
s ESN(n),
[not necessarily the
and consider the estimator
(of
TN = I~=laN(s))
Nn =2sN (n)aN(s)/ ~(s, n) .
We intend to study the limiting behavior of TNn[when
T
satisfies (2.20)].
(4.2)
N +00 and
n
We let
(4.3)
for
s =1, ... ,N,
and assume that (2.8) -(2.10) hold.
In addition,
.
-15-
as in Rosen (1972), we assume that
U::p{ ~~<lN(S)I/ Gm~nlN(s)I}
<
00
(4.4)
•
Also, for every N, we consider a nondecreasing function
o ~x
by letting
<oo}
N
-PN(s)tN(x)
N -x =Ls=le
for n
t N ={tN(x):
~N.
lim{
(4.5)
0 ~x <00 •
,
Finally, we assume that for every
e: >0,
maxE([~(s) -a~(s)]2I(I~(s) -a~(s) I >e:y'N))}
l~s~N
"
=0.
.. (4.7)
Then, we have the following
Undep (2.8) - (2.10), (2.20), (4.4) and (4.7),
Theopem 4.Z.
(TNn -T~)/ONn
If~
in
--v+ N(O,
(4.8)
1) .
addition~
Before we proceed to prove the theorem, we note that by (4.5),
(4.9)
2
d tN(x)
dX
2
=
N.
{
2
Ls =lPN(s)e
-PN(S)tN(S)} /
{
N
Ls =lPN(s)e
-P N(S)t N (x)}3
>0,
(4.10)
,
,
,
-16for every
0
x e: [0, 00).
J '
s
~x
<00,
so that for every
Also, define
in (1.3), we replace
{V
Nk
P
,
k
~l}
by
s
N,
tN(x)
is convex in
as in (1.5), where for the
PN(s),
1
~s ~N.
2
(N
-PN(s)tN(n) (
-PN(S)tN(n))) /
gNn=L s =l e
1-e
Then, if we let
[ N
-PN(S)t N(n))2
Ls =lPN(s)e
-tN(n),
(4.11)
for
~N,
n
by Theorem 4 of Rosen (1970), under the hypothesis of
Theorem 4.1 [namely (2.8), (2.20) and (4.4)],
(4.12)
Since,
tN(n)
~n,
Vn
VN/t (n)
N
~1
L
2
gNn =O(N),
while by (4.11),
1,
as
N +00
we obtain that
when (2.20) holds.
(4.13)
Finally, it follows from Rosen (1972) that under the hypothesis of
Theorem 4.1,
1
-
N~ max I~(s, n) - (1 - e
-PN(s)tN(n)
)
]
I
-+ 0,
as
N +00,
(4.14)
l~s~N
while, under (2.8), (2.20) and (4.4),
o <umf min PN(S)tN(n)} ~
11~s~N
lim{ min PN(s)tN(n)} <00,
l~s~N
(4.15)
so that by (4.14) and (4.15), under (2.8), (2.20) and (4.4),
lim inf{ min ~(s, n)} >0.
N-+oo
l~s~N
Note that (4.7) and (4.16) insure that for every
lim{ max E([aN(s)
l~s~
(4.16)
e: >0,
_a~(s)]21(I~(s) -~(s) I >e:lN ~(s, n))/~(s, n)}=
O.
(4.17)
With these preliminary results, let us now consider the proof of
Theorem 4.1.
We now write [paral1e1 to (2.4)], for a given
n, satisfying
(2.20),
(4.18)
-17-
f
* _\'
Then, by virtue of (4.2),
TNn -I..S
N
x_(n) ( )
s
and parallel to (1.6),
(n)-~
we claim that
where the
by
T* V z-~n)
Nn
-N\I
'
.
Nn
are defined by (2.4) -(2.5) [XN(s)
Z~)
x~n)(S),
1 Ss SN]
and
\INn
(4.19)
being replaced
is defined as in above.
Hence, to
prove (4.8), it suffices to show that
(n)
0
(ZN\I -TN)/oN ~ N(O, 1) .
Nn
n v
(4.20)
At this stage, we use (4.13), (4.17) and Theorem 2.1, and conclude
that for {n*}
satisfying (2.20),
_ ",(n))/d(n)
( z(n)
Nn*
'l'Nn*
Nn*
T
. N(O 1)
'
(4.21)
,
where [by (2.15), (2.16) and (4.18),
(n)
N
0
(
-n*PN(S))/
<P Nn * = Ls=laN(s) 1 -e
lIN(s, n),
(d~~~ }
2
{(
=L~=l ~(s)
}2
+CY~s
(4.22)
(S))IlI~(s,
} -n*p (s) (
-n*p
1 ":e
e
N
N
2 (
LN
s =l CYNs 1
-e
n)
-n*PN(S))/ 2
(4.23)
By (4.6), (4.14), (4.22) and (4.23), it follows by some standard steps
that
(n)
0
<PNtN(n) = TN +o(N
~2
(n)
),
dNtN(n/oNn
-+
1, as N -+00.
(4.24 )
Hence, by (4.13), (4.21) and (4.24), to prove (4.20), it suffices to
show that for every
E:
>0
and
n > 0,
there exist a
°0
>
and an
(4.25)
for every
N~NO'
Theorem 2.2.
o
and this follows directly from the tightness part of
~
0Nn =0(N 2 ) ,
Since
(n)
(TN - TN)/d Nt (n)
N
--+
0
as
N -+00.
-~
0
N (TN -TN)
-+
0
.
~ (s, n) .
insures that
Hence, the proof of the theorem is
,r
..
•
1 t
-18-
complete.
Next, we proceed to extend Theorem 4.1 to an invariance principle
for
{TNnL
let
For an arbitrary c(O <c <1),
for every N,
I = [c, 1] and,
c
t;;N ={t;;N(t), t £I c } by letting
consider a sample process
(4.26)
t £1 .
c
Then,
D[I c ]
t;;N belongs to the space
Theorem 4.2.
and we have the following
Under the hypothesis of Theorem 4.l,
converges weakly
t;;N
to a Gaussian function on I c ' for every 0 <c :s; 1.
Proof: We need to show that (i) the f.d.d.'s of {I;;N}
Gaussian and (ii) {I;;N}
is tight.
are asymptotically
Note that parallel to (4.19), we
have
Nn ,
n :S;N} V {ZN(n) , n :S;N}.
(4.27)
vNn
Also, for finitely many n, say n , .•. ,n , b ~1, satisfying (2.20),
1
b
(4.13) holds as does (4.25) for n =n., l:s;j :s;b. Hence, the proof of
{T
J
Theorem 4.1 can be directly extended to cover this case.
details of (i) are omitted.
Therefore, the
To establish the tightness, we again make
use of (4.27) and, thereby, it suffices to show that (a)
IN-\Z [nc]
_ TO)
NVN[N]
N
and
(b) that for every
and an
No'
p{sup
£
>0
and
I
o
is
n >0,
P
(4.28)
(1),
there exist a
0(0 <0 <I-c)
such that
N-~Iz(n)
NV
Nn
for every N ~NO'
_Zen') I:
NVNn ,
C
:S;!!..N <n ' :S;!!.N +0 :S;l} >
N
Now, for every c >0,
£} < n ,
(4.29)
[Nc] satisfies (2.20) and the
proof of (4.28) follows from the asymptotic normality in Theorem 4.1
along with the fact that by (4.6),
1imO N[nc] <00 ,
~ C
>0.
Hence, we
need to verify (4.29) only.
Note that for every N,
vNk
is
I
in
k(:s; N), while, by (4.24),
-19o _ (n)
1
.
TN -$NtN(n) +o(N~),
for every n
satisfying (2.20).
Hence, to prove
(4.29), it suffices to show that for every n: eN:s::n <N,
p{max[jz~) -$~~)- z~~') +$~~')
IIIN: n :s::n':s::(n +ON)AN,
Ik - t N(n) I <oN, lq - t N(n ') I <oN]
for every N ~NO'
where
>
E}
< no ,
(4.30)
No and 0 are defined as in (4.29).
For
this purpose, we consider a two-dimensional time-parameter stochastic
process
~N = {~N(t1' t 2):
(t 1 , t 2) EI~},
[Nt 1]
_1 [
by letting
[Nt 1] )
2
~N(tl't2)=N~.ZNt ([Nt]) -$Nt ([Nt])' (t 1 , t 2)El c '
N
2
N
2
Note that the tightness
of~N
(4.31)
would insure (4.30), and hence, we
complete the proof of the theorem by establishing the tightness of
~N'
By letting n = [Nt 1L
where
t 1 >ti
and
m = [Nti L
t 2 >ti
(so that
k =t N([Nt 2]) and q =tN([Nti]),
n ~m and k ~q), we obtain that
BN([ti, til, [t 1 , t 2])= ~N(t1,t2) -~N(t1' tp- ~N(ti,t2) +~N(ti' t2J
= N-~{(Z(n) _zen) _zen) +z(m)) _ [$(n) _$(n) _$(m) +$(m))}
Nk
Nq
Nk
Nq
Nk
Nq
Nk
Nq
,
where the
~(s)].
(4.32)
z~~) are defined by (2.4) -(2.5) [and (4.18) modifying the
Note that for
z(n)
( NK
n
~m,
k
~q,
_ Z (n) _ Z (m) + Z(m) ).
Nq
Nk
nq
= I~J=q+1 [y(~)
- y(~))
NJ
. NJ '
(4.33)
where
(4.34)
A similar equation holds for the
As such, using (4.18), (4.33)
and (4.34), we obtain by some standard steps that
(4.35)
where
M*
is a finite, positive quantity.
Thus, if we let
-20-
t l -ti =t 2 -t 2 =0 >0, the right-hand side of (4.35) is 0(0 3) [as by
(4.9) - (4.10), (k -q)/N =0(t -t )] and the tightness of sN follows
2 l
by an appeal to Theorem 3 of Bickel and Wichura (1971) after noting
that the Lebesgue measure
03=[A(B )]3h
N
that
[A(B )]
N
Q.E.D.
5.
of the block
B is
N
02 ,
so
SOME CONCLUDING REMARKS
In the developments in Section 4, we have treated the estimators
aN(s)
in a rather arbitrary manner.
unbiased estimators (of
in (4.3),
the~(s)),
~(s) =aN(s),
so that
In particular, if these are
as is usually the case, then
T~ =TN.
Moreover, we have not
imposed any restriction on the sub-sample sizes
{ms }' apart from the
uniform integrability condition in (4.7). If these ms are all large,
2
then the 0Ns will be small, so that the second term on the right-hand
side of (4.6) can be neglected.
o~n in (4.6)
In this limiting case, the variance
corresponds to the variance in the usual SSVPWR of the
primary units, where one observes the
~(s).
Thus, if n,
the number
of primary units in a SSVPWR is large and if the
ms are also so, then
sub-sampling does not lead to any asymptotic increase in the relative
variance of the estimators
TN
(TNn
and T ), though in practical problems,
Nn
is more adaptable, because it does not presuppose t.he knowledge
of the values of
{~(s)}.
Theorem 4.1 provides a theoretical justification of the asymptotic
which has been occasionally used in sample survey
normality of
theory without any proper justification.
Theorem 4.2 can be used to
extend this asymptotic normality when the sample size (n) is itself a
random variable.
testing for
Further, it may also be used to repeated significance
TN or in some sequential inference procedures for the same.
:'\'
"t·-..
-21-
REFERENCES
[1]
BICKEL, P.J. and WICHURA, M.J. (1971). Convergence criteria for
mu1tiparameter stochastic processes and some applications.
Ann. Math. Statist. ,...42,
"..., 1656-1670.
[2]
COCHRAN, W.G. (1963). Sampling Techniques.
John Wiley, New York.
[3]
HAJEK,J: (1964). Asymptotic theory of rejective sampling with
varying probabilities from a finite population. Ann.
Math. Statist. ~~, 1491-1523.
[4]
LOEVE, M.
[5]
ROSEN, B. (1970). On the coupon collector's waiting time.
Math. Statist. ii, 1952-1969.
[6]
ROSEN, B. (1972a). Asymptotic theory for successive sampling with
varying probabilities without replacement, I. Ann. Math.
Statist. i~, 401-425.
[7]
ROSEN, B. (1972b). Asymptotic theory for successive sampling with
varying probabilities without replacement, II. Ann. Math.
Statist. i~, 748-776.
[8]
SEN, P.K. (1979). Invariance principles for the coupon collector's
problem: a martingale approach. Ann. Statist. Z, 372-380.
[9]
SUKHATME, P.V. and SUKHATME, B.V. (1970). SampZing TheoY'Y of
Supveys with Applications (Second Edition). Asia Publ.
House, London.
~
(1963).
(Second Edition).
PT'obability TheoY'Y, Van Nostrand, Princeton, N.J.
~
Ann.
~
© Copyright 2026 Paperzz