Crowell, John I. and Sen, P.KOn Stopping Times for Fixed-Width Confidence Bands"

(#2018)
ON STOPPING TIMES FOR FIXED-WIDTH CONFIDENCE BANDS
John I. Crowell and P. K. Sen
Department of Statistics
University of North Carolina
Chapel Hill, NC 27514
Keywords and Phrases: confidence regions; bootstrap.
ABSTRACT
Analogous to the usual stopping time for fixed-width confidence interval estimation of a
real-valued parameter B(F), a stopping time for fixed-width confidence band estimation of a
statistical functional process R F is considered. The proposed confidence band is shown to have
the correct asymptotic coverage probability as the band width goes to zero.
Sufficient
conditions for asymptotic efficiency and normality of the stopping time are given.
Some
implications of using the stopping time in conjunction with a bootstrap confidence band
procedure are investigated.
An example illustrates how essentially the same stopping time
may be applicable when estimating a confidence region for a statistical function R F defined on
IR P, P 2: 2. Stopping times for the construction of variable width confidence bands are also
considered.
1:. INTRODUCTION
Let Xl"'" X n be independent and identically distributed random variables having
distribution function F. Suppose R F = {R F (t), tEl} for some I ~ IR is an unknown functional
process of F. There often exists a natural estimator R n (·)
R n (· ; Xl' ... , X n ) of R F (·) and
=
an associated process
on I. Several authors have exploited weak convergence properties of rn to construct large
sample confidence bands for the functional process R F .
A typical approach assumes that rn ~ g F' where
gF
is a Gaussian process on the
- 2 -
interval I such that P{ SUPtEI
I ~ F(t) I <
00 }
(Throughout this paper ~ (
= 1.
used to denote weak convergence (equivalence in distribution) of processes while
~
) will be
g will denote
weak convergence of random variables.) Define
For a desired confidence level of 1-0', 0<0'<1, let c = inf{ x: GF(x) ~ I-a}. If Cn is a weakly
consistent estimate of c, then a sensible large sample confidence band for R F is given by
-1/2
{ Rn(t) ± Cn n
, tEl },
(1.1)
for then
P { Rn(t) - Cnn
-1/2
:s;
RF(t)
:s;
Rn(t)
+ cnn -1/2 ,tEl }
-+ 1-0',
(1.2)
assuming G F is continuous at c.
In some special cases G F is independent of F and explicitly known, so that one can take
Cn =c in (1.1).
Estimation of a continuous distribution function
F by the empirical
distribution function is such a case, since then G F is the distribution of the supremum of the
Brownian bridge. Because G F is usually completely unknown, bootstrap methods have been
advocated for estimating c.
Mason(1989)j
paper.)
(See, for example, Bickel and Freedman(1981) and Csorgo and
other work on bootstrapping empirical processes is summarized in the latter
Sometimes the covariance function of
dependence on F.
~F
may be known, at least up to its
Then a possible approach is to use either the covariance function or an
estimate thereof to simulate independent copies of a Gaussian process hopefully close
distribution to
~Fj
III
taking the supremum norm of each of these copies, G F and c can be
estimated empirically.
If the objective is to estimate R F on I with some uniform prespecified precision, then the
idea of a fixed-width confidence band is appealing. In practice one might want
(1.3)
where d>O is a constant that reflects the desired accuracy.
Typically the minimum sample
size n such that (1.3) holds is unknown and some guidelines for deciding when an adequate
sample size has been reached would be useful.
The analogy with fixed-width confidence
interval estimation of a real-valued parameter B(F) is suggestive.
Suppose Bn is a consistent
estimate of B(F) and one would like the smallest sample size n, usually unknown, such that
1/2
D
2
2 .
P{ IBn - B(F)I :s; d} ~ 1-0'. Assume n
(B n - B(F)) -+ N(O, (J ) as n-+oo and Sn IS a
strongly consistent estimator of
(J2.
For the fixed-width confidence band problem, a number of
- 3 authors have considered stopping times of the type
N'd
= m. f{
n2: n o: n -1/2 sn T a/2 :$ d } ,
where no is some initial sample size and
normal distribution.
T a/2
is the (1 -
a/2 )th quantile of the standard
(See, for instance, Chow and Robbins(1965) and Sen(1981).)
With
appropriate conditions on the sequences {On} and {s~} it has been shown that
limp{ 10 , - O(F)I < d} = I-a,
dlO
Nd
so that for sufficiently small d the confidence band (0 , - d, 0 , + d) has approximately the
N
desired coverage probability.
N
d
d
This paper has as its primary purpose the definition of a stopping time Nd such that
under certain conditions
Jtct{ I RNi t ) -
RF(t) I :$ d, tEl}
Let Cn be a strongly consistent estimate of c.
= I-a.
(1.4)
For fixed d>O and some initial sample size no,
define the stopping time
Nd
. {
= mf
n2:no: n -1/2 cn
<
(1.5)
upon stopping use the confidence band
{ R N (t) ± d, tEl }.
d
(1.6)
Thus the rule states that one should stop when the width of the confidence band determined
by (1.1) is at most 2d.
The most basic properties of Nd follow directly from its definition. By the relation
[c Nd ]2 :$ N d d 2
[CN _1]2+ d 2 ,
(1.7)
d
Nd is finite a.s. for fixed d > O. If the quantile c of the distribution G F were known, then for
small d > 0 a reasonable fixed sample size would be
<
nd = inf{ n2:no: n -1/2c :$ d },
(1.8)
where no is the initial sample size that appears in (1.5). Suppose P{ cn:$ O}
n2:no' Since nd ,,",c 2 /d 2 as dlO, one then has that Nd/nd ..... 1 as dlO by (1.7).
0 for all
~ ~ F as d 10, then (1.4) holds.
Nd
A simple criterion for establishing weak convergence of rN is stated. In Section 3 sufficient
It is shown in Section 2 that if one also assumes that r
d
conditions are given for the expectation of the stopping time to be finite for fixed d > 0, as well
as for the asymptotic efficiency and normality of Nd as d 1O. Several bootstrap estimators of c
are considered in Section 4, along with some resulting properties of the stopping time. In some
applications one might want to allow confidence bands of variable width.
Section 5 addresses
the problem of stopping times for such modified confidence band procedures.
An example in
Section 6 illustrates how a stopping time of the form (1.5) can be used in the construction of a
-4confidence region for a statistical function R F defined on IR P , p ~ 2.
2. ASYMPTOTIC COVERAGE OF THE CONFIDENCE BAND
Theorem 2.1 contains the principal result of the section regarding the correct asymptotic
~ YF as d! O.
d
Corollary 2.1 gives a single sufficient condition for the weak convergence of r N which is easily
coverage probability of the fixed-width confidence band (1.6) when rN
d
verified in some cases.
Al summarizes the principal assumptions of the introduction.
AI:
With previously defined notation, rn
~
YF on I, where P{sUPtEI IYF(t)1
<
oo} = 1.
GF(c) = I-a and G F is continuous at c. There exists an estimator Cn of c such that Cn -+ c
a.s. as n-+oo and P{ Cn ~ 0 } = 0 for all n ~ no.
The estimator Cn can always be modified, without affecting its consistency, in order to
meet the requirement that P{ Cn
~
O} = 0 for all n
~
no. Tsirel'son(1975) showed that if YF is
a separable nondegenerate mean-zero Gaussian process which is almost surely bounded on I,
then G F must be continuous on the half-line (0, 00). Thus the continuity condition included
in Al is generally met.
W
Theorem 2.1: Assume Al and that rN -+ YF as d! O. Then (1.4) holds.
d
Proof: From the basic properties of Nd discussed in the introduction it follows that cN -+c a.s.
d
1/2
and Nd d -+ c a.s. as d! O. Thus the result holds by (1.2).
0
~ YF as d! O. Given the
d
YF on I, we consider a single sufficient condition for weak
In practice it will of course be necessary to verify that rN
stopping time Nd and that rn
~
convergence of R N based on the notion of a uniformly continuous in probability sequence. A
d
sequence of random variables {X n , n ~ I} is uniformly continuous in probability (u.c.i.p.) if
given (>0 there exists 6>0 such that P{ maxo~k~n6 IXn+k-Xnl ~ ( }
<
(for all n~1.
The condition considered here is as follows.
A2: For each tEl, {rn(t),
n~l}
is a u.c.i.p. sequence.
Corollary 2.1: Assume Al and A2 hold. Then (1.4) holds.
Proof: By Theorem 2.1 it is sufficient to verify that the finite-dimensional distributions of r Nd
converge weakly to those of YF' For arbitrary positive integer m, let (a 1 , ... , am)' be any m-
- 5vector of real numbers and {tl' ... ,tm}~I.
Define Y n = ~~lairn(ti)'
convergence of rn and the Cramer-Wold device, Y n
random variable.
g Y, where Y ~
By the weak
~~ 1ai YF( t i ) is a normal
Anscombe's theorem is used to show that Y N
d
has the same limiting
distribution as d 10.
By the proof of Theorem 2.1, Nd/d -2
-+
c 2 a.s. as d 1O.
Since for each tEl,
{rn(t), n~l} is a u.c.i.p. sequence, {Y n , n~l} is also a u.c.i.p. sequence. (See Lemma 1.4 of
Thus by Anscombe's theorem Y N
d
dimensional distributions of rN converge to those of YF'
d
Woodroofe(1982).)
gY
as dlO, which implies the finite-
0
;h PROPERTIES OF THE STOPPING TIME
By making additional assumptions about the quantile estimator Cn, some questions about
the expectation and rate of convergence of the stopping time can be answered.
In the fixed-
width confidence interval problem, the properties that will be considered here have been shown
to hold for the stopping time N
{s~, n ~ no }.
d under
a variety of conditions on the variance estimates
Due to the similarity in form between N
d and
Nd' by putting the same
requirements on the sequence {c~, n ~ no}, the same properties follow for N d . Thus no proofs
are given here, the necessary calculations being identical in spirit to those in the references
cited.
A problem is that conditions that are easily verified for variance estimators may not
readily apply to quantile estimators, so discussion is restricted to those conditions which seem
most relevant.
Among the basic properties established in the introduction were that Nd
< 00
a.s. for fixed
d > 0 and Nd/nd -+ 1 a.s. as d 1O. Lemma 3.1 provides a condition which implies that for fixed
d
> 0 the expectation of the stopping time is finite, while as d 10 the stopping time is
asymptotically efficient in the sense that
lim nd 1 ENd = 1.
(3.1)
dlO
The type of arguments needed to prove Lemma 3.1 may be found m Sen(1981), pp. 284 and
287.
Lemma 3.1: Assume Al holds and that for fixed d > 0
E{
Then ENd
<
00
sUPn>n
c~ }
_
0
<
00.
(3.2)
and (3.1) holds.
To show that the expectation of the stopping time is finite, another approach
establish the equivalent condition that
IS
to
- 6 -
L
n~no
P{ N d > n }
<
(3.3)
00.
Similarly, an alternative condition for verifying (3.1) is
lim
diO
L
neil
(3.4)
P{Nd>n}=O.
n~nd (1+f)
(See, for example, Sen(198l), p.288.) In Section 4 it is demonstrated that if C n is a particular
type of bootstrap estimator it may be possible to verify (3.3) and (3.4).
In the fixed-width confidence interval case one may sometimes deduce a limiting
distribution for Nci as d i 0 when s~ is asymptotically normal. Similarly, in the present setting
asymptotic normality of c~ may provide a rate of convergence for Nd as d i O.
Let { g(n) } be a sequence of positive numbers such that g(n) is increasing in n, g(n)-+oo
as n-+oo, and
(3.5)
Suppose that
')
')
g(ndHc~/c~-l}
nd
D
-+
')
N(O,,~)
asdlO,
(3.6)
where, is a finite positive constant. Further assume that
{g(n)[c~/c2 - 1], n~no } is a u.c.i.p. sequence.
Together with Al these three conditions yield the following result, similar
(3.7)
ill
derivation to
Theorem 10.2.2 of Sen(198l).
Lemma 3.2: Assume AI, (3.5), (3.6), and (3.7) hold. Then
g(nd H Nd/nd - 1 }
~
N(O, ,2) as diD.
4. BOOTSTRAP QUANTILE ESTIMATORS
One method that has been suggested for constructing confidence bands of the form (1.1)
IS
bootstrapping the process rn.
Given Xl"'"
independent with the distribution Fn(x) =
bootstrap sample size mn-+OO as n-+oo.
X n , let Xi, ... , xriI n be conditionally
n-lEi=l X(Xi~x), -oo<x<oo, where the
(Here X denotes the indicator function.)
Next, for
tEl define RriIn(t) = Rmn(t; Xi, ... , xriI n ); the bootstrapped process is then given by
rriI n (·) =
m~/2{RriIn(')
- R n (·)}
- 7 -
on I. One then typically makes the assumption that
r~n '1 ~ F
as n, mn -+ 00 so that a natural
estimator of the distribution G F is defined by
The bootstrap estimator of c, the (l-O')th quantile of G F , is then defined to be the (l-O')th
quantile of G n F .
, n
For computational reasons the bootstrap estimate is often itself approximated by
generating conditionally independent copies of r~n as follows.
For i = 1, ... , M n let Xi\"'"
X:t'
be a random sample from the distribution F n , where Mn-+oo as n-+oo.
Imn
R*mnl.(t) = Rmn(t; X:t'l"'"
X:t'n ). Then the processes
lIm
For tEl define
r* .(.)=m~/2{R* .(-)-R n (.))
mn l
mn l
on I, i = 1, ... , M n , are conditionally independent and identically distributed. Estimate G F
n, n
empirically by
-1 ~'-1
Mn X(
1-
A
G n F (x) = M n
, n
*
sup Ir
.(t)1 ~ x), -oo<x<oo.
tEl mnl
A
F (x) ~ 1-0'}. Putting Cn
n, n
ci;. in (1.5) then defines a stopping time for terminating the bootstrap procedure.
The bootstrap estimate of c is then taken to be ci;.
inf{ x: G
Due to the conditional nature of the ordinary bootstrap estimator, the feasibility of
establishing unconditional properties of the stopping time may depend on the nature of the
function R F and the estimator R n . Showing that the stopping time has finite expectation for
fixed d > 0 or has the asymptotic efficiency property (3.1) by confirming that conditions such
as (3.2), (3.3), or (3.4) hold may be difficult or impossible.
To gain some insight into this
problem we first examine a condition on the random distribution function G n,F which is
n
sufficient for (3.3) and (3.4) to hold. While this condition will not typically be useful in
practice, it does suggest how certain properties of the stopping time depend on the probability
distribution of Gn,Fn and might not automatically follow when one uses an ordinary bootstrap
estimator.
It is then seen that by partitioning the sample into independent groups of
observations one can
bootstrapping;
instead estimate a
distribution
F by
n, n
if the resulting quantile estimate is used to define the stopping time, the
desired properties will hold.
Condition Al is replaced by the following.
analogous
to
Gn
=
EG
-8A3:
With previously defined notation, r n
< oo}
P{ SUPtEI IgF(t)1
= 1.
GF(c)
=
~
g F and rriI n ~ g F on I as n, m n -+ 00, where
I-a and G F is continuous and strictly increasing in
" n (0) = 0 for all n ~ no } = 1. The sequence {M n } is such that
a neighborhood of c. P{ Gn,F
M ~ l log n
= o( 1 ).
A3 implies that c~-+c a.s.
-00
Let Gn(x) =
I rriIn(t) 1
EGn,Fn(x) = P{ SUPtEI
::; x },
< x < 00. Lemmas 4.1 and 4.2 assume that I G n, F n (x) - Gn(x) 1-+0 completely as n-+oo
in the following sense.
A4: For every xEIR and every 6>0,
L
P{
I Gn,Fn(x)
- Gn(x)
I>
6}
<
00.
n~no
Lemma 4.1: Assume A3 and A4 hold. Then for fixed d > 0, ENd
Proof; For all n
~
no, P { Nd > n } ::;
L
n~no
P{ N d
*
P { n -1/2 Cn
>
L
> n} ::;
<
00.
d } ,so that
P{
c~ >
1 2
(4.1)
n / d }.
n~no
D
1/2
Since G n -+ G F , for all n sufficiently large one has that [1 - a - Gn(n
d)]
6> O. Thus for large n
< -6 for some
By a standard probability inequality for empirical distribution functions
P { sup
xEIR
I G" n
F (x) - G F (x)
,n
n, n
I>
6/2
}
::; Ce
-2 M (6/2)2
n
(4.2)
where C is a constant not depending on F n or F. (See Serfling(1980), Section 2.1.3.) Hence
by (4.1), A3, and A4
L
n~no
P{ N d
::; L
n~no
> n}
Ce -2 M n (6/2)2
+
L p{ I G n (n 1/ 2 d) n~no
Gn,F (n
n
1 2
/ d)
1 > 6/2 }
- 9 -
<
00,
which is equivalent to ENd <
00.
0
Lemma 4.2: Assume A3 and A4 hold. Then lim
d!O
nd 1 ENd = l.
Proof: To show that (3.4) holds, we derive a bound for P{ Nd > n} which is independent of d
for n ~ n d (l+(). For any 6> 0 and all d sufficiently small, using the fact that n ~ c 2 (1+()/d 2 ,
P{cri>n
1/2
d}= P
{
A
Gn,Fn(n
1/2
d)<l-a
D
}
,
By A3, G n -+ G F and one can choose 6, 6 > 0 such that (1 -
a
+ 6) - G n ((1+()
1/2
,
c) < -6 for
all large n. For this 6 and sufficiently small d, by A4,
L
p{ 1- a + 6 >
Gn,Fn ((1+(//2 c ) }
n~nd (1+()
L
~
n~nd
(1+()
p{ I G n ,F n ((1+()1/2 c) -
G n ((1+()1/2 c)
I>
6'}
<00
and using (4.2)
L
n~nd (1+()
P{ N d > n} ~
L
1 2
P{ cri > n / d} <
00.
n~nd (1+()
Thus (3.4) holds and the result follows.
0
For some statistical functions R F and corresponding estimators R n , if F n is in some sense
close to F then G F may be close to G n . This suggests the following condition which is
n, n
sufficient for A4.
-10-
A5: [Hadamard-Lipschitz condition of order a.] There exists a real-valued function K(x) such
that for all n and x E R
I Gn,Fn(x) - Gn(x)1 ~ K(x)IIFn-Fll a for some a>O,
II f II
where
denotes the supremum norm of the function f.
1\
The inequality applied in (4.2) to G n F and G F can now be applied to F nand F to show
, n
n, n
that A5 implies A4:
I:
P{
I Gn,Fn(x)
I>
- Gn(x)
b} <
n2: n o
I: p{ IIFn-FII > [b/K(x)]l/a}
n2: n o
~
L
Ce- 2n [b/K(X)]
2/a
n2: n o
<
00.
While A5 seems a more natural condition than A4,
assumptions will be verifiable.
III
many situations neither of these
However, by modifying the procedure so that one unbiasedly
estimates an unconditional distribution function analogous to G n the desired properties for the
stopping time will automatically follow.
In introducing the modified bootstrap procedure some notation is redefined.
{ mn} and {M n } be sequences of positive integers such that mn
and Mnmn
~
-+ 00
and M n
-+ 00
Now let
as n
-+ 00
n for all n. Divide the sample into M n groups of mn observations each; denote
the observations in the i-th group by X· , ... , X.
,i = 1, ... , M n . One bootstrapped process
lI I m n
will be generated from each of the M n groups of observationsj for convenience the bootstrap
For i = 1, ... , M n let X:"I"'" X:"
now be a random
lImn
sample drawn from the empirical distribution F ni(x)
mn 1 Ej:n1 X( X ij ~ x ), -00 < x < 00.
sample size will be taken to be mn'
=
For tEl define R
.(t) = R mn ( tj X· , ... , X.
) and then redefine
mni
lI I m n
R*mni.(t) = R mn ( t; X:"I'"''
X:"n ')
lIm
Then the modified processes
r* .(.) = m~/2{R* .(.) - R
.(.)}
mni
mni
mn i
on I, i = 1, ... , M n , are unconditionally independent and identically distributed. Now redefine
G n by
G n can now be estimated in an un biased fashion by
- 11 1\
-1 M
Gn(x) = M n ~i=~ X( sup Ir* l.(t) I
tEl mn
A3 must be changed slightly.
A6:
With previously defined notation, rn
P{ SUPtEI I(JF(t)!
<
~
(J F and
::;
x ),
r~nl ~
-00
< x < 00.
(J F on I as n, mn -+
=,
where
GF(c) = I-a and G F is continuous and strictly increasing in
P{ Gn(O) = 0 for all n 2: no } = 1. The sequence {M n } is such that
oo} = 1.
1\
a neighborhood of c.
M ~ 1 log n = o( 1 ).
1\
Let c~ = inf{ x: Gn(x) 2: I-a} for which a natural estimate is ~~ = inf{ x: Gn(x) 2: I-a}. By
a standard probability inequality for deviations of sample quantiles, for any t> 0
P{
where
8n
t}::;
2e-2Mn8~,
(l-a)-G n (cri-t)}.
(See, for example,
D
Serfling(19S0), p.75.) Using the fact that G n -+ G F and G F is continuous and strictly
increasing in a neighborhood of c, it follows that lim inf 5 n > O. If M~ I l0g n = o( 1), then
n-+oo
one has that I ~~ - cri I -+ 0 completely as n -+ = in the sense that for any t > 0,
~go=no P{
min{
I ~~ - c~ I >
I ~ri
- cri
G n (cri+t)-(I-a),
I>
t }
<
=; consequently ~ri -+ c completely and a.s. as n -+
00.
Now let the stopping time Nd be defined with Cn = ~ri. When A6 holds, one then has the
1\
properties indicated in Lemmas 4.1 and 4.2. This can be seen by replacing G
F
1\
by G n and
n, n
G F by the new G n in the proofs of Lemmas 4.1 and 4.2 and then making the appropriate
n, n
simplifications.
The modified bootstrap algorithm results in essentially a grouped sequential procedure
when used in conjunction with the stopping time (1.5).
Given the sequences {M n } and
{mn}, the effective sample sizes are defined by the sequence {Mnmn}.
Even if the sample
were increased a single observation at a time, one usually would not want to repeat the
bootstrap procedure until either the number of subsamples M n , the subsample size mn, or
both had been increased.
Thus the stopping time would only take values in the sequence
{Mnmn}' This seems a disadvantageous feature.
Another point bears on the selection of M n and mn subject to the constraint Mnmn::; n.
The number of subsamples M n need only grow at a rate slightly faster than log n for complete
convergence of ~ri to cri, so it seems prudent to allow the subset size mn to increase relatively
quickly to speed convergence of cri to c. One might let M n be of order n A for some A,
0< A < 1, where typically A would not be greater than 1/2 and could be decreased as n
becomes larger.
- 12 5. VARIABLE WIDTH CONFIDENCE BANDS
In some applications the need for a confidence band of variable width arises quite
naturally.
If the variance of rn(t) is large for some values of t and comparatively small for
other values, then a confidence band of constant width such as (1.1) or (1.6) will often be
unnecessarily wide for some values of t.
This suggests scaling the confidence band width at a
point t by an estimate of the standard deviation of the process at that point.
(It should be
noted that this may actually increase the width of the confidence band at some points.)
In
other applications a modified process of the form {rn(t)/q(t), tEl} is known to converge
weakly, where q is a known nonnegative weight function.
These two situations are treated
jointly here, since both lead to essentially the same modifications of the stopping time (1.5)
and the confidence band (1.6).
Let {qn} be a sequence of possibly random nonnegative functions on I and q be a
possibly unknown nonnegative function on I.
and define the processes fn
= { rn(t)/qn(t),
Next suppose J ~I is a finite union of intervals
tEJ},
n
= 1,
2, ....
Assume that fn
where ~F is a zero mean Gaussian process on J such that P{ SUPtEJ I ~F(t) I
In the case where r n
~
YF on I and q(t) = Var{ YF(t) },
bounded away from zero and infinity on Jj
estimator of q on J, then
qn
~F(t) ~
=q for all n, where q is a
J~I
<
00 }
~ ~F'
= 1.
is necessarily such that q is
if additionally qn is a uniformly consistent
{ YF(t)/a-(t), tEJ}.
Alternatively, one might have that
known weight function that provides an appropriate normalization
of rn.
Set
and suppose that C is such that GF(c) = 1-0' and G F is continuous at C. Given a weakly
consistent estimator c n of C, a reasonable large sample confidence band for R F on J is then
given by
{ R n ( t)
± _en n -1/2 qn (t) , t E J } .
(5.1)
With the preceding assumptions,
(5.2)
= P{
sup Ifn(t) I ~ c n }
tEJ
.... 1-0'.
- 13 -
Although the goal is no longer a fixed-width confidence band, by scaling the function qn by a
constant d>O a stopping time can be introduced; the constant d is then let go to zero in order
to study the asymptotic coverage probability. Assume cn -+c a.s. and define the stopping time
= inf{ n~no: n -1/2 cn ~ d },
Nd
where no is again an initial sample size. Let
{ R... (t) ± d q... (t), t EJ }
Nd
(5.3)
Nd
be the associated confidence band.
It follows from the strong consistency of cn that if P{ cn ~ 0 } = 0 for all n ~ no, then
C...
w~fh
...
By' a relation like (1.7), one then has that [N ]
d
the additional condition that f...
F as d! 0,
-+ C a.s. as d!O.
Nd
lim
d! 0
p{ I R ...
Nd
=dli!mOP { sup
tEJ
(t) - RF(t) I
I...r ...
Nd
1/2
~g
d -+ c as d!O. Thus
~ d q... (t), tEJ }
(5.4)
Nd
...
(t) I ~ [N d ]
1/2
d
}
=l-a
by (5.2). Thus one can define a meaningful stopping time for the confidence band (5.1).
As in Corollary 2.1, the additional condition that for each tEJ, {fn(t), n~l} is a u.c.i.p.
sequence is then sufficient for (5.4).
stopping time N d one has that
Nd
Letting
lld = inf{ n~no: n -1/2 c ~ d }, as for the
is finite a.s. for fixed d > 0 and
Nd/
lld -+ 1 a.s. as d ! O.
Other properties of the stopping time such as asymptotic efficiency and normality follow under
conditions on
Nd
and cn identical to those considered for N
d
and Cn in Lemmas 3.1 and 3.2.
fr:. EXAMPLE: EMPIRICAL PROCESSES
Let ~, ~1"'" ~n be LLd. random vectors having distribution function F defined on IR P
for some p
~
1. Define the empirical distribution function by
Fn(:s) = n- 1E i=1
c(:S-~i)' ~EIRP,
where c(l!) is equal to 1 if all p coordinates of l! are nonnegative and is otherwise equal to O.
Letting Vn(:s) = n
1 2
/ [F n (:s)
-
F(:s)], the usual empirical process is then defined by V n =
{Vn(:s), ~EIRP}. For the remainder of this section, let
IR I V(:s) I ~ t }, -00 < t < 00,
xE P
whenever V n converges weakly to a zero-mean Gaussian process V on IR P . In addition, make
GF(t) = P{ sup
- 14 the assumption that P{suPxEIRPIV(~)1 < oo}
=
1, c is such that GF(c)
=
I-a, and G
F is
continuous and strictly increasing in a neighborhood of c.
As is well known, if p = 1 then V n
~
B 0 F as n
-+ 00,
where B is the Brownian bridge on
[0,1] and BoF(x) = B(F(x», -oo<x<oo. Thus when F is continuous G F is the distribution
of the supremum of the Brownian bridge, which is known, while for discontinuous F
I BoF(x) I ~
P{ sUPxEIR
t } ~ P{ sUPxEIR I B(x) I ~ t} for all t E IR,
(6.1)
so that one can give an upper bound for any quantile of G F . (See Kolmogorov(1941) and
Noether(1963).) Bickel and Freedman(1981) have shown that bootstrapping the univariate
empirical process V n yields a strongly consistent estimate of c, the (l-a)th quantile of G F .
Letting Cn be the bootstrap estimate and rn ( .) = V n ( .) on IR, define the stopping time Nd by
(1.5) and the associated confidence band by (1.6).
For each x E IR the statistic Vn(x) is a normalized sum of independent random variables
whose
distribution
has
mean
zero
and
Woodroofe(1982) the sequence {Vn(x), n
~
finite
variance;
thus
by
Example
1.8
of
I} is u.c.i.p. for each x E IR and by Corollary 2.1
the stopping time (1.5) and confidence band (1.6) provide the correct asymptotic coverage
probability as d 10.
If one instead implemented the bootstrap method of Section 4, by the
arguments presented there the stopping time would have finite expectation and (3.1) would
hold.
Finally, modifications of the sort considered in Section 5 resulting in a variable width
confidence band could
Thus while in the continuous case this sequential
be made.
methodology is not necessary to calculate a useful confidence band for F, if F has a discrete
component one can improve on the potentially conservative confidence band suggested by
(6.1).
Now let p ~ 2. In this case V n ~ V, where V is a tied-down zero-mean Gaussian process
on IR P .
The corresponding G F is usually unknown unless the p coordinates of ~ are
Again suppose c is the (l-a)th quantile of G F and use the strongly consistent
bootstrap estimator discussed by Beran(1984) to define the stopping time (1.5).
Upon
independent.
stopping, use the confidence region
{
FNd(~) ± d, ~EIRP }
for the multivariate distribution F. It then follows that ifV N ~ Vas dl0, then
d
(6.2)
= P{
sup
~EIRP
~ I <
I V Nd ()
l/2d }
Nd
•
- 15 -
-+1-0'
as d! 0 by the assumptions and the inequality (1.7). As in the univariate case, for each ~ E IR P
the sequence { Vn(~), n~l } is u.c.i.p.; by an argument similar to the proof of Corollary 2.1
it follows that V N
d
~
V as d! 0 so that (6.2) holds.
In this paper we have defined stopping times for several types of confidence band
procedures and considered various properties of these stopping times.
That these sequential
procedures achieve the correct coverage probability asymptotically was seen in Sections 2 and
5. The multivariate empirical process example suggests how these sequential methods can be
readily extended to confidence regions in p-dimensional Euclidean space, allowing a more
general treatment than considered here.
BIBLIOGRAPHY
Anscombe, F. (1952).
Soc. 48 600-607.
Large sample theory of sequential estimation.
Proc. Cambridge Philos.
Beran, R. (1984). Bootstrap methods in statistics. Jber. Deutsch. Math.- Verein. 86 14-30.
Bickel, P. J. and Freedman, D. A. (1981).
Statist. 9 1196-1217.
Some asymptotic theory for the bootstrap. Ann.
Chow, Y. S. and Robbins, H. (1965). On the asymptotic theory of fixed-width sequential
confidence intervals for the mean. Ann. Math. Statist. 36 457-462.
Cs6rgo, S. and Mason, D. M. (1989).
press.
Kolmogorov, A. N. (1941).
Math. Statis. 12 461-463.
Bootstrapping empirical functions.
Ann. Statist. 17, in
Confidence limits for an unknown distribution function.
Ann.
Noether, G. E. (1963). Note on the Kolmogorov statistic in the discrete case. Metrika 7 115116.
Sen, P. K. (1981). Sequential Nonparametrics:
Wiley and Sons, New York.
Invariance Principles and Statistical Inference, John
Tsirel'son, J. L. (1975). The density of the distribution of the maximum of a Gaussian
process. Theory Probab. Appl. 20 847-856.
Wood roofe , M. (1982). Nonlinear Renewal Theory in Sequential Analysis, SIAM,
Philadelphia.