Carlstein, Edward G.The Use of Subseries Values for Estimating the Variance of a General Statistic from a Stationary Sequence."

SUMMARY
..
THE USE OF SUBSERIES VALUES FOR
ESTIMATING THE VARIANCE OF A GENERAL STATISTIC
FROM A STATIONARY SEQUENCE
Edward G. Carlstein
University of North Carolina, Chapel Hill
Let {Z. : _00< i < -f=} be a strictly stationary a-mixing sequence
1.
... ,
Z ) be a statistic computed from the
n
Without specifying the
dependence model giving rise to the sequence {Z.},
1.
specifying the marginal distribution of Z.,
1.
of variance estimation for t
and without
we address the question
For estimating the variance of t
n
n
-+
from just the available data Z , we propose computing subseries
n
values t m(Zi+l' Zi+2' ... , Zi+m)'
0 < i < i+m < n .
These subseries
values are used as replicates in order to model the sampling variability
of the statistic t.
In particular we use adjacent non-overlapping
subseries of length m , m-+-OO
n
n
m
n
In
-+- 0 .
Our basic variance
estimator is just the usual sample variance computed amongst these
subseries values (after appropriate standardization).
This estimator
is shown to be consistent under mild integrability conditions.
A simulation study is conducted, leading to the introduction of
overlapping subseries and improved performance of the variance
estimator.
Running Heading:
Using Subseries Values to Estimate Variance.
AMS 1980 subject classifications.
Key Words and Phrases.
Primary 62G05; Secondary 60G10.
Variance estimation, subseries values, general
statistic, dependence, a-mixing, stationary
sequence.
Supported by NSF Grant MCS-8102725.
THE USE OF SUBSERIES VALUES FOR ESTIMATING THE VARIANCE
OF A GENERAL STATISTIC FROM A STATIONARY SEQUENCE
by
Edward G. Carlstein
•
1.
Introduction.
Consider a strictly stationary sequence {Z. :
_00
1.
-+-
from which we observe Zn = (Zl' Z2' ... , Zn)'
istic t
< i < +ro}
n> 1
A stat-
-+-
n
t (Z ) is computed from the observed series.
n
n
In the
absence of assumptions about the underlying dependence model in
the sequence (e.g. autoregression), and in the absence of specific
distributional assumptions about the Z 's (e.g. joint normality),
i
we would like to be able to estimate the variance of t
n
from the
-+-
available data Z
n
Most variance estimation techniques for general statistics
have been aimed at iid samples, making heavy use of exchangeability
in their schemes for generating replicates of t.
This is true of
the theory and intuition behind Tukey's (1958) "jackknife," Efron's
(1979) "bootstrap," and Hartigan's (1969) "typical values."
Recently, Freedman (1984) and Freedman and Peters (1984) have
•
considered applying the bootstrap to a linear model with autoregressive component, but this still assumes additive iid perturbations.
We propose computing the statistic t on subseries
-2O<i<i+m<n,
-
+
within the sample Z
n
as a way of obtaining replicates of t without disturbing the
natural ordering in the data.
Our basic variance estimator uses
adjacent non-overlapping subseries of lengthm
and m /n
n
0 as n
+
+
n
s.t. m
n
+
00
00
Section 3 gives a detailed comparison of the motivating factors
behind our variance estimator and those behind the standard variance
estimators for iid data.
In Section 4 we establish conditions under
which our estimator will be consistent in the L sense.
2
Parallel
theory is developed in Section 5 involving onlyF-consistency of
the variance estimator.
These
conaist~ncy
results are combined
with the asymptotic normality results of Carlstein (1984) to
obtain asymptotic normality for general statistics from a-mixing
sequences with the limiting distribution being free of the nuisance
parameter 0
2
Then simulation studies are conducted in order to
investigate the finite-sample performance of the variance estimator.
The results of these studies (Section 6) give insight regarding
the choice of subseries length (m )
n
they also suggest a way to
use longer overlapping subseries.
2.
Notation and Definitions.
Let {Z. (ell) :
1
_00
< i < +,o} be a strictly stationary sequence of
real-valued random variables (r.v.) defined on probability space
(D,F,F).
Let
F+(Fp q
respectively) be the u-field generated by
-3{Zp (W), Zp+l(W)' ... } ({ .•. , Zq- lew), Zq (w)} respectively).
For N > 1 denote:
a(N)
=
sup{!:JP{AnB} - :JP{A}:JP{B}
and define a-mixing to mean lim a(N) = O.
I:
A E: F~, B E: F~},
This is a standard
N~
mixing condition which guarantees approximate independence
•
between observations that are separated by a great distance
(in time) (see Rosenblatt (1956».
a-mixing is known to be
satisfied by normal, double-exponential, and Cauchy AR(l)
sequences (Gastwirth and Rubin (1975»,
as well as by Markov
sequences with finite state space (Billingsley (1968), p. 167).
In fact, Gastwirth and Rubin (1975)
ficient a(N) by C
Ipl N
bound the mixing coef-
for the normal and double-exponential
AR(l) sequences, and by C N IplN for the Cauchy AR(l) sequence
(where -1 < P < 1 is the AR parameter).
•
1
n
Let t n (zl' zZ'.·., zn) be a function from R -+ R , defined for
each n> 1 so that t (Zl (w), Zz (w), ... , Z (w»
-
n
n
is F-measurable.
We will suppress the argument w of Z.(·) from here on.
~
+i
i
Denote Zn = (Zi+1' Zi+Z'···' Zi+n) and t n
icular case:
-i
Z
n
-+i
= t n (Zn);
as a part-
n
L
=
j=l
Zi-tj.'
In.
For B > 0 denote:
Definition:
Random variables {X } will be said to be
n
uniformly integrable (e.u.i.) iff
]n
O
s.t. lim
A~
sup lE{ I~ I}
n
n>n
- 0
= O.
event~ally
-4At times it will be convenient to use the equivalence:
{x } are e.u.i. iff lim lim
n
A-'>= n-'>=
3.
0.
The Variance Estimator.
•
It would be useful to have a procedure for estimating the
variance
0
-+0
data Z .
n
avoid
2
of a general statistic to using only the available
n
In the same spirit as Carlstein (1984), we wish to
~aking
assumptions about the specific marginal distribution
F of Zo and about the dependence model
Min
{Zi}
Moreover,
calculation of the theoretical variance of to in terms of the
n
parameters of F and M --even if they were specified-- may be
intractable.
Hence our objective is a non-parametric variance
•
estimator for general statistics from stationary a-mixing
sequences.
In the special case where {Z.} is iid, non-parametric error
1
estimation has been addressed within the broader context of subsampling and resampling.
Hartigan t s (1969) "typical values" can
be used to obtain confidence intervals in a very general setting,
without explicitly estimaU ng
for example Efron (1982»
0
2
The "bootstrap" approach (see
may be applied for estimating virtually
any characteristic of the distribution of to
n
variance.
These techniques are
b~sed
--including its
on the idea that by computing
-5the statistic t on subsamples of the data, we can gain insight
about the sampling distribution of to
n
behind the bootstrap:
This is the intuition
The empiricalc.d.f. F
to the true c.d.f. F of 2
0
,
-+0
since 2
n
-+0
n
from 2
-+0
from F;
n
to the true sampling distribution of to
n
-+0
Z from F,
n
1n0
0*
n
So instead of drawing many samples
from F
n
Since F
n
t
-+0*
n
(Z
n
become close
We are, however, stuck
-+0*
replacement) •
cations t
from F.
o would
n
we draw many "bootstrap" samples Z
•
each based on
as the number of such replications be-
comes large, the empirical distribution of t
with but one sample
is "close"
is a random sample from F.
We would like to observe many replications of to
n
a new sample 2
n
is close to F,
(with
n
the corresponding repli-
) have an empirical distribution that satis-
factorily approximates the true sampling distribution of to
n
When non-trivial dependence is present in {Z.},
1
there is in
principle nothing wrong with using the empirical c.d.f. F
n
estimate the common marginal c.d.f. F.
to
For example, ergodic
theorems may be used to show that F (t) converges to F(t) with
n
probability 1; and Gastwirth and Rubin (1975) have demonstrated
that (F (t) - F(t»n~ converges to a Gaussian process when {Z.}
n
1
-6is)-mixing with a(k)
distribution of t
=
2
0(k- S / ) .
The problem is that the sampling
° depends not only on the marginal distribution F •
n
-+0
but also on the dependence in Z (i.e. the joint distribution).
n
Replications of t computed on bootstrap samples from F
n
from F-- will not accurately reflect this dependence.
"empirical dis·tribution" of t
0*
n
-- or even
The resultant
will be of little value in approximating
the true sampling distribution of to
n
In fact. the dependence
structure in {Z.} is preserved only by those subsamples of the form
1
O..::.j"::'n-k. n>k>l}
There are several competing considerations in designing a
variance estimator based on
{t~
It is clear
•
that the performance of such an estimator will depend upon how
many representative subseries values t
j
are used. how different the
k
t j , s are from each other. and how accurately the t j 's model the
k
k
behavior of to
n
For a particular value of k. one would not expect
'+1
to differ by much --especially in light of the dependence
k
and t J
between
{t~
:
z~
and Zj+k+l
0..::. j ..::. n-k}
Hence the collection of subseries values
contains a great deal of redundancy that may not
°
contribute information about t 's sampling variability.
n
tion
The collee-
on the other hand. contains only
t~
-7'k
non-overlapping subseries values. If k is growing, each t~
will
eventually behave as if it were independent of all but two of the
'k
other t J 's
Furthermore, if k remained fixed, a subseries value
k
t~ would never be able to reflect the dependencies of lag k+1 or
greater.
with k
n
These arguments suggest the use of {t
~
00
as n
~
ok
J n
k
o -< j
n
< [n/k ] - l} ,
n
00
Within this framework it seems reasonable to consider k
J'kn
since the corresponding t
(0 < B < 1)
k
liS
disjoint t
J'kn
k
's
=
[Sn]
's are based on subseries
n
of the same order of magnitude as to itself.
n
about
n
Unfortunately, only
of this form will ever be available as
n
jk
So an estimator based on such t
representative subseries values.
will never stabilize and home in on a
2
even as n
~
00
k
n,s
n
(Ironically,
the bootstrap and typical-value methods use randomly selected subsets of the possible subsamples, since it is computationally impractical
to use all the subsamples available.)
In light of these factors we propose the use of the subseries values
h
jm
n
m
n
s.t. m
n
o-< j
-
~
and m
00
< [n 1m
n
n
] -
In
4
1}
where {m
0 as n
+
00
number of subseries values (n/m )
n
n
: n~ 1} are positive integers
Thus we obtain an increasing
each of which is based on an
-8jm
ever-growing subseries
jm
(2m
n).
'
and each t
n
m
n is becoming increasingly
n
im
distant (m ) from all but two of the other t n , s
n
m
n
From this point on we will assume the following set-up:
s
i
-;.i
: = s (Z ) is a statistic that is wholly comp4table from the data
n
n n
,
0
t~
-+'
.
(s 1. -:IE { s }) n/2
Z1. , and does not involve any unknown parameters. t1.
n
n
n
n
is the correct theoretical standardization for s
2
o E (0,00).
[n/m ]-1
-"2
o
n
2
(m In)
n
n
L
in the sense that
2
The proposed estimator for
0
.
[n/m ]-1 .
2
1.m
(s
i=O
i
n
n - s
mn
n
where s
)
is simply:
mn
1.m
L
m
n
n/[n/m].
s
i=O
mn
This is nothing more than the usual sample variance amongst the stant
jm
dardized subseries values {~s n
n m
n
6
o-< j
< [n/m ] - l}
n
we will investigate the choice of {m },
n
In Section
and we will introduce
some modifications (involving longer overlapping subseries) which
enhance the performance of
4.
-"2
n
0
L -Consistency.
2
In this section we work out some theory for suhseries values.
The first main result is a law of large numbers for these
entities.
This result is llsed to obtain consistency of
~2
0
n
Finally
we arrive at an asymptotic normality result for to in which the
n
limiting distribution is free of
0
2
n
-9Let us begin with a useful truncation lemma:
Let X be F+-measurable and Y be F- -measurable, q> P .
Lemma 1:
q
p
2
2
Suppose max{E{x }, ]E{y }} < C < 00
Proof:
Then for any A> 0
•
Using the representation X
=
AX + ~ we see that:
(1)
The first term on the right hand side (r.h.s.) of (1) is bounded
2
above by 4A a(q-p), by Theorem 17.2.1, Ibragimov and Linnik (1971).
The required bounds on the other terms follow from the Schwarz
inequality.
0
Applying this lemma we can establish the following law of large
numbers for subseries values from an a-mixing process.
Theorem 2:
Let {m
-+i
Let {Z.} be a-mixing and let f (Z )
l
n n
n > I} be s. t. m
n
n
-+
00
and m
[n/m ]-1 .
Define f
m
n
I
n
i=O
lm
n/[n/m]
f
mn
n
n
In
-+
0 .
fi be a statistic.
n
-10-
1
cPt: lR ,
If:
(2.a)
and
2
(fO)
are e.u.i.
n
then:
f
l'
m
as n
(2.b)
-~
(2.c)
00
n
By (2. a) it suffices to show lim V(f}
m
n-+=
n
Proof:
jm
im
Now ~V(f
m
n
L
} <
I<tU
O<i<j<[n/m]-1
-
-
-
fin
n
f
n, f
m
[n/m
< (4JE{ (f0 ) }
m
+
I
n
k=2
n
n} I [n/m ,-2
n
kIn
n} I ([n/m ] _ k)[n/m ]-2
n
n
2
m
n
0 .
n
]-1
I<cU
O
,
mn
The idea here is that the covariance between non-adjacent f
jm
n,s
1J
~o,
dropping off as the separation (m ) incrcaccc.
n
Clre order n/m
as n
+
00
n
is
n
although i:ilCrc
of these terms, their average becomes negligible
•
o
2
Formally, we note first that (by (2. b» 1E{ (fn) } are bounded
uniformly in n2:. nO by C <
large so that m 2:. nO .
n
00
•
Assume now that n is sufficiently
Then for each k
E
{2, 3, ... ,
[n/m ]-l} we have:
n
-11-
J
Hence: k.!V{f
} < 4[n/m ]-l C
m n
+
n
2
4A a(m )
n
+ 6C\El
(1\0 )-})k.! for any A> 0.
TIl
n
Now take lim lim (.) of this last expression. []
A-+OO n-+OO
~2
Now we are ready to prove the L -consistency of '}n.
2
result follows in part from Theorem 2, since
This
~2n is essentially a
mean.
e
Let {Z. } be a-mixing and let {m } be s. t. m
n
n
l.
Theorem 3:
m In
n
-+
°.
Let s
i
i
2 "-2
t , a , a
n' n
n
-+ co
and
be as defined in Section 3.
4
If:
(to)
n
then:
a
Proof:
"-z
Write a
n
"-z
n
are e.u.i.
LZ
--> ()
Z
as
(3.a)
n -+
(3. b)
co •
[n/m ]-1 .
- (t
'm
[n/m ]-1
L:
L:
n
n
=
n
LZ
2
--> ()
]-1
(t
L
i=O
and (t
n
,
n
2
n,
m /
where t
m
n
L
i=O
l.m
t
mn
ll/[n/m]
n
im
n
[n/m
)
2
Clearly we only need to show
n
m
)
n
2 L2
->0
The former follows from Theorem 2.
-12In order to show t
L
4
--> 0,
m
recall
n
Lenuna 4 (Chung (1974), p. 97):
Let rE (0,00),
and suppose that
and X ~>
n
{x } are s.t.
n
X.
Then:
iff
L
X _r_> X.
n
By (3.a) , lE{(1:
m
)4J <
00
o.
Hence by Lenuna 4 it will suffice to establish
n
we have t
that (t
mn
~>
m
n
'rin s. t.
m >
n -
n
And applying Theorem 2
0 .
)4 are c.u. i.
Now (I:
so that for A> 0
m
lE{
(I:
m
n
n
Therefore we only need to show e.u.i.
2
But by (3. a) again we know that lE{ 0:: ) } < 00
n
m
n
~
nO.
above.
And L -)n
0
2
in lP and in L
Z
when
by Theorem Z, as discussed
So Lenuna 4 yields the required result. rJ
Notice that both Theorem 2 and Theorem 3 are logically independent
of the question of convergence in distribution.
These rcsults give
-13moment and integrability conditions that guarantee L
2
-consistency
of estimators based on the subseries values from an a-mixing
sequence--regardless of whether the to,s (or fO,s) are converging
n
n
in distribution.
Furthermore, we have not constrained the mixing
coefficient a or the sub series length m
a (n) -+ 0,
m
n
-+
m
00,
n
In
n
°.
-+
in any way other than
In practice the L -consistency
2
is desirable because it translates into shrinking variance and
bias for the estimator.
We can now combine the variance estiT:ldtion result (Theorem 3)
with the distributional results of Carlstein (1984), and obtain:
Theorem 5: Let {Z.} be a-mixing and let s
1
i
n
{m }
n
o"'2 ,
n
t
i
be as
n
in Theorem 3.
If:
2
30 E
(5.a)
(0,00) s. to
lim (N
n-+OO
k
n
IR ) 2
n
M
n
t
a;{t N ' R }
n
n
whenever {N }
n
°
as n
then:
{R }
{M }
n
n
30
-+
4
00
0
2
are s. t. N > M + R > R
n- n
n- n
•
00; and
(5. b)
;
,
-+
and
(S.c)
(S.d)
whenever {N },
n
Rn IN n
-+
p
2
{M},
n
{R} are s.t. N >M +R >R
n
n- n
n- n
-+
00
and
-14Proof:
We will begin by showing that (t
°N la,
M
V
tRn/O)
-->
N (0, 0, 1, 1, p) ,
n
via Theorem 4 of Carlstein (1984).
n
Since lE{ to} - 0,
n
2
it suffices to
2
observe that (5.b) implies that (to)
n
are e.u.i.
Next we want to use Theorem 3 to conclude that (S.c) holds.
light of (5.a) with M =0 and N =R =n,
n
n
n
(3.a) .
t
°
n
V
--->
In
it is enough to verify
°
But e.u.i. of (t) 4 follows directly from (S.b) together with
n
N(O,o 2 ) (established above).
0
(Condition (5.b) may of course be replaced by the less specific
4
condition:
5.
(to)
n
are e.u.i.)
F-Consistency.
In order to get the convergence in distribution (S.d) of Theorem
"2
5, we really need just a
n
F
2
--> a .
It is possible to obtain results
analogcus to Theorem 2 and Theorem 3, but which only require integrability
conditions on the moments being estimated--not on the higher moments
--and which yield only convergence in F for the subseries estimators.
The trade-off, however, is that we must explicitly relate the subseries
length m
n
to the rate of decay in a(·) .
[n 1m ] ex (m )
n
n
-+
°.
Specifically, we use
This says essentially that if the dependence is
strong (i.e. ex(·) decreases slowly), the subseries length should be
-15large relative to n .
This is reasonable since under strong
dependence we need larger "gaps" separating non-adjacent subseries
values if we want them to behave as if they were independent.
Proceeding in this spirit:
Let {Z } be a-mixing and let fi
i n '
Theorem 6:
{m }
f
n
n
Theorem 2.
1>
If:
E:
lR
1
be as in
m
(6.a)
and
,
fO are e.u.i. , and
n
= o;
lim n a(m )/m
n
n
n-)oO:)
f
then:
m
(6.b)
~> 1> as n -)-
(X)
(6.c)
(6.d)
•
n
Proof:
qn
Denote [n/m ]
n
2m
-1
+ f
m
m
n
n
q m
+ ... + f n n)/k
-2
f
m
lP
->
n
-
n.
We write f
4m
n + f
p m
n + ... + f n n)/k
m
m
n
n
n
as f
m
n
n
1>/2.
-1
We consider f
=
- n,
-1 +
f
m
n
m
n
m
1P
_..- ->
1>/2
and
n
m
n
first.
n
jm
n
m
n
Define r.v. 's {g
j
E:
{O, 2, 4, ... , p }, n > 1} having the
n
same marginal distributions as {f
jm
n
m
n
jE {O, 2, 4,
f2
m
n
m
3m
5m
(fn+ f n+ f n
m
m
m
n
n
n
-2
f
m
n
-1
It will suffice to show that both f
m
n
j an even integer -< k
n
sup{j
Pn
'
j an odd integer < k
sup{j
where f
= kn
... , p } ,
n
n >
n,
-16jm
but s.t. {gm n : j
E
{O, 2, 4, .•. , Pn}} are independent for fixed
n
n> 1.
C9
n
1
Denote: tjJ (s) =lli{exp{isf }}, tjJ (s) =lli{exp{is gm}}
n
m
n
n
n
jm
where Y . (s) = exp{is f n/ k },
nJ
m
n
n
p /2+1
(s)) n
2m
p m
+ gm n + ... + gm n n)/kn and 8n (s)
( gm0
n
n
= lli{Y nO ()}
s
n
Now, ItjJn(s) - tjJn(s) I ~ l1JJ (s) -lli{
n
+ Illi{
IT
IT
j =0, 2,
- lE{
. .. +
... ,
j =0, 2,
... ,
p
r,
n
-L.
Y .(s)}8 (s)1 +
nJ
n
Y ,(s)}
nJ
p -2
n
IT
Y . (s)}8 (s)
n
J'=O, 2 , ... , Pn- 4 nJ
I + ...
2
IT
Y. (s)} - 8 (s) I < 16 ex (m ) p /2 ,
nJ
n
n n
j= O, 2
IlE {
by Ibragimov and Linnik (1971), p. 307, because
Iy nJ,(s) I =
-
Hence, by (6.c) it will suffice to show g
ffiu
2(j-1)rn
Put r
p /L. + 1
n
n
denote X .
nJ
g
n
n:>1.
-
that for fixed n {X ,
nJ
We will show
n
mn
r
jE{1,2, ... ,r},
.
-¢
j=l
X./r
nj
JP
---> ¢/2 .
for each
n
I
1 and
JP
n
->0.
Note
-17Also, {X
r
n1
n
lim JE{ I
} are e.u.i. by (6.b), which in turn implies that
o.
XnIl}
Now truncating X
at r we obtain
nj
n
n~
r
r
-1
n
r
n
L
j=1
X
nj
r
-1
n
+
L
(
n
r
r
j=1
r
r
n
-1
n
L
n
j=1
- JE{
X
X I})
nj
r n
n
n
+ JE!
r Xn I}
n
+
X
nj
We will show that each of the 3 terms on the r.h.s. converges to
zero in W,
using an argument similar to Chow and Teicher (1978),
pp. 125-126.
IE! r
n
xn1 } I .2.r
I
lP{ r
-1
n
r
L
n
!E{X }
n1
nX .
nJ
j=1
I + IE{
I>d
r
n Xn1 }
I
-+
0 as n
-+
00;
also
I
< W{ t X . > r
for some 1 < j < r } <
nJ - n
- - n
r
r
n
W{ IXnll2- r
r
r
n
JE{ (r
-1
n
n
} < JE{
I
n xn11 }
-+
0 as n
00;
and lastly
n
\'
L
j=1
r -1
n
L
-+
r -.
lE{(Xnl)21I{j.2.-IXnll < j + I}}
j=O
-w{IXnll2- j+l})
~
n
L U+ 1 )2(1p{! Xnl l2:-j}
j=O
-18r -1
n
L
j=l
r
.::.1
n
I
+3
+ 1.
(2)
j =1
Since {X
n1
} are e.u.i., 3C<oo
r
I
I
m{ jx
j=[A]+l
for n Gufficiently large.
r
n
f
m
n
n
II}.::. A C + r
JE{
n
I~ II} , for
any A> 0 ,
n
Substitutir.g into (2), dividing through by
and taking lim lim (.) establishes the required convergence inF of
,
-1
s.t.
A-¥X>
.
I"HOO
-2
An exactly analogous argument may be used on f
m
.
0
n
Corollary 7:
Let {Z.} be a-mixing and let s
1.
i
n
,
t
i
,
n
a 2 , a"'2 ,
n
ten}
be as in Theorem 3.
2
If:
(t 0)
n
are e.u.i.
lim n (:t(m )/m =
n
n
n- KXJ
then:
,', 2
n
ll'
-_.
.2
a~
,
and
o;
n ,. . .
(7. a)
(7.b)
(7.c)
n
-19Proof: Write
JP
n
--'>
n
2
t
(t
n
m
)
2
as in the proof of Theorem 3.
n
follows directly from Theorem 6; so does t
m
JP
-->
n
° since
0
lE{ to} _ 0.
n
We can finally give a version of Theorem 4 of Carlstein (1984)
whose conclusion is free of a
2
and whose moment conditions are no
stronger than those in that earlier result.
Of course we pay by
assuming more about the relationship between mixing rate and subseries length.
Let {Z.} be a-mixing and let s
Corollary 8:
l
i
n
{m }
n
~2
()
n
as in Theorem 3.
If:
:3
0
2
E
(8.a)
(O,txl) s. t.
M
!,,;
lim (N IR ) 2 a;{tO
n n
N
n-l«J
n
t
R
}
"
n
whenever [N 1
n
N 'M
n-- n
n
I
{M
II
+R
'. R ->n- n
Q
;
,
2
{R } are s. t.
n
and
L
(to)
n
are e.Il.i.
lim n ('( (mn ) 1mn
n-K'O
=
and
°;
(S.b)
(8.c)
-20-
as n
then:
-+ 00 •
(8.d)
and
•
M
N2(0. 0, 1, 1, p) as n
t n /0) - V
->
A
R
n
n
{M},
n
whenever {N }.
n
N
>M +R
n- n
Proof:
"R
n- n
-+00
(8.e)
-+ 00
{R} are s. t.
n
R /N
n n
and
-+
p
2
This is an i.mmediate consequence of Theorem 4 of Carlstein
(1984) and Corollary 7 (above).
Notice that i f a(n) ~ C n Sn.
U
0 < S < 1.
as in the normal. double-
exponential and Cauchy autoregressive examples of Section 2. then
choosing m =n'i (O<y< 1) yields:
n
as well as m -+
n
co
and m
n
In -+ 0
1/
n a(m )/m <Cm Y
n
nn
mn
s
-+ 0
as n-+,oo.
.
The sample mean and sample fractile statistics are discussed as
examples in Corollary 14 and Theorem 17 (respectively) of Carlstein (1984).
6.
"2
Simulation Study of 0
n
A2
Section 3 gave intuitive motivation for the general form of 0
n
and Sections 4 and 5 established certain reasonable asymptotic
properties of this variance estimator.
In the present section we
A2
consider the finite-sample behavior of 0
A2
and we suggest some modifications of 0
n
n
and the choice of {m } •
that yield superior
n
A
..
-21performance.
Here the method of investigation is large-scale
simulation rather than theoretical calculation.
"2
At this stage it is helpful to write 0 as:
n
im
"2
o
=
n
o-< i
L
(s
I
(t
< j < [n/m ] - 1
n
m
n
n_ s
im
o-< i
< j < [n/m ] - 1
n
m
n
n_ t
jm 2
n)
m
n
ill
n
I[n/m ] [n/m -1]
n
n
jm 2
n) I[n/m ][n/m -1]
m
n
n
n
There are
[n/m ][n/m -1]/2 squared paired differences, each contributing 2
n
n
2
im
replicates of (t
m
n)
to our estimate of 0
2
As mentioned in
n
"2
Section 3, 0 will be biased if m
n
n
2
im
is not long enough to make (t
m
n)
n
im
a good "representative."
to the bias if jm - (i+1)m
n
jm
n
Th e cross-pro d uct terms t n t
m
m
n
n
n
is not large enough to make t
will add
im
nand
m
n
jm
t
n approximately independent.
m
And we need a fair number of
n
im
(t
m
2
n)
replicates if our estimator is to be stable.
These consid-
n
"2
erations led us to define a and {m } with m ~,~ and n/m ~
n
n
n
n
and led us to impose a-mixing on the underlying sequence.
00
The
theoretical framework we arrived at was tractable, and yielded
-22encouraging results.
But these same considerations also suggest
modifications to improve the performance of
im
m
n
for finite n .
n
we want our subseries to be as long as possible
For fixed n ,
so that (t
"'2
0
2
n)
reflects all of the "relevant" dependence in
{Z.}
l
We are restrained, however, by the fact that there are not enough
non-overlapping long subseries.
In practice, then, it is worth-
while to consider allowing the subseries to overlap so that quality
need not be sacrificed for quantity.
That is, we may use subseries
starting at the same intervals {im
n
i = O , l , 2 , ... },
for k
n
=
lmn
terms rather than just m
n
positive integer.)
(r
n
terms.
(Here
l
but lasting
is a fixed
The number of replicates available
: = [n/m ] -l+l) is virtually unchanged, but their approximate
n
On the other hand, since l
independence is undermined.
is fixed
the asymptotic properties of the estimator will still hold:
now
each subseries is approximately independent of all but 21 other
subseries (rather than all but just 2 other subseries).
In the
"'2
finite-sample setting we would expect to reduce the bias of (,
n
im
in so far as (t
k
n
2
)
2
im
is a better representative than (t
n
n
jm
im
Yet the magnitude of the cross-product terms t
im
be greater than that of t
m
n)
m
n
n
t
jm
- n
ill
n
n
k
n
t
k
n
will probably
n
especially for j-i small; this
-23could offset the reductions in bias.
And furthermore, although
im 2
n
the number of (t k ) replicates is nearly the same as the number
n
im 2
of (t n) replicates, the covariances between the former are likely
m
n
to be larger than the covariances between the latter. Hence the
im 2
"'2
n
) 's would have larger variance
estimator On based on the (t
k
n
im
than the version based on (t
m
2
n)
's
Our simulation study invest-
n
igates these trade-offs.
Based on the above arguments, it would appear that the general-
•
ized variance estimator being proposed is:
/',2
a
im
jm 2
(sk n_ sk n) k /r (r -I),
I
n
O<i<j<r -1
-
-
n
n
n
n
n
/',2
which reduces to our old a when (
i
-i
Z,
n
But reflecting on the case
1 •
n
s
n
it is clear that the paired differences involving overlap-
n
ping subseries require special treatment:
-
im
Zk
n
n
Z
k
n
n
(
l~
z
p=im +l
we have
"m +k
J n
n
"Jm
- n
jm
for 1 ~ j - i < ( ,
Z )/k
X
p
p=im +k +1
n
n
which should be stan11
11
dardized by a factor of k /((j-i)m )
n
P
n
1:
2
if it is to be used to model
-24n~s 0
This suggests that the appropriate variance estimator (for
n
mean-like statistics)
.-
im
L
(s
0< i < j < r -1
--
is:
-
n
k
jm
n_s
n
2
n ) (IT {j - i > f}
k
n
which again reduces to the old
"2
0
n
.e
+ -:-.
J-1
II { j - i <
n) k n / r n (rn -1)
,
when I I .
To test the performance of this 0"2
n
where theoretical checks can be made:
{Z.} comes from an AR(l)
1
E. ~
sequence Z.
1
we began with a simple situation
,
1
this situation it is easy to show that a
iid N(0, 1) ,
2
(l-¢)
S
i
n
= -::-lZ'n .
In
_?
~.
We considered
weak, moderate, and strong positive dependences in {Z.} (¢ = .1, .5, .9)
1
samples of realistic sizes for time-series analysis (n = 100, 250, 500,
1000)
(m
n
=
short, medium, and long "base-lengths" for the sub series
[.en nJ ,
and subseries overlaps of 2/3, 1/2, and
none (corresponding to I
3, 2, 1) .
=
For each combination of
,0
"2
1000 realizations of Z (and hence a ) were generated.
n
n
(¢, n, m , I) ,
n
The routine generating the E. 's was adapted from the uniform random
1
number generator of Wichmann and Hill (1982) and the inversenormal approximation of Beasley and Springer (1977).
The criteria used to evaluate
2
V{o }
n
+
~21
CIE{}
n
J
2 2
-
0)
,2
0
n
are:
and
(each of these being estimated
-25from the 1000 realizations of
-"2
).
n '
1:
-"2
2
is estimated by (V{o }/1000)
n
0
2
vary so dramatically
(0
the standard deviation of
0
2
Because the true values of
1. 23 for ¢
= .1,
0
2
= 100 for
-"2} Ii; 4
n
MSE { ()
it aids comparisons across ¢ to consider
¢ = .9),
nnd also to consider
V{~2} and
n
(lE{
;~2}
n
2
- ( )
2
as proportions of
The results are presented in Table 1.
-"2
As suggested by the theoretical results of Section 4, 0
is converging to
0
2
n
in m.s.e. as n increases--for all values
of ¢ and all choices of {m } and C •
n
Under weak dependence (¢ = .1), virtually all of the m.s.e. is
•
due to variance because even the short subseries are long enough to
represent the relevant dependence.
Comparing across m = il1 n, n ~ ,n 3/4 for
n
fixed values of nand C , we see the smallest variance for m
n
and the largest variance for m = n
n
3/4
=
This is due to the large
.
number of subseries values (r ) available 'Ilhen
n
ill
II
is short
and the scarcity of subseries values when m
n
is long
On the other hand, for fixed nand m we see a subn
stantial increase in variance as C increases.
This cannot be
attributed to the corresponding but relatively minor decrease in
r
n
(except
p~rhaps
in the case m
n
= n
3/4
).
Rather, this effect
must be from the larger covariances between the longer overim 2
n
lapping subseries values (t
Since variance is the name
)
k
n
il1 n
e
~
TABLE 1.
Simulation Study. of;2n
I
= .1,
n
,
m
n
£.
k
n
r
lE
n
sd {lE}
V
) =
,1SE
V
--4-
:1SE
G
25
50
83
166
1. 19
1. 18
1. 19
1.19
.011
.008
.006
.004
.120*
.061*
.032*
.016*
.98
.95
.94
.89
.080*
.042*
.023*
.012*
2.71
2.96
3.11
3.11
100
250
500
1000
4
5
6
6
2
2
2
2
8
10
12
12
24
49
82
165
1. 21*
1. 21*
1.22*
1. 22*
.014
.010
.007
.005
.192
.090
.055
.027
.99
.99
1. 00
1. 00
.126
.060
.036
.018
100
250
500
1000
4
5
6
6
3
3
3
3
12
15
18
18
23
48
81
164
1.18
1. 23*
1. 21*
1. 22*
.015
.012
.009
.006
.237
.137
.079
.038
.98
1. 00
.99
1. 00
100
250
500
1000
10
15
22
31
1
1
1
1
10
15
22
31
10
16
22
32
1. 23*
1. 22*
1.24*
1.23*
.018
.014
.012
.010
.310
.189
.145
.101
100
250
500
1000
10
15
22
31
2
2
2
2
20
30
44
62
1,9
21
31
1.18
1.24*
1. 26*
1.22*
.022
.018
.015
.012
100
250
500
1000
10
15
22
31
3
3
3
3
30
45
66
93
8
1:'
20
30
1. 26*
1. 23*
1. 22*
1.24*
100
250
500
1000
31
62
105
177
1
1
1
1
31
62
105
177
3
4
4
5
100
250
500
1000
31
62
105
177
2
2
2
2
62
124
210
354
100
31
62
250
I 500 105
177
I 1000
3
3
3
3
93
186
315
531
I
~
'"
I
i
1-'('
"
s=
I
I
I
I
-:r
I
;::;- I
"
"!
0;=\
I
I
I
:
2
= lim
n~
.5, :;.:.
(n V{ s O:-),
n
si
n
=Zi
4.0
n
V
~
V
\1~1'
4
5
6
6
s-
sd {E}
lE
1
1
1
1
II
~
,
_2 = 1.23
4
5
6
6
"
e
{Z.}
is an AR(l) sequence with coefficient
:I.
100
250
500
1000
....'"
e
,
.026
.019
.015
.011
.667*
.363*
.239*
.127*
.29
.25
.23
3.29
3.42
3.52
3.56
.040
.028
.023
.016
.158
.090
.053
.025
3.38
3.60
3.72
3.66
1. 00
1. 00
1. 00
1. 00
.203
.124
.095
.066
.489
.307
.232
.138
.99
1. 00
.99
1. 00
.027
.022
.018
.015
.745
.492
.314
.218
1.19*
1.24*
1.18
1. 23*
.037
.031
.028
.028
2
3
3
4
1.13
1. 20*
1.28*
1. 25*
.054
.038
.046
.034
1
2
2
3
--
--
--
--
--
--
1. 31
1.17*
1. 20*
.061
.1155
.040
3.73
1. 00
1.00
1. 00
2.45
1. go
1.07
4.16*
3.82*
3.96*
~
4
lE
sd{lE}
,146
.14
1. 57
. 81
.51
.26
.76
.71
.69
.57
.129*
.071*
.046*
.029*
26.4
35.2
41.8
42.6
· 3
· 1
· 9
· 3
.45
.41
.35
.25
.049
.035
.028
.019
2.43
1. 20
.78
.37
.86
.88
.91
.76
.176
.085
.054
.031*
34.7
46.4
53.5
54.3
3.42
3.58
3.73
3.78
.052
.040
.038
.030
2.67
1. 61
1. 44
.93
.89
.90
.95
.95
.188
.112
.094
.061
.323
.202
.153
.091
3.58
3.85*
3.91*
3.93*
.065
.056
.047
.039
4.19
3.14
2.17
1. 53
.96
.99
.99
1. 00
1. 00
1. 00
1. 00
1. 00
.489
.323
.206
.143
3.77*
3.76
3.89*
4.01*
.084
.064
.060
.049
6.98
4.06
3.63
2.42
1. 39
.94
.79
.31
1.00
1.00
1. DC
1. 00
.914
.617
.519
.532
3.67*
3.83*
3.73
3.85
.113
.097
.098
.084
2.95
1. 42
2 .10
1.14
1. 00
1. 00
1. 00
1. 00
1. 94
.93
1. 38
.75
3.69*
4,20*
3.97*
4.07*
.184
.133
.124
.120
~.03
1. 64
=
100
V
;lSI -
V
I
."!.:> I:..
15.3
20.7
25.2
25.3
--
?
:;-
~
.090
.065
.058
.185
.179
.129
= . 9,
35.1*
16.5*
.01
.01
. ell
.00
200
169
124
63
.04
.04
.04
.02
.65
.64
.52
.34
423
409
275
118
.09
',
......
.05
32.7
47.2
58.6
69.8
.61
.60
.62
.59
367
361
381
347
.273
.198
.136
.096
47.9
61.5
76.7
81.9
1.1
1.0
1.0
.9
1157
1008
1082
796
.99
.99
.99
1.00
.440
.257
.228
.151
55.1
73.9
80.8
89.6
1.5
1. ~
1.3
1.1
2224
2049
1687
1218
12.69
9.35
9.56
7.09
.99
1. 00
.99
1.00
.800
.586
.602
.444
60.0*
85.3*
89.0*
91. 9
1.9
2.2
2.2
2.0
3598
4770
4976
4128
.69
.96
.98
.98
33.8
17.6
15.3
14.4
1.00
1.00
1.00
1. 00
2.12
1.10
.96
.90
64.8* ,
89.9*
94.4*
90.0
3.1
3.0
3.3
2.5
9688
9082
10721
6380
.89
.99
1. 00
.98
' 1.09 !
.92 I
I
--
--
--
34.4
32.0
16.7
II
--
1. 00
1. 00
1. 00
--
2.15
2.00
1. 05
--
86.9*
84.0
98.1*
--
4.0
3.9
3.3
51.5*
43.1*
I
I
--
16005
14889
10548
i .713
.534
.~~ .. r
.J.:>9 ,
I
. .562 i
.:'37
.351
.336
.469
.328
d~
..."
-'
• .:.
.... -+
.220 I;
.07
.~89
!
.:1
.315
I
i
.18
.28
.209
.126
I
.30
.:'0
.388*:
.67
.163* i
.112*,
I
;
.249*!
. i .:.
I
,
.52
.75
.82
.92
.99
I 1..98
00
24 ~
I
...
I
.273
i
' .206 I
: .133 !
i .520
I
i
I
I
I
.499 I
.510 i
.419 :I
'
1. 08
.65
.~2
. .:>1
.06
I
-27of the game here, the introduction of overlap doesn't payoff in
a big way.
But it is worth noting that in the cases where there was
some significant bias (Le. m
subseries length (i
=
faced with a fixed n,
estimators (m
n
=
in n, i
=
n
=
1),
2) does eliminate it.
In practice one is
a fixed (but unknown) ep,
k
in n , n 2
n
3/4
doubling the
and a choice of
i = 1, 2, 3)
So the "bottom
line" of this analysis is to identify which of the 9 estimators is
best for each criteria (bias, variance, m.s.e.), given nand ep.
In
Table 1 , an asterisk (*) indicates the best (or approximate
best in the case of close races) estimator.
Clearly, m
n
i = 1 is the big winner for all sample sizes when ep
Moving on to the case of moderate dependence (ep
= in
n ,
.1.
=
.5), we begin
to see the biasing effect of insufficient subseries length.
= in
In the case of m
n
i = 1, where the bias is most substan-
n,
tial, there are pronounced gains for doubling and tripling the
subseries length (i
2, 3) .
=
When m
n
n~
and i
1,
there are
again improvements for doubling, but less decidedly so for
tripling.
When m
n
n
3/4
,
the subseries are already so long
that increasing i is of little value.
is parallel to the ep = .1 case:
The pattern of variances
for fixed n,
variance increases
in response to fewer replications and increased overlap.
•
though this makes m
n
=
in n, R.
Al-
1 the best choice for minimizing
variance, the bias contribution is of enough consequence to push
-28m = ll1 n, l = 2 ahead in m.s.e.
n
also beats m
ll1 n,
n
The variances for m
n
n
small sizes of r
n
l
=
Note that m = ll1 n,
n
l = 3
1 for n~ 250 (in terms of m.s.e.).
3/4 are so large--due to the excruciatingly
--that it seems unwise to use such estimators,
in spite of their relatively good performance in terms of bias.
For
m
n
n~
In
n
l
2 and 3, the biases are nearly as good as for
= n 3/4 , but the variances and m.s.e. are much more reasonable
(relative to
4
If one places higher priority on bias reduction,
0).
it would not be unreasonable to prefer m
n
=
!.:
n2
l = 2:
the
estimator that minimizes variance and m.s.e. amongst those estimators that are "best" on the bias criteria.
Thus there are
several arguments supporting the use of overlapping subseries
•
under moderate dependence.
When the dependence is strong (¢
=
.9) we are embarrassed to
find that, as n increases, the minimum variance estimator
(m
n
true
= ll1 n, i
0
2 .
1) is zeroing-in on a value that is \ of the
=
Now it takes the mammoth subseries of m
n
to wipe out the bias portion of m.s.e.
of the m
n
n
3/4
But again the variances
3 4
n / estimators seem prohibitively large.
The over-
all pattern of variances is as in the previous cases, but when
bias and variance contributions are combined into m.s.e., the
estimator with m
n
=
n
li
f = 2 is superior for all values of n .
•
-29On the whole, this simulation lends credence to the use of
=
8ubseries, particularly l
overlapping
=
2 (since l
3 seems
to suffer from increased covariances more than it gains by
And, for this range of sample sizes, m
reducing bias).
n
n
3/4
yields too few replications for it to be a stable estimator.
Looking carefully at the case
2
of 0
is 100),
¢ = .9 (where the true value
it appears that the gains for allowing overlap
do not quite measure up to what would be expected.
example, when k
34.7, but when k
42.
•
= 12
n
n
= 12
=3
with l
with l
=2
n
,
31,
n
l = 1
for: k
n
n
85.3; 64.8.
l increases.
im
n
-t
n
jm 2
n
)
k
62,
n
=
30,
69.8; 61.5; 55.1.
= 22, l = 1
those for: k
Similarly, the expectations for:
k
are respectively:
k
the expectation is approximately
but here the estimator does worse
when more overlap is involved.
(t
the estimator has expectation
In principle the bias reduction should be constant for a
fixed subseries length k
k
For
k
n
l = 2 ; k
l = 3
30,
Likewise the expectations
20, l = 2 are: 58.6; 47.9.
l = 1; k
n
=
62,
l = 2 (m
n
=
And
3 4
n / ) are:
In each case there is substantially more bias as
The explanation is that when t
=
1 all of the
im
terms contribute 2 replicates of (t
n
"
when
•
im
2
n
contribute 2 replicates of (t(O 0) )
J-1 m
l
=
n
> 2 there are pairs with 1 < j-i ~
n
k
n
2
)
but
n
l-1 that instead
Being based upon
-30shorter subseries, these latter replicates do not have the debiasing effect which was the motivation for introducing overlap.
Our special standardization of these pairs by k /«j-i)m )
n
k
2
n
makes these terms the correct order of magnitude (if s
i
n
is
mean-like), and including them in our estimator gives us more paired
differences and hence more stability.
But in terms of bias
we would be better off excluding them and defining:
im
jm 2
(sk n_ sk n) rr{j-i > -Ok /(r -f) (r -f+1)
n
n
n
r -1
n
n
~2
a :
n
n
as our variance estimator.
Of course, if the original number of
pairs (r (r -1)/2) is small (e.g. m
n n
n
3 4
n / ),
=
the reduction to
(r -f)(r -1+1)/2 non-overlapping pairs will be disastrous.
n
And
n
-2
in general we would expect a to have larger variance but smaller
n
"'-2
bias than a
To investigate these effects
n
were conducted, but excluding m = n
n
. l
'
Slmu
atlons
0
f a~2
n
3/4 (due to insufficient r ) ,
n
and excluding .£ = 1 (for obvious reasons) and f = 3 (because the
m.s.e. of f
=
2 was usually better).
The results for
¢ =
.1, .5,
.9 are in Table 2.
Under weak dependence (cjJ
~2
unbiased, and so are the a
that for fixed ¢,
n,
n
m ,
n
=
•
1) the
estimators.
.e
the
0n2
"'-2
(1
n
estimators were nearly
(A bubble (0) indicates
estimator is superior to or
"'-2
approximately equivalent to the corresponding a
n
estimator.
A
•
-31Simulation Study of
TABLE 2.
2
0-
n
k
=
n
si= Zin •
n
.
{ O}
= n-XX>
11m(n
V s
),
n
sample
= lmn =
size, m
n
~2
0
n
{Z.} is an AR(l) sequence with coefficient ¢.
•
1
Based on 1000 realizations.
= "base" subseries length, l = 2 = overlap factor,
actual sub series length, r
n
=
# of subseries per sample.
°Better than (or approximately equal to) corresponding value for
+Best (or approx. best) for criteria, for fixed nand ¢.
~2
lE, V. MSE are simulated estimates of :IE{0- }.
n
n
C""l
N
·
P
n
k
n
r
n
sd{E}
:IE
V
V
NSE
n
MSE
--tr
0-
8
10
12
12
24
49
82
165
1. 21°+
1.20°+
1.21"+
1. 220+
.014
.009
.008
.005
.190°
.088°
.056°
.027°
.99
.98
.98
1.00
100
250
500
1000
10
15
22
31
20
30
44
62
9
15
21
31
1.17°
1.22°+
1.23°+
1.22°+
.484°
I
.022
.018
.016
.013
.340
.247°
.165
.991.321°
1.001.2230
1.00 .162
1.001.108
100
250
II
pi 500
13 1000
4
5
6
6
8
10
12
12
24
49
82
165
II
.039
.029
.023
.017
1.50°
100
250
500
1000
10
15
22
31
20
30
44
62
9
15
21
31
.077
.057
.051
.041
5.97
3.23°
2.59
1.65°
100
250
500
1000
4
5
6
6
8
10
12
12
24
49
82
165
42.6°
42.9°
.49
.42
.37
.25
237
177 °
140
62 °
100
250
500
1000
10
15
22
31
20
30
44
62
9
15
21
31
48.5 °
68.0°
77.9°
84.2°
~
II
II
b
13
p
.125°
.059°
.037°
.018°
,
I
~
...... ....\,..
·
p
II
II
-e0
·
-.:t
p
13
p
~
~
II
b
~
....\,..
·
p
-e-
I3
0
0
....
If'')
II
II
P
p
......
~
II
II
I3
N
b
~
0\
·
II
-e-
P
.•If'
P
II
I3
P
3.22°
3.41°
3.53°
3.54°
I
I
3.72° +
3.74
3.89°+
3.93°+
28.1°
36.6°
I
I
1.2
1.2
1.1
.9
.85°
.52°
.27°
I
I
i
1532
1397
1240
851 0
. 71 1. 131°+
.711.075°+
.70,.046°+
.56 .030°+
.99
.98
1.00
.99
.378
.206°
.163
.104°
.04 .540°
.041 .419 °
.04/ .344 °
.021.333 °
.37
.58
.72
.77
•
A2
0-
n
and
-2
MSE{O- } respectively.
4
5
6
6
......
n
among all
100
250
500
1000
~
N
m
~2
V{o- }.
n
"-2
0-
.418° +
.242°+
.173°+
.110°+
II
i
0n2
estimates.
-32plus (+) indicates that for fixed
~2
¢ and n the
0
estimator is
n
"2
-2
the best or approximate best amongst all eleven 0 and 0
n
estimators.)
,,2
n
good as 0
(m
-2
In terms of variance and m.s.e.,
0
n
is almost as
n
-2
The variance of 0 is hurt more when r
n
n
is small
!.:
n
= n 2 ) because then the loss of (2r -i) (i-l)/2 overlapping
n
pairs is relatively greater.
The story is similar for
bias, variance and m.s.e. as
terms of variance when r
n
-2
¢ = .5 : o n has about the same
,,2
0
n
-2
did, but again
is small.
0
suffers in
n
-2 .
Notice that 0 ~s an
n
optimal (+) estimator in terms of m.s.e. when m = bl. n,
n
and
n~
is usually optimal for bias reduction when m
n
Turning to the case of strong dependence (¢ = .9), we now
0n2
see significant gains in debiasing by using
every expectation
shows an improvement relative to the corresponding entry for
,,2
n
o ,
and five out of eight of these increments are in excess
of 2 s.d. units.
-2
tends to be
0
n
Once again the variance of
~2
inferior, but its bias is so superior that o
n
actually ends up
"2
with smaller m.s.e. than 0 in six out of eight cases.
n
,,2
-2
Overall it seems that o performs as well as 0 in terms
n
n
of bias and m.s.e., but somewhat worse in terms of variance.
"2
Moreover, whenever (} (£=2) was optimal (*) for bias or m.s.e.,
n
-2
(} retained that optimality. And when the dependence is strong,
n
,,2
-2
(} offers substantial gains over (} in terms of bias and m.s.e.
n
n
-33If one is more concerned with bias and m.s.e. than with variance,
then nothing is to be lost by using
-2
0
n
And if one would like
"insurance" against strong dependence, then there is something
to be gained in using
0n2
An analogous simulation study was conducted in order to
investigate the behavior of
0n2
when si is the ratio statistic
n
The results here echo those for s
=2
there are substantial gains in debiasing for using i
=
than i
i
rather
1 (for fixed ¢, n, m ), and this effect is more pron
nounced under heavier dependence.
•
i
n
The estimator using m
n
in n ,
=
= 2 minimizes m.s.e. when ¢ = .1 and ¢ =.5 (for all n); but
¢ =
when
that m
n
.9 the debiasing effect of long subseries is so important
= n~
i = 2 has the best m.s.e.
Thus there is further support for using longer overlapping
.
.-2
0
•
n
su b ser1es 1n
Throughout these simulattons, i
2 in
particular has made noticeable improvements over i
i
=
3 had unacceptably inflated variance.
1, while
The choice of {m }
n
seems to hinge upon the strength of dependence in {Z.} : when
1
augmented by overlap (i
=
for ¢
but for ¢
=
.1 and ¢ = .5;
2), fin
in n was quite acceptable
.9 the extra length of fi
was really necessary to control the bias.
.,
n!:;.
n
(Recall that in Section 5
our theoretical work suggested the need to relate {m } to a(·)
n
-34by (n/m )O'.(m )
n
n
-+
a.
This again requires longer subseries
under strong dependence.)
Perhaps the "safest" and most in-
tuitive estimator under unknown dependence would be
t =
2 ,
m
n
= n~
It gives equal priority to r
replicates) and m
n
n
0n2
with
(II of
(base subseries length), but then beefs up
the subseries length for debiasing (k
n
= tmn
2m )
n
and
ignores the confounding overlapping pairs (j-i < t)
it minimized m. s. e. when ep
both statistics s
7.
=
.9,
for all values of n,
And
for
i
n
Acknowledgment.
I thank Professor John Hartigan, my thesis advisor, for his
guidance on this research.
•
REFERENCES
Beasley, J.D. and Springer, S.G. (1977).
of the Normal Distribution.
Billingsley, P. (1968).
The Percentage Points
Appl. Stat., 26, 118-120.
Convergence of Probability Measures.
John Wiley and Sons, New York.
Carlstein, E. (1984).
Asymptotic Normality for a General
Statistic from a Stationary Sequence. University of North
Carolina Institute of Statistics, Mimeo Series #1561
Chow, Y.S. and Teicher, H. (1978).
Probability Theory.
Springer-
Verlag, New York.
Chung, K.L. (1974).
A Course in Probability Theory.
Academic
Press, New York.
•
Efron, B. (1979).
e
knife.
Bootstrap Methods:
Another Look at the Jack-
The Ann. of Stat., 7, 1-26.
Efron, B. (1982).
The Jackknife, the Bootstrap and Other Re-
sampling Plans.
Society for Industrial and Applied Mathematics,
Philadelphia.
Freedman, D. (1984).
On Bootstrapping Two-stage Least-squares
Estimates in Stationary Linear Models. The Ann. of Stat., 12, 827-842.
Freedman, D.A. and Peters, S.C. (1984).
Equation:
Some Empirical Results.
Bootstrapping a Regression
Jour. of the Am. Stat. Assoc.,
79, 97-106.
Gastwirth, J.L. and Rubin, H. (1975).
The Asymptotic Distribution
Theory of the Empiric CDF for Mixing Stochastic Processes.
The Ann. of Stat., 3, 809-824.
Hartigan, J.A. (1969).
Using Subsample Values as Typical Values.
Jour. of the Am. Stat. Assoc., 64, 1303-1317.
Ibragimov, I.A. and Linnik, Yu.V. (1971).
Independent and
Stationary Sequences of Random Variables.
Wolters-Noordhoff
Publishing, Groningen, The Netherlands.
Rosenblatt, M. (1956).
Mixing Condition.
Tukey, J.W. (1958).
Samples.
A Central Limit Theorem and a Strong
Proc. of the Nat. Acad. of Sc., 42, 43-47.
Bias and Confidence in Not-quite Large
The Ann. of Math. Stat., 29, 614.
Wichmann, B.A. and Hill, I.D. (1982).
Pseudo-random Number Generator.
Ed Carlstein
Dept. of Statistics
Univ. of N.C.
Phillips Hall, 039A
Chapel Hill, N.C. 27514
An Efficient and Portable
Appl. Stat., 31, 188-190.