•
•
Random Designs for Estimating
Integrals of Stochastic Processes
by
Carol Schoenfelder
*This research was supported by the Air Force Office of Scientific
Research under Grant AFOSR-75-2796.
Carol Schoenfelder. RANInI DESIGNS FOR ESTIMATING INTEGRALS OF
STOCltASTIC PROCESSES (under the direction of S. cambanis).
The integral
I
= fAZ(t)~(t)dt of a second order stochastic
process Z over a d-dirnensional domain A is estimated in the mean
•
n
square sense by a weighted linear combination Zn
=!n ' l 1
1=
c (X, )Z(X'n)
n
m
1
of n observations of Z at random sample points Xln, ... ,Xnn which
are possibly dependent random variables taking values in A and are
independent of Z.
Results are obtained on the asymptotic behavior of the mean square
error E(I - -Zn) 2 for certain general random designs, and in
particular for simple random, stratified, and systematic sampling
designs.
Under appropriate assumptions an optimal simple random
design for each fixed sample size as well as an asymptotically optimal
sequence of stratified designs are obtained.
The mean square errors
under random, stratified, and systematic sampling designs are compared
for fixed sample size.
is considered.
Finally a nonrandom design, median sampling,
TABLE OF CONfENTS
Q-IAPTER
1.
e
PAGE
INTRODUCTION
1
1.1. Statement of the problem. .
1
1. 2. Random designs - A review of the literature.
3
1.3. Asymptotically optimal nonrandom designs. .
12
1.4. Summary.
16
2.
GENERAL RANOOM DESIGNS .
18
3.
SIMPLE RANOOM SAMPLING .
31
4.
STRATIFIED SAMPLING
4.1. Introduction
··..
·····
·····
49
4.2. Regular sequences of partitions.
····
4.3. As;rnptotica11y optimal sequences of partitions.
4.4. Examples.
..······
··....·
SYSTEMATIC SAMPLING.
5.1. Introduction.
····
.···
7.
MEDIAN SAMPLING WHEN
.
······
A= [0,1]. ·
e
85
106
····
123
133
144
APPENDIX.
BIBLIOGRAPHY .
80
113
5.2. Regular sequences of partitions.
COMPARISON OF DESIGNS.
.
106
··
6.
57
94
4.5. Incorrect choice of covariance kernel.
5.
49
....·····
157
ACKNOWLEDGEMENTS
I wish first to express my sincere appreciation to Professor
Cambanis who suggested this topic and directed me in its research.
Without his invaluable support and guidance this work would not have
been feasible.
I also wish to thank Professors Baker, Gross, Leadbetter, Quade,
Smith, and Kelly for their review of and patience in reading this
manuscript.
Further I wish to express my indebtedness to my family and
friends for their support and encouragement and for making my years
here enjoyable.
Finally thanks go to Susan Stapleton for an excellent job of
typing .
•
rnAPTER ONE
INTRODUCfION
1.1.
Statement of the problem.
Let Z be a measurable second-order
stochastic process on A, a Borel subset of Rd , continuous in
quadratic mean, with covariance RO'
(1.1.1)
R(s,t) = RO(s,t)
mean m,
+
and correlation
m(s)m(t)
such that
(1.1. Z)
f
R(t,t)dt
<
00.
A
Let R also denote the integral operator in LZ(A) with kernel R,
and consider cf> E LZ (A), I. N(RI / Z). Consider the following ''weighted
average" of the process Z over A:
(1.1. 3)
I =
f
Z(t)~(t)dt.
A
I is defined both as a quadratic mean and as a sample path integral.
When A is bounded and
cf>
is chosen as
Lebesgue measure), then
I
is the average value of Z over A.
cf>
is chosen as an eigenfunction of R,
l/ll(A)
I
(where
II
denotes
When
is the random variable
corresponding to that eigenfunction in the orthogonal decomposition of
Z.
2
We shall approximate
I
linearly, using. observations at random
sample points, by
(1.1.4)
-Z =1
n n
n
L
. 1
1=
c (X. ) Z (X, )
n ill
In
where the weighting function cn is defined everywhere on A and the
sample points Xl n "",Xnn are (possibly dependent) random variables,
taking values in A and independent of Z.
The set
{X ,· .. ,Xnn },
ln
or alternatively the rule for obtaining this set, is called the random
design (r.d.).
(In Section 1.3 and Chapter 7 we shall also consider
nonrandom designs - in which the sample points
~n' ..•
,Xnn are
nonrandom. )
In the following chapters we shall consider the problem of finding
a function cn and a design {Xin}~=l to minimize the mean square
error (m.s.e.) E(I - Zn)2, for fixed n or asymptotically, under each
of several sampling schemes.
In addition we shall consider the rate at
which the m.s.e. converges to zero and the relative size of the m.s.e.
for the various sampling schemes.
These sampling schemes are defined in Section 1.2 which also
contains a review of the related literature.
Sacks and Ylvisaker's (1966)
notion of asymptotic optimality (for a particular nonrandom design
problem) is described in Section 1.3 and used in subsequent chapters.
Finally a brief summary of the results in Chapters 2-7 is given in
Section 1.4.
3
1.2.
Random designs - A review of the literature.
Work on random designs
has been done by Cochran (1946), Quenoui11e (1949), Zubrzycki (1958),
and Tubi1la (1975), among others.
Cochran's and Quenouille's work is
mainly concerned with estimation of the mean of a finite population
(see Smith (1976) and the discussion following Smith's paper for a survey
of the subject of sampling from finite popUlations) while Zubrzycki' s
and Tubilla' s work is concerned with estimation of the mean of an
infinite popUlation.
Three types of random designs are generally considered under the
added assumption that A is bounded (see e.g. ZubrzyckD
random, stratified, and systematic sampling.
(1958)):
A (simple) random sampling
design is a design Xl"",Xn in which the X.1 's are i.i.d. with
uniform distribution over A. A stratified sampling design is a design
in which Xl""'Xn are independent random variables with each Xi
distributed uniformly over A.,
i = 1, •.. ,n, where the A.'
s form a
1
1
partition of A.
Xl
Finally, a systematic sampling design is a design where
is uniformly distributed over
~
and for each i
is the image of Xl under the translation mapping
~
= 2, .•. ,n,
X.
1
onto Ai where
the A. 's are congruent by translation and form a partition of A.
1
It is assumed that
~ = l/~(A)
and c
=
1.
Then assumptions such
as wide-sense stationarity are made and results are obtained on the size
of the (average) mean squared error (m.s.e.)
(1. 2.1)
2
en
= E(Z- n - I) 2
for the various sampling designs.
Note that with
~ = 1/~ (A)
and c
=
and for a given realization of the process and of the random design, Zn
may be thought of as the sample mean and I
that realization.
as the population mean of
1
4
It had long been supposed that systematic and stratified sampling
strategies are "better" than random sampling tmder certain circumstances
for estimating a population mean, but these suppositions had no
theoretical justification tmti1 the works of Madow and Marlow (1944)
Cochran (1946), Yates (1949), and Quenoui11e (1949).
Cochran
(1963) notes that Madow and Madow (1944) in their first of a series of
papers on systematic sampling were the first to look at systematic
sampling from a theoretical viewpoint.
In that paper Z is taken to
be a real-valued non-stochastic process on the integers 1,2, ... ,kn,
and results are obtained on the efficiency of systematic sampling and
its relative efficiency with respect to random and stratified sampling.
Yates 0949) also deals with systematic sampling in detail while
Cochran (1946) and Quenoui11e (1949) place an emphasis on comparisons
of designs for stochastic Z.
For the moment let
Z be a real-valued stochastic process defined
on the integers 1,2, ... ,kn,
as in Cochran (1946); and consider, for
example, a simple random sample of size n from a realization of Z.
Cochran notes that it has been shown by several authors that for a
given realization of Z
or when
Z is non-stochastic
average of the squared errors,
I =
the
Zkn ) taken
over all simple random samples of size n from the finite population
of size kn is
(1. 2. 2)
!
n
kn-n
kn-1
kn
!- \
kn i~l
(Z(i) _
Z
kn
)2
.
Similar expressions may be obtained for systematic and stratified
sampling.
These expressions, however, are difficult to compare.
Cochran reasons that since
Z may be thought of as a
5
stochastic process, it may be assumed that Z has a certain mean and
covariance structure.
Then the expected value of (1. 2.2) may be
calculated, along with the expected values of the corresponding
expressions for systematic and stratified sampling, to give expressions
that are easier to compare.
Specifically, Cochran assumes that
E(Z(i) - 1.I)2 = a2
EZ(i) = 1.I
(1.2.3)
E(Z(i)-l.I) (Z(i+u)-l.I) = pu(J2
where pu ~ pv ~ 0 whenever 0 sus v. The average variance for
simple random sampling (i.e., the expected value of (1.2.2)) is then
(1.2.4)
e;,n
=~
2
1
(l-K)(l -
kri(~-l)
kn.-l
r
u=l
For stratified sampling, it is
(1. 2. 5)
2
(J2 1
2
k-l
e st,n = n-(l--k)(1 - k(k-l) u=l
r (k-u)puJ
and for systematic sampling,
(1.2.6)
2
- "'kn'""?'(krir--:;-l~)
kn-l
II
2k
n-l
(kn-u)pu + n(k-l) U~l (n-u)/'iJl
\~
Here a stratified sample is formed by first forming n strata
Z(l), ... ,Z(k); ... ; Z(kn-k+l), ... Z(kn)
and then choosing an element
from each stratum randomly and independently of all others.
A
systematic sample is formed by first selecting at random an i
l, ... ,k and then letting Z(i), Z(k+i), ... ,Z(kn-k+i)
Under these assumptions for each n
from
form the sample.
6
e2
st,n
(1.2.7)
and if in addition,
:s;
e2
r,n
p.1- 1 + p.1+1 - 2p.1
~
0,
i=2,3, ... , kn-2,
(which
Cochran notes is satisfied by certain models for economic data and
for forestry and land-use surveys), then
2
e 2:s;2
e
:s; e
sy,n
st,n
r,n
(1. 2.8)
Quenoui1le (1949) extends these results in several directions.
He
notes that Cochran's results may be generalized to non-stationary
cases by assuming
EZ(i) =
1m
L
. 1
~.
1
1=
E(Z(i)-~i)2 = a·12
~.
= kn
1
kn
L
~
i=l
(~i -~)
2
=kna
2
(1. 2.9)
kn-u
L
i=l
p .•
1,1+U
in place of (1.2.3) and by adding the term
1
1 1 kn
L
klUl·
-(1--) ~
n
1= l
2
a·1
to the expressions on the R.H.S. of (1.2.4) - (1.2.6).
In particular,
the order relations given in Cochran between average variances for
each type of sampling still hold.
(see also Cochran (1963, pp.2l9-22l).)
Also, he obtains integral approximations to (1.2.4) - (1.2.6)
for
to
7
large n under the asslDllption that
Z is a (mean-square) continuous
process on some fixed finite interval from which the sample is to be
drawn.
For a (mean-square) continuous process Z and for large k,
Jowett (1952) also obtains an integral approximation to (1.2.6) and
Williams (1956) a series approximation to (1.2.6) in negative powers
of n.
Finally, Quenouille deals with sampling in the plane.
a real-valued process on {(i,j):
Let Z be
i=l, ••. , kl n1 , j = 1, ... ,k2n 2}
such that
EZ(i,j) =
(1.2.10)
II
E(Z(i,j)-ll)(Z(i+u,j+V)-ll) = Pijuv
I. I.
1
J
0
2
p..
IJUV = (kln1-lul)(k2n2-lvl)pUV
The popUlation mean of a realization of Z is to be estimated by the
mean of a sample of size nl n2 from that realization. Quenoui1le
obtains expressions for the average variance for each sampling scheme
and their integral approximations for' large nl , n 2 if it is assumed
that Z is a (mean-square) continuous process on some rectangle
containing {(i,j):
i=l, ... ,kln l , j=1, ..• ,k2n 2} and that a certain
sum converges.
Zubrzycki (1958) considers sampling from a realization of Z
where
Z is a stationary mean-square continuous process on some
bounded domain A of the plane, with mean m and covariance R(~, t)
where R(~,.!) depends only on the vector .!.-~ with 0 2 = R(t,!) and
is continuous in s and t.
The integral
be estimated by Zn with c = 1.
I
with
</>
= 1/11(A)
For random, stratified, and
is to
8
sampling, the expected squared errors are respectively
systen~tic
(1.2.11)
1 nL nL R(t. ,t.) - z1
f f R(~,t)dsd!. .
e2
= -,,sy,n n~ i=l j=l
1
J
~ (A) A A
where
t. E
1
A.1
is the center of gravity of A.,
1
i=l, •.. ,n.
Zubrzycki shows that
e
(1.2.12)
2
st,n
2
2
and obtains a sufficient condition for esy,n
:;; est,n
. Da1enius,
Hajek, and Zubrzycki (1961) consider optimal placement of {Ain}i for
2-dimensional A where
{Ain}i consists of circles and thus does
not form a partition of A,
nor does it necessarily cover A.
Tubi11a (1975) generalizes Zubrzycki's results to d dimensions
for particular A, Ai' i = 1, ... ,N = nd , and stationary Z and obtains
rates at which the m.s.e. (1.2.1) converges to zero as N ~
00.
Let
A be the d-dimensiona1 unit hypercube and AiN , i=l, ... ,N, its N
disjoint sub-hypercubes with side 1/n. AsSlDIle Z has constant mean m
and continuous covariance
(1.2.13)
2
Ne N ~ C(O)
r "
C(!.-~)
= Cov(Z(!),
Z(~).
I f C(!-~d~ d! .
AA
Then as N
~
00
9
Further assume that C possesses all partial derivatives of all orders
everywhere except at the boundaries tj=O, j=l, ••. ,d,
and that it
possesses all one-sided partial derivatives of all orders at these
boundaries.
(1. 2.14)
Then Tubilla shows that as N -+
N2/d e 2
sy,N
-+
f
L
12
~L
A j=l
~
ely (.1.)
~
dt
3t j
where
(l-t.)
J
where the sum is taken over all 2d combinations of plus and minus signs.
Further if C possesses all partial derivatives of all orders (not just
one-sided ones), as N -+
~
2
Nl+2/d est,N
(1. 2.15)
1 d
-+
- -6 . l 1
J=
Tubilla also considers a non-random
sampl~g
scheme, mid point sampling,
in which the sample points are the centers of the hypercubes A.1 ,
i=l, ... ,N,
uk/d e 2 N
m,
(1.2.16)
as N -+
and obtains
N--
-+
c
mk
where cmk depends on C and where k=2 if all one-sided
partial derivatives of C exist at the boundaries t j = 0, j = l, ..• ,d,
00
and k = 4 if all (two-sided) partial derivatives exist.
considers the case where C is isotropic; that is,
Tubilla also
C(t) = h(ltl)
for
10
some fWlction h.
derivative
He shows that if h has a continuous second
(on [0,00)),
then as N + 00
l l d
N + / e;t,N
(1.2.17)
f f I~-!Id~ dt
-hI (0)
+
AA
where
I!I
denotes the Euclidean norm of t.
Also if d=2
and h has
a continuous third derivative, he obtains
N e
2
sy,N
c
+
.
SY,lSO
(1. 2.18)
3 2
N / e2 N
m,
+
c .
m,lso
as N + 00 where c depends on C.
Several other papers (one of the first of which is that of Hansen
and Hurwitz (1949)) discuss choosing sample points with different
probabilities from a finite population (probability sampling with
unequal probabilities) and note that such a procedure may lower the m.s.e.
of the estimate.
For example Cochran (1963, pp.83-97)
considers
estimation of the mean of a finite population consisting of L subpopulations via a random sample of size n.
1
(of size N.),
i = 1, ... ,L.
1
subpopulation
i, i=l, ... ,L.
Let Z.1
from subpopulation i
denote the sample mean of
Then the population mean is estimated by
L
L
Zn =
i=l
n.Z.
1 1
L
L
.1
1=
n.1
11
and the squared error is smallest when the n. 's are chosen such that
1
n.
1
L
L n.
j=l J
where
i.
sf
is
(l-N/)-l
=
N. S.1
1
L
L N.S.
j=l J J
i = l, ... ,L
times the population variance of subpopulation
Thus the optimtull design concentrates its sample points where there
is the most variability.
unequal probabilities.
We shall pursue this idea of sampling with
12
1. 3.
Asymptotically optimal non-random designs.
parameter estimation problem:
a
Estimate
Consider the following
through observation of
(a realization of) the process
Yet) = 8 f(t)
+
Z(t)
at points Tn = {toIn }~1=1 where 8 is an tmknown constant, Z is as
in Section 1.1 and has zero mean and known covariance R where
A = [0,1],
Tn
c
[0,1],
is known and satisfies
f(t) = f~ ~(s)R(s,t)ds
(1.3.1)
on
and f
[0,1]
where
~
is continuous on
the best linear unbiased estimator of
[0,1].
a
It is well known that
based on these observations
is
L
-1
f(t.)R (t. ,to )Y(t. )
In
In In
In
i,j
=
-1 (t. ,to )f(t. )
L f(t.)R
In
In In
In
i,j
with variance
Var BT = (L f(t. )R-l(t. ,to )f(t.
..
In
In In
In
n
I,J
where the norm is that of H(R),
of R,
and PT
n
sp {R(·,t), t
E
)r l
=
the reproducing kernel Hilbert space
is the projection operator mapping H(R)
T}
n
(Sacks and Ylvisaker, 1960).
onto
The problem then is
to find for each n an optimal n-point design for use in estimating 8;
13
II PT fll;2
that is, one which minimizes the variance
over all n-point
n
designs T.
n
In this notation, we have also
E(I -
n
2
L
c.
Z(t.)) =
i=l In In
IIf -
2
gT cliR
n'
(1. 3. 2)
where
L c.In
I
is as in Section 1.1 with A = [0,1]
R(·, t·1 n ).
gT,c = PTf.
For fixed T,
=
and where gT
n'c
the m.s.e. is minimized when
The problem is then to find an optimal n-point design in
the mean square (m.s.) sense for approximating I; that is, one which
minimizes
II f-P T fll~,
or equivalently maximizes
n
n-point designs T.
n
over all
n
Thus there is a close relationship between finding
an optimal n-point design for estimating
approximating
II PT fll~,
a
and finding one for
I.
In both cases one wishes to find a set of n distinct
points Tn c [0,1] to maximize II P fl1 2 for each n (sacks and
Tn
Ylvisaker, 1970b).
Optimal n-point designs are in general difficult to obtain (Sacks
and Ylvisaker, 1970b); however, tmder certain conditions, designs that
are optimal in some asymptotic sense are easily obtainable.
Sacks and
Ylvisaker (1966, 1968, 1970b) set forth the following definitions:
Definition 1.3.1.
A sequence of partitions {Tnln is said to be a
regular sequence of partitions RS(h) for same continuous density h
on
[0,1]
smallest t
if tIn = 0,
s.t.
t nn = 1,
and tin' i=2, •.. ,n-l,
is the
14
ni = f to h(s)ds.
Definition 1.3.2.
A sequence of designs {Tnl n is said to be an
asymptotically optimal sequence of designs for estimating 8 if
lim
n-+oo
where for each n the sup, sUPn' is taken over all designs T
c
[0,1]
consisting of exactly n points.
Let
..
l
R ] (s ,t)
ij
ai + j
=
..
R(s "t)
as l at]
_.
ij
R+ (t,t) - 11m R (s,t),
s-..t +
and consider the following set of assumptions which are discussed at
length in Sacks and Ylvisaker (1966):
(1.3.3)
A.
R is continuous on [0,1] x [0,1] and has continuous
derivatives up to order two in the complement of the
diagonal in the unit square.
At the diagonal,
R
has all right and left derivatives up to order two.
B. aCt) = R:O(t,t) - R~O(t,t)
sup aCt) <
00,
is continuous on
and inf aCt) > 0,
(0,1),
so that a may be
extended to a strictly positive continuous function
on [0,1].
15
For each t € [0,1], R02+(., t)
02
sup II R + (. ,t) II R < 00.
C.
€
H(R)
and
Sacks and Ylvisaker (1966, Theorem 3.1; 1968, Theorem 3.1) prove the
following theorem which we restate in slightly different form.
Theorem 1.3.3.
aontinuous
if f
(Saaks and Ylvisaker
(~/h
=0 if
~
= h = 0),
(1966~1968)).
If
~,h,
.
n-+oo
11m n
2
~/h
if R satisfies A,B,C (1.3.3),
is of the form (1.3.1) and if {Tnl n forms a RS(h) ,
(1.3.4)
and
then
1 fl a~2
Ilf-PT fli R2 = IT
T·
n
0 h
Further if h is ahosen proportional to
2 1/3 ,then the sequenae
(a
~)
~
is used in Sacks and
of designs is asymptotiaaZZy optimal and
(1.3.5)
Finally the following assumption on
Ylvisaker (1970a, 1970b) when Z possesses one or more continuous
quadratic mean derivatives and will be used in Chapter 4 to obtain
results (Theorems 4.2.6, 4.3.2) similar to those in Theorem 1.3.3:
(1.3.6)
~
if
has at most finitely many zeroes (perhaps none) and
~(z)
numbers
= 0 then in same neighborhood of z there are
0 < m = m(z) < M = M(z)
and p
= p(z)
> 0 so
that for any x in this neighborhood, mlz-xl P s I<fl(x) I s
Mlz-xI P.
are
16
1.4 SUJIUTIB.ry.
In this study we do not asstUlle that 4> = l/lJ.(A), cn = 1,
and X.In is unifonnly distributed (over A or A.In depending on the
sampling scheme) as has been the case in the available literature (cf.
Section 1.2).
Instead we make those assumptions on 4> and cn stated
in Section 1.1, and further we allow Xin to be nonunifonnly distributed.
These relaxations of the assumptions on cn and the distribution of
{Xin ' i=l, ... ,n} allow us to use the variability of Z in the
approximation of 1.
MJre generally,
replaced by cin(Xln,X2n, ... ,Xnn)'
covariability of Z.
cn (X )
in
i=l, ..• ,n,
in (1.1. 4) could be
in order to use the
This, however, is a much more difficult problem
and will not be considered here.
In Chapter 2 we consider general sampling schemes and obtain
sufficient and also necessary and sufficient conditions for the m.s.e.
to converge to zero (under certain assumptions)using the results of the
Appendix.
In Chapter 3 (simple) random sampling schemes are considered
and an optimal design is obtained for each fixed n when R is
strictly positive definite.
In Chapter 4 stratified sampling schemes
are considered and an asymptotically optimal sequence of designs is
obtained under certain conditions.
The similarity between asymptotic
optimality of stratified sampling designs for this problem and of
nonrandom designs for a parameter estimation problem, as introduced by
Sacks and Ylvisaker (1966), is apparent in that chapter.
In
Chapters 3 and 4 we also study the effect on the m.s.e. of guessing R
incorrectly (and using the guess to determine the design).
.An
asymptotic expression for the m.s.e. under systematic sampling is
obtained in Chapter S.
In Chapter 6 random, stratified, and systematic
sampling are compared for fixed n.
It is shown that stratified sampling
~
17
is better in the m.s. sense than random sampling, and sufficient
conditions are obtained under which systematic sampling is better (worse)
than stratified sampling.
Finally in Chapter 7 median sampling, a
nonrandom sampling scheme, is considered, and the asymptotic value of
the m.s.e. is found under certain assumptions and shown to be the same
as that of
Ilf-PT fli R2
n
in (1.3.2).
OiAPTER TWO
GENERAL RANOOM resl G'JS
Before specializing to sampling schemes such as simple random,
stratified, and systematic sampling (which are considered in later
chapters), we first consider general sampling schemes.
In this
dlapter we derive an expression for the (average) mean squared error
2
(m.s.e.) en for a: general sampling scheme and a relationship between
c, 4>, and the underlying design under the requirement that the m.s.e.
2
en converge to zero as the sample size n tends to infinity. Finally,
we consh-ler sampling sdlemes in which the sample points are chosen
independently of each other.
Let
Gin denote the marginal distribution of Xin and J ijn the
joint distribution of X.
and X. , i, j = l, ... ,n. Then
1n
In
- -n2
+
= n12
fA
11
L
i=l
fA R(s,X.)c (X. )4>(s)ds
111
n
111
fA R(s,t)4>(s)4>(t)dsdt]
I Iff
R(s,t)c (s)c (t)J .. (ds,dt)
i=l j=l A A n n
1Jn
19
e~ depends only an the marginal and 2-dimensional
Thus the m.s.c.
j oint distributions of the sample points
n
1
L
Gn(t) = n
(2.2) .
i=l
= -r
n
in
,
i = 1, ••• ,n.
Let
G. (t)
n
1
In(s,t)
X
m
n
l L
i=l j=l
J .. (s, t) .
1Jn
Then
e
2
n
=I
AAI R(s,t)cn (s)cn (t)Jn (ds,dt)
(2.3)
+
Note that
G
n
and J
J~A R(s,t)~(s)~(t)
n
ds dt.
are distribution functions over A and A x A,
respectively, and that I n
its marginal distribution.
is synunetric in its argunents .with Gn
as
Under some reasonable asymptotic assumptions on the functions
cn
and the sequence of designs (which are met by most designs of interest
as will be seen in later chapters), the following theorem gives
necessary as well as necessary and sufficient conditions for the m.s.e.
e~
to tend to zero as
n
tends to infinity.
e
20
Theorem 2. 1. Assume that J n aonverges weakZy to some distribution
J
on
G aonverges weakZy to G, say~ the
n
marginal distribution funation of J. Assume that G has
density
A x A.
Thus
g and that
g is nonzero a.e. on A.
Further assume
that
an aonverges to some funation a unifo~Zy on aompaat
2
subsets of A and that a g is bounded a.e. Let Da be the
set of disaontinuities of a, and assume that the Lebesgue
measure of Da and the J-meaSUI'e of Da x Da are both zero.
Finally assume that R is aontinuous on A x A~ (1.1.2) hoZds
(a)
(2.4)
If
If
en ~ 0, then for some 1JIE:N CRl/2 )
cg =
</>
a.e.
+ 1JI
(2.4) and (2.5) hoZd, then e .. O.
n
(b) Assume further that R .is striatZy positive definite and
that
</>
.is non-zero a. e. on A.
Then e
n
~
0 iff
J(s,t) = G(s)G(t)
(2.5)
and
(2.6)
cg
Proof.
= </>
a.e.
Both (a) and (b) follow from (2.22) - (2.24) below, and so
no assumptions on N(R1/2 ) will be made at this time.
By hypothesis and by (2.3) (see Billingsley (1968, p.34).), we
have, wi th A suppressed
21
e~
lim
n-+oo
f f R(s,t)c(s)c(t)J(ds,dt)
-2 JJR(s,t)cf>(s)c(t)g(t)dsdt
f f R(s,t)cf>(s)cf>(t)dsdt.
=
(2.7)
+
This expression will be rewritten into a more useful fom using the
results of the Appendix.
ITnJSt show that
I.
R, c,
In order to use these results, however, we
and
We first show that
J
satisfy certain conditions.
R(s, t)c(s)c(t)
is of the fom
Consider a second order stochastic process
(A.!.).
l;(t), t€A,
defined
on a certain probability space and with mean zero and covariance
random variable
of
l;
n defined on the same probability space independent
and with mean
net) = c(t)n, tEA.
0 and variance 1, and a stochastic process
~(t)
Then
process with covariance
H(~)
Now consider
=E
[,;1[,;2 •
Y in H(l;) ,
= l;(t)n(t)
is a second order stochastic
R(s,t)c(s)c(t) = K(s,t),
=
sp{~(t),
Every element in
as follows:
t€A}
H(~)
is of the fom Yn
Consider an element X
n
N
n
~=
L
k=1
an,k [,;(tn,k)
N
n
=
L
k=1
= Yn n
say.
with inner product
Then
(2.8)
R, a
an, k c(t n, IJl;(tn,-k)n
in
< ~l'
~2 >
for some
sp{~(t),
tEA}.
22
for some
Y in
n
X = lim Xn .
s. t.
H(~).
The
Now consider
Xn ' s
H(~),
XE
sp{~(t),
X E
n
form a Cauchy sequence in
H(~).
tEA}
Then
since
E(Xnm
-X)
(2.9)
2
222
= E[(Y2
-Y ) n ] = E(Y -Y) En
nm
nm
2
= EcYn-Ym
) '
Xn = Y n forms a Cauchy sequence in H(~) iff Y forms a Cauchy
n
n
sequence in H(~). Hence X = n lim Yn = n Y, as desired.
Further, every element of R(K) , the reproducing kernel Hilbert
space of
is of the form
K,
every element of
XE
H(~).
R(K)
g
E
R(R) ,
E
R(K)
X = Yn
g E R(R).
f(t)
H(~),
for some
we have
Yn]
~(t)Y En 2
= c(t)
E
= c(t)
get)
as desired.
Finally, the inner product in R(K)
(2.11)
£1' £2
= E[~(t)X]
for some Y E
is given by
for
We know that
= E[~(t)X]
= E[~(t)c(t)n
(2.10)
for some
may be written as
Since, as we showed,
f(t)
for some
cg
since
= E(Yl n Y2n)
= E(Y1Y2)
En
Z
= E(Y1YZ)
£1
£2
=<c'c
>R
Z3
by (Z.10).
Thus if
is a complete orthononna1 set (CONS) in R(R) ,
{&n}
{c&n} is a CONS in R(K).
Since R is continuous,
also R(K) are separable.
Finally since K is measurable and R(K)
separable,
~
R(R) and therefore
is
has a measurable modification (cambanis, 1975, Th. 1).
By hypothesis,
c~g
is bounded a.e. and (l.l.Z) holds.
(cZg)(t) s M for a.e.
be a finite constant s.t.
tinA.
Let M
Then
= JR(t,t)CZ(t)g(t)dt
JK(t,t)g(t)dt
(Z .1Z)
s M JR(t,t)dt <
~
and K is the kernel of an integral operator in Lz(dG) . Also by
hypothesis g is nonzero a.e.which implies that G measure is
equivalent to Lebesgue measure.
Thus by Theorems 1 and 3 and the
proof of Theorem 1 in Cambanis (1973) - which may easily be extended
to apply to real-valued measurable stochastic processes defined on a
Borel measurable subset of d-dimensional Euclidean space ~
(Z .13)
R(s,t)c(s)c(t) =
where Ak and ak'
k
L Ak ak(x)~(t)
+
r(s,t)
k=l
= 1,Z, ... , are the nonzero (positive)eigenva1ues
and the corresponding eigenfunctions of the integral operator in
L (dG)
2
with kernel K,
00
(2.14)
(from proof of Th. 1), and r(s,t)
is zero a.e.
Thus
00
(2.15 )
R(s,t)c(s)c(t) = k~~
with (2.14) holding.
0
Ak ak(s)~(t)
in LZ(dG x dG)
24
II.
Next we show that if X and Yare random variables with marginal
distribution G and joint distribution J,
all
f
s. t.
E[f2 (X)] <
then E[f(X)f(Y)]
~
0 for
Indeed we have for continuous and bounded
00.
f,
Ef(X)f(Y) = If f(x)f(y)J(dx,dy)
(2.16)
= lim
If
= lim
~2 L L II
f(x)f(y)Jn(dx,dy)
f(x)f(y)Jijn(dx,dy)
1 L L E[f(Xin)f(X )]
= 1"1m ---r
jn
n
1 EU If(Xin)f(X )]
= 1"lffi---r
jn
n
1 E[(I f (X )) 2]
= 1"lffi-r
in
n
e
Now consider an arbitrary f
~
in LZ(dG).
o.
First of all, note that
G-measure is equivalent to Lebesgue measure on A.
From Rudin (1966,
p.68) we have that the set of continuous functions on A with compact
support and thus the set of bounded and continuous functions (on A) is
dense in L2(dG). Thus for any f in L2(dG) we may choose a
and continuous functions f m such that f m
sequence of bounded
"
converges to f in L2 (dG). It follows that f m(X) -.. f(X) and
f m(Y) -.. fey)
in L2 (Q,F,P)
and thus
E f(X)f(Y) = lim E fm(X)fm(Y)
m
since by (2.16)
E fm(X)fm(Y)
~
~
0 for all m.
0
0
Z5
Thus by Theorem A. Z and Proposition A.3 of the Appendix
II
(Z.17)
=I
R(s,t)c(s)c(t)J(ds,dt)
(0,1]
r dQ1 (r)
where
Q1(r) = I II
uET(r)
(Z.18)
and
~u
and T(r)
{~u} uET(r)
R(s,t)c(s)c(t)~(s)~
are as in the Appendix.
(t)g(s)g(t)dsdt
U
In particular, we have that
forms an orthonormal set in LZ(dG)
{(~u'~u)}uET(l)
for each r,
may be chosen as
(Z.19)
for some index set 5(1),
and the sets Mr with basis
{(l;u,I;J}u€T(r)' 0 < r < 1, Mo = {O}, and M1 with basis
{(l;u,l;u)}U€S(l) are s.t. n Mr = Mr " 0 s r' < 1. We also have
r>r'
that Q1(0+) = O.
We may also write
II R(s,t)~(s)c(t)g(t)dsdt
=
I
r dQZ(r)
(0,1]
(Z. ZO)
IIR(S,t)~(S)~(t)dSdt =
I
r dQ3(r)
(0,1]
where QZ(r) = Q3(r) = 0 for r < 1, and for r
QZ(r) = II
(Z.Zl)
Q3(r) =
R(s,t)~(s)c(t)g(t)dsdt
II R(s,t)~(s)~(t)dsdt.
~
1
26
Write Q = Ql - 2Q2 + Q3.
(2.22)
(2.23)
and for r
e~ =
lim
n-+OO
where for r
<
I
r dQ(r)
(0,1]
1
Q(r)
~
Then fram (2.7) and (2.17) - (2.21),
= I
II
R(s,t)(cg
~)(s)(cg ~)(t)dsdt
I
II
R(s,t)(cg
~)(s)(cg ~
u€T(r)
1
Q(r) =
u€S(l)
(2.24)
+
II R(s,t)(cg-~)(s)(cg-~)(t)dsdt.
Note that c 2g botmded a.e. by M, say,
~n
€
L2 (dG)
) (t)dsdt
U
g a density, and
imply
which imply cg, cg
~
€
L (dll)
2
. Thus the above integrals do make
sense in relation to the integral operator with kernel R operating
on L2 (dll) .
•
From Proposition A.4 and fram the nonnegativity of the last term
on the RHS of (2.24) (a quadratic form), we have that Q is a nondecreasing function of r.
lim en = 0 iff Q:: O.
Also since Ql(O+)
Now Q:: 0 iff
cg-~
=
0, Q(O+)
and cg
~u'
=
u
o.
Thus
€
5(1),
27
belong to N(Rl/2 ).
T(r), 0 < r < 1,
~u's
choice for the
is S.U
= 0 a.e., u
positive definite, then N(Rl/2 )
~u
= 0 a.e., iff cg = ~,
Finally since
~u
= 0 a.e.
€
S(l), T(r), 0 < r < 1,
= G(s)G(t). This gives (a). Now if R is strictly
implying J(s,t)
cg
In particular, one possible
= {O}, and thus Q = 0 iff cg = ~,
~~u
= 0 a.e., u
is nonzero a.e.,
~
~~
€
S(l), T(r), 0 < r < 1.
= 0 a.e. is equivalent to
Thus J(s,t) = G(s)G(t).
0
A direct consequence of this theorem is that when A is finite,
~
and g equal 1/11(A) , and R is strictly positive definite, the
m.s.e.
e~ converges to zero only if c is chosen as 1. These are
the choices of
and Tubilla.
~,g,
and c used by Cochran, Quenouille, Zubrzycki,
In addition, it is shown in Example 3.3 that the design
which minimizes the m.s.e. under random sampling, as defined in
Chapter 3, for
definite and·
Z weakly stationary with R strictly positive
~ =
llll (A)
yields g
= llll
(A).
Under the hypothesis of the independence of Xln",.,Xnn for
each n, we have the following, containing slightly weaker assumptions
on c and g than in Theorem 2.1.
Theorem 2.2 Asswne that Gn converges 7JJeakZy to some distribution
function
a function
G on A 7JJhere
c
G has density
g,
that
uniformly on compact subsets of A,
Lebesgue measure of the set of discontinuities of c
also that
each n.
c
n
converges to
and that the
is aero.
Asswne
Xl,
n ... ,Xnn is a set of independent random variables for
Finally asswne that R is continuous on A x A, (1.1.2) holds,
•
28
I
(2.25)
2
R(t,t)c (t)g(t)dt <
00.
A
Then
Proof.
e ~ 0 iff ag
n
=
~ + ~ a.e.
fop some
~
€
2
N(R1/ ;.
Under independence,
n
L
.. 1
1,J=
G. (s)G. (t)
In
In
(2.26)
+n
-2 n
L
Gin(min(s,t))
i=l
where
l
Dn(s, t) = n
(2.27)
n
L
·1
1=
G. (s)G. (t).
m
m
Substitution of (2.26) into (2.3) then yields
e~ =
II R(s,t)(Cn(s)Gn(ds)-~(s)ds)(cn(t)Gn(dt)-~(t)dt)
M
(2.28)
+~ {
f R(t, t)C~ (t)Gn (dt)
A
-II
R(s, t)cn (s)cn (t)Dn (ds ,dt) }
AA
as the m.s.e. under independence.
from this expression.
We shall prove the Theorem directly
29
We first show, with A suppressed, that
(2.29)
for all n.
For any distribution function Gin'
= I R(t,t)C~(t)Gin(dt)
by the Schwarz Inequality.
and G.
n
Thus (2.29) follows by definition of Dn
(Billingsley, 1968, p.34)
Note that
I R(t,t)C~(t)Gn(dt) + I R(t,t)c 2(t)g(t)dt
where the finiteness follows from (2.25).
o ~ ~~
lim
n-+oo
II
<
~
Thus
R(s,t)cn(s)cn(t)Dn(ds,dt),
I I R(s,t)cn(s)cn(t)Dn(ds,dt) ~ I R(t,t)c 2(t)g(t)dt < ~,
and the second term in (2.28) tends to zero, giving
(2.30)
e~
+
II R(s,t)(cg-~)(s)(cg-~)(t)dsdt
(Billingsley, 1968, p.34) which is zero iff cg - ~ E:N(R1/ 2).
0
30
Two particular examples of sampling schemes in which the sample
points are chosen independently of each other are simple random and
stratified sampling.
These sampling schemes are discussed in detail
in the next two chapters.
rnAPTER 1HREE
SIMPLE RANOCM SAMPLING
One method of sampling in which Xln ' ... ,Xnn are chosen independently of each other is simple random sampling (s.r.s.). For s.r.s.
we assume that for each n the X.1n , i = 1, ... ,n, are independent
and identically distributed with density g where g does not depend
on n.
(The usual asstnnptions that g is the uniform density over A
and that A is bounded will not be made here.)
(3.1)
for all
gin(s) =
i
~(s)
= g(s)
Under these assumptions
Dn(s,t) = G(s)G(t)
and n and for 5, t€A
(where Dn is defined in (2.27)).
It is also assumed that c does not depend on n. Finally, since c
and g do not depend on n and in view of Theorem 2.2, it is assumed
that for some $ € N(Rl / 2),
cg - </> = $
(3.2)
a.e.
The m.s.e. for s.r.s. is then (via (2.28) and (3.1)-(3.2))
(3.3)
e 2 =!. {
R(s,s) (</>+$)2(5) ds R(S,t)</>(S)</>(t)dsd1
r,n n
A
g(s)
AA
!
where
(</>+$)2(s)/g(s)
I
II
is understood to be zero for any 5 such that
g(s) = 0 (for by (3.2), c 2g = 0 if g=O, and c 2g = (</>+$)2/ g if g, 0).
2
-1
2
The m.s.e. er ,n converges to zero at a rate n ; in fact, ne r ,n
equals a finite constant which is nonzero (except when R, g, and $
2 = 0 for all n).
are as in Proposition 3.7 in which case er,n
32
We wish to
and fixed.
Rl/2
for fixed n and for R and </> given
min~ize
The density g and the function lIJ in the null space of
which produce this minimum (if a minimum exists) then determine,
by the above definition of s.r.s., what we shall call the best random
design for s.r.s. and also, by (3.2), the "best" function c for use
in the formula for In'
If R is strictly positive definite,
l/2 ) contains only the zero function, Le. lIJ = 0, and the problem
N(R
reduces to finding a density g which
min~izes
the m.s.e.
Such a g,
if it exists, would then produce the best random design for s.r.s. and
also the "best" c.
We shall be concerned mainly with the strictly
positive definite case.
The following proposition shows, in particular, that when R is
strictly positive definite the best random design for s.r.s. is obtained
when the density g is proportional to the square root of the second
moment of the process Z(t)</>(t).
Proposition 3.1 Fop fixed lIJ
(3.4)
R(t,t) (</>+lIJ)2(t) dt
JA
get)
fop all densities
(3.5)
€
g
get)
I/2 ) and fixed
N(R
~ {J
1/2
R
R
and </>,
Ct, t) I</>+lIJ I (t)dt} 2
A
with equality iff
l/2
= _ R
(t, t) l~-.lo.(t.:...e..)_
f R1/ 2(s,s) I</>+lIJl (s)ds
a.e.
A
Proof.
Consider the following restatement of the Schwarz Inequality for
g nonnegative a.e. (and strictly positive on a set of positive measure
in A)
and b, g
€
Ll(A).
33
{ I b(t)dt}
2
s{
A
I
2
I
b (t) dt } { g(t)dt }
A get)
A
b2 (t)
with equality iff b, g are s.t.
= kg 2 (t) a.e. Hence when g
is a density, we have
{ I b(t)dt}
(3.6)
2s
A
with equality iff
2
b (t)
A get)
I
dt
(since g is a density)
,
get) =~
J Ib(s)lds
(3.7)
a.e.
A
Since the L.H.S. of (3.6) does not depend on g,
it represents the
minimum value the R.H.S. may attain for fixed b.
attained when g satisfies (3.7). Letting b2(t)
This minimum is
tEA,
= R(t,t)(~+~)2(t),
then gives the desired result since
{ I Rl/2(t,t)I~+~1 (t)dt}2 s I R(t,t)dt I (~+~)2(t)dt
A
A
<
00
A
by (1.1. 2)and the Schwarz Inequality and hence b is in Ll (A).
0
When R is strictly positive definite and g is given by (3.5)
•
(3.8)
e;,n
=
k {(I Rl/2(s,s)I~(s)lds)2
A
-
II
M
R(s,t)~(s)~(t)dsdt
}
34
and
(3.9)
Zn
= I Rl/2(s,s)I~(s)lds
A
n Z(X.
L
I
m
)sgn(~(X.
m
))
R172 (X. , X. )
m
m
n i=l
Note that the best design g as well as the estimator Zn depend only
on the values of R on the diagonal.
only in the value of the m.s.e.
The full structure of Renters
Note further that for the best s.r.s.
design, the sample points are chosen as a random sample from a density
proportional to the square root of the second moment of
Z~,
and in the
calculation of Zn' the observed values at these sample points are
weighted according to the inverse of the square root of the second moment
of Z at these sample points.
Finally note that g as given by (3.5) produces a m.s.e. as small
if not smaller than does, for example, what is generally used, the
unifonn density over A when A is botmded and R is strictly positive
definite, follows from the previous proposition as well as from a direct
application of the Schwarz Inequality.
over A,
(3.10)
the m.s.e.
When g is the tmifonn density
is
~ { ~(A) I R(s,s)~2(s)ds
A
-
II R(S,t)~(S)~(t)dSdt,}
AA
The L.H.S. of (3.8) is smaller than (3.10) since by the Schwarz
Inequality
{I Rl/2(t,t)I~(t)ldt} s
A
.
~(A)
JR(t,tH 2(t)dt
A
35
with equality iff R(t,t)~2(t) is constant a.e.
Z,
Observe that for the
and A considered by Tubilla (1975) and Zubrzycki (1958) among
~,
R(t,t)~2(t) is constant.
others,
Example 3. 2 • Wide sense stationary Z
Asstune A is bounded,
~
L (A),
2
€
and Z has mean m and
covariance RO(s,t) = C(t-s) where C is continuous and strictly
positive defi~ite. Let a 2 = C(O). Then R(s,t) = C(t-s) + m2 is
continuous and strictly positive definite and satisfies
(l.l.~).
The
best s.r.s. design is given by
g (t)
(3.11)
= - il.i!l.LJI~(s)lds
A
yielding
zn
= II~(t)ldt . !. ¥
A n , i=l
(3.12)
2
ne;,n = a (
II~(t)ldt)2-
A
Z(X.
m
)sgn(~(X.m ))
IIAA C(t-s)~(t)~(s)dsdt
Notice that the optimal design as well as the estimator depend only on
~
and are independent of m and C while the error depends on all
three parameters
(3.12)
become
~,m,
C.
If
~
has constant sign on A,
(3.11)-
36
get) = <j>(t)
J <j>(s)ds
A
zn
(3.13)
I
=
I
!n
<j>(t)dt·
. 1
1=
A
ne;,n = cr 2 (
-II
I <j>(t)dt)2
Z(x.In)
C(t-s)<j>(t)<j>(s)dsdt,
M
A
and the error is independent of the mean m of Z.
Example 3.3.
Wide-sense stationary Z,
p = 1/lJ(A).
Let Z and A be as in Example 3. 2 and let <j>
= l/lJ (A) . The best
design is given by
(3.14)
get) = l/lJ(A)
yielding
_
n
1
I
Z =-
n
n i=l
Z(X.)
1n
(3.15)
This
2
e r,n
Example 3.4.
is of course the same as in (1.2 .11a) .
Z a stationary process with a trend.
AsslDne A is bounded,
covariance RO(s,t) = C(t-s)
m€
L (A),
2
<j>
€
L (A),
2
and Z has mean met)
and
where m and C are continuous,
and C is strictly positive definite. Let 0 2 = C(O).
37
Then R(s,t)
= C(t-s)
+
m(s)m(t)
is continuous and strictly positive
definite and satisfies (1.1.21 The best s.r.s. design is given by
(3.16)
where
(3.17)
Q=
I
2
!'m (t)
+ 0
2 1~(t)ldt
A
yielding
_
Z
n
=Q
1 n
• - L
n 1=
. 1
(3.18)
ne;,n
=
Q2 - (
I m(t)~(t)dt)2 - II C(t-s)~(s)~(t)dsdt.
A
M
Notice in particular that if A = [O,a], met)
¢
= t (Unear trend), and
= l/a, Q in (3.17) is given by
(3.19)
Example 3.5.
Q = -12 {
I
a 2+0 2
-
0
-
a
2
. -1 (-)
a } .
sinh
0
d-dimensiona1 Brownian motion.
has mean zero and covariance
d
Define net) =n t..
.1 1
1=
design is given by
(3.20)
The best s.r.s.
38
yielding
Z(x. )
-m
1 n
-n . L1
1=
(II (X. )) 172
-m
(3.21)
2
= 1, ne r,n
2 ~ .111, for d = 2, .086, and for d = 3, .051.
2 0- 2 has a maxiJm.Dn at d = 1 and in fact is
It may be seen that ne r,n
strictly decreasing in d to zero, as follows. Let a = ~, b = ~ .
For d
0-
Note that
an
(3.22)
-
bn
>
an+l - bn+l
for all n
~
1
for all n
~
1.
iff
Consider f(x)
n
x < n+l.
= Xn(l-X), 0
Now a
= 94
<
S
x
S
1,
and note that fl(x) > 0 iff
n
f
n+l
or all n
~
1.
Thus feb) < f(a)
which implies (3.22) holds, as desired.
Example 3.6.
Brownian motion on
[Qd]
with trend.
L2([0,1])" and Z has mean met) and
covariance RO(s,t) = 0 2 min(s,t) where m is continuous on [0,1].
Assume A = [0,1],
$
€
The best s.r.s. design is given by
(3.23)
39
where
(3.24)
yielding
1 n
Zn=Q·L
n 1=
. 1
(3.25)
It may be of interest to note that if
~
=1
and met)
(3.23), (3.25) reduce to (3.20), (3.21) with d
~
E
= al't,
then
= 1.
Although we shall not consider the general problem of finding a
N(RI / 2) and a density g to minimize er,n ,we do note one
interesting point in the following proposition.
The m.s.e.
mlder s.r.s.
may take the value zero only if Z is a one-dimensional process on some
subset of A with positive measure.
Proposition 3.7.
We have
exist
and ~
(a)
a
E
LZ(A)
E
Z
er,n
= 0 for all (or some)
N(R1/ Z) s.t.
Z(t) = a(t)ZO a.s. for each t
for some random variable
some function
•
E
A = {SEA:
W
n iff there
(~+W)(s) ~
Zo with finite second moment
a, continuous on
and for
A~.
is nonnegative (nonpositive) a.e. in A,
(b)
a(t)(~+~)(t)
(c)
get) = a(t)(~+~)(t)/J a(s)(~+~)(s)ds a.e.
A
O}
in A.
and
40
Proof.
Starting with the second term in brackets on the R.H.S. of (3.3)
and proceeding to the first term by a series of inequalitie5, we have,
suppressing A,
II R(s,t)~(s)~(t)dsdt
= II R(s,t)(~+~)(s)(~+~)(t)dsdt
~ II IR(s,t)I~+~)(s)I~+~1
(t)dsdt
~
{I Rl / 2 (t,t) I~+~I (t)dt} 2
~
I
R(t,t) (p;~)
2
(t) dt •
The first inequality is an equality iff
(3.26)
R(s,t)(~+~)(s)
for almost all
s, t
E:
A.
(~+~)(t) ~
0
The second inequality follows from the
Schwarz inequality with equality holding iff for almost all t
in
A~
there is a random variable Zo with finite second moment and a function
a
s.t.
(3.27)
Z(t)
= a(t)ZO
a.s.
Since Z is mean square continuous,
A~
and (3.27) holds for all
in \
(giving (a)).
Thus
2
R(s,t) = a(s)a(t)EZ O
(3.28)
for s, t
t
a has a continuous extension to
E:
zero on A -
A~.
A~
This with (3.26) and the observation that
gives (b).
~
+
~
is
The final inequality follows from Proposi-
tion 3.1 with equality holding iff g is defined as in (c).
0
41
Proposition 3.8.
When
(a)
In summary:
2
eT,n is minimized when g
R is strictly positive definite,
is chosen as
R1/ 2 Ct,t)I4>Ct)I
get) = ~
f R (5,5) 14>(5) Ids
a.e.
A
When
(b)
~
2
e r,n
R(s,t) = a(s)a(t),
takes its minimum value, zero, when
and g are chosen s.t.
f
(i)
A
(ii)
a(t)~(t)
is nonnegative (nonpositive) a.e. in A,
a(4)+~)
get) = a(t)(p+~)(t)
(iii)
f
A
Proof.
= 0,
a.e.
a(s)(4>+~)(s)ds
and
in A.
Part (a) follows directly from Proposition 3.1 and the discussion
preceding Proposition 3.1.
Part (b) follows from Proposition 3.7
~
provided we show that a function
Proposition 3.7 exists.
satisfying the assumptions of
We shall construct such a ftmction.
Let
f(4) a) (s)ds
-4>(t) + _A
~(t)
(3.29)
_
a(t)~(~)
=
-4>(t)
It
where
a.
~
is s.t.
0 <
(nonpositive).
Let k =
€
f
A
00
1 2
N(R / )
4> a
~
on A -
~(~) <
We shall show that ~
on
and a 'f 0 on
and that a .
/~(~).
Then
~
~
for some ftmction
(4)+~)
is nonnegative
42
IA ~ = '\I
(-4> + ~) a +
a
I-
4> a
A
2
= - I 4>
a + k
~('\) = 0,
A
and ~
E
N(R1/ 2).
Finally,
a(t)($+~)(t) - {:
and
a(4)+~)
It
when
0
is nonnegative (nonpositive), as desired.
might be noted that the choice of g which minimizes e 2r,n
R is strictly positive definite satisfies the assumptions of
Theorem 2.2.
First cg
E
L
2
since by assumption cg
= 4>
a.e. and
We have also
I
2
R(t,t)c (t)g(t)dt = (
A
I
l 2
R / (t,t)I4>(t)ldt)2
A
s
I
R(t,t)dt
A
I
4>2(t)dt <
00
A
by the Schwarz Inequality and (1.1.3). (Note that the Lebesgue measure of
the set of discontinuities of c,
~rowever,
since
~(Dc)
is not necessarily zero.
= 0 was used to show the convergence to zero of
~(Dc)
..
a particular term and since that term is zero for all n under r.s.,
we need not require
~(D)
,
c
= 0.)
•
43
Now we turn our attention to the effects of guessing R incorrectly.
Note again that the best r.d. for s.r.s. depends only on the values of
R on the diagonal and not on its values off the diagonal. Define
2 (SiR) as the m.s.e. when the design chosen is that corresponding
e r,n
to S when R is the true kernel EZ(s)Z(t) of Z. Assume that both
S and R are strictly positive definite and continuous and that
is nonzero a.e.
~
Then
2
l
er ,n(SIR) = -n {
(3.30)
I
A
Sl/2(t,t) I~(t) Idt
-II R(S,t)~(S)~(t)dsdt}
I
R~S,s) -I~(s) Ids
A s1/ (s,s)
.
AA
Let
2
2
deS, TIR) = -ler, n(SIR) - e r ,n (rIR) I.
(3.31)
For fixed R,
d is a distance function for the equivalence classes of
strictly positive definite and continuous kernels:
(3.32)
{T:
T(t,t)
=k
S(t,t),
for all t€A, k > o} .
From Proposition 3.8a we have that any two kernels belonging to the
same equivalence class (3.32) produce the same best design g.
Thus
d is a distance function for the equivalence classes of strictly
positive definite kernels producing the same best s.r.s. design.
•
In
particular, when both S and T are stationary, we always have
deS, TIR) = o.
The following propositions consider bounds on deS, RIR),
the
increase in the m.s.e. when the design chosen is that corresponding
44
to
instead of
S
R
when R is the true kernel EZ(s)Z(t).
In
Proposition 3.9, we aSSlDlle that R and S are s.t.
I Sl/2(t,t)I~(t)ldt = I R1/2(t,t)I~(t)ldt
(3.33)
A
A
or, in other words, that R and S are normalized representatives of
their respective equivalence classes.
In Proposition 3.10, we make no
such assumptions.
Proposition 3.9.
Let
R, S be strictly positive definite and continuous
and satisfy (3.33),(1.1.2). Assume
(a)
~ E
L2 (A).
(Sup bounds for normalized R, S)
(i)
(ii)
For a constant
k1
If ~(t) S-1/2(t,t)
depending on R,
E
L1(A), then for a constant k2
depending on S,
d(S,RIR) s kZ sup IR(t,t) - S(t,t) I
tEA
(b)
.
2 bounds for normalized R, S)
(i) If R(t,t)~(t)E L2 (A), then for a constant k3 depending on
(L
R,
dCS,RIRl < k {
3
J CS-1/2Ct,tl-R-1/2Ct,tll2dt}1/2
.
A
(ii)
If ~(t) S-1/2(t,t)
E
L2 (A),
then for a constant k
4
depending on S,
d(S,RIR) s k4 {
I
A
(R(t,t) - S(t,t))2dt}1/Z .
45
Proof.
Let
C denote the quantity on the RHS (and also LHS) of (3.33).
Under the normalization (3.33)
•
(3.34)
=C
deS, RIR)
Replace the
C in braces' by the RHS of (3.33).
(a i)
and
(bi)
C(f RZ(t, t)cf>2(t)dt)l/Z.
the illS of (3.33).
k
4
(aii)
= C f R(t,t) Icf>(t) Idt and k3 =
l
Now replace the C in braces in (3.34) by
with k
Then
=C f
deS, RIR)
yielding
Then
= C f R(t,t)\cf>(t)I (S-l/Z(t,t)-R-l/Z(t,t))dt
deS, RIR)
yielding
{J R(t,t)S-l/Z(t,t) Icf>(t) \dt - cr·
and
1cf>(t)\S-l/Z(t,t)(R(t,t) - S(t,t))dt
(bii)
with k
= C(f cf>Z(t)S-l(t,t)dt)I/Z.
Z
=Cf
cf>(t)S-l/Z(t,t)dt
and
0
Through normalization we have attempted to compare the equivalence
classes containing
R and
S.
There are, however, many ways to
normalize, and it would be well to consider botmds for non-normalized
kernels as well.
In the following proposition we consider botmds on d
for non-normalized R, S and also pointwise convergence for simple
mixtures.
Proposition 3.10.
•
Let
satisfy (1.1.2 ). Asswne
e
(a)
R, S be strictLy positive definite and
<P E:
L (A) •
Z
(Sup bounds)
(i)
For a constant k 5 depending on R,
46
d(S,R IR) s kS sup
s,t€A
(ii)
(L
Z
(i)
sup
6 s,t€A
If lj>(t)R1/ Z(t,t) € LZ (A),
depending on R,
then for a aonstant k 6
Set, t) _ /Ret ,t)
S(s,s)
R(s,s)
then for a aonstant k7
JJ [I', S(s
Res,s)S(t,t) - 1]2 dsdf}I/2.
,s)R(t, t)
If lj>(t)R(t,t) € LZ (A),
depending on R,
d(S,RIR) s kS {
(c)
I/
I
bounds)
d(S,RIR) s k {
7
(ii)
R(s,s)S(t,t) - 1
S(s,s)R(t,t)
If lj>(t) , lj>(t)R(t,t) € L1 (A),
depending on R,
d(S,RI R) s k
(b)
I/
Jf
then for a aonstant kS
(/
~f~:~~
z
-/ :f~:~~ )
l/Z
dSdt}
.
(Pointwise aonvergenae for simpZe mixtures)
If S(t,t) s M R(t,t)
then as
€
+
0
for aZZ
t €A
and some
M, 0 < M <
we have for some aonstant kg depending on
00,
R,S,
•
47
Proof.
We have
d(S,RIR) = ffl~(s)~(t)1 (R(s,s)SI/2(t,t)S-I/2(s,s)-RI / 2(t,t)RI / 2(s,s))dsdt
•
=
fJI~(sH(t)11 R(S,S)R(t,t)(/~f~:~~ifi:i~
yielding (ai) and bi).
- 1) dsdt
(aii) and (bii) follow from
d(S,RIR) = ffl~(s)~(t) IR(s,s) (/~f;:;~ -
/~f;:;~
) dsdt .
For part (c), note that since R, S are strictly positive definite and
continuous, so is
also s.t.
€IS-RI
(l-€)R
<
+
€S for 0 < € < 1. Then provided € is
R) which is satisfied for 0
<
€
< min
(1, 10M-I)-II),
we have using Taylor series expansions
d((l-€) R + €S, RIR)
= ffl~(s)~(t)1
R(s,s)
00
L (1~2)
j=O
RI / 2-j(t,t)(S-R)j(t,t)€j
J
r (-~/2)R-I/2-k(S,S)(S-R)k(S,S)€kdSdt
t=O
-{J RI / 2(t,t) I~(t) Idt}2 .
Now
•
and if 0
~
<
€
<
b min(l, 10M-I)-II)
for some constant b € (0,1),
we have €IS-RI (t-t) s b R(t,t) which implies
48
R
<
R-€ IS-Rr
yielding, for
00
0 < €<b min (1, 10M-I)-II) < min(l, 10M-I)-II),
= f 1~(t)1 (2R1/ 2_ (R-€IS-RI)1/2)(t,t)dt
< 2{
+
J
$2 (t)dt
I
€S(t,t))dt}
R(t, t)dt}1/2+
1/2
{J
$2 (t)dt
I
((l +<)R(t, t)
< 00
and
= fl~(s)1 R(s,S)(R-€IS-RI)-1/2(s,s) ds
~ (l~b)I/2 fl~(s) IRI / 2 (s,s)ds s { l~b J ~2(s)ds J R(S,S)dS}
1/2
<
00.
Thus using the bounded convergence theorem, we may interchange the
integral and summation signs and also (since we have absolute convergence)
rearrange the order of summation to give (c) with
•
o
.
QiAPTER FOUR
..
STRATIFIED SAMPLING
4.1
Introduction.
A second method of sampling in which the Xin ' s are
chosen independently of each other is stratified sampling (st. sJ. For
each n, consider a partition {Ain}~=l of A.
For st.s., we assume
that Xl n , ... ,Xnn are independent random variables and that for each
i, i = 1, ... , n, Xin takes values in Ain and has density gin.
From (2.2) and the above asslUllPtions , we may write
•
_ 1
(4.1.1)
&n
where
G .
n
&net) - n gin(t),
t fAin'
i = l, ... ,n
is a density over A with associated distribution function
Then from (2.28) we have
2
= II R(s,t)(c 2 -~)(s)(c 2 -~)(t)dsdt
e t
s,n M
n"11
n"11
+
~{J
R(t, t)(C~&n)(t)dt
A
(4.1.2)
- n
II
i~l
R(S,t)(Cn&n)(S)(Cn&n)(t)dSdt}
A. A.
m In
= e I ,n
+
e
2 ,n'
say.
Notice in particular that here Dn concentrates its mass on
n
u A. x A.
rather than on A x A as in s. T.S. Notice also that
i=l In
In
so
to
~
n
i~l f R(t,t)(C~&n)(t)dt f
A.In
&n(s)ds =
A.In
f R(t,t)(C~&n)(t)dt
A
and thus e 2 ,n is nonnegative for all n. The first tenn on the R.H.S.
of (4.1.2), e1 ,n ' is a quadratic fonn and is also nonnegative. Thus
2
. t he sum 0 f two nonnegatIve
.
est,n
IS
tenns, e1 ,n' eZ,n.
First of all, we have the following proposition relating the rate
2
at which the m.s.e. e st,n
converges to zero as n ~ ~ to the
smoothness of R and c.
(This will be elaborated upon in Sections
n
4.2 - 4.3 for the special case A = [0,1],
cn&n
R positive definite,
= ~ continuous.)
Proposition 4.1.1.
Assume
cn&n
2
est,n
-< e1,n
(4.1.3)
L2(d~).
€
+
Then
e 3,n
where
(4.1.4)
e3
,n
1
n
n
i=l
= -Z L (ess sup - ess inf)(R(s,t)cn(s)cn(t)).
s, tEA.
•
In
If R is uniformZy continuous a.e. on A x A,{cn}n is equicontinuous
a.e. on A,
max ess diam (A. )
i
In
~
0 as n
~ ~
(where
ess diam (A. )
In
51
is the essential suprernwn of the EuaUdean distance between s, t
and
II
h
hn
Rl/Z hn liZ = o(n- l ) were
L - norm, then we have as
Z
n eZ
.....
s t ,n
(4.1.5)
Proof.
n
+
= cn&n - If'....
€
A. ),
m
. t he
and t he norm 1-S
~,
o.
From (4.1. Z) we have
n e 2 ,n =
i~l
f R(t,t)C~(t)&n(t)dt
{
A.In
- n
J J R(s,t)cn(s)cn(t)&n(S)ds
&n(t)dt}
A. A.
In m
~
I {esst€A.sup R(t, t)cn (t) !.n
i=l
2
In
1
n
= - I (ess sup - ess inf) (R(s', t)cn(s)cn(t))
n 1=
. 1
S, t
€
Ain
by the positive definiteness of R.
This with (4.1.2) yields (4.1.3).
The a.e. uniform continuity of R and the a.e. equicontinuity of
cn implies R(s, t)cn (s)cn Ct) is equicontinuous a.e. Thus for any
E > 0, there exists aCE) > 0 s.t. ess diam (B) < 0 implies
Cess sup - ess inf)(R(s,t)cn(s)cn(t)) <
s, t
€
€.
B
Thus if ess diam (Ain ) < 0 for
The condition on cn&n yields
i
= l, ... ,n,
we have ne 3 ,n
< €.
52
-1
o(n ).
max ess diam (Ain ) < a(E),
i
Thus for all n s.t.
2
net
S ,n
~
0(1) +
E
which implies
2
lim sup net
S ,n
n-+- oo
~ E.
may be made arbitrarily small since max ess diam (Ain) -+- 0,
i
yielding (4.1.5).
0
Now
E
"
Observe that the conclusions of this proposition do not depend
on the dimension of A and may thus be thought of as yielding a lower
bound on the rate at which the m.s.e. converges to zero (if indeed it
does) for A of any dimension.
exact rate,
n-(1+2/d),
Recall that Tubilla (1.2.15) obtains an
of convergence of the m.s.e. to zero for A
the d-dimensional unit cube and with R, cn ' Ain satisfying the assumptions leading to (4.l.5)plus some additional assumptions including the
weak stationarity and continuous differentiability of R.
Also note
that we have from (4.1.3), (4.1.4)
n e2
~ n e l + (ess sup - ess inf)(R(s,t)c (s)c (t)).
st ,n
,n
s, tEA
n
n
Now if
IRI
is bounded a.e. onA x A by Ml ' say, and if for all n,
is less than MZ' say, for a.e. t in A and n e l ,n is less
..
53
than M3 , say, where Mz' M3 do not depend on n, then no matter how
bad the design may be in other respects or how high or low the
dimensionality of A, we have, letting M= M3 + ~ ~ < 00,
Z
(4.1.6)
est,n
s M
n
2
est,n
converges to zero at least as fast as Min. (Note in
I Z
particular that n e l ,n = 0 if cn'11
Q
- 4> € N(R / ).)
We would like to choose a partition {Ain}~=l' a sampling density
that is,
~ and a weighting function cn so as to minimize e;t,n; however,
this seems too complicated a problem.
•
Notice that e ,n depends on
l
cn ' &n only through their product and does not depend on the partition;
while e Z,n depends on the partition as well as on cn and '11
2:
and is
thus the more important term. Also el ,n takes its mininun value zero
Z
when cn'11
Q
- 4> € N(Rl/ );
while e Z,n takes its mininun value zero when
cn = 0 which is not an admissible value for cn . Recall that, by
Theorem 2.2, if cn&n converges appropriately to cg and if certain
further conditions are satisfied, then the m.s.e. converges to zero iff
cg - 4> E N(RI / 2). We are thus led to choose cn for each n s.t.
I Z
cn'11
Q
- 4> = W € N(R / ).
This may be thought of as an asymptotically
n
best choice in view of the asymptotic nature of Theorem 2.2. Then (4.1.2)
becomes
(4.1. 7)
e Zt
=!
s,n n
{J R(t,t)
A
(4)+W ) Z(t)
n
(t)
&n
n
dt-n[
i=l
I I
R(s,t)4>(s)4> (t)dsdt
}
A.m A.m
where Wn € N(R l / 2 ). Note the similarity between this expression and
the corresponding one for random sampling (3.3) and also the dependence
54
of this expression on &n and the partition.
Later we shall also assume
that R is strictly positive definite; that is,
cn&n
=~
for all n.
Under these assumptions the difficult problem of minimizing e;t,n
with respect to both cn and &n reduces to the much easier problem of
minimizing with respect to just &n' for a fixed partition of A. The
following proposition indicates that under these assumptions, for a
fixed partition, the optimal distribution of Xin over Ain , i = l, ... ,n
does not depend on the partition. The problem of finding an optimal
partition is also discussed, and an "asymptotically optimal" sequence of
partitions is obtained in Sections 4.2 - 4.3 under certain conditions
when A = [0,1].
Examples are given in Section 4.4. Finally in Section
4.5 we discuss the increase in the m.s.e. when the design chosen is that
corresponding to S, say, instead of R when R is the kernel
EZ(s)Z(t).
Proposition 4.1.2.
{A. }.
Inl
(4.1.8)
For fixed
R,~, and partition
of A,
I
R(t,t)
(~+I/In) 2(t)
&n (t)
A
dt ~ n
r {J
n
i=l A.
RI/2(t,t)I~+1/I I (t)dt
}2
n
In
for all densities
(4.1.9)
I/In E: N(RI / 2),
n,
&n of the for.m (f.l.l) with equality iff
&net)
=
R1/2(t,t)I~+l/Inl (t)
-n-f--R1...../.....Z....C--,s-)-I~;;,;;.+-I/I-I-Cs-)-d-s ' a. e. tE:Ain , i
A.
ln
S
n
=
1, ... ,n.
ss
Proof.
From (4.1.1) we have,
f R(t,t)
A
(~+~
n
)2(t)
~(t)
Note that by hypothesis,
f
n
dt = n.r
1=1 A.
dt.
R(t,t)
m
gin
(whose restriction to Ain is a
is fl.IDctionally independent of gjn· for i
density over Ain )
Thus we may minimize this expression term by term.
r j.
Proposition 3.1
implies
J R(t,t)
A.m
, {.J
A
2
Rl/ (t,tl I$""n l (tldtt
in
with equality iff
a.e.
(4.1.10)
for each i,
i = 1, ... ,n.
Thus by definition of
with equality iff (4.1.9) holds.
~,
t
€
A
in
we have (4.1.8)
0
Thus when R is strictly positive definite,
e;t ,n'
given by
(4.1.7), is minimized for a fixed partition if the restriction of
gin
to Ain is proportional on Ain to the square root of the second
moment of Z~, i = l, ... ,n. (Recall that for s.r.s. the best design
..
is obtained when the (common) density for each Xin is proportional on
A to this same quantity.)
~
When R is strictly positive definite and
is given by (4.1.9), or equivalently,
we have
gin is given by (4.1.10),
56
(4.1.11)
e;t,n
=.I
1
1
{J J R1/2(t,t)I~(t)ldt)2_ JJ R(S,t)~(S)~(t}dsdt}
A.
m
A. A:
mm
and
(4.1.12)
sgn(~(Xin))
Z(Xin ) 172
R
J
(x.m ,x.ln) A.
1/2
R (t,t)I~(t)ldt
m
As
with s.r.s., the best choice of gin' i
= l, ... ,n,
(for a fixed
partition) and the corresponding estimator Zn depend only on the
values of R on the diagonal.
The full structure of R enters only
in the value of the m. s. e.; however, the full structure of R may enter
in the choice of a best partition.
over Ain , i = 1, ... ,n,
of the second moment of
Note that the distribution of Xin
has density proportional to the square root
Z~ ,
and in the calculation of Zn the
observed value of Z at Xin is weighted by the integral over Ain
of the square root of the second moment of Z~
divided by the
square root of the second moment of Z at Xin ,
i = 1, ... ,no
57
Regular sequences of partitions. We would like to find a partition
4.2.
{Ain}~=l of A to minimize (4.1.11) for each n.
In attempting to do
this, however, we encounter some of the same problems encountered in
attempting to find an optimal design for the problem considered by Sacks
and Ylvisaker.
Thus following the approach of Sacks and Ylvisaker, we
attempt to find an "asymptotically optimal" sequence of partitions.
We
consider the one-dimensional case only (and in fact take A = [0,1]) and
consider only partitions s.t.
i = 1, ... ,no
(For A = [0,1],
A.m
= [to1·-,
1 n' t.1, n]
tOn = 0, t nn = 1.)
for t.1- 1 ,n < t.1,n ,
Asymptotic optimality is defined and asymptotically optimal
sequences of partitions are obtained under certain conditions on
and R in Section 4.3.
~
In this section we obtain asymptotic
results for a class of partitions, regular partitions, defined below.
The asymptotically optimal sequences of partitions obtained in
section 4.3 belong to this class of partitions.
Definition 4.2.1.
of A = [0,1]
by h (RP(h)),
for some continuous density
A.1n = [t.1- 1 ,n ,t.1,n] ,
1968),
{tOn"'"
n
{Ain}i=l' n = 1,2, ... ,
is caLLed a reguLar sequence of partitions generated
(4.2.1)
where
A sequence of partitions
h on
i
[0,1],
if
= 1, ... ,n
nn } form a RS(h); that is, (Sack!J and YLvisaker,
tOn = 0, t nn = 1, and tin' i = 1, ... ,n-l, is the smaLLest
t
vaLue (in case of ambiguity) satisfying
tin
f°
h(t)dt = !.
n
58
Gn
Note that if the sequence of partitions {A.}.
ml is RP(h) then
converges weakly to H with density h, as follows. Let f be
a bounded continuous function on
[0,1].
the Mean Value Theorem where
x.m
.
£
Then from (4.1.1) we have by
[to1 -1, n' t.1, n]
= i = l, •.. ,n,
£
I f(t)dG (t) =
! Iti,n f(t)g. (t)dt
Io
n
'=1 n t.
m
1
l-l,n
n
L!
=
. 1 n
1=
fex. )
m
= nL f(x.)
(4.2.2)
m
·1
1=
h(t)dt
ti-l,n
n
l
=
ft.1,n
fo .L
Io
1=1
f(Xin)I(t.
t. ) (t)h(t)dt
l-l,n' l,n
1
f(t)h(t)dt
-+
as n
n
-+ ~
by the Bounded Convergence Theorem since f (t) =
n
L
f(x. )I(t
t
) (t)h(t)
i=l
In
i-I ,n' i ,n
is bolUlded in absolute value by the
integrable function h(t) sup If(t)1
n
to fh
[0,1]
and converges a.e. in
[0,1]
converges pointwise to f
on
t
(for. L f (xin ) I (t .
t.)
1=1
l-l,n' 1,n
less the union of the at most cOlUltable number of intervals
where H is constant and thus h is zero).
~n
The weak convergence of
to some absolutely continuous distribution function is one of the
assumptions in Theorem 2.2.
'".
59
Define also
=
lR(t,t)
Rjk(s, t)
=
j+k
R(s,t)
;rsiatk
~O(t,t)
=
R~O(t,t)
=
ret)
Q
lim RjO(s,t)
s+t-
(4.2.3)
a (t)
1
=
lim RjO(s,t)
s+t+
RIO (t, t) - R+10 (t, t)
-
..
provided the derivatives and limits exist where s, t
E:
(0,1).
We shall consider the following two sets of assumptions:
.
. contlnuous
.
. t he camp1ement 0 f
and 1S
1n
(4.2.4:1 ) R10(s,t )eX1sts
the diagonal on
r' (t)
(0,1) x (0,1).
exists and is continuous on
r(t)r' (t), R~O(t,t), R:O(t,t)
(0,1).
exist and may be extended
to continuous and boWlded ftmctions on [0,1].
(4.2.4:2) R10 (s,t),R11 (s,t),R20 (s,t) exist and are continuous
on
(0,1) x (0.1).
r"(t)
exists and is continuous on (0,1).
r(t)r"(t), R20(t,t) may be extended to continuous and
boWlded fWlctions on
[0,1].
60
Assumption (4.2.4:1)
~plies
Z does not necessarily possess a quadratic
mean derivative while (4.2.4:2) implies Z has a quadratic mean
derivative for all t.
Also R(s,t) = min(s,t)
A, B, C of Sacks and Ylvisaker (1966) and the
(4.2.4:1)
satisfies Assumptions
s~ilar
assumption
but not (4.2.4:2).
Further, we have by definition that (Xl under (4.2.4:1)
under (4.2.4:2) are continuous and bounded on
[0,1].
and (X2
Also since
R~O(t,t), R:O(t,t) exist and are continuous, (Xl is nonnegative on
(0,1) (and thus on [0,1]), for it may be written as
(Xl(t) = 1~
u-v-+Ou,v-+t
R(v,v) - R(u,v) _ R(u,u) - R(v,u)
v - u
u - v
(4.2.5)
= lim
R(v,v) - 2R(u,v) + R(u,u)
u-v-+Ov - u
u,v-+t
Finally, the existence of RIO under (4.2.4:2)
that (Xl
~
0.
implies immediately
is identically zero.
2
We shall show (Theorem 4. 2. 6) for e st,n
given by (4.1.11) that
under (4.2.4:k), k = 1,2, and ce~tain additional assumptions, for a
RP(h),
(4.2.6)
where the integrand is taken to be zero for those t
h(t) = 0.
Note the
s~i1arity
s.t.
~(t)~(t)
=
between (4.2.6) with k = 1 and (1.3.3)
due to Sacks and Ylvisaker which holds for nonrandom designs and for R
satisfying Assumptions A, B, C which are
(4.2.4:1).
s~ilar
to the assumptions in
...
61
The following lemma shows that we need not consider separately the
cases where Z has exactly one continuous q.m.
derivative, exactly
two, exactly three, and so forth, as do Sacks and Ylvisaker who obtain
different rates of convergence of the m.s.e. for non-random designs
depending on the number of continuous q.m. derivatives of Z.
Sacks and Ylvisaker, 1970b.)
on some sub-interval
(a,b)
This follows since, if
of
(0,1),
~
(See
is nonzero a.e.
the lermna implies that
is strictly positive on a set of positive measure in
that the R.H.S. of (4.2.6) is greater than zero for k
0.
(a,b)
2
and thus
= 2.
'Thus the
result (4.2.6) with k = 2 is the best possible, and the rate of
convergence of es2t ,n to zero carmot be improved. lDlder any stronger
smoothness assumptions on R beyond (4.2.4:2) .
..
Lemma 4.2.2.
definite~
~{t
then
€(a,b):
Proof.
If (4.2.4:2)
0.
2
is satisfied and R is stPictZy positive
is nonnegative on
[0,1]
and~
fop aZZ
0 s a < b s 1,
a. (t) >O} > 0.
2
We have for all t € (0,1)
2
} a. (t) = r(t)r"(t) - R 0(t,t)
2
s.t.
R(t,t)
= RI / 2(t,t) [R- l / 2 (t,t) (R20 (t,t)
+
r
0,
Rll(t,t))
_ R- 3/ 2 (t,t)(RlO (t,t))2] _ R20 (t,t)
(4.2.7)
= R-l(t,t)[1 IRC·,t)1, 12.1 \ROl(·,t)1 12_ <R(.,t),ROlC·,t)
> ]
62
where the nonns and inner products are those of H(R), the reproducing
kernel Hilbert space of R.
for all t
s.t.
R(t,t)
(4.2.8)
Thus by the Schwarz Inequality, aCt)
r°
~
r
°
=°
The positive definiteness of R implies that R(t,t)
for all except possibly one t
(12
°
with equality iff
f(t)R(',t) + ROl(',t)
for some f.
~
° for all
t
Nowassume (12
€
€
(0,1).
Thus by the continuity of a 2 ,
[0,1].
is identically zero on some interval
(a,b)
c
(0,1).
Then by (4.2.8),
s
t
R(s,t) = R(c,c) exp(-I c f(u)du - I c f(u)du)
forall
s, t
in (a,b) - a contradiction since R is strictly positive
definite.
Thus (12
(0,1).
is continuous, the zeroes of a 2 form a closed
[a,b], and thus (12 is strictly positive on same subset of
(a,b)
in
Since (12
subset of
(a,b)
is not identically zero on any interval
of positive Lebesgue measure.
0
The following lenuna yields a lower botmd on lim inf nk+1 e s2t ,n '
k = 1,2, under conditions more general than those of Theorem 4.2.6.
2
Consider e st,n as given by (4.1.11) where R is
strictly positive definite and continuous on A x A = [0,1] x [0,1].
Lenuna 4.2.3.
Assume
~
holds.
Then for a RP(h) ,
(4.2.9)
and h are continuous on
[0,1]
and (4.2.4:k),
k
= 1,2
lim inf nk+l e 2
~ 1
I (1k(t)a2 (t)h-(k+1) (t)dt.
n ~
st,n
(k+2)! h>O
00
"
63
Proof.
Following the proof of Lemma 3.3 in Sacks and Ylvisaker (1968):
Let € > O.
For fixed n,
let
= {i:
II (€,n)
~ ~
0, h > E on
(t i , t i +l )} and let I 2 (E,n) = {O,l, ... ,n-l} - Il(E,n).
Sn(€) j Tn (€) = {to' tl,···,tn } be s.t.
if i € II' there exists an r
(4.2.10)
s.t.
sr
Let
= t i , sr+l = t i +l
if i € 12 , there exists sr' sr+l, .•• ,sr~. s.t.
1
Let un + 1 denote the cardinality of S, let
n
..
. (4.2.11)
t. 1
s. 1
J
1+
J
J+ h(t)dt
n=
h(t)dt
1
=
t.
s.
1
h(w.)(s·+l-s.) > €(s. 1-5.)
J+ J
J J
J
=
1
where wj € [Sj ,5 j +1]. For j € J 2 (€ ,n) we have Sj+1- Sj S Yn n
Thus sup (5.+1-5.) ~ O. Let
i
J
.
J
(4.2.12) W(s,t) =
1~(s)~(t)Ii:R(s,s)R(t,t)
Then
n-1
e
=
(L
~(s)~(t)R(s,t).
f f W(s,t)dsdt
2
= L
st,n i=O t.
..
-
t.1+ 1
.
t i +l
1
(4.2.13)
-3
+
II
L)
12
Jf w(5 ,t) dsdt
ti
Sj+l
, [~1 ~) I)
+
J
W(s,t)ds dt = E1(€,n) + E2 (€,n), say
64
where the inequality follows since W(s,t) is nOImegative for all s,t.
-3
We shall first show that E (e: ,n) = o(n ) as n -+ 00. Since R and <p
2
are continuous and A is compact, W is botmded by M, say. Thus by
(4. 2.10) we have
Sj+l
= l
f f W(s,t)dsdt
J 2 Sj
(4.2.14)
s MYn n-
3
§
(Sj+l-Sj)
2
-3
s MY n
n
Now consider
El(e: ,n).
= o(n -3 ) , as n
ane
For j E: J l
-+
00.
s, t E: [Sj ,Sj+l]
we
have
= <P(s)<p(t)[r(s)r(t)~R(s,t)].
W(S,t)
(4.2.15)
Note that the expression in brackets is nonnegative for all
since
s, t.
is continuous we may apply the Mean Value Theorem twice to
<p
obtain
s.
~
J+:z:
=
l
J1
IJ
W(s, t) ds dt
Sj
Sj+l
=l
(4.2.16)
Jl
<P(U )
j
JI
Sj+l
= l <P(u.)<P(v.)
Jl
where
<P(t) [r(s)r(t)-R(s,t)]dsdt
Sj
J
J
IJ
u., v. e: [s. ,s. 1], j e: Jl(E: ,n).
J
J
J
J+
[r(s)r(t)-R(s,t)]ds dt
Sj
Thus, for
k
= 1,2,
Then
65
2
nk+1 est,n
S
nk+1
JJ'
<!>(Uj)<!>(Vj).~ .. [r(s)r(t)-R(s,t)]dsdt
1
(4.2.17)
Assume (4.2.4:1) holds.
5
J
Sj+1.
J
Then r'(t), R10 (t,t), R+10 (t,t), R10 (S,t),
:f t, exist and are continuous. Thus for s. <
J
expand r(s)r(t) - R(s,t)
obtain where
5
<
t < s. 1 we may
J+
in a one-sided Taylor series about
5
=t
to
st S t ' n n .
r(s)r(t)-R(s,t) = r(t)[r(t)+(s-t)r'(Xst)]-IR(t,t)+(S-t)~O(Xst,t)]
5 S K
(4.2.18)
Similarly for Sj < t <
(4.2.19)
< Sj+1'
5
r(s)r(t)-R(s,t)
we have where t
S
Yts
S 5
= (s-t)(r(t)r'(Yts)-R~O(Yts,t)).
Note that since r(t),r'(t),R(s,t),R:O(s,t),R~O(s,t) are continuous
flUlctions of
5
and t
for S:f t,
be chosen to be continuous in s ,t.
it follows that. Jest' Yts may
Thus via (4.2 .18) - (4.2 .19) and
';
•
(4.2.20)
66
f (a. ,b. ,x. ,y.)
l J J J J
(4.2.21)
10
10
10
= ell (b.J )+ [R10
(x. ,a.) -R (b. ,b.)]+ [R+ (b. ,b.) -R (y. ,b.)]
- J J - J J
J J + J J
+ [r(a.) (r' (x.)-r' (a.))]+[r(aJ.)r' (a.)-r(b.)r' (b.)]
J
J
J
J
J
J
+ [r(b.) (r' (b.)-r' (y.))].
J
J
J
where the terms in brackets converge to zero as
continuity of
RIO
-'
RIO
+'
rr'
,
r'
r(s)r(t)-R(s,t)
(4.2.22)
s
=t
+
0 by the
and the bolUldedness of r.
Now assume (4.2.4:2) holds.
in a Taylor series about
(Sj+l-Sj)
Then we may expand r(s)r(t)-R(s,t)
to obtain
= (s-t)(r(t)r'(t)-RlO(t,t))
2
20
+ t(S-t) (r(t)r"{x ) _R (xst ' t))
st
where x st is between s and t (inclusive).
Note that for all
t s.t. R(t,t) r 0, that is, for all except possibly one point in
[0,1]
since
R is positive definite,
r(t)r'(t) - RlO(t,t)
•
By continuity this expression is then zero for all
(4.2.23)
t,
and we have
67
20
ret), r"(t), R(s,t), R (s,t)
Since
and
t,
are continuous functions of s
it follows that
x
may be chosen to be continuous. Then
st
via (4.1. 23) and the Mean Value Theorem, we have for j € J (€ ,n) ,
1
as before
S. 1
J+
f'
f
{r(s)r(t)-R(S,t)} dsdt
s.
J
= } fSj+1Jt (s-t)2(r(t)r"(x
s.
s.
st
) - R20 (X ,t))dsdt
st
J
J
S
S
+ !. J j+1J j+1(s-t)2(r(t)r"(X )_R20 (x t))dsdt
2 s.
t
st
st'
J
(4.2.24)
= ~(s.+1-s.)4[r(a.)r"(X.)-R20(x. ,a.)+r(b.)r"(y.)-R20 (y. ,b.)]
~~
where
J
J
J
J
J J
a., b., x., y.€[s.,s.+l]' j € J (€,n).
1
J J J J J J
J
J
J J
Note that we may write
f (a. ,b. ,x. ,y.) = (l2(b.)+[R20 (b. ,b.)-R20 (x. ,a.)]+[R20 (b. ,b.)-R20 (y. ,b.)]
2 J J J J
J
J J
J J
J J
J J
(4.2.25)
+ [r(a.) (r" (x.) -r" (a.)) ]+[r(a. )r"(a.) -reb. )r" (b.)]
J
J
J
J
J
J
+ [r (b . )(r" (y . ) -r" (b . )) ]
J
J
J
where the terms in brackets converge to zero as
continuity assumptions and the bO\mdedness of r.
(Sj+1-Sj)
+
Also note that by
(4.2.20), (4.2.24) and the nonnegativity of r(s)r(t)-R(s,t),
fk(a. ,b. ,x. ,y.)
J J J J
is nonnegative,
k
= 1,2.
0 by
J
68
Thus from (4.2.16), (4.2.20), (4.2.24), we have for k
n
k+1
nk+1
(k+2)!
E (E ,n) = - - 1
(4.2.26)
I
J
4>(U.)<P(v.) (s·+l- s .)
1
J
J
J
k+2
J
= 1,2,
fk(a. ,b. ,x. ,y.)
J
J
J
J
= -1- -
(k+2)!
upon use of (4.2.11).
This with (4.2.13) - (4.2.14)
yields
k+1 2
1
(Sj+1- s .)
k-2
net ~
I 4>(U.)<P(v.) k+l J fk(a.,b.,x.,y.) + o(n ).
s,n (k+2) ! J
J
J h
(w )
J J J J
j
1
(4.2.27)
Finally by (4.2.21), (4.2.25) and the boundedness of r, 4>, h- 1 on
[t j , t j +1]' j E J 1 (E ,n) ,
letting n -..
for all n independent of n, we have upon
00
(4.2.28)
lim inf
n -..
00
nk +1e 2
~ _1__
t
s,n
(k+2)!
I ~(t)4>2(t)h-(k+1)(t)dt.
h>f
4>;0
Letting E -.. 0 yields the desired result.
(Note that we omit the
condition 4> f 0 in (4.2.9) since the integrand is zero when 4>
The integral in (4.2.9)
uk (t)4>(t) f O},
k
is over the set
{tEA:
= 0.)
h(t) f 0,
= 1,2. We now show that (4.2.9) remains valid if the
integral is taken over the set for which h f 0 or
~4>;
0,
k
=
1,2.
uk4> f 0 is equivalent to
It follows :immediately that h f 0, ~4> f 0 may be
(Note that since uk and 4> are bounded,
uk f 0,
4> f 0.)
replaced by h f 0 in (4.2.9) since the integrand is zero when
uk4> = O. We show in the following 1emna that (4.2.9) remains valid if
we allow h(t) = 0 when (uk4» (t) f 0.. This with Lemma 4.2.3 then
0
69
implies that under the hypotheses of Lemma 4.2.3, for k
= 1,2,
(4.2.29)
where the integral may be finite or infinite and where the integrand is
taken to be zero for those t
s.t.
(ak~)(t)
= h(t) = O.
Bk = {tEA: h(t) = 0, (ak~)(t) 1 a}, k = 1,2, and
Zet e s t ,n ' R,~, h be as in Lemma 4.2.3. For k = 1,2, assume
k+l 2
(4.2.4:k) hoZds and ~(~) > O. Then for a RP(h), n
est,n + 00 as
Lemma 4.2.4.
2
n
.
+
Let
00.
Proof. (This proof follows the lines of that of Lenma 3.4 in Sacks and
Ylvisaker, 1968.)
Let €
>
0 and fix k
Assume also that € is small enough s.t.
If there is no
i
=
O,l, ... ,n-l s.t.
= 1,2. Assume (4.2.4:k) holds.
~{t: 1~(t)1 >
1~(t)1 >
€,ak(t)
>
€}
>
€, ak(t) > € for all
t € [ti,t i +l ], let u,v be s.t. 1~(t)1 > €, ~(t) > € for all
t €[u,v] and s.t. t l < u < v < t l +l for some l = O,l, ... ,n-l, and
let T~ = {to' ti, ... ,t~+2} = {to, ... ,tl,u,v,tl+l, ..• ,tn}' If there is
an i
T'n
{t:
s.t.
= T.
n
I~(t)
I > €, ak(t) > € for all t € [ti,t i +l ], let
Let n' denote the cardinality of T'.
(Note that if
n
(ak~)(t)
1 O}
c
{t: h(t)
=
a}, there exists no i, n, or € s.t.
1~(t)1 >
ti
€, ak(t) > € for all t € [ti,t i +l ]. This follows since the
form a RS(h).) Let Il(€,n:k) = {i: 1~(t)1 > €, ak(t) > € for all
t €[t!,t!
ll},
1
1+
I 2 (E,n:k)
=
{O , l, ... ,n'-l} - Il(E,n:k).
Sn(€:k), Jl(€,n:k), J 2 (€,n:k), El(€,n:k), E2 (€,n:k)
using the present definition of II'
Define
as in Lemma 4.2.3
o.
70
First of all observe that by the same reasoning as in (4.Z.l3) we
still have
(In the second line of (4.2.13), however, the equality is changed to a
Note that by construction II
"2=" . )
we may have sUP(Sj+l-Sj) f
o.
is not empty.
Under these definitions for
1~(t)1 > €,
we have (4.2.17), (4.2.Z6a) as before since
implies
~(t)
in obtaining
r
Finally note that
0 and since neither h >
nor Tn
€
Il,I Z'.'"
~(t) > €
= RS(h) was used
(4.Z.l7), (4.Z.Z6a).
Let, {n } be a subsequence s.t. along {nm},
m
Then.there exists
s.t.
SUP(s-M-iS.)
J
J'
1
J
+
0 >
o.
..
From (4.Z.l7) we then
obtain
J m+1
S.
fs.J
(4.Z.30)
k Z
[r(s)r(t)-R(s,t)]+o(n - ).
Jm
Now the strict positive definiteness of
R and
(s.
I-s.)
Jm
Jm+
+
0 > 0
imply that the integral in (4.2.30) is positive and bounded away from
zero, uniformly in m.
n k+l e 2
(4.2.31)
Now let
TIlliS we have as m +
m
st,nm
+
00,
00.
{nm} be a subsequence s.t. along {nm} sUP(s.+l-s.)
J
J
J
+
.
O.
l
Let
E be small enough s.t.
There exists such an
E
since
~(tEA:
~(Bk)
h(t)
> O.
= 0,
1~(t)1 > €, ~k(t) > E) >
Recall that in the proof of
o.
~
71
Lemma 4.2.3 (specifically (4.2.21), (4.2.25) plus adjacent comments) it
was shown that fk(aj,bj,xj'Yj)
j E J 1 (E,n:k)
n>
= ak(bj )
+ 0(1)
and that fk(a.,b.,x.,y.)
J J J J
as Sj+1-Sj
is nOIUlegative.
+
0,
Define for
0,
(4.2.32)
S
(
k) h(t)"'n
. J l€,n::
= {J€
"" Sj+1- j
(
J 3€,n:
k)
s:
11 t€Sj,Sj+1
[
]
.Lora
t.1+,n
1 -to1,n
where i
is s.t. [s"s·+l]c[t.
J J
1,n ,t·+
1 l ,n]}.
Then letting n = nm in what follows, we have from (4.2.26a), (4.2.13) (4.2.14) ,
k+1 2
nest n
,
~
•
nk+1
\
k+2
L
<I>(u.)<I>(v.) (s·+l- s .) fk(a.,b.,x.,y.)
(k+2)! J (€,n:k) J
J J
J
J J J J
3
+
0(nk - 2)
(4.2.33)
~
nk+1 2
k+2
k-2
€ (€+O(l)) L (s. l-s,)
+ O(n )
J
J+ J
(k+2)!
3
~
2 (€+0(1)) n-(k+1) l.\ (s·+l-s.) + O(nk-2 )
(k+2) !
J
J . J
3
I
E
The last inequality follows since for j € J 3 (€,n:k)
[s.,s.
J J+ l]c[t.,t.
1 1+1]
(4 . 2. 34)
1 =
-n
s.t.
s. l- s ,
t. 1
1+ h(t)dt
It.1
and i
<
n J+ J (t i+1- t)
i = n (Sj+1- Sj ) .
t.1+1-t.1
Then by (4.2.32) we have, since n(s.J+ l- s J.)/(t.1+ l-t.)
> 0,
1
(4.2.35)
lim inf
n
+
00
r
J (n)
3
(5·+ -s.) ~ ~{h=O,
J 1 J
1<1>1
>
€,
a k > €} > 0 .
72
Thus
(4.2.36)
which implies upon letting n ~ 0,
as n
~
00,
(4.2.37)
0
This and (4.2.31) give the conclusion of the 1enuna.
2
Note that (as follows from Lenuna 4.2.3) we may have nk+1 est,n
even if
~(t:
h(t) = 0,
(ak~)(t)
1 0) = 0,
~
00,
by having
•
2
ak~ = 00
I° Jt+I"
I
.
For example, consider R(s,t) = min(s,t) + m(s)m(t)
continuously differentiable on
Assume for simplicity that
t t l / 2, t
~
where m is
Then (4.2.4:1) holds with a l = 1.
Then if h is chosen as h(t) =
[0,1].
= 1.
the integral on the RHS of (4.2.9), with k = 1, is
infinite and we have n 2e s2t ,n ~ 00. (It may be shown directly that
2
n e 2t /(In n) ~ c where c = 4/81.)
s ,n
.
So far we have obtained lower botUlds on the lim inf of nk+l e s2 ,n .
t
Now we turn our attention to conditions on ~ and h s.t. the limit
E
[0,1],
exists and (4.2.6) holds.
The following example shows that if
on both positive and negative values, the limit may not exist.
t h IS examp 1e, we ]lave
o
~
For
1°1m In f n
k+1e 2t < 11m sup n
k+1e 2t = 00.
s ,n
s ,n
0
0
takes
73
Example 4.2.5.
~(t)
Let
= t-z
for
z
(0,1),
€
and let h(t) =
c-1It_zI1/2 where
Assume R(stationary) satisfies (4.2.4:2) with a 2 = 1, R(t,t) = 1
for all t. For each n let un = t.1 ,n' vn = t.1 +1,n be s. t .
n
n
un ~ z < v.
Then
n
v
n
f fn
u
{1~(s)~(t)lr(s)r(t)-~(S)~(t)R(S,t)}dSdt
V
n
..
(4.2.38)
=
f f ~(s)~(t)(r(s)r(t)-R(s,t)dsdt
un
Thus, with k = 2,
here,
k+1 2
k+1 n-1
n e
=n
L
st,n
i=O
t.1+ 1
ft.f
(s)~(t)(r(s)r(t)-R(s,t))dsdt
1
(4.2.39)
+ 4 nk+1
JZ fVn (-~(s)~(t))r(s)r(t)dsdt.
u z
m
It may be shown directly, as it is shown in the proof of Theorem 4.2.6,
(4.2.47) - (4.2.48), that
t.1+ 1
n-l
.
3 \
1 1m n
L
n-+oo
i=O
J f ~(s)~(t)(r(s)r(t)-R(s,t)dsdt
t.
1
(4.2.40)
=
~3
24
fl
0
1~(t)11/2 dt = ~4
24 .
74
Note that by definition of
since the t.
1
fonn a RS(h) ,
i
~
= ~ (-(z-u )3/2
n
3c
i +1
n
= ~ ((v _z)3/2
n
n
3c
n
+
+
..
z3/2)
z3/2)
and so
(4.2.41)
·
1
n
2n z3/2
= 3c
-
E
n
where
(z-u )3/2 = 3c E
n
2n n
(v _z)3/2 = ~ (l-E )
n
ton
n
and thus
O:$;
En < 1.
Thus the last tenn En'
say, in (4.2.39) may be
rewritten
En =
(4.2.42)
4n 3
I
I
Z
IVn
un
z
(-~(s)~(t))r(s)r(t)dsdt
Z
= 4n 3
(z-t)dt
Un
IVn
(s-z)ds
z
= n3 (z-un )2(vn _z)2 = nl / 3 (3C)8/3[
r · En (1- En )]4/3 .
Since E
n
converge.
is odd.
En
=
is the fractional part of 2z 3/2n/(3c), it does not
For example, if z = .5, En = 0 if. n is even, =.5
if n
Thus we may choose a subsequence s.t. along this subsequence
o(n -1/3 )
and so En
-+
O.
We may also choose a subsequence s. t.
along that subsequence En converges to a positive constant and so
2
En -+ 00. Thus lim inf n 3 e s t , n = c 4/24 and lim sup n3es2 ,n = 00.
t
75
Consider the following two sets of assumptions on
~
and h:
(4.2.43:1)
~
has at most finitely many zeroes, and at each zero
of
h- 2~ 2 .IS contLnUOUS
.
and equa1 s zero.
¢ ( 1. f any )
(4.2.43:2)
¢ has at most finitely many zeroes, and for each zero
z of ¢ (if any), there exist 0 < m(z) s M(z) <
o<
p(z) <
(a)
at
s.t.
~
h- 3 1¢1
z and
(b)
~,
1
2 -
p(z)
is continuous and equals zero
for all s in some neighborhood of z
m(z)ls-zIP(z) s 1¢(s)1 s M(z)!s-zIP(z).
Note the similarity between (4.2.43:2b) and (1.3.6) used in, for example,
Sacks and Y1visaker (1970a) when Z possesses one or more q.m. derivatives.
It might be noted that
~
and h of the preceding example
(in which the limit does not exist) do not satisfy (4.2.43:2), for they
satisfy part (a) of (4.2.43:2) only when P > 2 and (b) only when
P = 1.
The following theorem gives conditions mder which the limit
exists and is finite.
2
Consider est,n as given by (4.1.11) where R is
strictLy positive definite and continuous on A x A = [0,1] x [0,1].
Theorem 4.2.6.
Assume ¢ is continuous, and, Let h be a continuous density on
For k = 1,2,
assume (4.2.4:k)
¢2 is Riemann-integrabLe.
(4.2.44)
and
(4.2.43:k)
Then for a RP(h),
[0,1].
hoLd and that h-(k+l)
76
Proof.
The Riemann-integrability of h-(k+l) $2 and (4.2.43:k) imply
s~(ti+l
- t i ) ~ 0 as n ~ 00, as follows:
Let H denote the distribution function associated with h, and H- l
h is nonzero a.e. and thus
1
the inverse function of H.
Since h is nonzero a.e.,
increasing and continuous and H-1
is continuous on
H is strictly
[0,1]
and hence
uniformly continuous and
~ sup
(H-l(b) - H-l(a))
l
b-a-n
a,b€[O,l]
(4.2.45)
as
~O
Let
Il(n)
n~oo.
= {i: $ is positive (negative) on [ti,t i +l ]},
{0,1, ... ,n-1} - Il(n).
2
e st,n
2 (n)
1
=
Then
=
(1$(s)$(t)lr(s)r(t)-$(s)$(t)R(s,t))dsdt
(4.2.46)
First consider
1" From the proof of Lemma 4.2.3 since h > €
was not used in obtaining (4.2.26) from (4.2.16) but only to show the
E
boundedness of the slIJ1I1laJld, we have
t.1+ 1
nk+1 E1 ()
n = nk+1
(4.2.47)
ft.f
L
II
= -1- -
(k+2)!
1
=---
(k+2)!
$(s)$(t)[r(s)r(t)-R(s,t)]dsdt
1
\
L.
II
$(u.J$(v.)
1
1
(t.1+ l-t.)
k 1 1 fk(a.,b.,x.,y.)
h + (Wi)
$(u.1 J$ (v.1 )(ak(w.1 )+0(1))
1
1
1
1
(t.1+ 1 - t.)
1
77
where a.1 ,b.1 ,x.1 ,yo1 ,u.1,v.
,w.
11
that converge to zero as
EO
] and where 0(1)
[to1 ,t·+
1 1
(t i +1-t i )
O.
~
denotes terms
(See (4.2.21), (4.2.25).)
The
Riemann-integrability of ~2h-(k+1) and the continuity of ~k then
yield as n
~
(4.2.48)
00
n
k+1 E (n)
1
1
-+ - - -
j
~(kt+)~12(t)
(k+2)! I~ >0
h
dt.
(t)
Now consider E2 . First, if ~ is nonzero on [0,1], then
E2 = 0, and (4.2.46) and (4.2.48) yield (4.2.44). In general,
(4.2.49)
Then for k=l we have by the tI.ean Value Theorem where u.1
EO
[t.,t.
1]
1
1+
and by definition of RP(h),
(4.2.50)
For k=2 we have for large n by (4.2.43:2) and from (2.20)-(2.23) in
the proof of Lemma 4 in Sacks and Y1visaker (1970a) where 0 < B <
v.
1
EO
[t.,t.
1]
1
1+
(4.2.51)
Jt.1+1
=2
2
12
B r 1. (v.)
1
t.1
2
~ (t)dt
h
;2
(t i +1 - t·)
1
00,
78
..
by the Holder Inequality with p=4
and definition of RP(h).
use of the Mean Value Theorem and (4.2.43:2b)
U.
1
E:
[t.,t.
1]' 0 < M.1 <
1
1+
then yields for
<Xl
IIp.1
(4.2.52)
2
(u.)r
(v.)
.
1
1
B M.
1
Then since r
Further
is continuous on
[0,1]
and thus boWlded, since the
cardinality of I 2 (n) is less than or equal to the number of sign
changes of ~, a finite number, and since' h-2~2 for k = 1,
1
1
h-31~1~ - Pi for k = 2, is continuous and equals zero when ~ is
zero, we have nk+1 E2 (n) ~ 0 as n ~
(4.2.48) yields (4.2.44).
0
<Xl.
This with (4.2.46) and
It might be noted that Theorem 4.2.6 remains valid if we change
(4.2.43:k),
k = 1,2,
by allowing
~
to be zero on at most a finite
number of intervals and requiring it to behave as in (4.2.43:k) around
these intervals of zeroes.
subsets of
implies
~O(B)
=
~(Bn{~
f OJ)
for all Borel
Then the Riemann-integrability of h-(k+l)~2
[0,1].
~O(h=O)
Define
= 0 and
s~p ~O(ti,ti+l)
~
0 as n
+
<Xl
and finally
1
sup
. I ()
1 n
IE:
(t. 1 - t.)
1+
1
~
(4.2.49) - (4.2.50).
O.
Then (4.2.47) follows as before and also
For k = 2 note that (4.2.52) holds as before
if h f 0 a.e. on
[t i , t i +l ]. If ~ is identically zero on some
(closed) subinterval Bi of [t i , t i +l ], then h may equal zero on
some subset of B.1
of positive Lebesgue measure.
Let s.1
be the inf
79
of the set of zeroes of h in B.1
and si+1 the sup.
[t.,
1 t.1+1]' the tern is zero. If
are not identically zero on [t i , t i +1], we may replace
identically zero on
by
If
<p
<p
is
and thus h
[t
t.
1
in (4.2.49) and therefore
(t i +l - t i ) by (t.1+ l-s.1+1 + s.1 - t.)
1 in
(4.2.51) since <p is identically zero on [s.,
1 s.1+1]. (The equality
in (4.2.51) becomes an inequality, "s".) The conclusion of the
theorem then follows as before.
Note also that Theorem 4.Z.6 remains valid if we do not place a
lower bound on
o < m(z)
I<p I
s M(z) <
00,
in
(4.2.43: 2)
0 < q(z) <
00
but instead aSSlDIle there exist
for each zero of
<p
which is also
a zero of h s.t.
m(z)ls-zlq(z) s Ih(s)1 s M(z)!s-zlq(z)
for all
s
in some neighborhood of z.
Then for
k = 2 (4.2.49)
becomes via (4.2.43:2) with the above modification,the Mean Value
Theorem, and (Z.ZO) - (Z.23) in Sacks and Y1visaker (1970a)
ep2 r 2
(ft i +1 h3/ Z)2
n3 E2 (n) s 2n 3 L
(u.)
t.
1
12
1
1
(t. -t.) (fti+1h3/2)2
2-· -1 Z
1+1 1 t.1
p.
l!J
Pi
1
r
::; Z
M.
(u.
)
L
3
1 1
l
(ft i +1 h)3
z h
t.1
1
1
-p.
Z-- Z
l!J
Pi r
1
s 2 L M.
(u.1)
B
3
1
h
IZ
IT
e
o < p.1
<
00,
0 < B <
00.
80
4.3.
Asymptotically optimal sequences of partitions.
Following Sacks
and Y1visaker, we now consider the "asymptotic optimality" of a
2
sequence of partitions for use in e s ,n as given by (4.1.11). It will
t
be shown that under conditions implying the existence of the limit in
(4.2.6) for certain h,
a RP(h)
does just as well as or better than,
in this asymptotic sense, any sequence of partitions s.t.
A.In = [toIn ,to1+ 1 ,n ], i = O, ... ,n-l, 0 = tOn < tIn <••• < t nn = 1. In
particular the RP(h) where h minimizes the RHS of (4.2.6) is
asymptotically optimal (provided, of course, the limit in (4.2.6) exists
for this h).
The following definition of asymptotic optimality for a
sequence of partitions is suggested by the definition of an "asymptotically optimal sequence of designs for estimating
a"
set forward by
Sacks and Ylvisaker (1968, 1970).
2
For e st,n as given by (4.1.11) we say that a
sequence of partitions {T *}
is an asymptotically optimal sequence
Definition 4.3.1.
n
n
of partitions if
(4.3.1)
lim
Il-+<xl
2
~st , n , Tn *
--....,.-"";;';;"--= 1
. f 2
In est , n , Tn
2
2
where e st,n,T
denotes est,n
when the partition is defined by
n
Tn = {O = tOn < tIn < ••• < t nn = l} and where the inf is taken over all
such partitions of A = [0,1].
81
Theorem 4.3.2.
2
est,n as given by (4.1.11) where R is
Consider
striatZy positive definite and aontinuous on A x A = [0,1] x [0,1].
Assume
~
[0,1]
is aontinuous on
assume (4.2.4:k)
hoZds.
For k
= 1,2,
Then
k 1
lim inf n + e;t,n
(4.3.2)
and nonzero a.e.
1
~
1
(k+2)!
1
{Io(~(t)~2(t))K+! dt}
k+2
where the inf is taken over aZZ partitions of the form Ain = [t in ,t i +1 ,n]'
i = O, ... ,n-l, 0 = tan < tIn < ••• < t nn = 1. Further assume (4.2.43:k)
h = hk given by (4.3.3) and hk(k+1)~2 is Riemann-
is satisfied for
integrabZe.
Then the
RP(h )
k
with
1
(4.3.3)
h
k
•
•
(lk~2)K+2
=
"1 "1
2
"1
)
and l'1m nk+1 est,n
equavs
t he RHB 0 f ( 4.3.2.
'"1
~s asymptot~aavvY opt~av
Note:
If (lk is boWlded away from zero, we have inunediate1y that
hk(k+1)~2
1
=
(ak(k+1)~2)K+2
is Riemann-integrable, that for k
(4.2.43:1) is satisfied (provided
~
1
has a finite number of sign
changes), and that for k = 2 (4.2.43:2a) is satisfied when 2
Proof.
=
p(z)
<
<
00.
(This proof follows the lines of that of Theorem 3.1 in Sacks and
Y1visaker (1966).)
Let
E >
0 and fix k = 1,2. Assume (4.2.4:k) holds.
For fixed n consider a partition of A s.t.
1, ... ,n-1,
0
= to
<
t1
< •.• <
tn
A.1 = [t.,t.
1]'
1 1+
i
= 0,
= 1. We shall first find a lower bound
· In
. f n
k+ 1e 2t
h ' f'
ak
11"
f
h
on 11m
s ,n were t e m IS t en over a partItIOns 0
82
this form.
For fixed n, let
I 2 (n) = {O,l, ... ,n-l} - Il(n).
Since h >
€
was not used in obtaining (4.2.13)-(4.2.14), (4.2.26a)
in the proof of Lemma 4.2.2 and since the cardinality of Il(n),
card (ll(n)),
n
k+l 2
e
st,n
is less than or equal to n,
nk+l
~---
(k+2)!
L..\
we have
b
) + o(nk - 2)
</> (u )
1' </> (v·1) (t.1+l-t.1)k+2 f k (a.,
1 1. ,x.1 ,yo1
II (n)
(card(I I (n)) )
(k+2)!
k+l
\
L..
I (n)
I
</>(u.)</>(v.)(t·l-t.)
1
1 1+ 1
k+2
fk(a.,b.,x.,y.)
1
1 1 1
where u.,
1 v 1. , b.1 ,x.1 , y.1 € [ t.1, t.1+1] and where f k , k=1,2, is given by
(4.2.21), (4.2.25), respectively. Recall that fk(a.1,b.
1,x.
1 ,y.)
1 is
</>(Ui)</>(vi ) , i€I l , is nonnegative.
application of the Holder Inequality with p = k+2 then yields
nonnegative.
Note also that
An
-1
k+l 2
n est ,n
(4.3.4)
~
1
{ L (</>(u.)<p(v.)fk(a. ,b. ,x. ,y.)) fu" (t·+ -t.) }k+2
(k+2) ! II (n)
1
1
1 1 1 1
1 1 1
By an argument similar to that in the proof of Lemma 4.2.4 (in particular,
the paragraph containing (4.2.30) - (4.2.31)), we have, since </>
·
nonzero a.e. and SInce
t h·e Ri emann Integrab·l·
1 1ty 0 f
is
1
h-(k+l)
k
</> 2=(a k-(k+1) </> 2)K+r
83
implies a k is also nonzero a.e., that if sUP(ti+l-t i ) ~ 0 > 0, then
l"
f or w.h·1Ch sup ( t. l-t. ) ~ 0 > 0
k+le2t ~ 00 as n ~ 00. T
n
usIpart1t10ns
1+ 1
s ,n
may be omitted from consideration since such partitions will not yield
· ·1nf n
k+le 2t
han partitions s.t.
a 1ower bound on 11m
s ,n t
~
O.
Thus we shall assume sup
. (t.1+l-t.)
1
1
Letting n
~
00
~
0 as n
~
)
sup (
t.1+ l-t.
1
00.
in (4.3.4), we have
for any sequence of designs, i.e. (4.3.2).
To complete the proof we shall show that the RP(hk ) when ~ is given
by (4.3.3) yields this lower bound. It may be seen imnediately that ~
satisfies the assumptions of Theorem 4.2.6.
theorem then gives the desired result.
For k = 1,
An application of that
0
this stratified sampling scheme does surprisingly well.
Under certain assumptions similar to (4.2.4:1); Sacks and Ylvisaker
(1966, 1970b) obtain (see Section 1.3)
(4.3.5)
where Z~
is the best mean square estimator
of I of the form
n
(4.3.6)
L
i=O
c. Z(t m
·)
1n
for a fixed (nonrandom) set of n+l sample points,
0 = tOn
<
tIn
< ••• <
84
t rm = l,
and where {tOn' ... ,t rm }n fonn an asymptotically optimal
sequence of designs.
sampling schemes,
Thus the rate of convergence is the same for both
n -2
with the RHS of (4.3.2) twice the RHS of (4.3.5).
This occurs despite the facts that the sample points chosen under
stratified sampling do not necessarily fonn an asymptotically optimal
sequence of designs and that Zn is not necessarily as good an
Z~
in that each c in in (4.3.6) depends on aU n-l
sample points chosen instead of just tin. It should be pointed out
estimator
.
as
however that Z
has the advantage over
n'
more easily obtainable.
~*
n
that its coefficients are
(See Sacks and Ylvisaker (1970b, p.132) for a
discussion of the calculation of the coefficients c in .) It should also
be pointed out that Zn under the nonrandom sampling scheme described
in Chapter 7 does as well asymptotically as
Z~.
(See Chapter 7.)
85
4.4 Examples
Example 4.4.1.
Wide sense stationary Z
Asstune A is botmded,
¢l
€
L (A) ,
2
and
Z has mean m and
covariance RO(s,t) = C(t-s) where C is continuous and strictly
positive definite. Let 02 = C(O). Then R(s,t) = C(t-s) + m2 is
continuous and strictly positive definite and satisfies (1.1.2).
For
st.s., the optimal distribution of Xin over Ain where {Ain}~=l is
a fixed partition of A and under the assumption cn&n = ¢l is given by
g. (t) =
1ll
~
J
A.In
, t
l¢l(s)tds
€
A , i = 1, ••• ,n
in
yielding
n
Zn = i~l Z(Xin)sgn(¢l(Xin))A~ 1¢l(s)lds
1ll
=
.~
1-1
{02( J I¢lCt)ldt)2 - J J CCt-S)¢l(t)¢l(S)dsdt}
A.1ll
A.1ll A.In
Note that if for each i = l, ... ,n ¢l is nonnegative (or nonpositive) on
A.In , then the coefficient of m2 in the expression for the m.s.e. is
zero.
Nowasstune A = [0,1],
nonzero t
many zeroes.
C is continuously differentiable for all
with C'((8, C'CO+).
¢l is continuous and has finitely
Then the sequence of partitions of A, RP(h),
asymptotically optimal for
is
86
(4.4.1)
in which case as n
~
00
(4.4.2)
If A = [0,1],
C is twice continuously differentiable, and
~
continuous, has finitely many zeroes, and satisfies (4.2.43:2b)
p(z) > 2,
then the sequence of partitions of A,
asymptotically optimal for
1/2
h-~
- f o I~I (s)ds
(4.4.3)
in which case
(4.4.4)
n3e 2
st,n
~
II 1~(t)ll/2
-C"(O) {
12
0
For example, if C(t-s) = exp(-pls-tl),
and if C(t-s)
=
2
exp(-p(s-t)),
we have
dt}4 .
we have
RP(h) ,
is
is
for
87
Example 4.4.2.
Let
Z,
Wide sense stationary Z,
i = 1/1l(A).
A be as in Example 4.4.1, and let
the optimal distribution of X.m
fixed partition of
= 1/1l(A). For st.s
where {A.m }I?-1.=1 is a
over A.m
and under the assumption
A
cjl
=
cn~
cjl
is the
uniform distribution
i = 1, ... ,n
yielding
2
e st,n
=
in are chosen s. t. II (Ain ) = lin for all i, then gin and
Zn are those traditionally chosen for st.s. and the expression for the
If the
A
For A = [0,1], assume
m.s.e. is that obtained by Zubrzycki (1.2.l2b).
either C is continuously differentiable for all nonzero t
twice continuously differentiable for all t.
partitions RP(h)
over
[0,1]
or C is
Then the sequence of
is asymptotically optimal for h the tDliform density
yielding (4.4.2), (4.4.4), resp. with
(4.4.4) with (1.2.16) due to Tubilla.) Also
[i-1 .!.]
n' n
A
in
=
Z
n
= -
e;t,n
n
1
L
n.1=1
=~
Z(X.)
In.1
n
2
-
n
.~ I I
1.-1 I. 1
n
C(t-s)dsdt
cjl
= 1. (Compare
88
1 Z - nz
= -(0
f
n
Example 4.4.3.
continuous,
o
2
= C(O).
f
o
C(t-s)dsdt).
Z a stationary process with a trend.
Assume A is bounded,
has mean met)
1
n
~ E
LZ(A)
and covariance RO(s,t)
m E LZ(A),
For st.s,
and is nonzero a.e.,
= C(t-s)
and Z
where m and Care
and C is strictly positive definite.
Let
the optimal distribution of X.1.n over A.m
fixed partition {A.In }.1. of A under the assumption cn'1l.
2 =~
for
is
given by
where
yielding
n
Z
n
Nowassume A = [0,1],
t
and
=L
i=l
sgn(~(Xin))
/ m2( X. ). + 0 2
m
C is continuously differentiable at all nonzero
with C' (0-) f C' (0+),
~
Qin
m is continuously differentiable at all t,
is continuous and has finitely many zeroes.
of partitions of A, RP(h),
Then the sequence
is asymptotically optimal for h given by
89
(4.4.1)
yielding (4.4.2) as n
+
00.
Note that this asymptotically
optimal sequence of partitions does not depend on the trend and yields
the same limiting value for n2es2t ,n as was obtained under the assumption of no trend. The densities g.
1n do depend on the trend though.
Nowassume A = [0,1], C and m are twice continuously differentiable,
and
~
is continuous, has finitely many zeroes, and satisfies (4.2.43:2b)
for p(z) > 2.
Then the sequence of partitions RP(h)
is asymptotically
optimal for
for some constant K where
()2(t)
=
-2C"(O)
+
2 a2
~m' ~t))2
a +m (t)
in which case as n
+
n2 e2
st,n
00
+
!{ Jl0 ()2(t)~2(t))1/4 dt}4 .
24
This asymptotically optimal sequence of partitions does depend on the
trend.
In particular if met)
= t/a (linear trend), we have
90
Example 4.4.4.
Brownian motion on [0,1] with or without trend.
Assume A = [0,1],
2
and covariance RO(s,t) = 0 min(s,t)
has mean met)
continuous on
and is nonzero a.e., and Z
[0,1].
where m is
For st.s., the optimal distribution of
x.ln
over
A.
where {A.}. is a fixed partition of A and under the assumption
ln
ln 1
cn&n = ~ is given by
where
yielding
n
Zn
Assume also that
~
=
I
i=l
Q.
m
sgn(~
(x. ))
ln
/ m2(x. )+02 x.
ln
ln
Z(X. )
m
is continuous and has finitely many zeroes and m is
continuously differentiable.
Then the sequence of partitions RP(h)
asymptotically optimal for h given by (4.4.1) in which case as n
is
+
00
Note this asymptotically optimal sequence of partitions does not depend
on the trend, if any, although g.
does depend on the trend (for each i).
ln
91
Example 4.4. s.
d-dimensional Brownian motion
d
A = [0,1] ,
Asstune
d
RO(~'!)
ep = 1, and Z has mean zero and covariance
d
= n min(s.,t.).
i=l
1
1
Define
net) = IT (t.).
i=l 1
For
st.s
the
optimal distribution of X
over Ain where the A
form a fixed
in
in
partition of A and under the assumption cn&n = ep, is given by
g. (t) = _1_ (IT(t))1/2
1n
-'
Qin
tEAm. , i = 1, ... ,n
where
Q. = f (IT(t))1/2 dt
1n
A.1n
-
yielding
•
Now consider a partition of A s.t. with md = n,
A.
m
=
m an integer,
i -1
[_1_
m
From Example 4.4.4, we have that if d = 1,
optimal sequence of partitions.
this is an asymptotically
We would like to determine the rate of
convergence to zero of the m.s.e. for this sequence of partitions.
92
2
e st,n =
{! I
- 1-3 3 .
1
1=
n
(i 2 + (i-1)i _ 2(i_1)2)}d
Then via binomial expansions, for large n, m,
13 . 1
- [- m
1: (3i - 2)]
d}
1=
d
- -+
m+1
Thus as n -..
o~)) -
00
1 + 1
n
<I e 2
-..
st,n
d
3.tr
(1 -
4d
3(m+1)
+
o(;))}
93
Observe that when d = 2,
n
-4/3
,
the convergence rate of n- 3/ 2 is better than
the best rate obtained by Y1visaker (1975) for the nonrandom
design problem.
This rate, however, is not as good as n -2 the lower
bound (Y1visaker, 1975) for that design problem.
94
4.5
Incorrect choice of covariance kernel.
We now study the increase
in the m.s.e. when the covariance kernel R is guessed incorrectly.
We show that if the guess S is close to the true kernel R in some
appropriate way, then the increase in m.s.e. is small (for fixed n
~
or asymptotically) - so that
to S under st. s)
(under the best design corresponding
n
is in this sense robust with respect to choice of
covariance kernel.
For simplicity, we shall consider only strictly positive definite
kernels and assume cn~
Q:
= <p. Define e s2t ,n (SiR) as the m.s.e. when
the design chosen is that corresponding to S when R is the true
kernel R(u,v)
=
EZ(u)Z(v)
of Z.
Then, letting s(u)
IS(u,u) ,
=
we have (from (4.1.11) under the assumptions cn&n = <p and R,S are
strictly positive definite),
2
e t (SiR)
s ,n
•
{JA. ~ 1<p(u)ldu A.f
n
= i=l
L
In
s(v)I<p(v)ldv
ill
(4.5.1)
-f
f
A.
In
n
where the partition {A.In }.1= I
consider two cases.
for all
S.
R(U,V)<P(U)<P(V)dudV}
A.
In
of A may depend on S.
We shall
For the first, the partition is chosen the same
For the second case, the partition depends on Sand,
in fact, belongs to an asymptotically optimal sequence of partitions
as considered in Section 4.3, for
A = [0,1] and S,
certain conditions.
Let
(4.5.2)
2
2
d(S,RIR) = e s t ,n (SiR) - e st,n (RIR).
<p
satisfying
.
95
It is easily seen from SChwarz's Inequality that d
~
o.
Then d is
the increase in the m.s.e. when the design chosen is that corresponding
to S instead of R when R is the true kernel R(u,v) = EZ(u)Z(v).
It will be seen that if the partition does not depend on S,
then d
depends only on the values of R and S on the diagonal (as was true
for r.s.).
If, however, the partition depends on S,
then d may also
depend on the behavior of R and S in a neighborhood of the diagonal.
First, if for fixed n the partition {AiJ
is chosen the same
for all S, then we have Proposition 3.10 with the obvious modifications.
For example, Proposition 3.10 (bi) becomes (in the notation of this
section):
If <I> r E L2 (A), then for a constant k.j , depending
on R and the partition,
and Proposition 3.l0(c):
If S(t,t)
o < M· < 00,
kg,
~
MR(t,t)
for all tEA and some M,
then as E -+ 0 we have for some constant
depending on R,S,
and the partition,
(4.5.4)
In particular, as for r.s., if S(u,u) = cR(u,u)
constant c,
then d = O.
for all u and some
This follows by applying (4.5.3) or more
directly by noting that by Proposition 4.1.2, for a fixed partition,
R and S yield the same optimal sampling density
(4.1.7).
Note that S(u,u)
and R are both stationary.
= cR(u,u)
~
for use in
holds, for example, when S
96
We now consider some
Proposition 4.5.1.
on
[0,1], and
(aJ
~
Assume
k
=1
~
~
and the density
is positive (negative) on
R satisfies
01'
h-2 ~2
n
R,S are strictly positive definite and
Assume
[0,1] x [0,1],
continuous on
results for A = [0,1].
a~ymptotic
2,
(4.2.4:1J~
and is s. t.
r
2
8
are continuous
[0,1].
satisfies
5''/S is
is Riemann-integl'able.
h
(4.2.4:k)~
continuous~
Then fol' a
RP(h)
and
as
00,
(4.5.5)
(b)
Assume
k =1
R satisfies
01'
Then fol' a
3
2
n est,n
whel'e
Proof. (a)
Since
RP(h)
(8 IR)
1
~ 24
and h
as
n
S satisfies
-3
~
2
.
ex~st
is Riemann integrable.
~ 00,
l,f, 2
J° h!
't'
(4.2.4:k)~
r 2 s"I s, rr I s 'I s, rs 'I s
t
· s..
and ~s
continuous~
and al'e
(4.5.6)
2,
(4.2.4:2)~
[
a2,R +
2(r I
I
_
2
rs ) ]
s
a 2 ,R(u) = 2r(u)r"(u) - 2R20(u,u).
~
is continuous and nonnegative (nonpositive), we
have as in the proof of Lemma 4.2.3,
97
It. It.
~U!l
= .~n ~(ui)~(vi) 1
1 (~S(V)-R(u,v))dudv
1-1
t.1- 1 t.1- 1
(4.5.7)
where u.,
1 V.1 E [to1- 1,t.],
1 i=l, ... ,n. Then expanding s(v), R(u,v) in
Taylor series with remainder we have, again as in the proof of Lemma 4.2.3,
for
•
R satisfying (4.2.4:1),
(4.5.8)
S'
(y.)
1
}
- R+10 (y.,b.)
1
1
xuv ~ v, v ~ yvu ~ u and x.1 ~ a.,
b.1 ~ y.1 E [to1- 1,t.],
l
1
i = 1, ... ,no The last equality follows from the Mean Value Theorem
where u
~
(since, as in the proof of Lemma 4.2.3,
xuv ' yvu may be chosen to be
continuous functions of u and v and since (v-u) is nOIUlegative,
nonpositive, for u on
[t i _1 ,v],
[v,t i ], respectively).
Thus as in
the proof of Lemma 4.2.3,
1 ~
3 10
e 2t (SiR) = -6
l. ep(u·H(v.)(t.-t. 1) [R
(x.,a.)-R+10 (y.,b.)
s ,n
i=l
1
1 1 1- 1 1
1 1
.
(4.5.9)
2
r (a )
i
s(ai )
98
ti-t i - l { 10
10
2 ..
R (x.,a.)-R (y.,b.)
h (w.)
-
1
1
+
1
1
1
2
r (b.)
+
where w.1
is
E
1
s(b i )
S'(Yi)}
[to1- l,t.],
i = l, ... ,n.
1
Ri~nann-integrable,
This yields (4.5.5) since $2/ h2
implying h is nonzero a.e. and thus
1 ) ~ 0 as n ~ 00, and the tenn in brackets in (4.5.9)
max(t.1,n - t.1-,n
may be rewritten as al,R(b i ) + 0(1) as n ~ 00.
(b)
Expanding s(v), R(u,v)
in Taylor series with remainder we
have
2
2
r (u)
1
2
r (u) s(v)-R(v,u) = ----'-~- (s(u)+(v-u)s'(u) + Z(v-u) s"(~))
s(u)
s(u)
(4.5.10)
2
= (v-u)(r (u) s'(u) _ RlO(u,u))
s(u)
2
+ 1 (v-u) 2(r (u) s" (x ) -R20 (x u))
2
uv
uv,
s(u)
where xuv is between u and v inclusive.
2
r (u)
B(u) = s(u)
(4.5.11)
1
s' (u) - R O(u,u)
2
=
An expansion of
Let
r (u) s'(u) - r(u)r'(u)
s(u)
B in a Taylor series about u=t.1 and substitution of
this expression in (4.5.10) yields further
99
2
r (u)
s(u)
(4.5.12)
s(v) - R(v,u)
= (v-u)(8(t.)+(u-t.)8'(Y
))
,1
1
U
2
+ !.2(v-u)2(r (u) S"(X ) - RZO(x u))
uv
UV,.
s(u)
where yu is between u and t.1
inclusive.
Then combining (4.5.7),
(4.5.12), we have via the Mean Value Theorem, since
nonpositive (nonnegative) for u on
before,
(v-u) (u-t i )
is
[to1- 1,v]([v,t.]),
and since, as
1
xuv' Yu may be chosen as continuous ftmctions of u and v,
+ Z1 4 (t.-t.
1)4 8 '(b.)
1 11
Z
4 r (a i )
1
20
+ -12(t.-t._ 1) (_-S"(X.) - R (x.,a.))
1
1
s(a.)
1
1
1
}
1
rZ(a.)
{ -38'(Y.)+8'(b.)+z(
1
1
1
s(a.)
S"(x.)-RZO(x.
,a.))}
111
1
where ai' bi , u i ' vi' wi' xi' Yi € [t i _1 ,t i ]. Then since ~Z/h3 is
integrable, which implies h is nonzero a.e. and thus max(t. -to 1 )
1,n 1- ,n
i
as n
U
e
-+
2,R(b i )
00, and the expression in brackets in (4.5.13) may be rewritten as
+
2(r' - rs'/s)2(b i )
+
0(1), we have (4.5.6).
0
Notice that when h does not depend on 5, (4.5.5) does not
depend on 5 while (4.5.6) depends only on the values of 5 on the
-+
0
100
2
Also the rate of convergence to zero of e st,n
(SiR)
depends only on the behavior of R in a neighborhood of the-diagonal;
diagonal.
that is, the existence of RIO on the diagonal.
Thus we have the
following proposition dealing with the asymptotic behavior of d = d .
n
Proposition 4.5.2.
Assume
[0,1] x [0,1],
continuous on
~
[0,1],
positive (negative) on
(a)
R,S are strictly positive definite and
and h is a density on
Assume partitions form a
on
S
(i)
(or
~
and h are continuous,
RP(h)
7JJhere
h
is
[0,1].
does not depend
and is nonzero a.e.
R)
Under the assumptions of Proposition 4.5.1(a) as
n-+
oo
(4.5.14)
(ii)
Under the assumptions of Proposition 4.5.1(b) as
(4.5.15)
n
(iii)
3d (S RIR)
n'
Assume
-+
II h!
~2
1
12 . 0
R,S are s.t.
(r' _ rS ')2 .
s
r, rr", r 2s"/s, rr's'/s,
rs'/s exist and are continuous on
also
[0,1].
Assume
~2/h3 is Riemann integrable and ~ is
continuously differentiable.
Then as n -+
00,
(4.5.15) holds.
(b)
Assume partitions form an asymptotically optimal sequence
of partitions under the assumption
S( for e;t,n(SIR),
101
01'
(i)
2
R for e s t ,n (RI R))
Assume
is nonaero a. e.
R satisfies (4.2.4:1) with
bounded away from aero and is s. t.
continuous on
!6
(4.5.16)
Then as n
[0 ,1] .
{I
1
k+2
(
a
~ 2k al,R
2
)'l<+!
a..
Assume
2
r s'/s
is
00,
-+
1
1
(II (~ 2~ s) K+2)2
0'·
K,S
_( IIa(~2a 1,R)1/3)3}
(ii)
~
a1,R bounded away
from aero and 5 satisfies (4.2.4:k), k = 1 or 2, with
'1< ,5
•
is the kernel, and
•
R satisfies (4.2.4: 2) with
a 2,R bounded away
from aero and S satisfies (4.2.4:k), k = 1 or 2, with
bounded away from aero and is s. t.
ak S
,.
rr's'/s, rs'/s exist and are continuous on
Then as
n
-+
2
r s"/s,
[0,1].
00,
1
n3dn (5,RIR)
(4.5.17)
~J Ia(ak,s
-3 ~ 2k-2)K+!(a ,R+ 2(r'
241
2
1
-+
.[f:
1
,llK+2)3
("k..
- [ f:CQZ,R*Z)1/4j4 }
Note:
Observe that
(aiii) makes no asslUllPtion as to the differenti-
ability of R(u,v) (off the diagonal) while (ai) and (aii) do.
other hand, (aiii) assumes that
~
On the
is continuously differentiable
while (ai) and (aii) assume only that
~
is continuous.
102
Proof.
Parts (ai), (aii) , (b) follow directly from Proposition 4.5.1
and the definition of d = d.
For Part (b) note that, from Theorem
n
2
4.3.2, the optimal h for es t ,n (SIR) is proportional to
1
Further the optimal h for es2t ,n (RIR) is proportional
(u ¢2)1/3 in part (bi) and to (u ¢2)1/4 in part (bii).
1
2
(a.. ¢2)'1<+2.
K
to
For part (aiii), note that we may write
t.
n
L f f
dn (S,RIR) =
i=l
t
¢(U)¢(V)(r (u)s(v) - R(U,V))dudV
s(u)
i-I
ti ,
n
-.L f J
(4.5.18)
1=1
n
= L
¢(u)¢(v)(r(u)r(v) - R(u,v))dudv
t.I- 1
f\ J ¢(U)¢(V)(r2 (u)s(v)
i=l t
where
2
1
- r(u)r(V))dudV
s(u)
i-I
{ti}~=O fonns a RS(h). Expansion of r 2 (u)s(v)/s(u) - r(u)r(v)
in a Taylor series about v = u yields
2
2
r (u)s(v) _ r(u)r(v) = (V_U)(r (u)s'(U) - r(u)r'(u))
s(u)
s(u)
2
+
.!. (v-u) 2 (r (u)s' (u)
2
where
is between u and v, inclusive.
s(u)
Let
2
seu) = r (u)s'(u) _ r(u)r'(u),
s(u)
y(u,v)
=
r
2
(u)s"(~v)
s(u)
- r(u)r"(x ) .
uv
-
r(u)r"Gc ))
uv
•
103
Then we have
2
r (u)s(v) _ r(u)r(v) = (v-u)S(u) +
s(u)
~(V-U)2y(U,V)
upon expansion of S in a Taylor series about u = t i
between u and t.1 inclusive. Similarly we have
is
where x·
Ul
~(u)~(v) = ~2(u) + (v-u)~(u)~'(yuv)
upon expanding
expanding ~2(u)
in a Taylor series about u = v and then
~(u)~(v)
in a Taylor series about u
=ti
where Yuv is
between u and v inclusive and where y.
is between u and t.1
Ul
inclusive.
Then (4.5.18) may be rewritten as
n
dn(S,RIR) =
t
L f ~J ~(u)~(v)[(v-u)S(t.)+(v-u)(u-t·)S'(x.)
1
1
Ul
i=l
t.I- 1
I
2
+ Z(v-u)
y(u,v)]dudv
n
= i=l
L
t
f ~J
{(V-U)S(t.)
1
[~2(t.)+2(u-t.H(Y
·H' (y .)
1
1
U1
U1
t.1= 1
+
+
(v-u)~(u)~'(yuv)]
~(uH(v)[(v-u)(u-ti)S'(xui)+
1
2 y(u,v)] }dudv
Z(v-u)
104
1 ~(u)~(v)y(u,v)] }dudv
+ (v-u) 2 [~(u)~' (YuvHHt ) + '2
i
n
4 2
I
(t -to 1) {~ (t.)a(t.)· 0
i=l
i 11
1
=
.
n
_ 1
I
- n3
where
i=l
x .. , y. 0' u .. , v .. , w.
1)
1)
1)
1)
E:
1
[t. l' t . ], i
1-
1
=
1, ... ,n,
j = 1,2,3,
by
(4.2.11) and the Mean Value Theorem since
chosen as continuous functions of u
Riemann-integrable and thus
rr's'/s, rs'/s
implies
0
1
n 3d (S,RIR)
n
is nonzero a.e.,
r', rr", r 2s"/s,
exist and are continuous on [0,1], and
sup (t -t
i
h
xuv' Yuv ' xui ' Yui may be
and v. Then since <p 2h- 3 is
0
1-
1)
-t-
-+!12
=
J
0
as
n
1
~ 3 (t)
-+
00,
we have as
n
h nonzero a.e.
-+
00,
2
0 h (t)
(-a'(t) + y(t,t))dt
2
2
02 Ct ) (-2 rr's' _ r s" + r (s,)2
0 ~
s
s
s2
!- Jl
12
+ (r')
22"
+ rr" +!2.- - rr") (t)dt
s
105
o
which yields (4.5.15) as desired.
Note that (4.5.14) - (4.5.15) hold in particular if
bounded away from zero, k = 1 or 2,
~
~
,S
is
is nonzero, and h is
1
2 1<+2
chosen proportional to (a k , S ~ )
Then (4.5.14) - (4.5.15) give
an asymptotic expression for the difference e;t,n(SIR) - e;t,n(RIR)
22·
where for both e s t ,n (SiR) and e s t ,n (RIR) the partitions form an
asymptotically optimal sequence of partitions under the assumption S
is the kernel.
Notice also that, as suggested by (4.2.2), the
as¥mptotic value of the difference in the m.s.e. 's depends more heavily
on the partitions chosen (Le.,
for X.ln ,
i
=
1, ... ,n ; n
=
h)
1, 2, . . .
than on the distributions chosen
(i. e. ,
gm") .
rnAPTER FIVE
SYS~~TIC
5.1
Introduction.
SAMPLING
Systematic sampling ( sy.s.) is a fonn of sampling
in which the sample points are highly dependent on one another.
Consider
a partition {A.III }~1= 1 of A and a set of transfonnations
T.. : A. + A. ,1-1, onto, s.t. T.. = T.knTk . for all i,j,k.
1Jn
In
1n
1Jn
1
In
For sy.s., we aSSlDlle that X1n has distribution GIn on ~n (and
density gl)
n and that X.1n = T'1l nXln a.s. Let G.m denote the
distribution of Xin and gin its density. From (2.1) we have
-k.r1=1 f R(S,Xin)Cn(Xin)~(S)dS}
A
+f J R(s,t)~(s)~(t)dsdt.
(5.1.1)
AA
= E{
In
~
n
i=1 j=l
- ~n i=lr
+
n
L L
f
A
R(X. ,T.. X. )c (X. )c (T.. X. )
m Jm m n m n Jm m
R(s,X. )c (X. )~(S)dS}
1n n 1n
f f R(s,t)~(s)~(t)dsdt.
AA
1
nL nL
= 2"
n i=l j=1
f
A.m
R(t,T .. t)c (t)c (T .. t)g. (t)dt
Jm n
n Jm m
107
- ~n .1
¥ J J R(s,t)cn (t)g.ID (t)~(s)dsdt
1= A. A
1n
JJ R(s,t)~(s)~(t)dsdt.
+
AA
From (2.2) and the above asslDllptions we may write (as in st.s.)
_ 1
(5.1.2)
&n (t) -
n gin (t),
t
to
Ain ,
i=l, ... ,n
where &n is a density over A with associated distribution function
Gn .
Substitution of (5.1.2) into (5.1.1) then gives
2
e sY,n =
JJ R(s,t)(c
AA
+
Q
n~
-~)(s)(cn~
Q -~)(t)dsdt
¥ ¥J
(1n
i=l j=l
- f
JR(s,t) (cn&n) (s) (cn&n) (t)dsdt)
(5.1.3)
R(t,T .. t)(c Q )(t)c (T .. t)dt
J1n
n~
n JID
A.1n
AA
= e l ,n
+
e 2 ,n'
say.
is nonnegative for each n and for all ~, and
Observe that e 2
sY,n
thus letting ~ = cn&n (since e 2 ,n does not depend on ~) we have
e 2 ,n is nonnegative for each n.
Notice .
also that
el n is a
,
quadratic form and thus nonnegative. Thus, as was also the case Wlder
st.s., the ffi.s.e. Wlder sy.s., e 2
is the sum of the two nonnegative
sY,n
terms.
•
108
The following proposition (similar in nature to Proposition 4.1.1)
Z n for finite n and indicates that the rate at
gives a bound on esy,
Z
which e sy,n
converges to zero, if indeed it does, depends on the
smoothness of R and c.
n
Note that the botmd obtained here is as
large or larger than that obtained in Proposition 4.1.1 for st.s.
Proposition 5.1.1.
Assume
cn&n € LZ (dll) . Then
(5.1.4)
e Zsy,n ::; e l,n
+
e3,n
where
n
(5.1.5)
e
If further
n
= ~ L L (ess sup-ess inf)(R(s,t)c (s)c (t))
3,n n i=1 j=1 s€Ain,t€A
n
n
jn
R is
unifo~ly
continuous a.e. on A x A, {cn}n is
A, max ess diam (Ain)
equicontinuous a. e. on
-+
0 as
n
-+
00
(UJhere
ess diam (Ain ) is the essential supremum of the Euclidean distance
1/2
between sand t €Ain).J and II R
hn 11 2 -+ 0 as n -+ 00 where
hn = cn&n - ~ and the norm is the L2-no~.J then we have as n -+ 00.
eZ
(5.1.6)
Proof.
sy,n
-+
o.
From (5.1.3) we have
eZ
,n
= 1:. I I
n··
(J
- n
J
1
J
R(T .. t,t)c (T .. t)c (t)Q (t)dt
J 1n
n Jm n '11
A.
1n
J
R(s,t)cn(s)cn(t)&n(s)ds &n(t)dt)
AinAjn
109
<! L
L ( ess
sup
R(s,t)c (s)c (t)o !
..
A. t A n
n
n
I J
SE jn' E in
- n
as desired.
The a.e. tmifonn continuity of R and the a.e. equi-
continuity of {c}
nn implies
Thus for any E > 0,
a.e.
ess diam (B.)
< 0,
I
{R(s,t)cn(s)cn(t)}n is equicontinuous
there exists
i=l,Z,
aCE)
>
0 s.t.
implies
( ess sup - ess inf) (R(s,t)cn(s)cn(t)) <
E •
sEBl,tEBZ
Thus if ess diam (Ain )
Also, the condition on
<
a
cn~
for
i=l, ... ,n, we have e 3 ,n <
yields
E.
max ess diam (A. ) < aCE)
Thus for all n s.t.
i
e2
sy,n
ID
:s; 0(1) + E
which implies
Z
lim sup e sy,n
n-+ oo
Now
E
•
:s; E.
may be made arbitrarily small since max ess diam (A. ) -+ 0,
yielding (5.1.6).
i
0
In
110
The conclusions of this proposition do not depend on the dimensionality of A and may thus be thought of as yielding a lower bound
on the rate at which the m.s.e. converges to zero for A of any
Recall that Tubilla (1.Z.l4) obtains an exact rate, n- Z/ d ,
dimension.
of convergence of the m.s.e. to zero for A the d-dimensional unit cube
and with R, cn ' Ain satisfying certain assumptions described in the
paragraph containing (1.2.14). Observe that this exact rate is 0(1)
as n
~
00
for all d.
Note also that (5.1.3) and Proposition 5.1.1 may be used to provide
analogs to Theorems Z.l, Z.2 since (5.1.3) and the remark following it
imply that e 2sY,n ~ 0 iff e l ,n ~ 0 and e Z,n ~ 0 and Proposition
5.1.1 gives sufficient conditions for e Z,n ~ O. Let hn = cn~
Q
-~.
Z ~ 0 only if IIRl / Z h II ~ 0 as n ~ 00.
Then (5.1.3) implies e sY,n
")1
Further assume as in Proposition 5.1.1 that R is uniformly continuous
a.e. on A x A,
{cn}n is equicontinuous a.e. on A,
max
. ess diam (A.In )
~
0
Then e Z
~ 0 iff
sY,n
as
I
as n
e
2
~
-+
and
II Rl / Z hn II ~
If in addition hn ~ cg -~, say, in LZ' then we have
0 iff cg - ~ E N(Rl / Z) (or cg = ~ a.e. when R is strictly
00.
sY,n
positive definite).
For reasons similar to those stated in Section 4.1.1, we are led
l
to assume cn~ - ~ = ~n E N(R / Z) for each n. Then (5.1.3) becomes
2
e
sy,n
Inn
n i=l j=l
I
=- L L
•
~+~
R(t,Tjint) (~+~n) (t) ( ~n) (Tjint)dt
A.In
-JJ R(s,t)~(s)~(t)dsdt
(5.1. 7)
AA
n
n
= L L
I
i=l j=l A.
m
-II R(s,t)~(s)~(t)dsdt
AA
~+~
R(t,Tjint)(~+~n)(t)( g.:) (Tjint)dt
J
0
111
(from (5.1.2)).
that
Even under this simplification and the assumption
R is strictly positive definite, the problem of finding a
partition {A.In }~-1'
1a sampling density
a set of transformations T.IJ·n' i,j=l, ... ,n,
&n
(and a weighting function cn) so as to
2
minimize e sy,n
seems intractable.
For instance, if R and ~ are nonnegative and R is strictly
positive definite, we have for each i, j
by the same reasoning as
in the proof of Proposition 4.1.2 that
(5.1.8)
R(t, Tj int)~(t)~(Tj int )
---"'-~---""'---
f
dt
~
{f I
g. (T .. t )
In JIn
A.In
A
R(t,T;.
tH(t)~(T.. t)dt
JIn .
JIn
in
with equality iff
g. (T .. t) = k.. !R(t,T .. tN(t)~(T.. t)
In Jln
IJn
JIn
JIn
.. ; that is, iff
a.e. on A.In for some constant k IJn
=
(5.1.9)
;!R(t,Tijnt)~(Tijnt)~(t)
f I
A.
In
a.e. on A. , i,j=l, ... ,n.
In
(5.1.10)
R(s, T. ·nsH(T ..
IJ
s)~(s)ds
The constraints (5.1.9) imply, however,
f I
A.
In
R(t , T .. t H (T .. t)
_ _...::I,Jn~_---=..IJL;:n-=--- =
R(t, tH(t)
IJn
f
A.
In
R(s,T .. s)~(T .. s)~(s)ds 2
IJn
I'R(S,s)
IJn
~(s) ds
}
112
a.e. on A. , i,j=l, ... ,n. That there do not necessarily exist a set
In
of transformations {T .. }.. and a partition {Am'}1' s.t. (5.1.10)
1Jn 1J
holds may be seen by considering the case ep = 1, A = [0,1] and
R(s,t) = min (s,t).
Thus it is not in general possible to minimize
(5.1.7) term by term.
A special example where the conditions (5.1.10) are consistent
for certain choice of {Tijn}ij'
ep = 1,
{Ain}i is the stationary case
A = [0,1], R(s,t) = C(s-t), C(O) = 1,
positive definite and nonnegative.
(5.1.11)
C(T .. t-t)
IJn
and R is strictly
Then (5.1.10) becomes
1
=-----f
ll(A ) A.
jn
In
C(T .. t-t)dt
1Jn
I
a.e. on A. , i,j=l, ... ,n. This is satisfied if T.. t = d .. + t
In
IJn
1Jn
a.e. on A.
for some constant d ... (Observe that (5.1.11) is then
In
IJn
immediately satisfied.) Further T.. = d .. + t is satisfied iff
1Jn
IJn
A.In = dJm
.. + A.
iff A. = [(i-l)/n, i/n] for all i. Thus the
In
m
conditions (5.1.11) are consistent when T.. t = t + (i-j)/n, A. =
m
1Jn
[(i-l)/n, i/n] and yield gin = n on Ain , i,j=l, ... ,n. Thus for
this stationary case if
i,j=l, ... ,n,
A.m
and T.. t = t + (i-j)/n,
1Jn
then (5.1.7) is minimized over all possible (consistent)
{g in }i
= [(i-l)/n, i/n]
when g.In = n for each i.
(1975), among others.
This is the choice used by Tubi11a
For these reasons we do not attempt to solve any general
minimization problems.
•
Instead we now turn our attention to the
2
asymptotic behavior of esy,n for A = [0,1]
definite.
and R strictly positive
113
Regular sequences of partitions. In this section we derive an
under certain assumptions on the
asymptotic expression for e 2
sy,n
partition {Ain }i' the set of transfonnations {T ijn }ij' and the
sampling densities g.In , i=l, •.. ,n, for A = [0,1]. In obtaining
5.2.
this asymptotic expression we also make some simplifying assumptions
on
and h and thus avoid some of the difficulties encountered in
<l>
Section 4.2.
As in Section 4.2 we consider sequences of partitions which form
= [0,1].
a RP(h)
for some continuous density h on
(4.2.2),
Gn defined by (2.2) converges weakly to a distribution
[0,1] with density h.
function H on
A
Then, as in
We also consider only
transformations of the form
-1
T.. t = G. G. t,
IJn
In In
(5.2.1)
i, j=l, ... ,n,
where Gin is the distribution ftmction of the density
gin'
assumed nonzero a.e. on Ain . This is a restrictive assumption,
but it simplifies the analysis considerably. For instance, if the
va1ue 0 f
of X.
In
the
XIn
.
IS
t he pth percentl·1e, say,
is the pth percentile of g;n
0f
gIn'
for all
i,
t hen t he va1ue
and not perhaps
.IoU
(IOO-p)th percentile if i
is even,say.
However, this set of
}. s.t.
transformations is consistent with any choice of {G.Inl
is nonzero a.e. on A.In for all
i.
Consider the following set of assumptions.
not necessarily continuous at
i
= l, ... ,n.)
(Recall that
tln,···,tn-l,n where
Ain
~
is
= [ti-l,n,t in ],
..
114
is strictly positive on (0,1] and converges a.e.
~
(5.2.2)
to h.
and its first two derivatives and also
(~/~)
exist and are continuous on the interior of Ain
for all i,n and converge a.e. to functions continuous
~
(0,1].
on
For each i, m ~ 0, i
+
m ~ 2,
Rim exists and is
continuous on the complement of the diagonal in [0,1] x
11
[0,1].
(Denote the a.e. limit of Kn where
Kn(s,t) = R(s,t)(~/&n)(s)(~/&n)(t) by L.)
R~O(t,t), R:O(t,t) exist and are continuous on [0,1].
Observe that &n is allowed to take the value zero at zero.
We have
the following theorem.
Theorem 5.2.1.
2
e sy,n
as given by (5.1.7) where R is
Consider
striatZy positive definite,
{A. }.
ID1
forms a RP(h)
[0,1].
density on
Then as
n
-+
for
Assume
{Tijn}ij
h
is given by (5.2.1),and
a aontinuous and striatZy positive
R,~,
&n satisfy (5.2.2) for h as above.
00,
(5.2.3) 2 2
n esy,n
-+
1 J1 ~2
6
°h1 (t)a ,R(t)dt
1
+
1
12
f f L(s,t)dsdt
s;t
where
a 1 , R is given in (4.2.3).
Proof.
We may write (5.1.7) as
(5.2.4)
2
2
esy,n = e st,n
+ E
n
115
where
is given by (4.1.7) and En by
R(t ,Tj- int )</> (t) </> (Tj- int) dt Q (T .. t)
~
Jm
E = 1:
n
Vj
Further by (5.2.1) and since from (5.2.2)
interior of AI'n for all
f f
R(s,t ) </> (s ) </> (t )ds dt ]
A. .
A
m
In
gin is nonzero on the
i ,n, we have G.InT..
Jmt = GI·nt for t
E:
A.
In
and hence
_ d
(5.2.5)
_
Tijn(t) - dt Tijnt -
gj n (t)
_ . t . . -_ _
g. (T .. t)
m
IJn
&n (t)
=
Q
~
(T .. t)
IJn
in A.• Then using the substitution u = T.. t (t=T .. u,
In
JIn
IJn
dt = (&n (u)/&n (Tijnu) )du) and noting that gjn = n &n is a density
on A. , we have
In
En
= 2 1:
i<j
(*
f f
R(t,T .. t)</>(t)</>(T .. t)
Jm
J m dt R(s, t)</> (s)</> (t)dsdt
f
Q (T .. t)
A
A
A.In
'11 Jm
.
.
m In
R(t,T .. t)</>(t)</>(T .. t)Q (5)
]
J In
Jm ~
- R(s, t)</> (s)</> (t) dsdt
&n(T .. t)
Jm
f f (
= 2 1:
i<j A. A.
In In
where
~(s,t)
= R(s,t) (</>/&n) (5) (</>/&n) (t) .
]
116
Expanding
s=u
and then
K (T .. u,u) - K (T .. u,s)
n IJn
On IJn
in a Taylor series about
~l(TijnU,U) in a Taylor series about u=t jn , we have
K (T.. u,n) - K (T.. u, s)
-n
IJn
n IJn
01
= - (s-u)Kn (TijnU,u.)
1
Z OZ
)
-Z(s-u) K (T .. u,x J
,
n 'lJn
us
= -(s-u){K0 1 (t. ,to )+(u-t.)[K (y .,x .)T!. (x .)
n
In In
In on Ul uJ IJn UJ
1
Z K~OZ (T .. u,x )
+ KOZ (y .,X .)]} - ~(s-u)
~
On IJn us
n
Ul uJ
where
T!.
is given by (5.Z.5) and where x
is between u and
us
IJn
x. between u and t . , and y. between T.. u and t. ,
In
Ul
IJn
m
uJ
inclusive. Similarly, we have for s,u in the interior of A ,
jn
= QZ(t. ) + (s-t. )Q' (b .)Q (t. )
~
In
In ~ SJ ~ In
where
b.
UJ
inclusive.
•
is between u
Then
and
t.
In
and
b.
SJ
between
s
and
t.,
In
s,
117
En
=
-z . I .
J J
J<l A.
A.
Z
1
((S-U)2'11 (t.In )K0
(t.m ,t·Jn )
-il
In In
+ (s-u) (S-t. )2' (b .)2 (t. )K01 (t. ,to )
In ~ SJ '11 In -il m In
+ (S-U) (U-t. )2 (5)2 (U) (K1ty .,X .)T!. (X .)+KOZ(y .,X .))
In '11 '11
n U1 UJ 1Jn UJ -il U1 UJ
1
Z&n(s)&n(u)~
OZ (TijnU,Xus) ) dsdu.
+ I(s-u)
=
Z ..
I (t.In -toJ - 1 ,n) 4 {O
J<l
1 2' (b. )2 (t. )K01 (t. ,to ) - Z1 2' (b' )2 (t. )K01 (t. ,to )
+ -8
4 '11 J Z '11 In n m In
'1l J 1 '1l In -il
m In
1 2 (a· )2'(b· )Ko1 (t. ,to ) + ~ 2 (a. )2'(b. )Ko l (t. ,to )
- -8
'11 J 3 '1l J 3 n
In In
.:;q. '11
J 4 '11 J 4 n m In
I
- -8
_l1.
oz
&n (xj 5)
(a' )2 (b·s)(~~Y·s'x·s)
+ K (y's,x' s ))
'11 J S '11 J
n 1 J &n (yis)
n 1 J
Q
I
11.
&n (xj 6)
oz
+ Z4 2 (a· 6)Q (b· 6)(K ~Y·6'X·b)
+ K (y'6'X' 6))
'11 J '11 J
n 1 J &n(Yi6)
n 1 J
by the Mean Value Theorem and (S.Z.S) since bsj ' buj ' Yui' Xuj' xus
may be chosen as continuous functions of u,s, and where a. , b. ,
Jrn
x. ,y.
Jrn
Jm
E
[to 1 ,t.]
J - ,n
In
for all rn.
Then as n
+
00
Jrn
118
(5.2.6)
3
=! L
6 j<i
(t. -to 1 )Ct. -to 1 )h(W.)(1l (x. s ) lL
)
]n ]- ,n m 1.- ,n
1. '1l J K_-lY. x. )+0(1)
h3 (w )
~(xis) n 1.5, Js
j
since as in (4.2.11)
where w.1. € [to1.- 1,t.].
1.
Further
e;t,n
=
t
J J &n(s)&n(t)(~(t,t)-~(s,t))dsdt
A.
1.n A.1.n
=by a Taylor series expansion where xst is between s and t,
inclusive. Then since xst may be chosen as a continuous function
of s, t,
we have by the Mean Value Theorem and (4.2.11)
2 2
net
s ,n
(5.2.7)
n2
3
. 10
= --6 L
.1. (t.-t.
1. 1.- 1) {Il'1l (a·1.1)1l'1l (b·1.1)Kn- (c·1.1 ,bo1.1)
119
where w.1 , a.lID , c il
Note that
bil , c i2
~
<!:
bi2 are all in
~~O (t, t) = R~O (t, t)(L)2(t)
+
~
and thus Kn
10(t
t) - Kn
10(t
t)
-'
+'
Recall that ~ converges a.e.
m=1,2.
[to1- l't.],
1
R(t, t) (L) , (t)(L)(t)
~
~
converges a.e. to al,R cf>2/h2.
to h,
~l converges a.e. to L,
and h strictly positive implies rnax(t.-t.
1)
1
1-
~
0 as n
~
Then (5.2.4), (5.2.6), (5.2.7) yield (5.2.3) upon letting n
Recall that for stratified sampling the optimal
partition {Ain}i is proportional to Jk(t,t)cf>2(t)
i,n.
In general for
~
00.
0
for fixed
on A.In for all
on Ain for all
i,n, we have the following, which is used in Examples 5.2.3 5.2.5
~
proportional to acf>,
~
00.
say,
for comparisons between stratified and systematic sampling.
2
Consider e sy,n
as given by (5.1.7) where R is
strictly positive definite, T..
is given by (5.2.1), and the
Example 5.2.2.
IJn
partition {A.ln }.1
positive on
fonns a RP(h)
[0,1].
A.In for all
Then if
~
with h continuous and strictly
is proportional to acf>,
i,n and (5.2.2) holds, as n
~
say, on
00
2
2 2
n e
sy,n
-+
1
-6
II ---.rcf> a l
+
i2
0 hL
(t)dt
(5.2.8)
I I ~cf>
s~t
(s)
Ki (t)
2
a
R(s,t) dsdt
asat a(s)a(t)
..
where a l
is given in (4.2.3).
120
Observe that
Further for t
(ti-l,n,t in )
E
nC~)'Ct)(~
ct)'Ct) •
as n
ll$ =
+ ~
nCti-ti_1)aHUi)(;)'Ct)
1-l,n
=
l ) , (t)
a<l>(u. Ha
h(w.)
1
1
+
1
~ (t) (1), (t)
a
n
by the Mean Value Theorem and (4. 2.11) where u.,
1
Thus ~1s,t)
W. E
1
[to1- l,t.].
1
converges to
a<l> (s)
n
Et (t)
2
a_ R(s,t)
asat a(s)a(t)
a.e. and (5.2.8) follows upon substitution in (5.2.3).
Note in particular if RIO does not exist on the diagonal,
lim n 2En may be negative in which case (from (5.2.4)) systematic
sampling will perform better asymptotically than stratified sampling
(with the same sampling densities gin' i=l, ... ,n).
The following
examples illustrate this point.
Example 5.2.3.
R(s,t) = exp(-pls-tl),
<I>
= a = h = 1.
for a, h are asymptotically optimal tmder st.s.
•
Let C(t) = exp(-pltl).
Then as n
+ ~
(These choices
See Example 4.4.2.)
. 121
n 2 En
~
1
-p
-(l-e
-p) < 0
6
and
n
2
e
2
sY,n
1
~ -6(1-e
-p
+p)
..
= i(C(O)-C(l)-CI(O+)).
Compare (1.2.14) due to Tubilla (1975).
In this example RIO does
not exist on the diagonal, and this sy.s. scheme is better
asymptotically than the best st.s. scheme.
Example 5.2.4.
R(s,t) = exp(-p(s-t) 2 ),
~
= a = h = 1.
for a,h are asymptotically optimal under st.s.
Let C(t) = exp(-p t 2). Then as n ~ ~
(These choices
See Example 4.4.2.)
and
n2 e2
sY,n
~
1
-p
-(l-e
)
6
1
= 6(C(O)
-C(l)).
Compare (1.2.14) due to Tubilla (1975). Here RIO exists on the
diagonal (and also Rll ). This sy.s. scheme is worse asymptotically
In fact e s2t ,n converges to zero at a
-3
2
-2
rate n
while esy,n here converges to zero at a rate n .
than the best st.s. scheme.
The final example shows also the dependence of the limit (5.2.8)
on the sampling density (asslDlled proportional to
all
a~
on
Am
for
i,n) .
•
122
Example 5.2.5.
R(s,t)
= min(s,t),
~
is asymptotically optimal under st.s.
aCt)
=1
or aCt)
=t
On the other hand if
on [0,1],
aCt)
= h = 1. (This choice for h
See Example 4.4.4.)
then as n
2
If on
[0,1] (the optimal choice under
+
1
- n
°
<
and
221
n e sy,n +-8.
•
+ ~
=
st.s. for fixed partition),
n En
If
CHAPTER SIX
COMPARISON OF DESIGNS
In this chapter random, stratified, and systematic sampling
schemes are compared for fixed n when R is strictly positive
definite.
It is shown, under certain assumptions, that the m.s.e.
under r.s. is greater than or equal to that under st.s. for all fixed
n.
Also conditions are obtained under which the m.s.e. under st.s.
is greater than or equal to that under sy.s., and vice versa.
First we have the following result relating r.s. and st.s.
This
result shows in particular that when R is strictly positive definite
the optimal sampling scheme under st.s. (if one exists) is better in
the m.s. sense than the optimal design under r.s.
Proposition 6.1.
Consider e;,n as given by (3.3) where each Xin ,
2
a and e st,n as given by (4.1.7) where the
density of Xin is proportional to a On' Ain and
a = lin for
all i=l, ... ,n. Assume R is strictly positive defini~. Then for
i=l, ... ,n,
has density
1.
each n
(6.1)
2
is given
In particular (6.1) holds when e r,n
2
by (3.8) and est,n by (4.1.11) for the optimal choice of {Ain}i
with equality iff n=l.
(if' one exists).
124
J a = !n ' we have
A.1n
from (3.3), (4.1. 7)
Proof.
Since
e;,n =
s ,n
= a in (4.1.1), (4.1.7) and thus
~ { I R(t,t) ~2(t)
dt -
~2(t)
(t)
dt -
I
=!
n
II
aCt)
A
e 2t
&n
R(t,t)
A
a
AA
R(S,t)~(S)~(t)dSdt}
r I'
'''1
1
I R(s,t)~(s)~(t)dsdt.
A. A.
, 1n m
Also since R is positive definite, we have
o $ ff R(s,t)~(s)(IA.
1n
AA
(6.2)
=
(I I
-lAo )(s)~(t)(IA. -IA. )(t)dsdt
J n m In
I f )R(S,t)~(S)~(t)dSdt-2 I I R(s,t)~(s)~(t)dsdt.
+
A. A.
A. A.
m m
In In
Then upon sUlTBJling over i, j
(6.3)
0
$
nj
1
I I
1 A. A.
~
and dividing by 2, we obtain
R(s,tH(sH(t)dsdt-
1n 1n
which gives (6.1).
A. A.
1n In
If R(s,tH(s)~(t)dsdt,
AA
Since R is strictly positive definite,for each
~(IA.
-IA. ) = 0 a.e.
In
on A, i.e. iff ~ = 0 a.e. on A. u A. . Then, since by
1n
In
1 2
assumption ~ E N(R / ) and thus does not equal zero a.e. on A,
i
j
the inequality (6.2) is an equality iff
m
the inequality (6.3) is an equality iff n=l.
..
125
The rest of the proposition follows upon noting that if aCt) is
2
proportional to lR(t,t) 1~(t)l, then er,n
is given by (3.8) and
2
e s t ,n by (4.1. 7) for {Am'}.1 satisfying the asstullptions of this
proposition.
Further the optimal choice of {Ain}i for use in
(4.1.7) will produce a m.s.e. e s2t ,n as small if not smaller than
e s2t ,n as above.
0
Note that the asymptotically optimal design under st.s. for R
stationary (with no trend) and
yields
~
J a = lIn for all i,n.
A.m
= lh.a(A),
where A is bounded,
(See Example 4.4.1.) Proposition 6.1
thus contains the result of Zubrzycki (1958), (1. 2.12), for the case
where R is stationary and
dimensional.
~
=
1/~(A)
for A bounded and two
It is also similar to the result of Cochran (1946),
(1.2.7), for a finite population.
The next result gives sufficient conditions under which
2
< 2
2
2
e s,n
t
- e sy,n , or e sy,n s e
t'
s,n
Proposition 6.2.
2
e st,n as given by (4.1.7) where X
in has
2
density g.
1n on Ain , i=l, ... ,n, and e sy,n as given by (5.1.7)
where Xin again has density gin on Ain , i=l, ... ,n, and the
Consider
Tijn , i,j=l, ... ,n are consistent with this choice
of a partition and sampling densities. Assume R is strictly
transformations
positive definite.
For each
i,j, where
s,t
£
in ,
A
let
k.. (s,t) = R(s,T .. t) .L(s) L(T .. t),
Jln
J1n &n
&n J1n
(6.4)
.. =
MJIn
J kJln
.. (s,s)g.1n (s)ds - J J kJ1n
.. (s,t)g.m (s)g.1n (t)dsdt.
A.
In
A. A.
1n 1n·
126
If M..
Jm
is nonnegative fop all
(6.5)
e
If M..
Jm
2
sY,n
e
~
i ; j, then
2
st,n
is nonpositive fop all
i ; j, then
2
< 2
e sY,n
- e s t ,n
(6.6)
Proof. We may rewrite (4.1.7), (5.1.7) as follows
e 2S t ,n
=!.., ELL
2
sY,n
=!..,
ELL
n
i j
e
n
£.
£.
••
IJ
R(X.In ,X.In ) L(x.
Q
In ) .L(x.
Q
In).- ffR(S,t)<!>(S)<P(t)dSdt,
~
~
M
R(X. ,T .. x. ) L(x. ) L.(T .. x. )-ffR(S,t)<!>(S)<!>(t)dsdt.
In JIn In &n m &n J m m M
For each i en is assumed fixed) let Yin be a random variable
independent of and identically distributed with Xin . Then since
Xjn = TjinXin a.s., TjinYin has the same distribution as Xjn
and we have
n2
(e 2 -2
e
) = L E(k.. (X. ,X. ) -k.. (X. ,Y. ))
sY,n
st,n ·4·
Jln In In Jm m m
ITJ
(6.7)
= .~
1
. M..
Jln
J
from which (6.5), (6.6) follow.
<!> =
When R is stationary,
tD'liform
d~lsities
(over A. n
1..
),
0
1/~(A)
.
for A botD'lded, the gin are
and the T..
Jm
are translations
127
(by
T .•
Jln
i,j=l, ... ,n,
),
M..
JIn
= R(T Jln
.. )
then
-
f f
R(t-s)dsdt,
A. A.
In In
and thus Proposition 6.2 gives Theorem 2f of Zubrzycki (1958).
Zubrzycki (1958) shows further that when R(s,t) = exp(-pls-tl)
where s, tEA,
two dimensional and b01.mded, then for sufficiently
2
large p, M.. < 0 for all i f j and thus e 2
< e
Jln
sY,n
st,n
(Theorem 4). He also shows that if the Ain's are disjoint circles
of equal radius and A =
A.In has diameter less than lip, then
2
M.. > 0 for all i f j and thus e 2
< e
(Theorem 3).
Jln
s t ,n
sY,n
We now consider sufficient conditions that are easier to verify
u
than M.. ~ 0 (or ~ 0) for all i f j.
Jln
i f j, k..
satisfies the inequality
Jln
(6.8)
for
k(s,s)
+
Note that if for each
k(t,t) - k(s,t) - k(t,s)
~
0
(or s 0)
a.e.
s, t E Ain , upon integration with respect to
g. (s)g. (t)dsdt ~ 0, we have M.. ~ 0 (or sO). Thus (6.8) is
In
In
JID
a sufficient condition. It also has the intuitive appeal that it can
be written as
E(X(s) - X(t))(X(T .. s) - X(T .. t))
Jln
Jln
~
0
(or sO),
128
A.1n , where Xes) = Z(s)(~/Q~)(s), and so requires
that the increments of the processes X and X 0 T .. , over the
s, t
for a. e.
€
Jill
same interval of Ain , should be positively (or negatively)
correlated for all i ~ j.
For A one-dimensional, a further sufficient condition is the
quasi-monotonicity (or quasi-antitonicity) of k.. , i
Jill
s',
t
$
(6.9)
j.
A
is called quasi-monotone on B x B if for all
function k(s,t)
S $
~
in B,
t'
k(s,t)
+
k(s' ,t') - k(s,t') - k(s' ,t)
~
0,
and it is called quasi-antitone if -k is quasi-monotone.
Then under
the assumptions of Proposition 6.2, we have that (6.S)(or (6.6)) holds
if for all i
k..
is a right-continuous quasi-monotone (or J1n
antitone)function on A.1n x A.1n . This follows directly from
~
j
Proposition 6.2, as (6.9) clearly implies (6.8).
It also follows from
Theorem I of Cambanis, et.al. (1976) which gives that Mjin ~ 0 (or
~ 0)
when k..
is quasi-monotone (or -antitone), for (X. , X. )
Jill
ill
in (6.7) is stochastically smaller than (X.ill , Y.)
ill
s, t
€
ill
in that for all
Ain we have
P(X.1n
<
If T..
Jill
in A. ),
S,
Y.
1n
<
t)
=
is monotone
P(X.1n
<
(Le.,
s)P(X.ill< t) $.P(X.ill
T..
Jill
s < T..
Jill
<
s, X.m < t).
t whenever s < t
then k..
is quasi-monotone (-antitone) on A. x A.
J1n
ill
ill
iff Kn defined by Kn(s,t) = R(s,t)(~/&n)(s)(~/&n)(t) is quasimonotone (-antitone) on A.1n x A..
Cambanis, et.al.(1976, p.292)
ill
In
129
give examples of covariances that are quasi-monotone (-antitone).
Further if 1'..
is monotone and K is absolutely continuous on
Jln
n
A. x A.
In
Jn for all i 1 j, then by the above and Cambanis, et.al.
(1976, p.292) we have that k..
Jill
K~l2:
is quasi-monotone (-antitone)
iff
a.e. on Ain x Ajn . Note further that when the
indicated derivatives exist we have k .. 11 (s,t) = 1' .. '(t)K 11(s,T .. t).
0
($ 0)
Jill
Jill
n
Jill
The transfonnations 1'..
given by (5. 211) are monotone and also
Jln
differentiable with a positive derivative. Thus when Knll exists
off the diagonal and is locally integrable, we have from the preceding
paragraph that k..
is quasi-monotone (-antitone) iff Kn11s,t) ~ 0
Jln
(5 0)
for a.e. s E Ain , t E Ajn . In Examples 5.2.2 - 5.2.5, ~/&n
is chosen proportional to l/a on Ain for all i,n for some
"
function a.
Under this choice, then, we have that k..
Jill
is quasi-
monotone (-antitone) if
R(s,t)
a(s)a(t)
(6.10)
~
0
(s 0)
off the diagonal.
This expression plays a central role in (5.2.8)
and Examples 5.2.2 - 5.2.5. For the example in which e 2
is
sY,n
asymptotically larger than e;t ,n' Example 5.2.4, the LHS of (6.10)
. posItIve
..
2
IS
a.e. and t hus we have e sy,n
~
2
Similarly for the examples in which e sY,n
than e s2t ,n ,Examples 5.2.3 and 5.2.5 with
2
that e s2t ,n 2: e sY,n
for each fixed n.
e s2t ,n f or each f"lXed n.
is asymptotically smaller
a=l,
we have via (6.10)
As a final example, when A is one dimensional e 2
s e 2t
sY,n
s ,n
is monotone for all i 1 j and the covariance
holds if T..
Jm
Kn(s,t) (or R(s,t)/(a(s)a(t)))
is biconvex, for a biconvex function
130
is quasi-antitone.
This follows directly fran the definition of
biconvexity given in Berman (1978), which extends the notion of
stationary convex covariance functions.
A symmetric function K
defined on
K(s,t)
~
[a,b] x [a,b] is called bioonvex if for all a ~ s < t ~ b,
10
01
11
0, K (s,t) ~ 0, K (s,t) s 0, K (s,t) s O. (Berman (1978)
shows that a biconvex function is a covariance function.)
covariance functions
K(s,t)
= R(s,t)/(a(s)a(t))
Of the
considered in
Examples 5.2.3 - 5.2.5, min(s,t)/15t and exp(-pls-tl) defined on
[0,1] x [0,1]
are biconvex while exp(-p(s-t)2)
is not.
A further
example of a (non-stationary) biconvex covariance is K(s,t) =
t[B(S) + B(t) - B(t-s)]
for
0s s < t
concave and nondecreasing for t
>
where B(t)
is nonnegative,
0 (Berman, 1978).
It might be noted here that Cochran (1946) obtains (6.6), (1.2.8)
for a finite population when R is stationary,
under the convexity assumption Pi-l + Pi+l - 2Pi
= 1, and c = 1
<l>
~
0 in addition to
certain general assumptions including the nonnegativity of all Pi'
This is the discrete parameter analog of the continuous parameter case
where R is nonnegative and convex and thus R(s,t)
quasi-antitone.
2
2
:5::
If e sy,n
est ,n'
(6.11)
= R(t-s)
is
then
2
2
e sY,n
-< e s2t ,n s er,n
for {Ain}i satisfying the assumptions of Proposition 6.1. If
2
esY,n
~ e s2t ,n ' then in general the order between the m.s.e.'s in
r. s. and in sy. s. is not fixed. Under the assumptions of Theorem
5.2.1, it is clear from that theorem and (3.3) that for large n,
131
2 . On the other hand, as is seen in the following example
2
esy,n
~ er,n
based on one given in Cochran (1968, pp. 218-219), we may have
2
2
~ er,n
for certain fixed n.
esy,n
Example 6.3.
Consider a stochastic process
Z(t) = yet) + sin(2m
where EY(t) = 0,
t),
t
[0,1]
f
EY(s)Y(t) = exp(-pis-tl), 0
and let m be a multiple of n.
~
'1l
~
Then with
<
p
Ain =
<~.
Let
~
= 1,
[(i-n)/n, i/n],
= 1, T..
t = t + (j-i)/n for all i,j, we have for sy.s. from
J In
(5.1. 7)
i
e2
=1 L
sY,n n..
I,J
fn. 1 (exp( _£.1n i-j I)+sin(2nrrrt) sin (2nrrrt +
I-
n
-f~ f~[e-p,s-t'+ Sin(~S)Sin(Zmmt)ldsdt
i
= .L. fn._
kI,J
1
-f: f:
1
(exp(
~Ii-jl+
2
Sin (2nrrrt)]dt
n
[e-pls-tl+
Sin(~S)Sin(~t)ldsdt
= B(p,n) + 21 - C(p)
where
B(p,n) = ~
n
C(p)
=
.L.
I,J
exp(- ~Ii-jl)
P)).
f1o f10 e-pls-tl dsdt = ~p (1 - l(l-ep
2nrrr
j -i)dt
n
132
Similarly for r.s. we have from (3.3) with g=l,
e 2 = -1 { f1 (1 + sin 2(2mmt))dt
r,n n
0
1 ·
-f1
fo(e-p,s-t,+ sin(2mms)sin(2mmt))dsdt }
o
= -n1 (-32 - C(p)) •
For all p €(O, 00) and n ~ 3 s.t.
2 , as follows: First
2
have e sy,n
> e
r,n
2
thus e
> ~ for all p,n. Further
sy,n '"
2 < ~. Thus if
and thus e r,n
n '"'3 ,
",n
2
2
also appears that e r, 2 < e sy, 2.. )
~-
n is a divisor of m, we
note that B(p,n) > C(p)
and
C(p) > 0 for all p €(O,oo)
2 < 3 s I < e2
er,nm
"2
sy,n .
(It
rnAPTER SEVEN
MEDIAN SAMPLING WHEN A
= [0,1]
Consider an estimator of (1.1.1) of the form (1.1.4) with the
sample points nonrandom
i.e., Xin = min € Ain , i=l, ..• ,n. It is
shown here that if A = [0,1], the A
form a RP(h), R
in
satisfies (4.2.4:1), and certain other assumptions are satisfied,
then this estimator does as well asymptotically as the best m.s.
estimator of the form
n
(7.1)
L
i=l
c. (t. , .•. ,t )Z(t. ) .
In In
JUl
In
Thus, it might be expected that little is lost in the preceding
chapters in terms of asymptotic size of m.s.e. by considering only
estimators of the form (1.1.4) with Xin , i = l, ••. ,n, randomas opposed to more general estimators of the form (7.1) with, then,
tin' i=l, ... ,n,
random.
In general the "best" constants c in for
use in (7.1) are difficult to obtain (Sacks and Ylvisaker, 1970b)
while the "best" cn for use in (l.l. 4) is, as has been seen,
fairly easily obtainable.
(Compare Sacks and Ylvisaker (1970b,
pp. 132-135) for other estimators that do as well asymptotically as
..
the best m.s. one of the form (7.1).)
We consider only the case
for which RIO does not exist on the diagonal.
First we have the following lemma whose assumptions are satisfied,
134
for example, by the asymptotically best st.s. scheme under (4.2.4:1)
with a l bounded away ,from zero, R(t,t) nonzero except possibly at
t=O, and <p continuous and positive. (For this example, n kin =
f r
and hO = r<p
<p
A.
In
where r
since g.
m
= r<P/f r<p on A. ,
. (4.2.3).
is defined In
A.
m
m
i=l, ... ,n,
2 1/3 for some
Also h=kl(al<P)
constant kl .)
Lemna 7.1.
where
{A in }i
Assume
h is nonzero a.e.,
on Ain for all
i,
and there exist constants
converges unifol'fT/ly to
on
(0,1].
i=l, ... ,n.
point in
Let min
€
kin s.t.
say, where h O is continuous and nonzero
[ti-l,n,t in ] be the median for the density gin'
Then as n -..
h O'
00
for aU
i=in s. t. min converges to some
(0,1]
Pin =
Note:
with Ain = [ti-l,n' tin] fol'fT/s a RP(h)
gin is a continuous density nonzero a.e.
m.In - t.1- 1 ,n
-
t.m - t.1-,n
1
1
-..
'2
If min -.. 0,
the result does not necessarily hold. If <p = 1
and ret) = ho(t) = t l / 2, mIn/tIn = 2-2/3 for all n - and thus does
not converge to 1/2.
Proof.
By definition of g.In and h"11 and by the Mean Value Theorem,
•
135
m.
m.
h (t)dt J m
(t)dt
hn (x.In )(m.In -to1- 1,n) = ~1_---<.,_n n
t. 1 gin
= """'L"1_-~,:...n
_1
.
gm. (t)dt - 2
hn(Yin) (tin-ti-l,n) J tin h (t)dt Jtt in
t.1- 1 ,n n
1- 1 ,n
Jt.In1
where x.In , y.In
E:
[to1-,n
1 ,t.], i=l, ... ,n.
In
Thus
min - ti-l,n =! hn(Yin)
tin - ti-l,n
2 hn(xin )
Then since hn converges Wliformly to hO which is continuous and
nonzero on (0,1], and since h nonzero a.e. implies
max(t. -t. 1 ) -.. 0 as n -..
i
In 1- ,n
00,
we have the desired result.
0
For simplicity we let cn = ~/h since (for random sampling schemes)
Theorem 2.1 implies the asymptotically best choice for cn is ~/g
where Gn -.. G weakly (and certain other conditions are satisfied) and
(4.2.2) implies Gn -.. H when the partitions form a RP(h).
consider
n
(7.2)
~ (m. )Z(m. )
L
i=l h
In
In
where m.In is the median of the continuous density g.m
and
(7.3)
Then
on A.In ,
136
= f~ f~ R(s,t)4>(s}4>(t) dsdt
n II
.r1=1
R(min,t) *(min )4>(t)dt
0
2
n
r1 .nr1 R(m.
1
n
+ ~
n
L
•
In
1= J=
,m.)
In
*
(m.) L4> (m· n )·
In 11
J
The following theorem shows that (under certain assumptions on, in
particular,
srnalla
4>
1JIl. s. e.
and h)
Zn
as given by (7.2) yields asymptotically as
as the best m. s. estimator of the form (7.1).
Compare (7.4), (7.5) with (1. 3. 4), (1. 3. 5).
Theorem 7.2.
h
i8 a aontinuou8 den8ity on
4>
is bounded away from
zero~
and the aontinuous density
i >
1 and nonzero on
n-+
oo
(7.4)
R20 msts off the diagonal.~
Assume (4.2.4: 1) hol.ds~
[0,1] and is bounded away from
4>/h is twiae aontinuousl.y
gin is nonzero on
(0, tIn]
2 -+ IT
1
n 2 em,n
for
Io
l
i=l.
o,hl~
2
.
In partiauZar if h is ahosen proportional. to
i8 bounded away from zero, we have as n -+
(7.5)
2 -+ IT
1
n 2 em,n
co
(I l0(o,l't'~2)1/3)3
differentiabl.e~
[ti-l,n' tin]
Then for a
zero~
for
RP(h) as
•
137
Proof.
Let
K(s,t) = R(s,t)
**
(s)
(t)
•
and K20 exists and is continuous off
Then K satisfies (4.2.4:1),
the diagonal.
Further, omitting the subscript n in what follows,
we have from (7.3)
ft. ft.
2
e
= nL nL 1
J h(s)h(t)W.. (s,t)dtds
m,n 1=
. 1 J'-1 t. 1 t. 1
1J
1J-
(7.6)
For i f j
Taylor series expansions about t=mj
and then
s = m.1
yield
•
01 (s ,m.) - K01 (m. ,m.))
w1J
.. (s,t) = (t-m.)(K
J
J
1
J
+
1
2
Z(t-mj ) (K
02
02
(s'X jst ) - K (mi'x jst ))
= (t-m. )(s-m.)K11 (y. ,m.)
J
where
1
J
1S
jst is between t and mj and Yis between s and mi ,
inclusive. Let p.1 = (m.-t.
l)/(t.-t.
1)' Then by (4.2.11) and the
1
11
1X
Mean Value Theorem since
jst ' Yis may be chosen as continuous
functions of s,t and since (t-m.) (s-m.) is positive for
X
J
(s,t)
E
(t.1- I,m.)
x (t.J- I,m.),
1
J
and so forth, we have
1
negative on
(t.1- I,m.)
x (m.,t.),
1
J J
138
t.1 ft.J h(s)h(t)W (S, t)dtds
ij
ft. 1 t.J- 1
1-
2
= n 2 .~. { (t.-t.
1) 2 (t.-t.
1
1J J- 1)
1
+
•
J
1
.
3
02
02
-6(t.-t.
l)(t.-t.
1) h(a.)h(b.)(K (a.,y.)-K (m.,y.))
1
1J J1
J
1
J
1
J
(p~
•
J
_ (l-p. )3)}
J
(7.7)
=
(t.-t. l)(t.-t. 1)
Y.
i~j
1
4
•{ L
k=l
+
J
1-
(-1)
•
k+l h(aik)h(bjk) 11
K (y·k,m.)
1
J
h(w.) h(w.)
J
1
1:. h(a~)h(bj)
6
J-
(K 02 (a. ,y.) _ K02 (m. ,y.))
h (w )
1
j
J
J
1
.(pJ - (1_Pj)3)}
where ai' a ik , bi , bik , Yi' Yik' wi E: [ti-1,t i ] and where
IC1,Z)Ck) = 1 if k=l,Z, =
otherwise. Denote the term in
°
braces in the last expression by Cij , and for
E:
>
° let
139
= {(i,j): i ,
Il(E,n)
j, t i < E or t j < E}, IZ(E,n) =
j} - I 1 (E,n). By Lemma 7.1, Pi' Pj
on IZ(E,n)
i,
{(i,j):
+}
and thus
L
•
IZ(E,n)
as n
+
(t.-t. l)(t.-t. l)C ..
1
Further C..
00.
IJ
constant C,
1-
J
J-
+
0
IJ
is lD1ifonnly bolD1ded (for all n)
by some
say, and thus for all n,
'\
(t.-t.
l)(t.-t.
l)C IJ
..
1 1J JII (EL.. ,n)
•
Since this holds for all
n2 . ~ .
1
It.1 ft.J
J
E > 0, we have
h(s)h(t)W..
(s,t)dtds
IJ
Next consider the
SlDll
over i
= j.
11
Similarly,
.
W•. (S,t)
11
O.
For s < t < m.1 we have
via a Taylor series expansion of W11
.. (s,t)
W.• (s,t)
+
t.1- 1 t.J- 1
about t
= m.1
140
for s
m.1.
< t,
mi < s < t, respectively, where x ist ' Yist are
between t and mi and v ist between 5 and mi , inclusive.
Note that X.1.St' y.1.5t' v.1.5 t may be chosen as continuous ftmctions of
<
5 and t.
Thus by the synmetry of Wii and by the Mean Value
Theorem since (t-mi ) is of constant sign on 5 < t < mi and
s < m.1. < t and (s-m.1. ) is of constant sign on m.1. < 5 < t, we
•
have
ft. ft.
n Z nL
1.
1. h(s)h(t)W.. (s,t)dsdt
. 1 ·t. 1 t. 1
1.1.
1.=
1.- 1.= Z nZ
I
.Iti Jt h(s)h(t)W..
(s,t)dsdt
. 1 t. 1 t. 1
1.1.
1.=
1.- 1.-
•
•
1
10
10
Z
+~
h(a·1. Z)h(b1.·Z)(K+ (x·Z,m.)-K+
(Y·Z,m.))(t.-m.)
~
1. 1.
1. 1.
1. 1. (m.-t.
1. 1.- 1)
(7.8)
1 h(a· )h(b· )(K1(x·
0 . · 10
.
+ -6
1. 3 . 1. 3 - 1. 3 ,b·1. 3)-K+ (Y·3,m.))(t.-m.)
1. 1.
1. 1.
1
n
(t.-t.
1. 1.- 1)
1.=
h (w.)
= -3 . L1
3}
2
1.
10
3
..{ h(a·1. 1)h(b·1.1)(K10
- (x·1.1 ,m.)-K
1. + (Y·1,a·
1. 1.1))p·1.
10
10
Z
+ 3h(a·1. Z)h(b·1. Z)(.K+ (x·1. Z,a·1.2)-K+ (Y·Z,m..
1. n))(l-p.)
1. p.1.
•
141
where a ik , bik , xik ' Yik' wi € [t i _1 ,t i ], by (4.2.11). Denote the
term in braces in the last expression by Ci and for € > 0 let
J (€,n) = {i:
t i < €}, J 2 (€,n)
Pi +} and thus
1
Lemma 7.1,
1
3
as
n -+
L
(t 1·-t 1·- 1)
J 2 (€,n)
Z
h (wi)
1
Ci
+
2 (w.)
Further C./h
1
1
00.
=
12
s C
L
J (€,n)
II
€
As before by
10
10
(K_ (t,t)-K+ (t,t))dt
is nonnegative and is lUliformly
botmded above by some constant C,
•
{1, ... ,n} - J 1 (€,n).
say, and thus for all n,
(t.-t.
1) s C €
1
1-
•
1
Then
I
n
lim - L
3 1=
. 1
lim
I
n
L
1 i=l
(t i -t i _1 )
~~----
h2(w.)
10
1 Jl 10
(K_ (t,t)-K+ (t,t))dt + C €
i - 12 €
C
<
1
(t.-t. 1)
1
1-
hZ(W.)
1
C. ~ -1
12
II
€
10
10
(K (t,t)-K (t,t))dt
-
+
1
and since this is true for all
"
.
as n
-+
00.
€ > 0,
it follows that
Then (7.4) follows upon noting that
142
and (7.5) follows directly from (7.4).
It might be noted that if m.m
0
is chosen s.t.
m.
Im
t
gin = P(t i - 1 n)
'
i-l,n
where p is a continuous ftmction on
(7.6) - (7.8),
[0,1]
to
[0,1] we have from
tmder the assumptions of Proposition 7.2 except that
now min is not necessarily the median,
n
2
e~,n ~
~)(P(t)- ~)dsdt
I I r\s,t)(p(s)sft
+
! f~ a1,K(t)(p3
+
(1-p)3)(t)dt
where a 1 ,K(t) = K_10 (t,t) - K+10 (t,t). (See (7.9).) Note further
that if p is constant (p-percentile sampling), then as n ~
00
2 ~ (P-I)
1 2 (K(1,1)-2K(0,1)+K(0,0)) + 12
1
n 2 ep,n
II°a
1 ,K(t)dt
since
ff
sft
xl\s,t)dsdt
= [ f~
f:
+
f~ f:l ~1s.t)dsdt
1
= Io(KO~(t,t)-K01(O,t)+K01(1,t)-KO:(t,t))dt
1
= K(1,1)-2K(O,1)+K(O,O)-
I
a1 K .
° '
.
143
Thus median sampling,
p=. 5,
yields asymptotically the smallest m. s. e.
among all p- percentile sampling schemes.
Note also that the proof of this proposition might be easily
modified for
~/~
~/~
to be used in place of
possesses a second derivative on
~/h
in (7.2) provided
(t i - 1 ,n,t in ), i=l, ... ,n,
which converges pointwise to some continuous function on
The pointwise limit of
(as
of
•
•
~/h),
~/~
~/~
(0,1].
would then enter into the final result
and the pointwise limit of the first or second derivative
would not appear .
APPENDIX
Chesson (1976) obtains a decomposition for bivariate distributions
that are not necessarily aZ-bounded.
an expression for Eh(X,Y)
Using his result we shall obtain
for a certain class of hIs and also
investigate the form of this expression when the bivariate distribution
for X and Y is symmetric in its two arguments.
Chesson considers bivariate distributions for general random
variables (for example, random functions), but we shall consider
•
bivariate distributions only for vector-valued random variables. Let
X and Y be random variables taking values in Rd and defined on some
probability space
(n, A, Q).
Denote the joint distribution function
of X and Y by J,
the marginal distribution function of X by F,
and that of Y by G.
Let F be the a-field generated by X and G
the a-field generated by Y.
Finally define
H = {(f(X),g(Y)): f, g are real valued,
f
E
d
d
LZ(R , B , dF),
g
E
d _r1
LZ(R , 8-,
dG)}.
Chesson notes that H is a real Hilbert space under the inner product
Then using the bounded self-adjoint operator B in H defined by
•
B(f(X),g(Y))
= (E(g(Y) IX),
E(f(X)\Y))
145
and its spectral decomposition, he proves the following (which we
restate for vector-valued ,random variables):
Theorem A.1 (Chesson).
of H, 0
$
r
n
(i)
r>r'
(ii)
then
If
Mr
=
Mr"
{(~t(X),
{~t}t€T(r)
0
$
r' < 1,
and
nt(Y))}t€T(r)
and {nt}t€T(r)
L (Rd , Bd , dF)
Z
on the spaaes
(iii)
suah that
1,
$
There exists a unique famiZy of subspaaes Mr
is an orthonormaZ basis for Mr ,
aPe orthonormaZ famiZies of functions
and
LZ(Rd ,
E[f(X)g(Y)] =
If,
dG)
respeativeZy.
{(~t(X),nt(Y))}t£T(r)
(f(X)"g(Y)) € H and
For
MO = {Ole
as in (ii)
J r dQ(r)
(0,1]
where the right-aontinuous funation
Q(r) =
Q of bounded variation is given by
L (f f(X)~t(X)dF(X))(fg(Y)nt(Y)dG(Y))
t€T(r)
=
(iv)
(=Eg(Y)),
(f(X),g(Y)) € M 8 Ma , 0
B
then a < E[f(X)g(Y)] $ B.
If
$
a < B $ 1,
and
EfZ(X) = 1
This theorem inunediate1y suggests the following theorem which is
used in obtaining a representation for
•
146
J J R(s,t)c(s)c(t)J(ds,dt)
in the proof of Theorem 2.1.
Theorem A. 2 • For any function h € L2 (dF x dG) s. t .
00
h(x,y) =
(1)
La .. f.(x)g.(y)
. . 1
1J
1,J=
1
in L2 (dF x dG)
1
where
00
1<
L la..
1J
(2)
00
i,j=1
and whel'e the
"
f i 's
fom an orthonomaZ set in L2 (dF) ,
an orthonomaZ set in
L2 (dG), and for
{(~t'
and the
nt)}t€T(r)' F, and G
as above
Eh(X,Y)
(3)
=
J
r Q(dr;h)
(0;1]
where for fixed h Q is right continuous and of bounded variation
and is given by
Q(r;h) =
(4)
Proof.
L
t€T(r)
J J h(x'Y)~t(x)nt(y)dF(x)dG(y).
By Chesson's Theorem since f i € L2(dF), gi € LZ(dG),
E f.(X)g.(Y)
1
1
=
J
(0,1]
•
Note that
r Q(dr. f.g.).
'
gi' s
1 1
147
00
Eh(X,Y) = E
(5)
I
i,j=l
a .. f. (X)g. (Y),
1J
1
J
and observe that by the Schwarz Inequality and (2)
E
I I .a..
1J
f. (X)g. (Y) I
1
J
=I la..
1
1J
<00.
Thus the expectation and summation signs in (5) may be interchanged
to give
00
Eh(X,Y) =
=
I a .. E[f.1eX) gJ. (Y) ]
i ,j=l 1J
n
a .. E[f (X)gj (Y)]
lim
i
n-+oo i+j=2 1J
r
n
= lim
r Q(dr; fig j )
L a ..
n-+oo i+j=2 1J
(0,1]
I
= lim
n-+oo
I
r
~(dr;
h)
(0,1]
where
n
(6)
Qn(r;h)
=
I
i+j=2
a ..
1J
Q(r; fig j ).
We shall first show that as n-+
to a lilnit, denoted by Q(r;h),
oo
~(r;h)
converges for all r
and then that the total variation of
•
148
~
that
is botmded for all
by some constant M.
Together these imply
Q is of botmded variation and that (3) holds (Rudin, 1964, pp.
154-155).
Last we shall show that the convergence of
tmiform in
•
n
0 < r ~ 1,
r,
and that
Q(r;h)
~(r;h)
is
is right continuous .
From Chesson's Theorem and (6), we have
n
L
a..
L E[f1· (X)E;t(X)]E[gl' (Y)nt(Y)]
1J t€T(r)
i+j=2
for all
(7)
n
and
L
r.
L
i+j=2
t€T(r)
By the Schwarz Inequality ,(2), and the ortho-
Ia..
1J
00
~
~
L
t€T(r)
(I
i+j=2
L
00
(
t .
1·
1
L Ia, . IE[f. (X) E;t (X)]
i+j=2
.
E [f. (X) E;t (X) ]E[g. (Y) n (Y)]
i+j=2
1J
1
I
E[g. (Y) nt (Y) ]
1
2
2
la .. \ E [f.(X)E;t(X)])1/
t€T(r) 1J
1
.
L
L Ia .. IE2 [g. (Y) nt (Y) ] )1/
t€T(r)
r
1J
2
J
2)
~ ( 00 Ia. , I Ef~ (X) ) 1I 2 (00L Ia .. IEg . (Y) 1/2= 00L Ia .. I
i,j=1 1J
1
i,j=1 1J
J
l,j=1 1J
Also
~
. , 1
1,J=
~
•
Thus
ffI
a .. f.(x)E;t(x)g.(y)nt(y)ldF(x)dG(y)
1J 1
1
2
la·J,1
(Ef 1·(X)
1
"1
1,J=
y.
EE;t2(X)EgJ~(Y)Ent2(y))1/2 = ~
"1
1,J=
la .. 1 < 00 .
1J
< 00
149
f f f.(x)~t(x)g·(Y)nt(y)dF(x)dG(y)
00
lim Qn(r;h) = I
L a ..
n-+«>
i,j=l t€T(r) 1)
=
f
I
t€T(r) i,j=l
1
If
a .. f.
1)
1
e
)
(X)~t(x)g.) (y)nt(y)dF(x)dG(y)
00
=
=
as desired.
L
t€T(r) J
L a.·
J i,j=l
1)
f. (x)~t(x)g. (y)nt(y)dF(x)dG(y)
1
)
I f h(x'Y)~t(x)nt(y)dF(x)dG(y)
I
t€T(r)
= Q(r;h),
Note that by (7) the series is absolutely convergent.
The total variation of
is given by the following expression
~
{(rk-l,rk)]~l' 0 = r O ~
and where we take ~(O;h) =
where the sup is taken over all partitions
rl
~
.
:S 1"18
= 1,
18
= 1,2,...
~(O+; h)
n
18
= sup
:S
sup
L I
I
k=l
i,j=2
18
n
I
I
i+j=2
k=l
a .. [Q(rt;g·g·) - Q(rt_l;f·g·)]1
1)
1 )
1 )
la··1 IQ(rk;f.g.) - Q(rk l;f·g·)1
1)
1 )
1)
n
=
I
i,j=l
la··1 V(Q(·; f.g.))
From Chesson, Q(r; fg)
1)
1 )
may be written as the difference of two
increasing functions of r,
as follows (where P(O,r]
is the
orthogonal projection from H onto Mr ):
•
150
Q(r; fg) =
<
P(O,r] (f(X),-g(Y)), (f(X),-g(Y)) >,
- <
•
P(O,r] (f(X),g(Y)), (f(X), g(Y)) >
giving for the total variation
,
V(Q(·; fg)) =
<
P(O,l](f(X) ,g(Y)), (f(X), g(Y)) >
+
<
P(O,l] (f(X),-g(Y)),(f(X),-g(Y)) >
Thus
n
I
V(O ) ~ 2
la.. I
~
i,j=l 1J
CXl
s2
I la .. 1 <
i,j=l
1J
CXl
as desired.
In showing that Q(r;h)
converges uniformly in r,
~(r;h)
for n
is right continuous, we first show that
<
° < r s 1.
As
in (7) we have,
m,
I~(r;h)
- Qm(r;h)\
m
•
=
I .. I 1a..
Q(r;f.g.)\
1J
1+J=n+ 1J
=
I
m
I a.. I E[f. (X)E;t(X)]E[g. (Y)nt(Y)] I
i+j=n+1 1J t€T(r) 1
J
151
~
m
I
I
t€T(r)
i+j=n+1
~
(I
i+j=n+1
la .. E[f.(X)~t(X)] E[g.(Y)nt(Y)]I
1J
1
J
la .. 1 E2 [f.
t€T(r) 1J
1
I
m
. ( ..I
1+J=n+1
m
L Ia ij I
t€T(r)
f~(X))1/2(
~ ( I.
la··1 E 1
i+j=n+l 1J
(X)~t(X)])1/2
.
2
E [gj (Y) nt (Y) ]
) 1/2
g~(Y))1/2
T
la .. 1 E
1J
J
i+j=n+1
m
=
~
and by (2)
wise to Q,
Q(r; fg)
L la..
\,
1J
..
1
1+J=n+
is unifonn1y Cauchy.
~
Then since
it also converges tmiform1y to
Q.
is right continuous; that is, for
0
converges point-
By Chesson's Theorem,
< r
•
O S 1,
lim Q(r; fg) = Q(r O; fg) .
O
r-..r +
This implies that
lim Q (r; h) = ~(rO; h)
O
r-..r + n
~(r;
since
h)
is a finite sum of tenns of the fonn Q(r; fg)
then, by the tmifonn convergence of
~
to
Q,
and
that (Rudin, 1964,
p. 135)
lim Q(r; h)
r-..r O+
= Q(rO;
h).
•
Thus
Q is right continuous.
0
152
Note that (1,1) is an eigenvector of B with corresponding eigenvalue 1.
.
Also since Mr is based on the spectral decomposition of
in that Mr = P(O,r]H, (1,1) € Ml a Ml _.· Further since the only
restriction on {(~t(X) nt(Y))}t€T(r) is that it be an orthonormal
basis for Mr ,
chosen as
0
<
for some set S(l).
Theorem,
n M
r>O r
r
~
1, we have that
{(~t'
B
nt)}t€T(l) may be
Finally note that by part (i) of Chesson's
= {a} and thus Q(O+) = O.
Now consider a bivariate distribution for X and Y which is
symmetric in its arguments and such that E[f(X)f(Y)] ~ 0 for all f
such that E f 2 (X) = E f2(y) < 00. (Note that the synmetry of the
bivariate distribution implies of course that F
shown that if
~t
{(~t(X),
= G.) It will
be
is a basis for Mr then
Thus if h in addition to satisfying
nt(Y))}t€T(r)
= nt a.e. (dF), t € T(r).
(1) is a covariance function,
Q(r; h)
as given by (4) is a stun of
quadratic forms.
Proposition A.3. Asswne that the bivariate distribution of.
X and
is symmetric in its arawnents and such that E[f(X)f(Y)]
for all
f such that
E f2 (X) <
00.
Then everly elem~nt,
of M, 0 < r ~ 1 is s.uchthat f
r
common dist'I'ibution .of . X and Y.
Proof.
»
B :
1
Assume that
H -+ H by
~
0
(f(X) ,g(Y)), say,
= g a.s. (dF)
where F
(f(X), g(Y)) € H and define the operator
is the
Y
153
Bl(f(X),g(Y))
(g(X), g(Y)).
=
Note that tmder the hypothesis of synmetry Bl is well-defined on H
and is in fact an isomorphism. We shall use this operator Bl in
showing
(f(X), g(Y))
shown that Bl
E(g(Y)IX)_ Then
be
Mr implies (g(X), fey) E Mr - First it will
commutes with B. Let fO(Y) = E(f(X)IY), gO(X) =
E
B B1 (f(X) , g(Y))
=
B(g(X) , f(y))
= (f (X), go (Y))
O
= B (go (X), f 0 (Y) )
l
=
BIB(f(X), g(Y))-
Thus Bl commutes with B and hence with P(O,r]
Nowassume (f(X), g(Y)) E Mr - Then
(g(X), fey))
=
for all positive r_
Bl (f(X), g(Y))
= Bl P(O,r] (f(X), g(Y))
=
P(O,r] BI(f(X), g(Y))
= P (0, r]
implies
(g(X), fey))
((f-g) (X) , (g-f)(Y))
€
(g (X), f (Y) )
is also in Mr " Since Mr is a subspace,
Mr - Part (iv) of Chesson's Theorem then
implies
E[(f-g)(X)(g-f)(Y)]
~
0
.
154
with equality iff f = g a.s. (dF).
E[(f-g)(X)(f-g)(Y)]
It
Now by hypothesis
~
o.
follows that
..
E[(f-g)(X)(f-g)(Y)] = 0
and thus f = g a.s. (dF).
Proposition A.4.
0
If the bivariate distribution for X and
Y satisfies
the hypothesis of Proposition A.3 where F is the oommon distl'ibution
of X and
Y and if
is suoh that
h
00
h(x,y) =
I
. 1
1=
a. f.(x)f.(y)
in
111
LZ(dF x dF)
00
where a.
1
•
~
0,
L
i=l,Z, ... ,
orthonol'maZ set in LZ(dF),
i=l
a. <
00,
1
then Q(r; h)
and the f i ' s
fol'm an
is a nondeoreasing
function of r.
Proof.
By Theorem A.2 and Proposition A.3,
Q(r; h) =
=
where
Iff
t€T(r)
Iff
t€T(r)
{(~t(X), ~t(Y))}t€T(r)
~t(X)~t(y)dF(x)dF(y)
h(x,y)
I
i=l
a.f.(x)f·(Y)~t(X)~t(y)dF(x)dF(y)
1 1
1
is an orthonormal basis for Mr.
arguments similar to those in the proof of Theorem A. Z,
change the integral and summation signs to give
Q(r;h) =
I
00
i=l
a.
1
I
t€T(r)
f
[f.(x)~t(x)dF(x)] Z
1
we
By
may inter-
155
Define for 0 s r s 1 the following subspaces of LZ (dF):
Nr = {f:
(f(X), fey)) € Mr }.
(By Proposition A.3,
Mr is of this form.) Note that if
{ ;t(X) , ;t(Y)}tET(r) is a complete orthonormal set (CONS) in Mr ,
then {;t}tET(r) is a CONS in Nr . This may be seen as follows.
in Nr . Then
Consider an element f
(f(X), fey)) € Mr ,
,
and for
all t € T(r),
s (f(X), fey)), (;t(X), ;t(Y)) > M
r
= < f, ;t
Now if E[f(X);t(X)]
•
>
N
r
= 0 for all t € T(r) , then
(;t (X), ;t (Y)) > =0 for all t € T(r).
<
(f(X), fcY)),
Then since {;t (X), ;t (Y)}t€T(r)
is a CONS in Mr , (f(X), fey)) = 0, as an element of Mr. This in
turn implies f = 0, as desired. Thus {;t}t€T(r) is a CONS in Nr .
0 < r ~ 1,
Let f.Ir denote the projection of f.I onto N,
r
i = 1,2,...
Then
00
Q(r;h)
where
II f II
Theorem,
=
is the nonn of f
n
r>r'
M
r
= M"
r
L
i=l
in
a.
I
II fir II
L (dF) .
2
2
By part (i) of Chesson's
0 s r' < 1, which implies
n
r>r'
N
r
=N,
r
156
which in turn implies
since the a i
in r.
0
I Ifirl I
>
I Ifir,1 I
,Os r' < r s 1.
are nonnegative, we have that Q(r;h)
Finally
is nondecreasing
BIBLIOGRAPHY
Berman, S. (1978). Gaussian processes with biconvex covariances.
Journal of Multivariate Analysis 8: 30-44.
Billingsley, P. (1968).
Wiley, New York.
Convergence of Probability Measures.
Cambanis, S. (1973). Representation of stochastic processes of
second order and linear operations. Journal of Mathematical
Analysis and Applications 41: 603-620.
(1975). The measurability of a stochastic process of
second order and its linear space. Proceedings of the American
Mathematical Society 47: 467-475.
, G. Simons, and W. Stout (1976). Inequalities for
Z. Wahrscheinlichkeitstheorie
verw. Gebiete 36: 285-294.
--~E~k~(=X~,~Y~)- when the margina1s are fixed.
Chesson, P. (1976). The canonical decomposition of bivariate distributions. Journal of Multivariate Analysis 6: 526-537.
Cochran, W. (1946). Relative accuracy of systematic and stratified
random samples for a certain class of populations. Annals of
Mathematical Statistics 17: 164-177.
-------- (1963). Sampling Techniques, 2nd ed. Wiley, New York.
Dane1ius , T., J. Hijek and S. Zubrzycki (1961). On plane sampling and
related geometrical problems, in Proceedings of the Fourth
Berkeley Symposium, Vol. 1, pp. 125-150, University of California
Press, Berkeley.
Hansen, M. and W. Hurwitz (1949). On the determination of optimum
probabilities in sampling. Annals of Mathematical Statistics 20:
426-432.
Jowett, G. (1952). TIle accuracy of systematic sampling from conveyor
belts. Applied Statistics 1: 50-59.
Madow, W. and L. Madow (1944). On the theory of systematic sampling I.
AnnaZs of MathematicaZ Statistics 15: 1-24.
"
Quenoui11e, M. (1949). Problems in plane sampling.
Mathematical Statistics 20: 355-375.
Annals of
158
Rudin, w. (1964). PrincipZes of MathematicaZ
McGraw-Hill, New York.
(1966).
AnaZysis~
ReaZ and CompZex AnaZysis.
2nd ed.
McGraw-Hill, New
York.
Sacks, J. and D. Y1visaker (1966). Designs for regression problems
with correlated errors. AnnaZs of MathematicaZ Statistics 37:
66-89.
(1968). Designs for regression problems with correlated
errors; many parameters. AnnaZs of MathematicaZ Statistics 39:
49-69.
(1970a). Designs for regression problems with correlated
errors III. AnnaZs of MathematicaZ Statistics 41: 2057-2074.
(1970b). Statistical designs and integral approximation,
in Proceedings of the TweZth BienniaZ Seminar of the Canadian
MathematicaZ Congress~ pp. 115-136, Canadian Mathematical Congress,
Mmtrea1.
Smith, T. (1976). The fOlmdation of survey sampling: A review. (with
discussion) JournaZ of the RoyaZ StatisticaZ Society A 139:
183-204.
Tubilla, A. (1975). Error convergence rates for estimates of multidimensional integrals of random furt~tions. Technical Report No.
72, Stanford Department of Statistics, Stanford.
Williams, R. (1956). The variance of the mean of systematic samples.
Biometrika 43: 137-148.
Yates, F. (1949). Systematic sampling. PhiZosophicaZ Transactions
of the RoyaZ Society A 241: 345-377.
Ylvisaker, D. (1975). Designs on random fields, in A Survey of
StatisticaZ Design and Linear ModeZs~ pp. 593-607, North-Holland,
Amsterdam.
Zubrzycki, S. (1958). Remarks on random, stratified, and systematic
sampling in a plane. CoZZoquiwn Mathematicwn VI: 251-264.
UNCLASSIFIED
SECURITY CLASSIFICATION OF THIS PAGE (WIl.n 0.,. Enl.,.d)
READ INSTRUCTIONS
BEFORE COMPLETING FORM
REPORT DOCUMENTAJION PAGE
2. GOVT ACCESSION NO. 3.
1. REPORT NUMBER
RECIPIENT'S CATALOG NUMBER
I
4.
5. TYPE OF REPOIU 6 PERIOD COVERED
TITLE (end Subtltl.)
Random Designs for Estimating Integrals of
Stochastic Processes
TECHNICAL
I. PERFORMING O""G. REPORT NUMBER
Mimeo Series No. 1201
7.
I. CONTRACT OR GRANT NUMBER(.)
AUTHOR(.)
Carol Schoenfelder
9.
AFOSR-75-2796
10. PROGRAM ELEMENT. PROJECT. TASK
AREA 6 WORK UNIT NUMBERS
PERFORMING ORGANIZATION NAME AND ADDRESS
Department of Statistics
University of North Carolina
Chape I Hi11 , North Carolina 27514
II.
U.
CONTROLLING OFF'ICE NAME AND ADDRESS
12. REPORT DATE
Air Force Office of Scientific Research
Bolling AFB, Washington, D.C.
13. NUMBER OF PAGES
MONITORING A<1ENCY N4ME 6 ADDRESS(1t dlll.,.nt from COI""d"n, OWe.)
IS. SECURITY CLASS. (o/lhl. r.port)
,
November, 1978
158
UNCLASSIFIED
ISa. DECL ASSI FICATION/ DOWNGRADING
SCHEDULE
16.
DISTRIBUTION STATEMENT (of Ihle R.po,t)
Approved for Public Release:
17.
Distribution Unlimited
DISTRIBUTION ST4TEMENT (01 the ab.tract .n/n.d In Bfod 20, fI dflle,enl "-Report)
'I. SUPPLEMENTARY NOTES
19.
KEY WOROS (Conllnu.
Oil
r.v.,,,• • Id. /I n.c ....ary MId Identlly by bloclr numb.,)
Random designs; Second order processes; Simple, stratified and systematic
sampling; Asymptotically optimal desings.
20.
4eSTRACT (Continue on ,.v.,••• Id. It n.c••••ry end Id.ntfly by bloclr nUlIlb.,)
The integral I = fAZ(t)~(t)dt of a second order stochastic process Z over a
d-dimensional domain A is estimated in the mean square sense by a weighted linea~
combination Zn = ln r~1= I cn (X.1n )Z(X.1n) of n observations of Z at random sample
points Xln' ... 'X which are possibly dependent random variables taking values iT
A and are indepeHaent of Z.
Results are obtained on the asymptotic behavior of the mean square error
E(I-Zn )2 for certain general random designs, and in particular for simple random,
DO
FORM
, JAN 73
1473
EOITION OF , NOV 65 IS OB.OLlETE
UNCIMSIFIED
SECURITY CI.ASSIFICATION 0' THIS "AGE (Wlten V.t. I!nt.,.d)
20. Abstract (continued)
stratified, and systematic sampling designs. Under appropriate assumptions an
optimal simple random design for each fixed sample size as well as an asymptotically optimal sequence of stratified designs are obtained. The mean square
errors under random, stratified, and systematic sampling designs are compared
for fixed sample size.. Finally a nonrandom design, median sampling, is
considered.
e .,
.
SECU,.ITY CLASIl ..'CATIOIi 01" T ..... ""GE(When D .., .. finle, "
© Copyright 2026 Paperzz