Marron, James StephenUniform consistency of a Cross-Validated Density Estimator."

.
UNIFORM CONSISTENCY OF A CROSS-VALIDATED
DENSITY ESTI~~TOR
by
James Stephen Marron
University of North Carolina
Chapel Hill, North Carolina
,
AMS 1980 Subject Classification:
Key Words and Phrases:
Primary 62GOS, secondary 62G20
Nonparametric density estimation, kernel estimator,
cross-validation.
ABSTRACT
In the problem of nonparametric estimation of a probability density,
kernel estimators are considered.
Uniform consistency is established when
the bandwidth is chosen by a version of cross-validation.
This estimator
has been shown in Marron (1983a) to have excellent mean integrated square
error properties.
1.
INTRODUCTION
Consider the problem of estimating a univariate probability density, f,
using a sample Xl' ... ,Xn from f.
A very popular estimator
estimator" defined as follows .
the "kernel
1S
Given a "kernel function", K, and a
"bandwidth", h > 0, let
1 n
x-Xi
f(x,h) = - L K(---,:-) .
nh i=l
n
(1.1)
There are many results giving various asymptotic properties of f(x,h).
In
particular, it is well known that if h = hen) is to be chosen deterministically in an optimal fashion,
of f.
then h must depend on the unknown "smootlmess"
In Marron (1983a), a data-based means of choosing h is proposed.
This
is seen to have the same optimal asymptotic properties as the deterministic
choice of h, but does not make use of the smootlmess of f.
The estimator of Marron (1983a) is defined as follows.
For j=l, ... ,n
define the "leave one out" estimators,
A
1
f j (x,h) = (n-1)h
(1.2)
A+
x-X.
.~.K(T)
IT]
'"
f. (x,h) = max(f. (x,h) ,0)
]
]
Find an interval [a,b] on which f is known to be bounded above
(1.3)
p(y) =
f~ ~
K(¥)dx
Now define the "estimated likelihood",
(1.4)
'"
L(h)
n",
IT f~(X. ,h)
j=l] J
l[
a,
b](X,) -p(X.)
] e
]
0
and define
-2-
and take h to maximize L(h).
This choice of h is a modification of the
technique of cross-validation introduced by Habbema, Hermans and van den
Broeck (1974).
This particular modification is heuristically motivated in
Marron (1983a).
The error criterion that is optimized by the estimator f(x,h) is a
particular version of the Mean Integrated Square Error, defined by
(1.5)
In Marron (1983a) the proofs that are given require that maximization
of L(h) be performed over a restricted range of h's, which is determined
asymptotically.
This can be disturbing to the experimenter who has a fixed
sample size and hence does not know the appropriate range of h's.
paper, it is seen that the cross-validated
In this
h lies in this range. Hence, for
sufficiently large sample size, the experimenter need only maximize L(h)
over h > O.
Another result of this paper is that the estimator f(x,h) is consistent
uniformly over the entire real line.
This is comforting to the experimenter
because, unlike the integral norm (1.5), the uniform norm guarantees that the
estimator f(x,h) is well-behaved both at each individual point and also for x
not in the interval [a,b].
2.
ASSUMPTIONS AND THEOREMS
(f.1)
(f.2)
(f.3)
-3-
(K.1)
IK(x)dx
(K.2)
K is bounded and supported inside [-1,1],
(K.3)
K is of bounded variation
(K.4)
K is lUlifonn1y continuous, and letting w(u) denote the
square root of the modulus of continuity,
=
1 ,
I6[-log
u]~dw(u)
<
00
•
The theorems of this paper can now be stated.
Theorem 1: Assume (f.1)-(f.3) and (K.1)-(K.4).
If h
=
hen) is any sequence
of maxima of L(h), then
-1
lim lim P [h < cn 2y+ 1 ]
c+O
Theorem 2:
n
Under the assumptions of Theorem 1,
A
h -+ 0
"e
= 0 .
a.s.
Note that a consequence of Theorem 1, Theorem 2, and Theorem A of
Silvennan (1978) is the unifonn consistency result:
Theorem 3:
Under the assumptions of Theorem 1,
sUPlt(x,B)-f(x)
I
-+
0
in probability.
XE:ffi
3•
PROOF OF TI-IEOREM 1
Using the familiar (see, for example, Rosenblatt (1971) or (3.6) of
Marron (1983a)) variance and bias 2 decomposition of (1.5) it is easily seen
that
~rrSE = ~(IK(u)2du)+o(~) + I~[JK(U)f(Y-hu)du-f(y)]2 ~ dy
It will be convenient to denote the bias 2 tenn by
-4~
Note that by (f.2), (f.3), (K.l) and (K.2), as h
0
+
2y
sf(h) = O(h )
The above may be summarized as
f 2du)
MISE = b-a
nh (K(u)
(3.1)
C
= i1h
+ 0
1
(nh)
+
+ 0 (h
1
o(nh)
2y
) ,
for a constant C.
For sequences {a } and {b } it will be convenient to let the phrase
n
n
"h = hen) is between an and bn " mean:
lim a h- l = 0 and lim b h- l =
n+oo n
n+oo n
00
It will also be useful to define, for j=l, ... ,n,
~. (X. ,h)-f(X.)
f... = J
(3.2)
J
J
f..:=
J
f(X.)
"'+
f.
(X. ,h) -f(X.)
J
J
f(X.)
J
J
J
J
By Theorem A of Silverman (1978), for h between n-llog n and 1,
supl~(x,h)-f(x) I
(3.3)
+ 0
a.s.
x
But, by (1.1), (1.2) and (K.2), uniformly in x and j,
(3.4)
1
x-X.
1
x-X.
1
f j (x,h) -f(x,h) = n(n-1)h ). K(~) K(~) = OeM) .
nn
ltJ
-1
Hence, by (f _2), for h between n log n and 1,
sup
11 [
b] (x·)f..:I:o.;
sup
11 [
b] (x·)f..·1
j=l, ... ,n a,
J J
j=l, ... ,n a,
J J
in probability.
(3.5)
It
+
0 ,
Now, for n=1,2,_ .. define the event
+
Un = U[ a, b] (X.)f...
= 1[ a, b] (X.)f...
J J
J J
-}
for j - 1,_ .. ,n
follows from the above that for h between n -1 log n and 1,
-5-
It will be convenient to define (analogous to (1.4))
n
1
L = II (f(X.)e- l ) [a,b]
j=l
J
(X.)
J
Note that maximizing L(h) by choice of h is the same as maximizing n-llog(L(h)/L).
-1
From the above it follows that for h between n log n and 1, on the event U ,
n
-1
A
n 10g(L(h)/L) = n
=n
(3.6)
=n
n
-1 \
L [l[
j=l
a,
b](X.)10g(1+6.)-P(X.)+1[ b](X.)]
J
J
J
a,
J
-1 n
2
2
L [l[ b](X.)(1+6.-~6. + 0 (6.))-p(X.)]
j=l a,
J
J
J
P J
J
-1 n
L [l[ b](X.)(1+6.- p (X.))]-(2n)
j=l
a,
+ 0 (n
P
J
J
=
-1 n
J
2
L l[ b](X.)6. +
j=l a,
J
J
-1 n
2
L
l[ b](X.)6.) •
j=l a,
J J
The terms of this expansion may be handled by the following Lemmas:
Lerrnna 1.1:
For h between n -1 and 1,
n
Lerrnna 1. 2:
-1 n
L [l[
j=l
a,
For h between n
n
b](X.)(1+6.- p (X.))] = 0 (~rrSE) .
J
-1
J
J
P
and 1,
-1 n
L l[ b] (X.)6.
= MITSE
J J
j=l a,
+ 0
P
(MISE)
Lemma 1.1 is a consequence of (3.1) and Lemma 1 of Marron (1983a).
Lemma
1.2 is Theorem 2 of Marron (1983b) in the special case d=l, w(x) = l[a,b] (x)f(x)-Z
Theorem 1 would follow from Lemma 1.1, Lemma 1. 2 and (3.1) if 11 were
restricted to be between n- 1 log n and 00. The fact that this is really no
restriction at all is a consequence of (3.1) and the following lemmas.
Lemma 1.3:
There exists E > 0 so that, for h
lim P[L(h)=O] = 1 .
n~
~
-1
En log n,
-6-
Lenuna 1.4:
There exists 6
>
°so that for En
-1
A
lim P[n 10g(L(h)/L)
n-+OO
These lerrnnas will now be established.
~
-1
-1
log n
-6 •
~rrSE]
h
~
~
n
2y+l
= 1.
Proof of Lerrnna 1.3
This pr.oof is based on an order statistics result of Cheng (1983) and
is very similar to the proof of (ii) in Lemma 1.1 of Chow, Geman and Wu (1983).
It will be convenient to define the set
A = {j=l, ... ,n:X/ [a,b]} ,
and to let N denote the cardinality of A.
Note that N is Binomial(n,p) where
p = P[X.1 E[a,b]] = Jba f(x)dx.
(3.7)
Next, let X(l)" ,.,X(N) denote the order statistics of {Xj :jEA}. Define
X(o) = a, X(N+l) = b. By (1.2), (1.4) and (K.2) a sufficient condition for
L(h) =
° is:
for some j=l, ... ,N, both X(j)-X(j_l) > h and X(j+l)-X(j) > h.
Hence, for L(h) = 0, it is sufficient that
j=l~~~.,N min(X(j)-X(j_l),X(j+l)-X(j))
h .
>
Now let F denote the c.d.f. of X. conditioned on X. E [a,b].
J
J
By (f.l)
and (f.2) there is a constant a > 1 so that, for xE(a,b),
a-I < F' ex) < a
Thus, for L(h) =
0_
max
J-l, ... ,N
° it is sufficient that
min(F(X Cj ))- F(X Cj _l ))' F(X(j+l)) - FCX(j))) > ah
But conditioned on A, {F(X.):jEA} have the same distribution as N independent
J
Uniform (0,1) random variables.
With this in mind, let D(l)" .. ,D ) denote
Cm
the order statistics of an iid Uniform (0,1) sample. Let D ) = 0, D(m+l) = 1.
CO
Define
M
m
=
max
j=l, ... ,m
-7-
Theorem 4.8 of Cheng (1983) is:
2mM
P[lim ~ = 1] = 1.
Jn700 log m
It follows from the above that there is an E > 0 so that for h
~
-1
En log n,
lim P[L(h)=OIN~np/2] ~ lim P[M N > ahIN~np/2] = 1.
J1+OO
n~
Lemma 1.3 is now an easy consequence of the fact that N is a Binomia1(n,p)
variable.
Proof of Lemma 1.4
Suppose E >
0
is given.
Using the proof of Theorem
A
of Silverman (1978),
it can easily be shown that there is a constant B > 0 so that, for
1
En- log n ~ h ~
-1
2y+1
n
a.s.
lim
sup If(x,h)-f(x)! < B
n xE[a,b]
Hence by (f.2), (3.2) and (3.4), there is a constant B' > 0 so that, for
.
-1
-1
2y+1
En log n ~ h ~ n
,
lim
sup
11 [ b] (X.) LL I < B'
"-I , ... ,n
a,
J J
n J-
a. s .
Next observe that, by (3.5), (3.6) and Lernma 1.1, for h between n -1 and
1, on the event U ,
n
~10g(LL(h))
n
=
I
~ 1[ b](X')
n j=l a,
J
[log(l+~.)-~.]
+ 0p(MISE)
J
J
.
By calculus it is readily verified that there is a <5 > 0 so that, for yE(-l,B'),
10g(1+y)-y ~ _<5y2
Thus, for h between n
-1
and 1, on the event {
sup
J. --1 , •.. ,n
\1 [ b] (X·)~·I <B'}n U ,
a,
J
J
n
1
L(h))<
-1 n
2
- log(-L-- - -on I 1 [ b] (X.)~. + 0 (MISE) .
n
j=l a,
J J
p
Lemma 1. 4 is now a consequence of Lenuna 1. 2 and the fact that L(h) = 0 on the
-8-
complement of U.
n
4.
This completes the proof of theorem 1.
PROOF OF THEOREM 2
This proof uses techniques developed by Chow, Geman and Wu (1983).
details of the proof here are quite different for two reasons.
The
First,
assumption (f.2) avoids many of the difficulties encountered by Chow, Geman
and Wu.
Second, complications arise here from allowing the kernel, K, to
assume negative values.
Note that theorem 2 will be established when it is shown that, for any
-
sup
(4.1)
-1
-1
A
A
lim n log L(h) < lim lim n log L(h) ,
h > h
o
h-+O
n
n
a.s.
As in Chow, Geman and Wu, measurability difficulties are avoided by assuming
that the probability measure, P, is complete and noting that all statements
are made with probability 0 or 1.
Given a
>
0, it will be convenient to define
11 K(~)f(y)dy
x-v
= IK(u)f(x-hu)du,
fh(x) = Ef(x,h) = h
A
(4.2)
and for j=l, ... ,n to define
(4.3)
f~(x,h) = max(t. (x,h) ,a)
J
J
In the same spirit as (3.7) define
(4.4)
Ph =
J~fh(x)dx,
Ph =
I~fh(x)dx
Now given ho > 0, define:
.
-9-
(4.5)
HI = {h
~
ho :
H = {h
2
~
h :
H3 = {h
~
-a ~
0
ho :
~
inf fh(x)
xE:[a,b]
a} ,
inf fh(x) <
xE:[a,b]
inf fh(x) < xE:[a,b]
a} ,
a} •
Note that (4.1), and hence theorem 2, is a consequence of the following
lerrunas.
Lemma 2.1:
For a sufficiently small,
lim lim n-llog L(h) = fbf(x)log f(x)dx-p
n
h+0
a
a. s. ,
where p was defined in (3.7).
Lerruna 2.2
~ ~
Lemma 2.3:
For a sufficiently small,
sup --lim n -1 log L(h) < fbaf(x)log f(x)dx-p
h E: H
n
2
A
a.s.
Lemma 2.4
sup .,.......-11m n -1 log L(h) < fba f(x)log f(x)dx-p
h E: H
n
3
A
a.s.
Before these lemmas are proved, three lemmas which will be useful in
several of the proofs will be stated.
Lemma 2.5:
Given hI
sup
h > h
Lemma 2.6:
In-
sup Inh > 0
0, as n
+
00
1 n
L l[ b](X.)log
a,
j=l
1
As n
>
+
00
,
1 n
L p(X.)
j=l
J
b
f~(X.,h)-f f(x)log f~(x)dxl
J J
a
J
- Phi
+
0
a.s.
+
0 a.s.
-10-
Lemma 2. 7: Given a measurable set S (on which f is bOlmded above 0) and an
integrable, nonnegative function g(x),
I s f(x)log
fs g(x)dx
g(x)dx -
fs f(x)log
~
with equality if and only if f(x) = g(x) a.e. on S.
is a constant
f(x)dx -
fs f(x)dx,
Furthermore, if there
> 0, so that
~
fsg(x)dx
~ fsf(x)dx - ~ ,
. then the above inequality may be sharpened to:
f s f(x)log
g(x)dx -
l
f(x)dx - Js f(x)dx-~2[2J s fex)dxr .
I s g(x)dx ~ Js fex)log
Since Lemmas 2.5, 2.6 and 2.7 are used in the proof of the other Lemmas,
they will be proved
fir~t ..
Proof of Lemma 2.5
First, for j=l, ... ,n define the empirical distribution functions
n
F(x)
= n-l.I
1(_00
1=1
i\ex ) =
and define
X.] (x)
,
' 1
(n-l)-\~jl(_oo,Xi]eX)
X.-y
K*(y) =
if(+)
Note that, by (K.3), K* is of bounded variation uniformly over h > hI and
over j=l, ... ,n.
Using (1.2) and integration by parts, for h > hI and j=l, ... ,n,
If.J (X.J ,h)
- fh(X.)
J
x. -y
I
= IJ~(+)d(Fj -F)(y)
I
=
= II[Fj(y)-F(y)]dK*(y)
I
~
~ JldK*(y)I· sup IF.(x)-F(x)I ~
x
~
IldK*(y) I[n-
1
J
A
+
supIF(x)-F(x) I].
x
It
follows from the above that
(4.6)
sup
sup
h > hI j =1, .•• ,n
and hence that
If.ex.,h)-fhex .)I
J
J
J
-+
0
a.s.,
-11sup
sup
If~(Xo,h)-f~(X.)
h > hI j=l, ... ,n J J
J
I~
0
a.s.,
and so,
(4.7)
sup In -1 ~L l[ b](X.)log f~(X.,h)-n-1 ~L l[ b](X.)log fh*(X.)
h > hI
j=l a,
J
J J
j=l a,
J
J
A
I~
But now, since fh(x) is bOWlded above and below unifonnly over h > hI '
1 n
b
sup In- L 1[ b](X.)log fh*(XJo) - f f(x)log fh*(x)dx\ =
h > h1
JO-l
a,
J
a
-
(4.8)
Ifb 10g f~(x)
= sup
h >h
d(F-F) (x)
a
1
I~
a.s.
0
Lemma 2.5 follows from (4.7) and (4.8).
Proof of Lemma 2.6
Note that by (1.3), (4.2) and (4.4), for h > 0,
n
x-x.
n
.
n- 1 L p(X·)-Ph = n- 1 L fb k(~)dx - fb[f k(xh-y)f(y)dy]dx =
. 1)
0
)=
)=
a h
1
U
II·
b-X.)
= n-1
a
h
U
b -y
J Jll K(u)du - JrrnK(u)du]f(y)dy .
U
)=1 a-Xo)
a-y
II
h
But, by (K.2), K has a bOWlded antiderivative, K(-l) .
Hence,
h
sup
>
1 n
0
\n- L p(X')-Ph l
j=l
J
Lemma 2.6 follows easily from this.
Proof of Lemma 2. 7
It will be convenient to define
Q = fsf(x)dx ,
Q' =
fsg(x)dx •
Note that by Jensen's Inequality,
=
0
a.s.
-12-
Is f~) 10g[~~7c~)J
dx
~ 10grsg~~)
dxJ = 0 ,
with equality if and only if
a.e. on S .
g(x)/Q' = f(x)/Q
Hence,
f s f(x)log g (x)dx
(4.9)
I s f(x)log
-
f(x)dx ~ Q log Q' -Q log Q
By calculus it is easily verified that, for x,y > 0,
y log(x/y)
~
x-y ,
with equality if and only if x=y.
Thus,
Q log Q'-Q log Q ~ Q'-Q .
From the above it follows that
f s f(x)log
-
~
I s f(x)log
g(x)dx - Q' ~
f(x)dx - Q ,
with equality if and only if g(x) = f(x) a.e. on S.
The second part of Lemma 2.7 is established similarly.
for x
<
First, note that
y ,
y
Thus for Q'
~
Q-
10g(x/y) ~ (x-y)_(x-y)2/2y .
,
~
Q log Q'-Q log Q ~ Q' - Q -
~
2
/2Q ,
And so, by (4.9) ,
I s f(x)log
g(x)dx - Q' ~ f s f(x)log f(x)dx-Q - ~2/2Q
This completes the proof of lemma 2.7.
Proof of LelTIma 2.1
First note that, by (f.3) and (4.2), for all x
IJK(u) [f(x-hu)-f(x)]dul ~
~ MhYJluYK(u) Idu .
Hence, by (K.2),
(4.10)
lim suplfh(x)-f(x)
11+0 x
I
= 0 ,
E
R,
-13-
and so,
lim SUPlf~(x)-f(x)1 = o.
x
It now follows from (f.l), (f.2), (3.7), (4.4) and the dominated convergence
h~O
theorem that
(4.11)
lim[f~f(x)log fh+(x)dx-Ph] = fbf(x)log f(x)dx-p .
~O
a
Another consequence of (4.10) is that, by (f.2), for h and a sufficiently
small ,
inf fh(x) ;: : a ,
XE
and so, from (4.2), f h
[a, b]
= f h.
In a similar spirit, note that by (3.3) and
(3.4), for a sufficiently small,
-e
lim lim
h~O n-)<JO
inf
inf ±.(x,h) ;: : 2a
j =1 , • . . ,n
x
,
J
and so, for h sufficiently small, n sufficiently large, j=l, ... ,n,
"'+
f.
(X. ,h) = '"f~(X. ,h) .
J J
J J
(4.12)
Hence by Lemma 2.5,
-1 n
+
lim limln I l[ b](X.)log t.(X.,h)
~o n
j=l a,
J
J J
b
Jaf(x)log fh(x)dxl = 0
Lemma 2.1 follows from this, (1.4), (4.11) and Lemma 2.6.
Proof of Lemma 2.2
First define the function
(4.13)
Lemma 2.2 is a consequence of the following lemmas.
Lemma 2.2.1:
As n
~
00
,
sup In-llog L(h)-G(h) I ~ 0
h E HI
Lemma 2.2.2
lim G(h) =
~
-00
a.s.
a.s.
-14Lemma 2.2.3:
G(h) restricted to HI is continuous.
Lemma 2.2.4:
Given h 2
Lemma 2. 2. 5:
For all h
ho ' the set {h
>
~
h2: h
E
HI} is compact.
~\
E
G(h) < fbf(x) log f(x)-p
a
These lemmas will now be established.
Proof of Lemma 2.2.1
By the fact that on
that, as n -7
sup
hE HI
In-
f h :: f h and by (4.6), it follows from Lemma 2.5
~,
00,
1n
b
I l[ b](X.)log f:CX.,h) - faf(x)log
fh(x)dxl -7 0 a.s.
j=l a,
J
J J
Lemma 2.2.1 is a consequence of this together with (1.4) and Lemma 2.6.
Proof of Lemma 2.2.2
Note that by (K.2) and (4.2),
lim suplfh(x)
~
x
I = lim suplfh K(\()f(y)dy! = O.
~
x
and so
+
lim sup log fh(x)
=
h~
-00
•
x
Lemma 2.2.2 follows from this and (4.13).
Proof of Lemma 2.2.3
Note that by (4.2), (4.4) and (4.5), for h,h'
I[f baf(x)log
E
+
+
fh(x)dx-Ph]
- [fbaf(x)log fh,(x)dx-Ph']
b
HI'
I
~
b
~ faf(x) Ilog fh(x) - log fh,(x)ldx+falfh(x)-fh,(x) Idx
But, by (f.3),
~
Ifh(x)-fh,(x) I~ fIK(u)I·lf(x-hu)-f(x-hu') Idu -70,
uniformly as h -7 h' .
convergence theorem.
Lemma 2.2.3 now follows from (4.13) and the dominated
-15Proof of Lemma 2.2.4
Lemma 2.2.4 is an obvious consequence of (4.5), the continuity of fh(x)
and the fact that the given set is contained in [ho ,h 2].
Proof of Lemma 2.2.5
Lemma 2.2.5 is Lemma 2.7 in the special case S = [a,b] and g(x)
+
fh(x).
This completes the proof of Lemma 2.2.
Proof of Lemma 2.3
Note that by (4.2), Lemma 2.5, Lemma 2.6, (4.4) and (4.5),
¥
.,..--,- -1
".+
sup lim n-lL(h) = sup
11m n L [l[ b](X.)log r.(X.,h)-p(X.)]
h E: H2 n
h E: HZ
n
j =1
a,
J
J J
J
~ sup
h E: HZ
-e
lim nn
l·n
L [l[
j=l
~
b](X.)log f~(X.,h)- p(X.)] =
a,.
J
J
J
J
(4.14)
$
sup [J~f(x)log f~(x)dx-p~ + 2a(b-a)]
h E: HZ
a.s.
Now from (f.2) define
6=
inf f(x).
xE:[a,b]
Given h E: HZ' define the set
v = {xE:[a,b]:
fh(x) < Z6/3} .
Note that, by (f.3) and (4.Z), there is a constant M' > 0 so that for all
x,y E: JR.,
(4.15)
and hence,
Now take a ::;; 6/3.
above that, for h
If
E
1.1
HZ'
denotes Lebesgue measure, then it follows from the
-16(4.16)
t-l (V)
S l/y
~
2 (3M')
Next let yC = [a,b]\V, and note that by Lemma 2.7,
(4.17)
fyef(X)10 g f~(X)dx-fyefh(X)dx ~
~ fyef(X)lOg f(x)dx - fyef(X)dx .
Similarly, since
fyfh(x)dx ~ 2: t-l(V)
=
S t-l(V) -
~ t-l(V) ~
~ fyf(x)dx - ~ t-l(V)
it follows from Lemma 2.7 that
f y f (x)log fh(x)dx - fyfh(x)dx ~
~
Jy f(x)log
f(x)dx - Jyf(x)dx - ~ ,
for some constant, ~ > O. Putting this together with (4.17) yields
b
b
f a f (x)log f~(x)dx ~ J f(x)log f(x) - p - ~ .
a
Ph
Lemma 2.3 follows from this together with (4.14) and (4.16).
Proof of Lemma 2.4
Giyen h E H3 , let
V' = {xE[a,b]: fh(x)
~
-a/2}
Note that, by (4.15)
t-l(V') ~ 2(2~,)1/y .
It
follows from (f. 2) that for h E H3 '
P[{X.
J
E
v'H.o.]
where "La." means infinitely often.
e
P[{XjE [a,b] and fh(Xj )
=
Hence
~
and so by (4.6), for h E H3 '
P[{X.E[a,b] and f:(X.,h)
J
J
J
1,
-a/2H.o.]
=
O}i.o.]
Lemma 2.4 follows easily from this and (1. 4) .
=
=
1
1,
-17REFERENCES
Cheng, S.H. (1983). On a problem concerning spacings.
processes, technical report # 27.
Center for stochastic
Chow, Y.S., Geman, S. and Wu, L.D. (1983). Consistent cross-validated
density estimation. Ann. Statist. 11, 25-38.
Habbema, J.D.F., Hermans, J. and van den Broek, K. (1974). A stepwise
discrimination analysis program using density estimation. Compstat 1974:
Proceedings in computational statistics. (G. Bruckman, ed.) 101-110,
Vienna:Physica Verlag.
Marron, J.S. (1983a). An asymptotically efficient solution to the bandwidth
problem of kernel density estimation. North Carolina Institute of
Statistics, Mimeo Series #1518.
Marron, J.S. (1983b). Convergence properties of an empirical error criterion
for multivarite density estimation. North Carolina Institute of
Statistics, Mimeo Series #1520.
Rosenblatt, M. (1971).
Curve estimates.
Ann. Math. Statist.,
~,
1815-1842.
Silverman, B.W. (1978). Weak and strong uniform consistency of the kernel
estimate of a density and its derivatives. Ann. Statist., £' 177-184.