Sen, Pranab Kumar; (1982).On the Limiting Behaviour of the Empirical Kernel Distribution Function."

-e
ON THE LIM;I,TING BEHAVIOUR OF THE EMPIRICAL KERNEL
DISTRIBDTI'ON FUNCTION
By
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of
Statistic~ Mimeo
July 1982
Series No. 1405
ON THE LIMITING BEHAVIOUR OF THE EMPIRICAL KERNEL DISTRIBUTION FUNCTION *
By
PRANAB KUMAR SEN
University of North Carolina, Chapel Hill.
For an estimable parameter of degree m(
~
I), Glivenko-Cantelli lemma
type result for the empirical kernel distribution and weak convergence of
the related empirical process are studied. Some statistical applications
of these results are also considered.
AMS Subject Classifications: 60F17, 62G99
Key Words
&Phrases
: Blackman-type estimator; degree; Glivenko-Cantelli lemma;
kernel; reverse (sub-)martingales; U-statistics; weak convergence.
* Work partially supported by the National Heart, Lung and Blood Institute,
Contract NIH-NHLBI-71-2243-L from the National Institutes of Health.
1. Introduction. Let {X.,i>l}
1
-
~
N
~
_
~
~
~
~
_
be a sequence of independent and identically
_
distributed random vectors (i.i.d.r.v.) with a distribution function (d. f.)
F, defined on the real p-space EP, for some p ~ 1. Consider a functional 8(F)
of the d.f. F, for which there exists a function g(xl, ••• ,x ) , such that
m
(1 .1)
=
f .•• f g(xl, ••• ,xm)dF(xl) ••• dF(xm) ,
for every F belonging to a class
c.3<
of d.f.'s on EP • Without any loss of gene-
rality, we may assume that g(.) is a sYmmetric function of its m arguments. If
m(~
1) is the minimal sample size for which (1.1) holds, then g(Xl, ••• ,X ) is
m
called the kernel and m the degree of 8 (F), and an optimal (sYmmetric) estimator
of 8 (F) is the V-statistic [ viz., Hoeffding (1948)J
g(X. , ••• ,X. ) ; C
1
1
n,m
m
(1.2)
= {
1
1 < i < ••• < i < n} ,
- l
m-
whenever n > m. Let us assume that the kernel g (.) is real-valued and denote b y e .
p{ g(Xl, ••• ,X ) < y} , Y E E.
m We are primarily interested in the estimation of the d.f. H(y) in (1.3).Note
(1.3)
H(y)
=
that 8(F) (= fydH(y) ) is also a functional of the d.f. H. Analogous to (1.2),
we may consider the following estimator of H (to be termed the empirical kernel
distribution function (e.k.d.f.)):
(1.4)
I(g(X. , •• .,X.) <x),
1
1
1
m
-
X E
E, n::.m.
Note that like V in (1.2), H , for m _> 2, does not involve independent summands,
n
n
and hence, the classical results on the asymptotic properties of the sample
d.f. may not be directly applicable to H • Nevertheless, like the V , such
n
n
asymptotic results can be derived by using some (reverse) sub-martingale theory.
The main objective of the present study is to consider the e.k.d.f. H and the
n
1
related empirical process {n~{H (x) - H(x)} ,x
n
asymptotic behaviour.
E
E} and to study their
Section 2 is devoted to the study of the Glivenko-Cantelli type almost sure
(a.s.) convergence result on H
process
k
n 2 (H
H. The weak convergence of the empirical
-
n
- H ) is studied in Section 3. The concluding section deals
n
with some statistical applications of the results of Sections 2 and 3.
2. Glivenko-Cantelli lemma for H • Note that by (1.3) and (1.4),
---~---~~-~---------------~ n
(2.1)
EHn(x) = H(x) , for every x •
We are interested in showing that
(2.2)
IHn (x)
sup{
- H(x)
I :x
E:
E}
0 a. s., as n +
+
00
•
Towards this, let~n be the sigma-field generated by the unordered collection
{X , ••• ,X }.and X .,j > 1, for n > 1. Note that ~n is monotone nonincreasing.
1
n
n+J
/'-"
{ ~~~ IHn (x) - Hex)
Lemma 2. L
I ,,J!..'n
Proof. Note that for every x £ E,
·e
(2.3)
I /'-n+
;- .' IJ
E [ H (x)
n
~
n > m}
is a reverse sub-martingale.
m,
= (n)
-1 l:
m
C
n,m
EII(g(X.1 , •• .,X.1 ) -< x)1 /"n+
,'" 1]
1
m
Now, given ~n+
~ l ' X.1 , ••• , X.1
can be any m of the units X ,.,.,X +1 with the
1
n
m n+l -1
equal conditional probability ( m) , so that for every 1 <i < ••• < im: n,
1
1
(2.4)
E[I(g(X 1. . . . .,X.1 ) -< x)1 .r-"n+
:t'. 1 J = E[I(g(X 1 , ••• ,X)
m -<
1
m
= (n+l ) -1 l:C
I (g (X. to. OJ X. ) .::. x) = H + 1 (x) •
n
m
Jl
Jm
n+ 1 ,m
Thus, by (2.3) and (2.4), for every n
(2.5)
E [ {H (x) - H (x) },
n
Since
s~p ( .)
X E:
E
~
x)1
tJ
1]
/L n+
m,
df
I /~J
n+l J
=
{Hn+ lex) - H(x)},x
E:
E. (a.e.)
is a convex functional, (2.5) insures the reverse sub-martingale
property. Q.E.D.
Let now n* = [n/mJ be the largest integer contained in n/m , for n > m.
Also, for every n
(2.6)
~
m, let
*
-1 n*'
Hn(x) = (n*)
Li=l I(g(XCi .;.l)ii1H"."X im ) .::. x) , x
E:
Note that H* involves independent summands, and, as in (2.4),
n
(.2.7)
E{ Hn* (x)
I£.]
.
n
= Hn (x), for every -x
E:
E.
E.
4-
By Lemma 2.1, (2.7) and the Kolmogorov inequality for reverse submartingales,
E > 0, n
we obtain that for every
~
m,
sup
sup
IHN(X) _ H(x) I _> E }
N> n x E E
< E- l E{ sup IH (x) - H(x) I }
X E E
n
p{
(2.8)
I ,en] I }
< E-lE{ E[ sUP I H*(x) - H(x) I I.£n
X E E
n
P
= E-lE{ sU
IH*cX) - H(x) I }
E
X E
n.
] }
-
Now, H* - H relates to the classical case of independent summans for which the
n
results of Dvoretzky, Kiefer and Wolfowitz (1956) insure that for every r > 0,
*
1
P{(n*)Yz sup IH (x) _ H(x)
xEE
n
(2.9)
I ~>
2 2
r} < C e- r
-
, U n* ~ 1,
Ii_
where C is a finite positive constant, independent of rand n*. Since n * _ n/m,
a
as n
+
n
+
00
n*
~
1, { (n*) !,;2{H * (X)-H(x)}/{1- H(x)},x E E} is a martingale, and hence, by
n
00
(2.9) insures that the right hand side of (2.8) converges to
,
as
~.
This completes the proof of (2.2). We may also note that for every
•
the Hajek-Renyi-Chow inequality, it can be shown that the right hand side
of (2.9) may be replaced by a more crude bound r -2 , for every r > 1,
the convergence of the right hand side of (2.8) (to
_3.
a
as n
+
50
that
00) remains in tact.
1
__N_~______________
Weak
convergence of nYz(Hn - H). For the sake of simplicity, we assume that
H(x) is a continuous function of x E E and denote by H-1(t)
= inf{
O<t<l. For every n( ~ m), we then introduce a stochastic process
x: H(x) ~ t},
W = {Wn(t);
n
O<t<l} by letting
W (t)
(3.1)
n
= n
!,;2
1
{H (H- (t)) - t }, a <
n
t
< 1.
Then, W belongs to the space DIO,l]. We intend to study the weak convergence of
n
W
n
to some appropriate (tied-down) Gaussian function W = {W(t);O~t~l}. Towards
this, note that for arbitrary r(
(Ap ••• ,A )
r
,
~
,by (1.4) and (3.1),
1), 0
~
tl
<••• < t
< 1 and non-null
r-
A
=
r
~.
1 A.W (t.) =
J=
J n J
n~
r
A. [ H (H-1 (t, .))
j=l J'
n'
J
L:
1
J
J
~
= n~ {V
(3.2)
t.
n
- EV }, say,
n
where
(3.3)
¢ (X. , •• .,X. )
~l
~m
and
-1
r
(3.4)
¢(X. ,. • .,X. ) = L:. 1 A. I(g(X. , ••• ,X. ) < H (t.)) •
~l
~m
J=
J
~l
~m
-
J
Thus, V is a V-statistic and we may borrow the classical results of Hoeffding
n
(1948) to show that the right hand side of (3.2) converges in law to a normal
distribution with 0 mean and a finite variance (depending on tl, ••• ,t
m
and A ).
-
If we denote by
(3.5)
-I
I'; c (s,t) = P{ g(Xl, ••• ,X)
m -< H (s), g(Xm-c+ 1, ••• ,X 2m-c ) -<
for every (s,t)
(3.6)
E
[0,1]
2
and c=O,l, ••• ,m (note that
-1
(t) }- st,
1';0(s,t) = 0 ), then,
E{[H (H-l(s)) - s][H (H-l(t)) - t J}
n
n
= (n)-l
"m
.,. (s t) , f or every (s,t )
m
'"'c=l (m)
c (n-m)
'In-c "'c'
Note that by (3.1) and (3.6), for every (s,t)
(.3.7)
H
EW (s)W (t)
n
n
2
m 1';1 (s,t)
+
E
E
[0 , lJ2 •
2
[O,lJ , as n
+
00
,
= I';(s,t), say.
Thus, if we define a Gaussian function W=
{W(t)jO<t~l},
such that EW = 0
and the covariance function of W is given by { I';(s,t),(s,t)
E
[0,1]2}, then
from the above discussion it follows that the finite dimensional distributions
(f.d.d.) of {W}
n
converge to those of W. Further, W (0) = 0 with probsbility
n
1 and Wbelongs to the C[O,l] space, in probability. Hence, to establish the
weak convergence of {W}
n
to W, it suffices to show that {W } is tight. For
this, it suffices to show that for every 0
n
~
sl<5 <
2
s3~
1, there exist an
integer
(3.8)
[See Theorem 15.6 of Billingsley (1968), in this context.] For this, we
define W* = {W * (t)jO<t<l} as in (3.1) with H being replaced by H* • Then,
n
n
-n
n
6
(3.9)
On the other hand, W* (.) involves n* independent summands, and hence, using the
n
moment generating function of the multinomial distrubution, we obtain that
* ] 2 [ W* (s3 ) - W*()]2}
E{ [ Wn* (s2) - Wn(sl)
n
n s2
(3.10) < 5(n/n*)2 (s2- sl) (s3 - s2)
2
0~sl<s2<s3 ~l
< 5(m+l) (s2- sl)(s3 - s2) , for every
and n
~
m.
Thus, (3.8) follows from (3.9) and (3.10). Hence, we arrive at the following.
Theorem
- - 3.1. W in (3.1) converges in law to the Gaussian function Wwith EW=
n~
o and covariance function
Note that
down at t = 0
to
~(s,t)
~(s,t),
given br (3.7).
= 0 when s or t is equal to 0 or I, and hence, W is tied-
and t = 1. However, for m ~ 2, in general,
~(s,t) is not equal
min(s,t) - st , so that W is not necessarily a Brownian bridge.
4. ~~~~_~EE~~~~~~~~~' Let Xl",.,X n be n i.i.d.r.v.'s with a d.f. F(x) =
Fo ((X-~)/0) where F0 is a specified d.f. and the location and scale parameters
~
and
0
are unknown. Blackman(1955) has considered the estimation of the
location parameter (when
0
is assumed to be specified) based on the empirical
d.f. We consider here a similar estimator of
2
0
when
~
is not specified. Note
that if we let g(X.,X.) = (X. _ x.)2/2, then Eg(X.,X.) = E(Xl-~)2 = Var(X) =
1.
J
1.
J
1.
J
2
c o 0 , where c0 is a specifed positive constant and depends on the specified
d.f. Fo • We assume that F0 admits a finite variance, and, without any loss of
generality, we may set
c
o
= 1. Thus, we have a kernel of degree 2 and the
empirical kernel d.f. Hn may be defined as in (1.4) with m=2. We define H(y)
as in (1.3) and since F is specified, we may rewrite Hey) as
o
2
+
(4.1)
H(y) = H0 (y/ 0) , Y E E = [ 0, CXl),
where Ho depends on F0 and is of specified form too. Let then
~.
'7
2
00
= n fa [
+
Hn(ty) ~ HoCY) ] dHo(Y) , tEE •
2
As an estimator of 8 = 0 , we consider
which is a solution of
n
inf
(4.3)
M (8 ) =
M (t) •
(4.2)
Mn(t)
e
A
n
n
n
t
Note that if we rewrite Mn(t) as
dHo(Y) , then, for t away from 8 , Mn(t) blows up as n
to 8 , we may proceed
that as n
+
~
(4.4)
,
while, for t close
as in Pyke(1970) and through some routine steps obtain
A
n
- 8 )
+ 0
o
00
00
n 2( 8
where h
+
P
Cl)
is the density function corresponding to H and W (.) is defined as
0
in (3.1). Hence, the asymptotic normality of n
~
2
A
(
en -
n
8) can be obtained from
Theorem 3.1 and (4.4). A similar treatment holds for other Blackman-type estimators
of estimable parameters when the underlying d.f. is specified (apart from some
.~
unknown parameters).
In the context of tests of goodness of fit when some of the parameters are
unknown, an alternative procedure may be suggested as follows. Corresponding to
the unknown parameters, obtain the kernels and for these kernels, consider the
corresponding empirical kernel d.f.'s. Then, a multivariate version of Theorem 3.1
may be employed for the goodness of fit problem, using either the KolmogorovSmirnov or the Cramer-von Mises' type statistics.
The theory also can be extended to the two-sample case on parallel lines.
REFERENCES
BILLINGSLEY, P. (1968). Convergence of Probability Measures. New York: Wiley.
BLACKMAN, J.(1955). On the approximation of a distribution function by an
empiric distribution. Ann. Math. Statist. 26, 256-267.
DVORETZKY, A., KIEFER, J. and Wolfowitz, J. (1956). Asymptotic minimax character
of the sample distribution function and the classical multinomial estimator.
Ann. Math. Statist.
~,
642-669.
HOEFFDING, W. (1948). A class of statistics with asymptotically normal
distribution. Ann. Math. Statist.
~,
293-325.
PYKE, R. (1970). Asymptotic results for rank statistics. In Nonparametric
Techniques in Statistical Inference (ed:M.L.Puri), New York: Cambridge
Univ. Press, pp.2l-37o

Download Report

Sen, Pranab Kumar; (1982).On the Limiting Behaviour of the Empirical Kernel Distribution Function."

Paperzz.com

Your Paperzz