ON THE RATES OF CONVERGENCE IN DISTRIBUTION OF FIRST ORDER
STATIONARY U-STATISTICS AND VON MISES STATISTICS
Robert Alan Smith
Department of Statistics
University of North Carolina at Chapel Hill
Chapel Hill. NC 27514
Research supported by the National Science Foundation under Grant MCS78-0l240.
1.
Introduction and notation.
For a sequence X ,X ,X , .•. of independent, identically distributed ranl 2 3
dom variables, and a sYmmetric Borel-measurable function ~: mm ~IR satisfying
E IIi> (X. , ... , X. )
~l
~m
. t egers 1 -< 1'
f or a 11 pOS1·t·1ve 1n
1
I
<
00
-< ..• -<.~m -< m, th e correspon d'1ng sequences 0 f
U-statistics and von Mises statistics are defined by
n
~
m
n
~
m.
and
n
n
I
I
i =1
1
respectively.
i =1
m
Ii>(X. • .... X. )
1
Following Hoeffding (1948),
and for h = l •...• m.
let
and
Hoeffding showed that if
~
m
<
00
1
1
let
m
page 2
e
and if 1;1 > 0, then as n ~
00,
(1.1)
and
where Z is a standard normal random variable.
Callaert and Janssen (1978) found
the rate of convergence in (1.1) to be O(n-~).
When 1;1
cumbersome.
= 0,
however, the asymptotic distributions of Un and Vn are more
Viewing
as a functional on the space F of distributions onlR satisfying
and noting that I;d
= 0 implies I;d_l= 0 for d = 2, .•• ,m, Hoeffding called 8(F)
stationary of order d at FO E F if
Von Mises (1947), Filippova (1961), Gregory (1977), Neuhaus (1977), Eagleson (1979),
Hall (1979) and Janson (1979) have all investigated the limiting behavior of
Un or Vn in this degenerate case, i.e., when the order of stationarity is greater
than zero.
Following Gregory (1977) for the case m = 2, let {A k : k = O,l,2, ••. } denote
the sequence of eigenvalues and {~k: k = 0,1,2, ••• } the corresponding sequence of
orthonormal eigenfunctions of the kernel function
~(x,y)
= ~(x,y)
-
e.
page 3
That is,
E[~k(Xl) ~k' (Xl)] = °kk"
E[~(Xl,X2) ~k(X2) IX l ] = Ak ~k
(Xl) a.s.,
and
K
E[~(Xl,X2)
lim
K-+eo
When
l;1
I
k=O
Ak ~k(Xl) ~k(X2)]
2
= o.
(1. 2)
= 0,
E[~(Xl,X2) IXl ] = 0
a.s.
so that A may be taken to be zero with corresponding eigenfunction
O
Under these assumptions, Gregory showed that if l;2 > 0, then as n ~
n(U
n
- 8) ~
and if in addition
r
k~l
L IAkl<
2
Ak (Zk - 1)
00,
~O
- 1.
00,
(1. 3)
then
k~l
where Zl'Z2'Z3' .•• are independent standard normal random variables.
Gregory
cites several examples.
Gregory's results have been extended to two-sample U-statistics (Neuhaus
(1977) and Eagleson (1979)), to functional limit theorems (Neuhaus (1977) and
Hall (1979)), and to orders of starionarity d > 1 (Eagleson (1977), and Janson
(1979)).
This paper studies the rates of convergence in these results.
Using the
methods of Sazonov (1968b and 1969), Theorems 2.1 and 2.2 estimate the rate of
convergence in (1.4) and (1.3), respectively.
stationary kernel functions of degree m > 2.
Section 3 considers first order
Extensions to functional limit
theorems and two-sample statistics are also possible.
page 4
~
2.
Rates of convergence for first order stationary kernels of degree two.
For m = 2, let
W = n(V - 6). n
n
n
n
-1
n
L L
i=l j=l
If (X. , X. ),
~ J
Hn (x) = P[Wn s xl
and
H(x)
= P[ L Ak
k~l
where Zl,Z2,Z3'
{A k : k = 0,1,2,
THEOREM 2.1
z; S xl
are independent standard normal random variables and
} and
If
{~k:
~l
k = 0,1,2, ... } are as in (1.2).
= 0 and
~2
> 0, if the eigenvalues {A k } satisfy
and
o<
and if the eigenfunctions
{~k}
P S 1,
satisfy
and
then
suplH (x) - H(x)
x
n
PROOF
I = 0(n(p-2)/(22p+4)).
For fixed positive integers K > 8 and n > 2, let
page 5
wCK ) = n- 1
n
.... CK)
W
n
=
n
n
n
L L
o/CK)CX .• X.)
i=l j=l
-1
n
L nL
i=l j=l
1
.... CK)
0/
J
CX .• X.).
1
J
and
Then
suplHnCX) - HCx) I
S
x
~l + ~2 + ~3
where
~l
= suplH
~
= supIHCK)Cx)
n
2
x
n
Cx) - HCK)Cx - L Ak) I.
n
k>K
_ HCK)Cx)
I
x
and
We proceed to estimate each of
Estimation of
~l:
For any
~l' ~2'
£
and
~3
in turn.
> 0.
IH Cx) - HCK)(x - L Ak) I
k>K
n
n
- P[W CK ) s x - LA. Iw CK ) - L A Is £]1+ Ip[WCK)~x_WCK).lwCK)_ I A I> £]
n
k>K k
n
k>K k
n
n
n
k>K k
- P[w CK ) s x - I A • Iw CK)_ L A I > £] Is Ip[w(K)s x- wCK ).
n
k>K k n k>K k
n
n
Iw CK )_ I
n
A I s £]
k>K k
page 6
~
The first term is
sup P[x - E < W(K)s x + E]
n
x
= sup(p[W(K)s
x + E] - p[W(K)s x + E]
n
X
K
where h(K) denotes the probability density function of
I
k=l
A
k
z;,
This and an application of Chebyshev's inequality imply
6 1 s 262
CIE
+
-2
+
E
~(K)
E[Wn
\
-k;KAk]
2
where C is a constant which may be taken to be (A1A2)-~'
1
4It
with respect to
E
yields
6 s 26
2
1
for some constant C ,
2
+
C (E[W(K)_
2
n
I A ]2)1/3
k>K k
Noting that
and
one can show that
where C and C are constants such that
3
4
Minimizing the RHS
page 7
e
and
for sufficiently large k.
Thus,
(2.1)
Estimation of
~2:
Applying Theorem 1 of Sazonov (1968a) with equation (10)
of Sazonov (1968b) to the sequence of K-dimensional sample means
n
-1
n
T
l
(Wl(X.), ... ,WK(X.))
. 1
~
~=
~
yields
K
~2 ~ CSK3 n-~ l E/Wk(X l )1 3 ~ C6 K4 n-~
k=l
where Cs is an absolute constant and C6 = Cs
Estimation of
~3:
. sup Elwk(X l ) 13 .
k
By Lemma XVI,3.2 of Feller (1971)
where
A
h(t) •
f
eitxdH(X)
and
The numerator of the integrand on the right can be written
(2.2)
page 8
where
= E[e
itW(K)
]
Writing
=1
2
+
it
t
l - r-(
l
k>K
A)
k>K k
2
+
2
o(ltl )
yields
where the integral is finite for K ~ 8.
Since
A~ ~ C3Kl - 2/p,
k>K
l
Combining (2.1), (2.2), and (2.3), one obtains
Letting K = n 3p /(22p+4) yields the desired result.
REMARKS
(1)
In the above theorem, the kernel function
have infinitely many positive eigenvalues.
~
is assumed to
If, however, only finitely many
eigenvalues, Al, ••• ,A , say, are non-zero, then the estimation of 62 above shows
K
page 9
that
~l'
where
.•. '~K are the orthonormal eigenfunctions of
and C is an absolute constant.
~
corresponding to Al, ... ,A K
This is the case for the familiar chi-square good-
ness-of-fit test.
(2)
The assumed positivity of the Ak'S is used to ensure the convexity of
k
certain sets inR to which Theorem 1 of Sazonov (1968a) is applied to obtain
(2.2).
(3)
4It
The assumption can be circumvented.
For the Cramer-von Mises goodness-of-fit statistic
where Fn denotes the corresponding empirical distribution function, the eigenvalues have the form
A
k
= (nk) -2
k
~
1
so that p could be taken to be slightly larger
than~.
The resulting rate of
O(nE-l/lO) for arbitrarily small positive E matches the first estimate by
Sazonov (1968b) but falls short of his later (1969) result in which advantage
was taken of the particular kernel function.
(4)
The rate may be improved by assuming L -convergence (r~2) in place
2r
of L -convergence in (1.2). There does not seem to be any practical justifi2
cation for doing so, however.
Now let
Tn
= n(Un -8)
page 10
~
and
~k's
where the Zk's, Ak'S, and
THEOREM 2.2
are as before.
If the hypotheses of Theorem 2.1 hold, then
PROOF
G (x)
n
= P[T
+
$
n
L
I
n -1 n ~(X.,X.)
S x,
i=l
1
P[T S x, n -1 nL ~(X.,X.) n
i=l
1
1
I
P[T
n
+ n-
1 n
I
i=l
+ p[ln-
= H (x +
n
l
~(x.,x.)
1
k~l
$
1
A > E]
k~l k
L Ak
X +
-
1
E]
I
I
k~l
1
I ~(x.,x.)
i=l
L Ak I S
-
1
I
k~l
+ E]
Akl > E]
I
l
Ak+ E) + p[!n~(x.,x.)
k~l
i=l
1
1
I
I
k~l
Akl > E].
Similarly
H (x + L Ak-E)
k~l
n
S
I
l
G (x) + p[/n~(x.,x.) - I Ak / > E].
n
i=l
1
1
k~l
Thus,
supIGn(x) - G(x)1
x
+ sUPIH(x+E) - H(x)
x
S
I
supIHn(x) - H(x)
x
+ P[ln-
1 n
I
i=l
I
~(x.,x.) 1
1
I
k~l
Akl > E]
where the first term on the right is estimated by Theorem 2.1, the second term
page 11
is
x+e:
sup(H(x+e:) - H(x)) = sup J dH(t)
x
x x
:s;
e: sup h (2) (x) = e: I '1. 11. 2 '
x
and the third term is
p[ln- 1i=l
I ~(x.,x.)
- l Akl
~ ~
k~l
> e:]
so that
Choosing
e:
= (2~ E[~(Xl,X1) - l Ak]2
k~l
n- 1 )1/3
to minimize the right hand side yields the desired result.
3.
Rates of convergence for first order stationary kernels of arbitrary degree m.
Again let
= P[Wn
x]
:s;
where
T = n(U -8) = n(~)-l
n
n
l
l
·1<'
-1 < ..• <'1 <
m-n
1
~(X. , ... ,X. )
11
~m
and
W
n
= n(Vn -8)
= n
-m+1
n
n
I ... I
i =1
1
i =1
m
~(X. , ... ,X. ).
~l
~m
page 12
Hoeffding (1961) showed that
U
n
-
r (~)
e=.
u(h)
h=l
(3.1 )
n
where
\
(h)
U(h) = (hn) -1 l\ . ' . •
l. If
(X., ••• , X. ),
n
lSil< .•• <~sn
11
I h
= 1jJ1 (x),
1jJ(l)(x)
= 2, ... ,m,
and for h
Similarly,
V
n
e=
r (~)
V(h)
h=l
(3.2)
n
where
(h)
Vn
= n
-h n
I ... nI
i =1
(h)
i =1
1
When 1;1
If
(x., ... , x. ).
11
h
I
h
= 0,
n
U(l)
n
= Vel) = n- l I
If (x.)
i=l 1
n
1
is almost surely zero so that (3.1) and (3.2) become
and
where
m
R(2) = I~) u(h)
n
h=3
n
and
~2)
(2) 2
m
= I~)
h=3
V(h) •
n
(2) 2
-3
(2)
Since E[Rn ] and E[~ ] are both O(n ), (1.3) and (1.4) applied to Un
and v~2), respectively, yield the following results:
page 13
llIEOREM 3.1
If
~1
=0
and
~2
> O. then as n
~
00.
where {A } is the sequence of eigenvalues of the kernel function
k
is a sequence of independent standard normal random variables.
THEOREM 3.2
~2
If in addition to the hypotheses of Theorem 3.1.
and {Zk}
I
IAkl <
00.
k~l
then as n
~ 00
n (Y -6)
n
k ~) l
A
k~l k
Z~
•
Estimates of the rates of convergence in Theorems 3.1 and 3.2 can be
obtained from applications of Theorems 2.2 and 2.1 to U(2) and y(2). respective1y.
THEOREM 3.3
If ~1
=0
and ~2 > O. if the eigenvalues {~} of ~(2) satisfy
and
and if the corresponding eigenfunctions
{~k}
satisfy
and
then
suplG (x) - G(x)
x
n
I = O(n(P- 21k 22p+4)).
page 14
PROOF
For any e: > 0,
Gn (x)
S
peT
S
peTn - nR(2)
n
n
S
x, InR(2)
n
= p[n~) u~2)
S
e:] + PrlnR (2)
n
I
> e:]
x + e:] + prlnR(2)
n
I
> e:]
I
S
x + e:] + P[lnR (2)
n
S
I
> e:].
Similarly,
P[n 1)
U(n 2)
t~
S
x + e:]
S
Gn (x) + P £I nRn( 2) I > e:].
Thus,
supIGn(x) - G(x)
I
S
x
suplp[n~) U~2)s
+ supIG(x+e:) - G(x)
I
+ P[lnR(2)
n
x
~
x + e:] - G(x+e:)!
x
I
> e:].
(3.3)
The first term on the right hand side of (3.3) is 0(n(p-2)/(22p+4)) by
The second term is O(e:) as e: ~ O. Since E[R ]2 = 0(n- 3), the
n
2
1
third term is 0(n- e:- ). Choosing e: = n(p-2)/(22p+4) completes the proof.
Theorem 2.2.
The proof of the next result is exactly parallel to that above.
THEOREM 3.4
Under the assumptions of Theorem 3.3,
SUpIHn(X) - H(x)
I = 0(n(p-2)/(22p+4)).
x
References
Callaert, H. and Janssen, P. (1978).
Ann. Statist. 6, 417-421.
The Berry-Esseen theorem for U-statistics.
Eagleson, G. K. (1979). Orthogonal expansions and U-statistics. AustraZ. J.
Statist. 21" 221-237.
Feller, W. (1971). An Intl'oduation to ProbabiUty Theory and Its AppUaations"
VoZ. II" Seaond Ed." Wiley, New York.
page 15
Filippova. A. A. (1961). Von Mises' theorem on the asymptotic behavior of
functionals of empirical distribution functions and its statistical
applications. Theor. Frob. Appl. 7~ 24-57.
Gregory. G. G. (1977). Large sample theory for U-statistics and tests of fit.
Ann. Statist. 5~ 110-123.
Hall. P. (1979). On the invariance principle for U-statistics.
Froaesses and their Appliaations e~ 163-174.
Stoahastia
Hoeffding. W. (1948). A class of statistics with asymptotically normal distribution. Ann. Math. Statist. 19~ 293-325.
Hoeffding, w. (1961). The strong law of large numbers for U-statistics.
of North Carolina. Institute of Statistics. Mimeo Series, No. 302.
Univ.
Janson, S. (1979). The asymptotic distribution of degenerate U-statistics.
Uppsala University. Department of Mathematics Report No. 1979:5.
Uppsala University. Uppsala, Sweden.
Mises. R. v. (1947). On the asymptotic distribution of differentiable statistical functions. Ann. Math. Statist. 18~ 309-348.
Neuhaus. G. (1977). Functional limit theorems for U-statistics in the degenerate case. J. MUltivar. Anal. 7~ 424-439.
Sazonov. V. V. (1968a). On the multidimensional central limit theorem.
Sankhya, Sere A 30~ 181-204.
2
Sazonov. V. V. (1968b). On w criterion. Sankhya~ Sere A 30~ 205-210.
Sazonov. V. V. (1969). An improvement of a convergence-rate estimate.
of Frobability and Its Appliaations 14~ 640-651.
Theory
© Copyright 2026 Paperzz