Hoeffding, Wassily; (1968)On the centering of a simple linear rank statistic."

ON TEE CENTERING OF A SIMPLE LINEAR RANK STATISTIC
by
Wassily Hoeffding
University of North Carolina
Institute of Statistics Mimeo Series No. 585
June 1968
This research was supported by the Mathematics
Division of the Air Force Office of Scientific Research
Grant No. AF-AFOSR-68-l4l5.
DEPARTMENT OF STATISTICS
University of North Carolina
Chapel Hill, N. C.
SUMMARY
H8.jek [3J proved that under weak conditions the distribution of a simple
linear rank statistic S (see (1.1)) is asymptotically normal (ES,~), with ~
defined in (1.6).
He left open the question whether under the same conditions
the centering constant ES may be replaced by the simpler constant
~
defined
by (1.8), as waS found to be true in the two-sample case and under different
conditions by Chernoff and Savage [1] and Govindarajulu, LeCam and
Rhagavachari [2].
In this paper it is shown that the replacement of ES by ....
is permissible if one of H8.jek's conditions is slightly strengthened.
The
relation of the problem to one in the theory of polynomial approximation is
noted.
1.
Introduction and statement of results.
H~jek
[3] studied the asymptotic
distribution of the sum
N
(1.1)
S =
L
c. a... (R.),
J.
~
J.
i=l
called a simple linear rank statistic, Where cl ' ••• , c are real numbers,
N
R , ••• , ~ are the respective ranks of N independent random variables
l
~, ••• , XN whose distribution functions F , ••• , F
are continuous, and the
l
N
so-called scores ~ (i) are generated by a function ~(t), 0 < t < 1, in either
of the following two ways:
(1.2)
~(i) = ep(i/(N+l»,
i = 1,
(1.3)
~(i)
i = 1,
=
Bp(U~i»,
... , N,
..., N•
Here U(i) denotes the i-th order statistic in a sample of size N from the
N
uniform distribution on (0,1).
m.jek proved four theorems asserting the
asymptotic normality of S under different conditions, of which we quote
H!jek's Theorem 2.3.
Let ep(t) = Cf>r(t) - Cf'.o:Ct), 0 < t < 1, where ~I(t)
and epII(t) are both non-decreasing, square integrable, and absolutely continuous inside (0,1).
Then for every
£
> 0 and TI > 0 there exists N(e,,,) such
l
:s i :s N (c i
that
(1.4)
N > N(£,TI), var S >
implies
(1. 5)
sup Ip(S-ES < x( var
1}N max
- c)2
sri) - I(x) I <E.
X
The assertion remains true if we replace var S in (1. 4) and (1. 5) by
2
according as x 2: 0 or x < 0,
cp' denot es the derivative of cp, and
H(x) = N-lt;
(1.7)
1
~=
F. (x).
~
Hajek's Theorem 2.4 states that under the conditions of Theorem 2.3,
for every € > 0 and
~
> 0 there exist
N(€,~)
and
e(e,~)
such that the con-
clusion of Theorem 2.3 holds with var S replaced by
2
d = ~=l(ci - c)2 f~(cp(t) _cp)2 dt,
where ii> =tij>(t)dt, provided that the condition max. j
~,
is added to
e(e,~)
,x IF.~ (x)-F.(x)1<
J
(1.4).
H'jek's theorems are extensions of the earlier results of Chernoff and
Savage [1] and of Govindarajulu, LeCam and Raghavachari [2], Which are concerned with the special case c l = ••. = cm' cm+l =.••= cN' Fl =••• = Fm,
Fm+l =••• = FN (two-sample case). Apart from different sets of assumptions
(which, in essential parts, are more restrictive than Hajek's), the theorems
of [1] and [2] differ from Hajek's theorems in that the centering constant
ES is replaced by
(1.8)
The problem of whether ES may be replaced by Ii is of interest since, typical.4';
~ is easier to evaluate.
H'jek observed ([3], p. 330) that he did not succeed
in showing that this replacement is possible under the conditions of Theorems
2.3 and 2.4.
In this paper it is shown that if the condition of square integrability
of CPI and CPII is slightly strengthened, then the conclus ions of Theorems
2.3 and 2.4 remain true with ES replaced by
Ii or by
3
~'
1
=
~ + CU:~=lCP(i/(H+l»-N JCP(t)dt).
a
Explicitly, the following result is pr::Jved.
Theorem 1.
Let cp(t) = CPI(t) - CPII(t) satisfy the conditions of Hajek's
Theorem 2.3 and the additional condition
1
(1.10)
!'r-Pk(t)\ t-t (l-t)-t dt <
k
co,
= I, II.
a
Then the conclusions of H~jek's Theorems 2.3 and 2.4 remain true with ES replaced by
~'
scores (1.3).
in the case of the scores (1.2) and by
~
in the case of the
Moreover, if Ic\/maxi~i-c\ is bounded, ES may be replaced by ~
also in the case (1.2).
Condition (1.10) implies square integrability of the monotone functions
CPk (see Lemma 1 below).
It has been shown in [4] that if CPk is monotone and
!~~(t){lOg (l+~k(t) \))2+&
dt <
co
for some & > 0, then (1.10) is satisfied.
a
In this sense condition (1.10) is not much stronger than square integrability.
The author does not know Whether square integrability of CPr and CPrr is
sufficient for the conclusion of Theorem 1.
Theorem 1 is proved in Section 5.
The proof depends on the following
two propositions which are demonstrated in Sections 3 and 4.
Proposition 1.
There is a numerical constant Cl such that ifcp is non-
decreasing, then
(1.11)
IFtp(a~i»_
cp(i/(N+l» I
~
1
C Ni
l
J ~(t)
a
\t-t-(l_t)-t dt.
4
There is a numerical constant C such that if cp is non2
decreasing and Fl , ••• , F are any continuous distribution functions, then
N
Proposition 2.
(1.12)
Remark.
by inf-ao<a<Jo
The integral on the right of (1.11) and (1.12) may be replaced
f~
A>(t)-alt-t(l-t)-i dt.
This is due to the fact that the left
sides of the inequalities are not changed if cp is replaced bycp + const.
Proposition 1 gives an upper bound for a distance between the sequences
of scores (1.2) and (1.3).
Proposition 2 has an obvious bearing on the esti-
mation of IEs_~t I with scores (1.2).
Taken together with Proposition 1, it
will be seen (in Section 5) to imply an inequality analogous to (1.12) for
the scores (1.3).
The stated results are closely related to a problem in polynomial approximation.
For a function cp which is Lebesgue integrable on (0,1) define the
(modified) Bernstein polynomial of order N by
i+l)/(N+l)
~(t) = ~=o(N+l)
cp(u)du (~)ti(l_t)N-i
j
i/(N+l)
(see Lorentz [6], Chapter II).
then
1
J I~(t)
°
a numerical constant.
It is shown in [4] that ifcp is nondecreasing
1
- cp (t) I·dt
:s C N-i JI:P (t) It -t(l_t) -t dt,
where C is
°
The proof is similar to that of Proposition 2.
2. Preliminary lemmas.
The following lemmas will be used in the proofs.
The symbols C , C , C , ••• will denote numerical constants.
l
2 3
5
Lemma 1.
If~
is nondecreasing,
Lemma 2.
If
is nondecreasing,
~
-N-l L~(N+l
L.sl:t!) - ~(N+l)
~ } (.il)t(
i
~·=l
N I-.il
N) ~ C3
(2.2)
Lemmas 1 and 2 are identical
wi. th
J
1
L
It
w(t)
-t(l-t) -t dt.
o
Lemmas 1 and 2 of [4] and are proved
there.
Let GNi denote the distribution function of U~i) and let the random
variable WN(u) have the binomial (N,u) distribution. Then for i = 1, ••• ,N,
(2.3)
GNi (u) =
Lemma 3.
Proof.
p(U~i) < u} = P(~(u) 2: i), 0 < u < 1.
For j = 1, ••• ,N-l,
By (2.3), for 0 < u < 1,
j
1:i =l (1 - GNi (u»
= J:I=l P{WN(u)
=
Now
Hence
~
i-l} =
~~ ~=O
P(wN(u) = k}
~=O (j-k)PCWN(u) = k)=jPCWN(u) ~ j) - ~=okPCWN(u)=k}
6
For U
= j/(N+l)
of the last term in
we have j-Nu
(2.5)
= j/(N+l) <
The maximum for U
1.
is attained at u=j/N.
1
~ j
(0,1)
Hence
By Stirling's formula there is a numerical constant
-i (J.) -j -!( _ J.)-N+j-i
( 2.7 ) ( N)
j ~ C4 N
N
1 N
'
E
~
C4 such that
N-l.
Inequality (2.4) now follows from (2.6) and (2.7).
3.
Let CP.j = cp(j/(N+l», 1 ~ j ~ N, and let G
Ni
Proof of Proposition 1.
be defined by
(2.3).
Since cp is nondecreasing,
1
cp(U)dGN·(u) =
~
[
.-N 1
< l:"""j~:O CPj+l
~j=O
j(j+l)/(N+l)
j/(N+l)
[;(j+l)/(N+l)
CP(U)dGNi(U)
/(N+l)
1
1
dGN·(u) +
cp(u)dGN·(u)
~
N/(N+l)
~
If in the sum in the last line we majorize the j-th term bYCPj+l1
~
j
(3.1)
~
i-I, we obtain
(")
'm-n(UN~)
-r
rn.
T~
< a. + b.,
~
_
i
~
= 1,
••• , N,
where
with
Hence
~
= o.
Since a.
> 0,
~ -
b.
> 0,
J. -
(3.1) implies
~j
for
7
-N
(")
:&"1=1 Imp(UNJ. )- CPi I
(3.3)
5 2A +
2B + D,
where
By Lemmas 3 and 2,
t
i
-N-l I
J.
J.
A S~N - CP1 + C4 N Ej-=l \~j+1- CPj)(N) (1- N)
S'1'N - '1'J. + C C4 Nt lJ.fp(t)
3
Since
~=1 GNi (u)
1
B = N
It-t(J.-t)~t.
= Nu,
1
1
(3.7)
t
cp(u)du - N(N+l)- ~N.
N/(N+l)
Since
and
i
r!-1~' 5 r:.-1 (N+1)
J.-
J.
-
1
(i+1)!(N+1)
1,1
cp(t)dt=(N+1)
cp(t)dt
/(N+1)
l/(N+l)
1
(i)
.1=1 mp(uN ) = N 0 cp(t)dt,
_N
(3.8)
D
5
11/(N+l)
cp(t)dt - N
,,(t)dt.
1/(N+1)
0
f
l
Now
(3.9)
(N+1)N-
l
I
N/(N+1)
0
cp(t)dt
l
r
l
• f!(N+l)
CP(t)dt.
S CPN 5 (N+l)~/(N+1)cp(t)dt, CP1?:(N+l)~
From (3.7) to (3.9) we obtain after repeated application of Schwarz's inequality
8
Combining (3.3), (3.6) and (3.10) and applying Lemma 1, we obtain
inequality (1.11) of Proposition 1 with C1 = 2C C + C •
3 4
5
4.
Proof of Proposition 2.
Let
(4.1)
Then
(4.2)
r{=1 lap(Ri /(N+1» ..
J-...
...
cp(H(X»dFi(X) Is ii.=lJ-lti (x)-qJ(H(X»ldF i (X).
We have
'i (x) •
~=1 C9j
=CP1 +
Now Ri = ~=1 u(Xi -
P{Ri = j IXi = xl
~:i
Xk),
(CPj+1 -CPj) peRi
2:
j+1lx i = x}.
where u(·) is defined after (1.6).
Hence if
we let
(4.3)
Vex) =
then P(.Ri
2:
~=1
u(x -
j+1l xi = xl S P(V(x)
Xk),
2:
j)
for all i and
By Theorem 1 of [5],
(4.5)
p(V(x)
2:
j)S (i)-j (1- i)-N+j H(x)j (l-H(x)}N-j for j
If we majorize P (V(x)
the terms with j
(4.6)
2: j)
2:
NH(x).
in (4.4) by 1 for j S [NH(x)] and apply (4.5) to
> [NH(X)]+l, we obtain
' i (x) - cp [NH(x) ]+1 S J(H(x»,
0
< H(x) < 1,
where
(4.7)
_ -N-1
(
_
)(i)-j( _ ~)-N+j j( _ )N-j
J(t) - ~j=[Nt]+l CPj+1 CPj N
1 N
t 1 t
for 0 < t
S1
- N-1 and J ( t ) = 0 for 1 - N-1 < t < 1.
9
Since J(t) > 0, (4.6) implies
l'fi (x)
- cp [NH(x) ]+1 1~ 2J(H(x»
+ cp [NH(x) ]+1-
'fi (x).
Therefore, by (1.7),
Now
(4.9)
Nuf:l
~[NtJ+l
dt = ~=l
~(k/(N+l»
= ~=l Bp(Ri/(N+l)
and
1
1°
J(t)dt =
~:i
fk/N
(k-l)/N
J(t)dt
r
~-l ~-l
~=l };j=k
k N
(
)(~)-j
i)-N+j
=
\cp j+l - cp j N (1- N
J I / t j (l-t) N-j dt
(k-l)/N
j N
~-l
) i)-j
i -N+j / j
N-j
= ~j=l (cp~+l-cpj (N
(1- N)
~
t (l-t)
dt.
r
By Stirling's formula, for 1 ~ j ~ N-l,
Hence
It follows from (4.8) to (4.10) that
(4.u)
r:.=l!OO Iti(x) - cp[NH(x)]+lldFi(X)
~
Furthermore,
..
~ 2C6N~:i(cpj+l-cp)(~ri(l- ~)i.
10
It now follows from (4.2), (4.11), (4.13) and Lemmas 1 and 2 that inequality
(1.12) holds with C = 2C + 4.
2
6
5.
Proof of Theorem 1.
Lemma 4.
(I
If ~
The following lemma will be used.
satisfies the conditions of Theorem 1, then for every
> 0 there exists a decomposition
~(t)
(5.1)
such that
t
=,(t)
+~(l)(t) _~(2)(t),
is a polynomial,
~ (1)
and
~ (2)
O<t<l,
are nondecreasing, and
Lemma 4 is an analog of Lemma 5.1 of Ha.jek [3], which differs from
Lemma 4 in that cp is assumed to satisfy the conditions of Theorem 2.3 and
l
(5.2) is replaced by
(1) (t) 12dt +
l-p (2) (t) 12dt < a. Hajek's
11
o
.
l
0
proof of Lemma 5.1 serves without change to prove Lemma 4.
It will be sufficient to prove the assertion of Theorem 1 concerning
Theorem 2.3 since
for Theorem 2.4 the proof is analogous.
First let S be
11
defined with ~(i) = ~(i/(N+l».
centering constant
~',
To prove the statement of the theorem with
it is enough to show that for every
there exists a number N'
~
> 0 and
~
>0
= N'(~,~) such that
implies
Indeed, given E > 0 and ~ > 0, choose ~
Let N"(E,~)
= max
Theorem 2.3.
We wri te
maxxl.(x~~)~(x)l<E/2.
(N'(~(E),~», N(E/2,~», with N(·,·) defined in Hajek's
Then (1.4) with
ES replaced by
= ~(E) so that
N(E,~)
replaced by
N"(E,~)
implies (1.5) with
~'.
S(~)
~'(~),
for S, J1' to exhibit the dependence
r:.=l cp (Ri / (N+l» = r:.=l cp (i/ (N+l» and r:.=l';; (H(x) )dFi (x) = N
we have from
S(rn)
T
on~.
t;
Since
cp (t)dt,
(1.9)
~'(~)
= r:_l(c.-c)
~_
~
~(R./(N+l»J~(H(X»dF.(X)}.
.
~
~
Hence
We apply Le1l1IllB. 4 with ex to be determined later.
Clearly
Since V has a bounded second derivative, it follows by a Taylor expansion (see Hajek [3], p. 340) that there is a constant K(t) such that
l
IEt(R./(N+l)1< K(t)N- , i = 1, ••• ,N.
~
J1ft(H(X»)dF.(x)
~
Hence, from (5.5) with
~
= t,
12
From (5.5) with
~
= ~(k), Proposition 2, and (5.2),
If var S > ~ max.~ (c.~ - c)2, it follows from (5.6), (5.8) and (5.9)
that
(5.10)
1
Now, given ~ > 0 and ~ > 0, choose a in Lemma 4 so that C2~-2 a = ~/2.
This choice fixes K(t) =
Kl(~'~).
N'(~,~)
Define N' =
by
~-iK(V)(N')-~ ~/2.
(5.3) implies (5.4), as was to be proved.
Then
To prove the last part of Theorem 1 concerning the case (1.2), note that,
by (1.9),
ltil-fli :sIc 1Ir;=1
....
Since
~
1
~(i/(N+l»-N
1
cp(t)dt 1=lc
o·
INll~(~~;l).. 0
cp(t) )dt I.
is the difference of two non-decreasing, square integrable functions,
it follows from (4.12) that
IJl'-JlI:s Icl
Nt~,
. varS > 'IN max ( c. -c-)2. , then
Hence ~f
~
where 11l = 11l(cp) ... 0 as N .....
IfI'- fli /(vars)i <
~-t~
lel/maxlci - el,
Ie I/max Ic. -c I is
which is arbitrarily small for N large enough if
~
This implies the last part of the theorem.
Finally consider S with
~(i)
=
~(U~i».
1
In this case
~=l~(i)=N~(t)dt,
hence
S(~)-f1(cp)
(5.11)
=
I!! l(c.-e){a.~(R.)JCP(H(X)dF.(X»),
~=
~
~
~
IES(cp)-f1(Cp)I :smaxilci-ci ':;'=1
Now it is easily seen that
bounded.
~
IE~(Ri)- !CP(H(X»dFi(X)I.
13
(5.12)
~=l IE~(R)j~(H(X»dFi(X).I.:s
For cp =
and var
that
U~i)
t
-&;=1 lap(Ri/(N+l»
we apply Taylor's formula to the last term.
Since
< N- l for all i, we find that there is a constant
IE.(U~i»_ .(i/(N+l»1 < K'(t)N- l ,
i = 1, ••• ,N.
this implies an inequality analogous to (5.8).
and 2 to
jcp(H(X»dFi(X)1
(5.12)
with
~ = cp(k),
an inequality analogous to (5.9).
first part of the proof.
EU~i)=i/(N+l)
K'(t)
such
Together ,dth (5.7)
Applying Propositions 1
k = 1,2, and using Lemma 4, we obtain
Now the conclusion follows as in the
14
REFERENCES
[1]
Chern~ff,
H. and Savage, l.R. (1958).
Asymptotic normality and effi-
ciency of certain nonparametric test statistics.
Statist.~,
[2]
Ann. Math.
972-994.
Govindarajulu, Z., LeCam, L. and Raghavachari, M. (1966).
Generaliza-
tions of theorems of Chernoff and Savage on the asymptotic
normality of test statistics.
Statist.
Frob. b
Proc. Fifth Berkeley Symp. Math.
609-638.
[3] Hajek, Jaroslav (1968). Asymptotic norma.lity of simple linear rank
statistics under alternatives.
Ann. Math. Statist.,
~
325-346.
[4] Hoeffding, Wassily (1968). Approximation of a monotone function by
Bermtein polynomials.
!nst. of Statist. Mimeo Series No. 58l.
University of North Carolina.
[5] Hoeffding, Wassily (1963).
Probability inequalities for sums of
bounded random variables.
[6] Lorentz, G.G. (1953).
Press, Toronto.
J. Amer. Statist. Assoc., 2.§., 13-30.
Bernstein
Pol~omia1s.
University of Toronto

Download Report

Hoeffding, Wassily; (1968)On the centering of a simple linear rank statistic."

Paperzz.com

Your Paperzz