Sen, Pranab Kumar; (1983).On a Kolmogorov-Smirnov Type Alighed Test."

-e
ON A KOLMOGOROV-SMIRNOV TYPE ALIGNED TEST
by
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1442
June 1983
ON A KOLMOGOROV-SMIRNOV TYPE ALIGNED TEST *
PRANAB KUMAR SEN
University of North Carolina,Chapel Hill
For testing the hypothesis that two (symmetric) distributions
differ only in locations, a Kolmogorov-Smirnov type test based on
the aligned observations is considered and its properties studied.
1. Introduction.
Let X "",X be m independent and identically distributed
m
1
random variables (i.i.d.r.v.) with a continuous distribution function (d. f.)
F, defined on the real line R = (_00,00). Also, let Yl""'Y
n
be n i.i.d.r.v.
with a d.f. G , defined on R. It is assumed that
F(x) = Fo(x - 8 ) and
1
(1.1)
where
G(x) = Go(x - 8 2 )
x E: R,
8 = (8 ,8 ) is an unknown vector of the location parameters, and, we
1 2
want to test for
(1. 2)
treating
=
G
o
against
~
8 as a nuisance parameter. If
8
Go
1
= 8 2 , then an omnibus goodness of
fit test for (1.2) is based on the classical Ko1mogorov-Smirnov statistics
(1.3)
(1.4)
sup{
F (x)
= sup{
IFm(x)
=
K
mn
m
x E: R },
G (x)
n
G (x)
n
I
X E:
R },
where F and G are the two sample (empirical) d. f. This test is not sui table
m
n
when 8 and 8 are not equal. One possibility to overcome this problem is to
1
2
estimate ~ by some convenient estimator 8 and to base the test on the aligned
A
A
observations X - 8 , i=l, ••• ,m and Y - 9 , j=l, ••• ,n. Unfortunately, the resulting
i
1
j
2
test may not be distribution-free, even asymptotically. In this note, we consider
AMS 1980 Subject Classification: 62GlO, 62G99.
Key Words
&Phrases:
Alignment; asymptotically distribution-free; empirical d.f.;
empirical process; Kolmogorov-Smirnov statistics; tightness; weak convergence.
* Work partially supported by the National Heart, Lung and Blood Institute,
Contract No. NIH-NHLBI-7l-2243-L from the National Institutes of Health.
-2a variant form of the Kolmogorov-Smirnov
statisti~
where the alignment pro-
cedure works out well when the underlying d.f.'s are sYmmetric. Along with the
preliminary notions, this proposed test is considered in Section 2 and its
properties are studied in the concluding section.
3.
A
!~~_EE~E~~~~_!~~!·
Let 8i ,m
82 ,n be two arbitrary translation-invariant
and
estimators of 8 and 8 , respectively, such that
2
1
2
mk
(2.1)
I8
A
I,m
and
-
n
Consider then the aligned observations
A
A
8
X = X.
i=l, •••• ,m ,
(2.2)
mi
1
I,m
.
k2
A
= 0 p (1).
8 2,n - 82 '
A
= Y.
nj
J
Y
A
8
2,n
,
j=l, ••• ,n,
and let
(2.3)
A
Fm (x)
-l~
= m
A
lIe X . < x)
1 = · m1-
~.
and
A
G ex) = n
n
-1 n
A
l:. 1 I (Y .
.s.
J = . nJ - x) •• x
E
R
be the two empirical d.£. 's based on the aligned observations. Further, let
(2.4)
A*
A
F (x)
m
= Fm(x)
1'\*
A
A
G ex) = G (x) - G (-x-), .x > O.
A
- F (-x-) and
n
m
n
n
Our proposed tests are based on the statistics
A*
A*
A*+
sup{ Fm(x)
x > O}
K
G (x)
(2.5)
=
mn
n
A*
A*
A*
x > o}
sup{ IF ex)
K
(2.6)
G ex) I
=
mn
m
n
We intend to show that the tests based on the statistics in (2.5)-(2.6) are
asymptotically distribution-free and have the same properties as the ones based
on (1.3)-(1.4), when F
o
is sYmmetric about O.
3. Properties of the test. Because of the translation-invariance of the estimators
-8
~---------------------A
and 8
loss of generality, we may take 8 = 8 = 0 and note
2 ,n ' without any
1
2
I,m
that the residuals in (2.2) are translation invariant too. Further, we assume
that there exists a positive
(3.1)
0
< A
0
<
A
0'
such that on letting N= m+n,
< 1 _. A0 < 1, for all N
AN = miN
Also, we assume that the d. £. F
0
(and G ) are symmetric and have uniformly
0
continuous probability density functions (p.d.f.) f
8
o
(and g ) almost everywhere
0
A
A
(a. e.) • Finally, replacing
.
1 ,m
and
8 ,n by 8 and 8 , respectively, in (2.2),
1
2
2
we denote the corresponding empirical d.f.s in (2.4) by F* and G* ' respectively.
n
m
-3-
Then, we have the following
Theorem 1. Under (2.1) , (3.1) and the assumed regularity conditions on F0 -and G0'
~-
sup{
(3.2)
!-<2
A*
!-<
A*
*
I
*
Gn (x) I
m IF (x)
m
n 2 IG (x)
n
Proof.We shall only prove (3.2)
sup{
(3.2)
Let t
~
m
(3.4)
A
o (1) ,
p
x > 0 } ::: o (1), as N -+ 00.
p
and (3.3) follows precisely on the same line.
Fm(x)
,
JC > 0 }
:::
A
::: m (8 ,m - 8 ). Then, by (2.2) and (2.3), for every x :- 0,
l
1
A*
_~
_~
F (x) ::: F (x + m t ) - F (-x + m t
)
m
m
m
m
m
*
!-<
1
::: F (x) + {F (x+m- 2t )-F (x)-F (-x+m-":2t -)+F (-x-)}
m
m
m m
m
m
m
so that
F* (x)}
(3.5)
m
!-<!-<
!-<
::: m2 { F (x+m- 2 t )-F (x)-F (-x+m- 2 t )+F (-x)}
o
moo·
m 0
!-<
+ m2 {F"(x+m
m
_!-<
2
-!-<
)-F Cx+m 2 t) - F (x)+ F c.x)}
tm
om
m
0
-~ t) -F (-x-)+F (-x)}
- m~ {Fm(-x+m -~ t m
)-F o
(-x+m
m
m"
0
!-<
Now, the first term on the right hand side of C.3.5) is equal to t {f (x+am- 2 t )
mom
·e
!,;
-f (-x+bm- 2 t )} , where 0 < a,b < 1, and using the sYmmetry and uniform continuity
o
m
of the pdf f o ' we conclude that under (2.1) (insuring the boundedness of tm' in
probability), this converges to 0, in probability, as m -+~, uniformly in JC 2..
o.
On the other hand, if we make use of the weak convergence of the empirical
process
m~{ F - F } to a Brownian bridge , then, by the tightness part of this
m
0
weak convergence, (2.1) and the uniform continuity of f
o
, each of the other
two terms on the right hand side of (3.5) converges to 0, in probability, as
m -+
00;
the conclusion remains true under the sup-norm metric when we make use
!,;
of the modulus of continuity for the empirical process m2 (F
m
- F ). Hence, the
o
proof of (3.2) is complete.
By virtue of Theorem I, we conclude that under the hypothesis of Theorem 1,
(3.6)
!,;
A*
A*
*
*
sup{ N2 1 F (x)-G (x) - F (JC) + G (x)
m
n
m
n
* is the empirical d.f. of the
Note that Fm
I
:
x > o}
-
Xi - 8
1
P
-+ 0,
as N
-+
00.
whose true d.f. is F* (x):::
o
F (x) -F (-x), x > O. Similarly, G* is the empirical d. f. corresponding to the
o
0
n
*
true d.f. Go(x)
::: Go(x) -Go(-JC), x > O.
Hence, if we define
-4*+
*
*
*
*
*
K = sup{ F (x)-G (x):x > O} and K = sup{IF (x)-G (x) I:x > O},
mn
m
n
mn
m
n
then, these are respectively the one and two-sided Kolmogorov-Smirnov statistics
(3.7)
for the two samples with observations Ix - 811, i=l, ••• ,m and /Y.-8 I, j=l, ••• ,n.
i
J 2
From (2.5), (2.6), (3.6) and (3.7), we conclude that under the hypothesis of
Theorem 1, as N
(3.8)
+
00
p
1
"'*
N~I K +
0
+
mn
and
p
"'*
mn
N~I K
1
+
O.
This exhibits the asymptotic equivalence of the proposed tests and the ones
based on the statistics in (3.7). In particular, under F
o
also the same, and hence, we have for every
(3.9)
(3.10)
G
o
F* and G* are
o
0
d > 0,
h: *+
2
limN+<Jo PH { (mn/N) 2 Kmn > d } = exp(-2d ) ,
0
h: *
00
r-l
2 2
exp (-2r d )
limN+<Jo PH { (mn/N) 2 Kmn > d }= 2L: r =1 (-1)
0
",
V
"...
see for example, Hajek and Sidak (1967, p.190). Thus, under H : F =G and the
O o 0
h:"'*+
regularity conditions of Theorem 1, for (mn/N) 2 Kmn
and
~'"
(mn/N) K
* , the
mn
limiting distributions in (3.9) and (3.10) apply.Also, for contiguous alternatives,
*+
the power properties of K
mn
and
K* are shared by the proposed tests.
mn
In this context, we may use the sample medians for
81 ,m
and
82 ,n
and these
satisfy (2.1) whenever f (0) and g (0) are positive. Other robust estimates
o
0
of location may also be used, and, this may avoid the assumption of a finite
second moment which is usually needed with the sample mean. It is also possible
to improve (3.8) by using the Bahadur-Kiefer representation for the sample
quantiles in (3.5) and the Komlos-Major-Tusnady strong approximation for the
reduced empirical processes. However, for our specific purpose, these are not
really needed.
REFERENCE
,;
v
,;
HAJEK, J. and SIDAK, Z. (1967). Theory of Rank Tests. Academic Press,New York o