UNIVERSITY OF NORTH CAROLINA
Department of Statistics
Chapel Hill, N. C.
ON AN ANALOG OF REGRESSION ANALYSIS
by
P. IC Bhattacharya
August 1962
Contract No. AF 49(638)-213
In a multivariate distribution the nature of dependence
of one variate on the others is studied through conditional quantiles. Estimators and test procedures for
conditional quantile functions are given and some of
their large saLWle properties investigated.
This research was supported by the Air Force Office of Scientific
Research.
Institute of Statistics
Mimeo Series No. 335
ON AN ANAIDG OF REGRESSION ANALYSIS.
by
P. K. Bhattacharya.
1.
1
Introduction~
-
Suppose (Xl' "',
Xh'
Y) follow an unknown multivariate distribution on
which independent observations are made.
(Xl' ••• ,
Xh)
The nature of dependence of Y on
is fully understood only when we know how the conditional distribu-
tion of Y given Xl
= xl'
"',
Xh
=~, changes with (xl' "', ~).
attempted in two different ways.
This can be
One approach is to make inference about the
functional relation between the conditional moments of Y given Xl
= xl' "',Xh
=~,
and (Xl' ••• , ~), and a special case of this (when the behavior of only the conditional expectation of Y is studied) is known in statistical literature as regression analysis.
The classical methods of regression analysis are based on the as-
sumption that E~Y
I Xl = Xl'
"',
Xh = ~ _7
is a linear function of Xl' "', ~.
Mahalanobis ~3_7 and Parthasarathy and Bhattacharya ["4_7 have proposed some
methods of regression analysis which do not involve any such linearity assumption.
The methods proposed by Parthasarathy and Bhattacharya can be generalized in a
straightforward manner for estimating and testing hypotheses about higher order
conditional moments of Y, but in order to prove the consistency of these estimates
and tests for the first m conditional
momen~.,
one should assume the existence of
at least first 3m conditional absolute moments.
A second approach to this problem
is to make inference about the functional relation between the quantities of the
conditional distribution of Y given Xl
= Xl'
"',
Xh = xh '
and (Xl' "', ~).
This kind of an analog of the variance components analysis has been considered by
Roy and Cobb ["5_7, while Sathe ["6_7 has given a test for the conditional median
~
of Y given X under a restricted model.
1.
Research.
This research was supported by the Air Force Office of Scientific
2
In this paper, estimates have been proposed for conditional quantile functions of Y given Xl' ••• ,
~,
and the simultaneous uniform convergence of any num-
ber of such estimates to the corresponding conditional quantile functions has been
studied.
A large sample test for the hypothesis that certain conditional quantile
functions are equal to specified functions, has been suggested and proved to be
consistent.
In order to avoid complicated notations, the methods and their prop-
erties will be discussed for the bivariate case in sections 2, 3 and 4, while in
section 5, the corresponding methods for the multivariate case will be explained
and their properties will be stated without using too many symbols.
2.
Problem, assumptions and notations.
(X, Y) has a bivariate distribution.
F is the marginal distribution func-
tion of X and Gx is the conditional distribution function of Y given X = x.
For
any 0 < P < 1, ¢ (x) is the solution of the equation
p
Gx (¢p (x»
=p
On the basis of independent observations on (X,
the function ¢
p
Y), we want to estimate
for a given p, and for a specified real valued function ~ defined
on the range of X, we want to test the hypothesis Ho : ~p = ~.
In what follows, 0 < P < I is always a specified number for which we are
interested in ¢ p •
We make the following assumptions about F, { Gx } and ¢p:
(i)
(ii)
for simplicity 0 ~ X
The range of X is bounded;
:s 1.
F is continuous and strictly increasing.
(iia)
For each x, 0 < x < 1, G is continuous and strictly increasing.
(iii)
¢
-
p
-
x
is continuous (hence uniformly continuous).
3
( i v)
For any given
Ip' -
p
E
> 0, there exi sts 5 > 0 (not depending on x) such that
I< 0
I¢p (x)
implies
"0
It should be noted that when
~
..
¢P ,(x) I <
€
for all x.
say that a univariate distribution function H is
strictly increasing, we mean thereby, that H is strictly increasing on an interval
(a, b), where
a
=
i.
the
{
-
CD
u. b. of the set {x: H(x)
= o}
if this set is non-empty
otherwise,
and
the g.
K.
b. of the set {x: H(x)
= l} if this set is non-empty
co otherwi se •
Another point to note is that condition (iv) is satisfied if there exists a function
1jr
on L'-0, 17
such that Gx (y)
-
= G0 (y +
'\rex»~.
(Xl' Yl ), ••• , (Xnk ' Ynk ) are independent observations on
X(l) < ••• < X(rut) be the ordered values of Xl' ••• , Xnk •
For r
= 1,
••• , k and for s
= 1,
X
) -- Xrs
(r-l·n+s
Y(i)
=
(X, y). Let
Yj if XCi)
="
Xj
.
••• , n,
,
Y
(r-l.n+s)
= Yrs
For any given integer k, define a set of random intervals as follows:
I kl =
L 0,
Xln _7 J I kr = (Xr-l, n' xrJ, r = 2, ••• , k-l, and
Next let Yr(l) < ••• < Yr(n) be the ordered values of Yrl , ••• , Yrn ' and denote
by a_7 the largest integer.:s a. Now define a random step-function f nk on £0,1_7
as follows:
L
if
J
r = 1, '."
k
In course of the analysis carried out in the next two sections, we shall
make ase of an upper bound for the tail probabilities of sums of independent and
4
bounded random variables due to Hoeffding ["2_7.
We shall also make use of an
upper bound for the error of approximation by the Central Limit Theorem and an
upper bound of error involved in the usual approximation of the distribution function of tlfrequence X2t1 for a simple hypothesis regarding a multinomial distribution, both due to Esseen ~~7.
For the sake of completeness, these are stated
below.
Theorem (Hoeffding).
o -< X.1 -< 1, EX.1
If Xl' "', X are independent random variables,
n
= ~, Sn = Xl + ••• +X , and if
n
FrS - ES > nt 7 < en
n--
(2)
Theorem (Esseen).
2nt
0 < t <
l-~,
then
2
If Xl' ••• , X are a sequence of independent random
n
variables with the same distribution function F, the mean value zero, the variance
a
2
f
0 and the finite absolute moments ~l' "', ~
I x I < v'(1
+ B Xv
-
2) log n
v
(v is an integer>
-
3),
then for
,
,
where F is the distribution function of (Xl + ••• + X )
n
n
IIU a, p (x)
=
x
~
-CD
1
J21t
e-
t2 2
/ dt, Cl(O,
~)
and C (B,
on 0 ~ 0 < 1, and the moments ~l'
Theo~
(Esseen).
2
~)
are finite constants, depending only
..., v .
~
If n ind.ependent observations are taken on a multinomial
distribution with (m + 1) classes and with class probabilities ql' ···'~+l'·
m+l
E qi
= 1,
and if n , ••• , n + are the respective observed frequencies,
ml
l
= n,
then
1
m+l
E ni
1
5
i
m+l
( 4)
P [
2· 2 ]
(ni - n~) / n~
x
e-w/2 wm/2-l dw
.s
O(ql' ••• , <1m+l)
+
m!(m+l)----n
where O(ql' ••• , ~+l) is a finite constant, depending only on ql' "', <1m+l'
3,
Convergence of f
nk
,
In this section we shall study the convergence in probability and almost
sure convergence of f
nk
¢p' uniformly. It is obvious that both nand k should
to
tend to infinity for such convergences to take place, but the crucial point in our
analys1s is to find out how nand k should depend on each other as they tend to
infinity••
We shall first prove a probability inequality for the event that the
random variables
r
= 1,
•• "
{Xrn l
,r
= 1,
k-l respectively.
"',
k-l, lie in some neighborhoods of F-l(r/k),
The following lemma is a modification of a similar
probability inequality given by Parthasarathy and Bhattacharya ~4_7.
Lemma 1.
Let 0 < a
k
< 1, k
= 1,
2, ••• ad inf., be a sequence converging to zero.
Under assumption (ii),
1
P~F- (!k - a k ) -<
X
1 r
< F- (-k + a k ), r
rn-
2
=
1, "',
k... 17
-
where F-l(a) for a < 0 is defined to be 0 and F-l(a) for a
-2nka k
> 1 - 2ke
>1
is defined to be 1.
6
The left side of (5) is
Proof.
k-l
=1
-
(
Z
Z
s>rn
r=l
kn
s
s
kn- s k-l
kn
s
kn- s
- Z
r;
( ) (1- .! - a ) (.! + a )
k
r=l s>kn-rn s
k
k
k
k
) (.! - a ) (1 - .! + a)
k
k
k
k-l
1
s
kn-s
: (en)(.! - a k) (1 - -kr + a k)
)
7" s k
r=l S,?kn.r(r
L k - a k +ak-
=1 - Z
Z
k-l
Z
r=l
Z
s>kn r (l-'!~
-
-2nka
>
1 - 2(k - 1) e
-.2nka
k
> 1 - 2ke
k
2
k
, by (2)
2
We now find a probability inequality for the event that the length of
each of the intervals I
kl
, ..• , I
kk
is less than a specified positive number.
Let L(I) denote the length of an interval I on the real line.
Lemma 2.
Under assumptions (i) and (ii), for any 0 > 0,
P.L L( I kr )
< 5, r = 1, ••• ,
k:-7
> 1 - 2ke
-2kna 2
k
for sufficiently large k.
Proof.
We first note that under assumptions (i) and (ii), for any given 0 > 0,
there exists an integer k
)
(6
a
such that for k > k ,
0
r
1 + a ) ,F-l(kr+1 + a ) - F-1 (kr - a ) ' r = 1, ••• , k- 2,
max_ F-1(it
k
k
k
I - F-l(k-l
--a )
k
7 <0
k-
7
We shall show that (6) along vdth
,
r = 1, ... , k-l
,
implies L(Ikrk 0 , r = 1, ••• , k.
For r = 2, ••• , k-l,
L(I
kr
)
= X
rn
- X
r-l,n
,
,
<0
L(I
Also,
kl
by (6)
) < X
1n
.:5
-1(1
k + ak)
,
F
,
<0
-l(k-l.)
I t - ak
<0
by (7)
by (6)
) ~ 1 - F
L(
I kk
and
by (7)
,
by (7)
by (6)
An application of lemma 1 now completes the proof.
Lemma 3.
Under assumptions (iia), (iii) and (iv), for sufficiently large k and for
any given 11 > 0,
PL
Sup
xel
kr
v5, p+€(x)-
Inf
x€I
¢p-e (x)
< 11, r
='1,
-2nka 2
~ •• ).k7> 1 _ 2ke
-
kr
if e > 0 is such that
I p'
- p
I<
€
implies
I ¢p (x)
-
¢p ,(x) I < 11/3 for all x.
(By assumption (iv) such an e > 0 exists for any given 11 > 0.)
k,
8
Choose 0 > 0 such that
Proof.
,
together imply
By assumption (iii) such a 0 > 0 exists for any given ~ > O.
¢po-€ ( x) <
~ P~ Sup
P~ Sup
..T
x~kr
Sup
x€I kr
I¢
(x) ....
:po- Ei:"
¢
P+€
..T
x~kr
¢.(x) / < n/3' "
P
by the choice of 0 and €.
Lemma 4.
Inf ¢
(x)-
.
~, r = 1, ... , k
-
7
Inf ¢po-€(x) < ~, L(I kr ) < 0, r
x€I kr
¢p+€(x) -
x€I kr
But
Now,
••• , ~7
= 1,
.
(x) < ~, r = 1, ••. , k/L(I k ) < o,r = 1, ••• ,k7
po- €
r -
r = 1, ... , k/ L(I kr ) < 0, r
= 1,
... ,
kJ
= 1,
An application of lemma 2 now completes the proof.
Under assumptions (iia),
(iii) and (iv), for arbitrary € > 0 and for
large n,
P~ Inf ¢ (x) < Y (r 7) < Sup
x€I
po- €
r 1.. np_
X€ I k
kr
r
¢P+€(x) /Xl = Xl'
••• , Xn k = x nA,~ 7
2
> 1 - 2ke- n €
,
9
Proof.
Suppose xl' ••. , xn are points in an interval La, b_7 C £0, 1.-7. Let
Y , •.. , Y be mutually independent random variables, Y. having distribution
1
n
~
function Gx .' i
= 1,
Also let Y(l) < •.• < Y(n) be the ordered values
..• , n.
~
of Yl , "', Yn • Then
= pr-at
L
where Ul ' ""
least
r- np_7
L
of Yl , "', Yare <
n
-
Inf
¢p- € (x)- 7
,r7
X€L a, b_
Un are mutually independent random variables with
P£U. = 1 - G ( Inf
~
x
r
7 rf!.p- € (x» - 7 = GX i (Inf
,r 7 ¢p- € (x»
i .
X€L a,b_
x~ a,b_
and
€r
PLU. =-G (Inf
¢ (x»
~
x.~ X _ a, b_ 7 p- €
i
= 1, ••. ,
-
7 = 1- G (Inf
X..rX€L a, b_7
~
¢
p- €
(x»,
n.
Now,
<
-
G
xi
= p -
(¢
€
P-€
(x.»
~
since x.€r-a,b_ 7
~:L.
by (1)
Hence
~ np - 1 - n( p - €) == n€ - 1
> n€/ J2
for large n.
10
Thus,
,
by (2)
7
¢ + (x) 7 < e-
Similarly, prY(r 7) > Sup
L L np,r
X€L a,b_
p
€
-
-
2n
2
€
The desired probability inequality is now obtained.
We are now in a position to prove the following theorem about uniform
convergence of f k to
n
Theorem 1.
, > 0, d k
n
k
->
ClD
Proof.
If F,
=
and for n
kr
and ¢
p
satisfy conditions (i) - (iv), then for n > k',
-
¢ (x) I converges
- -
~
Let
x€I
t
lx5
Sup
If k(x) O<x<l
n
> p[" Inf ¢
-
<; G
¢p .
> k, d k converges to zero with probability one as k ->
-
n
Inf
xel
kr
(x) < Y
< Sup
r(["np_7)
xeI
kr
p- e
-
¢
co •
> 0 be given.
¢
(x),
P+€
< D, r = 1, ••• ,
= p["
to zero in probability as
p
p- e
¢
Sup ¢
(x)- Inf
p+€
x€I
X€I kr
kr
~, for arbitrary €
(x) < Y
< Sup
(x), r
r([" np_7)
x€I
p+€
kr
Inf ¢
(x) < ~, r
x€I
P- €
kr
= 1,
•.• , kl Sup ¢
(x)
xel
P+€
kr
••• , k_7x PLSup ¢P+€(x)- Inf ¢p- e(x) <
xeI
x€I
kr
kr
2
2
-2nka
k )
> (1 - 2ke- n € ). (1 - 2ke
= 1,
> 0,
,
1'},
11
by lemmas 3 and 4 if
€
> 0 is so chosen as to satisfy the condition of lemma 3.
Both the factors in the last term tend
a
k
= k-~/2
I
,which
converges if n
~
to one as k
->
proves the first part of the theorem.
k and a
= k
k
-1/2
00
if n ~ k'Y, Y
> 0 and
Again,
The second part of the theorem now follows
•
from Borel-Cantelli lemma.
If we have 0 < Pl < ••. < P < 1, and if we define
m
f(i)(X)
nk
=Y
r(Lnp.
7)
for
~~Ik
.
'
= 1,
I'
••• , k
, i
= 1,
••• , m ,
I'
~-
then theorem 1 can be immediately extended
(1) ' ••• , f (m) to ¢ , ••• ,
vergence of f nk
nk
P1
4.
for the simultaneous uniform con-
rJ.
~p
respectively.
m
Large Sample Tests for Specified Conditional Quantile Functions.
Let ~ be a specified real-valued function onLO, 1_7and consider the
problem of testing the hypothesis
HQ:
¢p
We define random variables {Urs ~
=~
against the alternative H : ¢p
l
,I'
= 1,
.•• , k;
s
= 1,
f
~.
"', n, as
follows:
if Y < ~(X )
rs rs
U
rs
otherwise
These are mutually independent random variables and under HO'
values 1 and 0 with probabilities p and 1-p respectively.
I'
= 1,
••• , k, and
T
nk
Sup
=
I'
= 1,
"', k
Vp(l - p)jn
each of them takes
n
Let U = L: U In,
r s=l rs
12
If we can find the limiting distribution of T
ED,
(suitably standardized) under
nk
then we can test HO at any given level of significance 0 < a < 1, as follows:
Reject HO if and only if T > Tnk(a) , where
nk
Again, suppose 0 < Pl < •.• < P < 1 are given numbers and ~I' •• ' ~ are
m
m
specified real-valued functions on ~o, 1_7, satisfying the condition that
~l(X)
< ••• < ~m(X) for all X€~o, ~7. Consider the problem of testing the
. H(m)
hypo th es~s
0 : (~'F
¢Pi
. t the .olternat~ve H(m)
that
= ~l' .", ¢
= ~)
aga~ns
1
Pl
Pm
m
#~. for at least one i, i = 1, .•• , m.
l-'
...
~
Define U(l) =
rs
if Yrs -< ~l(Xrs )
otherwise
,
if ~.~- leXrs ) < Yrs -< ~.(x
~
rs )
othervlise
,
i = 2, ••. , m
u(m+l)
rs
if Y > ~ (X )
rs
m rs
otherwise
,
=
= ~ U(i)
and
s=l
Let ql
= Pl ,
~
= Pi
,
,
rs
- Pi - l , i
= 2,
= 1,
r
.", m;
,
••• , k
~+l
=1
i
= 1,
.•. , m+l
- Pm' and consider the
statistic
(m)
=
T
nk
m+l
Sup
Z
r=l, ••• ,k i=l
test
then we can test
% with
~m)
the help of
T
with the help
nk
•
2
T~) (suitably standardized) under
of T~~) in the same way as we hope to
If we can find the limiting distribution of
H~m),
(U(i) ~ nq.)
ro
~
13
The following theorem gives us the limiting distributions of T and
nk
under HQ and H~m) respectively.
T~)
For any real G and for any integer k, define
+ log k - ~ log log k)
1
if G + log k - '2 log log k > 0
,
otherwise
+ log k + (m/2-l)10g log k} if G + log k + (m/2-l)10g log k > 0
otherwise
Theorem 2.
(b)
(a)
If n~ k Y, I> 0, then
If n
~ (k log k)l+l/m , then
lim
k-> co
Proof.
(a)
r nk(m)
P_
Under
T
He'
(m) ( )
~ f.. k
0
IIrQ(m) _7 =
.r
1
eXPL - rtm!2)
/
{Ursf are independently and identically distributed
with mean p, variance p(l-p), and with finite moments of all orders which are
functions of p only. Let v be so large that
, 0 < 8 < 1
( 8)
Y( v - 2) > 2
,
and
14
By
(8) and (3),
p
[
IUt
-
- ;;pl
~ ~(O)
J]
H
O
Jp(l-p)!n
=
2
where le kr / ~ C
~
since n
~
k/,
L
5
3 l -1/2 - ( A. k ( 0 ) ) / 2
- ( v- 2) / 2
1.. 1 + (A.k(O» j n
e
+ n
C' Le-O(lOg k)2 k-(1+;/2) + k- y (V-2)/2
7
_7
C' being a finite constant, depending on p and p and 5.
log pr'T k < A.k(O) ·IHo
- n -
7=
/u-pf
k
!: log P [
r=l
k
= !: logL
r=l
P
r . Jp(l-p)!n
(A.k(O» -
Now,
~ A. k (O)/Ho-7
4) (-A.k(O»
+ e kr
_7
where
Since e
kr
- > 0 and
c;P (A.k(O»
-
cP (-A.k(O»
-> 1 as k ->
CD
,
for
sufficiently large k,
Now loge 1 + x)
=x
2
+ vx , Iv
I<1
for
Ix I < 1/2.
Hence,
,
where C" is a finite constant depending on p and 5.
Zk
- > 0 as k -> co.
It now follows from (9) that
To complete the proof we have only to verify that
15
1
-0
-e
vn
(b)
It can be shown in the same way as in proving (a) that
lOIlLT~~) ~ ,t)(O) I ~m) -7 = k log [ 2mI2~(m/2)
where
z~m)
lim
k->
tion of
can be shown to be zero if
n~
~.~m) (0)
5
e-
W
/
2
wm!2-ldW
J
+ z(m)
(k log k)l+l/m, by an applica-
CD
(4). The rest of the proof follows from the fact that
\~m) (0)
"
~1m
k ->
klog [
CD
m 21
2 ! r(m/2)
j
a
For any 0 < 0: < 1, let
/'
Tnk(O:) =
and
/2 {log(k/
T~) (0:) = 2
Theorem 3.
(a)
--
fir. ).. ~
{ log(k/r(m/2»
log log k .. log log
(~:a:)}
+ (m/2 - 1) log log k .. log loge 1:0:) }
Under assumptions (i) .. (iv),
If n~ k/, y > 0, then the test with critical region
0<0:<1
,
is a large sample size 0: test for the null hypothesis HO' and is consistent
against the alternative HI.
(b)
If
n~
(k log k)l+l/m, then the test with critical region
T(m) > T(m) (0:)
nk
nk
,
0<0:<1
is a large sample size 0: test for the null hypothesis
against the alternative Him) •
k
a
~m),
,
and is consistent
'
16
Proof.
The first parts of (a) and (b) are immediate consequences of theorem 2.
We shall prove here the second part of (b), and the proof for the second part of
(a) will follow on exactly similar lines.
Since we are assuming
only the case when
~l'
••• ,
¢ ,
Pi
~m
i
= 1,
•.. , m to be continuous, we need consider
are continuous.
Suppose
¢p. ~ ~i for at least one integer i between 1 and m.
~
¢
~ ~.,
J
for which
some 5 >
° and an interval (c l '
~j(x)
either
does not hold.
Then
Let j be the smallest
Since ¢. and ~j are both continuous, there exists
Pj
integer
Pj
H~m)
c2 )
C
1_7, such that
["0,
> ¢p.(x) + 5
J
~
or
.(x) < ¢ (x) - 5
Pj
J
Suppose the first is true.
(The other case can be treated similarly.)
tion (iv), for given 5 > 0, there exists an
p. < P < p. + 2€ implies
J
Choose P
= Pj
J
+ €.
¢P(x)
>
€
-
By assump-
° such that
¢P
(x) < 5 for all x
j
Then
¢pj+€(x) < ¢Pj(X) + 5
Hence,
G (¢
X
p.
J
= Pj
(x) + 5) > GX (¢ p. +€ (x))
J
For convenience of notation, let P
o
x.
Now for any X€(c , c ),
l
2
+
€
= 0, and
¢Po (x)
= ~o (x) = - co for all
17
p r u( j)
L rs
= 1/
X
rs
= x- 7 = pLr Il j - 1( x) <
Y < Il. (x) / x
=x 7
rs - J
rs-
< Yrs -< Il j (x) I xr s
=x 7, since
smallest integer for which
j is the
¢ # Il
Pj
>p, +
€-
= q.J +
€
J
P'l
J-
Now let c " < c ' < c ' < c " be points in (c ' (c + c 2)/2) and
l
l
l
2
2
l
choose k
o
sufficiently large such that for k~ k ' F(C )
o
l
F(C ') - F(C ') >
2
1
> 11k, 1 - F(c 2 ) > 11k,
11k,
F(c ' ) - F(c ") > a and F(C ") - F(C ' ) > a , where {ak1
k
2
l
2
k
l
-l(
I
)
is a sequence as in lemma 1. Then for k ~ k ' F
r k .=s c ' < c2 I .=s F-l(kr+1 )
o
l
implies F( c ') - F( c
2
l
l
)
:s 11k, which is a
contradiction.
Hence for each k ~ k '
o
there exists at least one integer r(k) such that cl' < F- l ( r~k»
k
~
r~k) - a k )
and F-1(r~k»
k ' F- \
o
diction,
diction.
:s c l " implies F( c l , ) + a )
k
Hence for each k
~
F( Cll!)
:s a k ;
< c2 I .
Also for
which is a contra-
c " implies F(C ") - F(C 1 ):S a which is a contra2
2
k
2
> k a , there exists a smallest integer rl(k) such that
-
Similarly, for each value of k greater than some integer, there exists a largest
Hence for such large values of k,
18
-1 r 1 (k)
F
(
k
-1( r 1 (k)
- a ) < x (k)
,n < F
k
r
1
\
k>.
+ a )
k
-1 r 2 (k)
-1 r 2 (k)
F ( - - a ) < x (1)
+ a )
<
F
( k
k
k
r ~ ,n
k
2
and
imply that I
kr
<:
(c , c ) for r
2
1
= rl(k)
+ 1, •.. , r (k), and this set of
2
intege:cs is non- empty. Thus for sufficiently large values of k,
-2nka
k
> 1 - 2ke
2
, by
lemma 1.
For each k greater than some sufficiently large integer, now choose and
fix an integer r(k) such that r (k) + 1~ r(k) ~ r (k).
2
1
Then
-2nka
2ke
~ PffiA~Lo-nq}2/nClj > 'T~) (a)
2
k_7
IIk,r(k) C (c 1 ,c 2 )_7
-2nka
x ["1 - 2ke
2
k
_7
,
19
.
To complete the proof we have only to note that ~f
by (2).
then with the choice of a
5.
( k log k)l+l/m ,
= k-1/2 , both factors in the final expression tend to 1.
Generalization to the case when X is vector-valued.
Suppose (X, y)
tribution.
x
k
n~
= (Xl' "',
~, Y) follows an unknown multivariate dis-
Let F be the distribution function of X
= (Xl' ""
= (Xl' ••• ,
~) and for any
~), let G be the conditional distribution of Y given X
x
given 0 < P < 1, let ¢p (x) be the p-quantile of Gx •
= x.
For
1
The assumptions we make about F, {Gx
and ¢p are the same as (i) - (iv)
given in section 2, with some modification$, We shall only state the modified
version of assumption (ii), the modifications on the other assumptions being
obvious.
yodified assumption (ii).
The marginal distribution of Xl and the
conditional distribution of Xi given Xl
j
= xl' "', Xi _ l = xi _l (0
~ x
j
~ 1,
= 1, "" i-I), i =2, ••• , h, are all continuous and strictly increasing.
In sections 2, 3 and 4, for a sample of size uk from a bivariate (X, Y),
we defined a random division of ~O,
the fractiles of X observations,
1_7
into k sUb-intervals with the help of
Studying the convergence of the sample
fractiles to the corresponding population fractiles, we were able to give a
probability inequality for the event that the length of each of these random
intervals is less than a specified quantity.
some other regular nature of
¢p ,
Due to the unifonn continuity and
the rest was accomplished by investigating the
behavior of the p-quantile of the Y-observations corresponding
to the X-observa-
tions belonging to each one of these random intervals, separately.
When X is vector-valued, there is only a partial ordering among the X's.
We therefore modify our procedure as follows.
(XlN ' ""
Let(XII' ""
~N' YN) be N independent observations on (Xl' ""
~l' Yl ), "',
~, Y), where
20
N
h
h .
= llic. Let Nj = nk -J, j = 1,
•• "
h.
First, let us arrange th Xl-coordinates
of all observations in increasing order of magnitude and let Xl(i) be the i-th
order statistic obtained from XII' "', XlN '
'Into randoin sub-intervals as follmvs.,
II
Let us divide the interval~OJ
= LO, Xl(N \ 7, I
lr
Xl(r N)
1 1
_7,
rl
.rl
= (X _
'l(r-lON'
- 1
1_7
,
l'
= 2, ''', k-l, and I k = (X
, 1 7. For each r l = l,.,.,k,
l(k-l'N )
l
consider all samples (Xli' .,., ~i' Yi ) for which Xli € I
and call them the
rl
samples belonging to the rl-th fractile group,
Now for each j
= 1,
.", h-l and
for each r l = 1, "" kj ,'!'"
r j = 1, "', k, arrange the xj+l-coordinates of
all samples belonging to the (rl , "" r.).th fractile group, in increasing order
J
of magnitude, and with the Nj+1th, 2N j +i th, ""
(k-l)Nj+ith order statistics
obtained from the Xj+l-coordinates of these samples, divide the interval
["0,
1_7
into random sub-intervals in the same way as II' ""
I k were defined,
l' "', I
k' For each r l = 1, "" k;
rl,,·r j
rl",r j
""
r j +l = 1, "', k, consider all samples (Xli' ',., Xhi ' Yi ) which belong to
the (r l , ••• , rJ,)-th fractile group and for which X. 1 .€I
, as belongJ+,~
rl···rjr j +l
ing to the (r , ••• , r j , rj+l)-th fractile group. Finally, when for each r l =
l
and call these intervals I
kj "', r h = 1, "" k, we have exactly n observations belonging to the
(r 1 , " ' , rh)-th fractile group, let us arrange the Y-coordinates of all samples
belonging to the (r , •• " rk)-th fractile group, in increasing order of magnitude,
l
and suppose the order statistics obtained from the Y-coordinates of these samples
1, ""
are, Y
(1) < ... < Y
( ).
rl·,.rh
rl ••• r h n
Also, for any specified real valued
function IJ. defined on the h times Cartesian product of ["0, ~7 with itself,
let
21
1
o
otherwise
N
= . E1
and letU(r , ••• , r )
l
h
1.=
u.(r l , ""
rh)!n
J.
Now we define a random function
rl
= 1,
.•• , k; ""
rh
= 1,
.'"
k,
and a statistic
lu(rl' .oe,
Sup
l, ... ,k; ... ,
r
h
=:
r ) h
pi
y'p(l - p)jn
1, ••• , k.
It can be easily verified that statements regarding the convergence of f
to ¢p' the limiting distribution of T (properly standardized) under the null
nk
hypothesis HO: (¢p
= IJ.)
and the consistency of the test
II
reject H if and only
O
if Tnk > Tnk(a), where Tnk(a), 0 < a < 1, is such that
lim
k->
PLTnk:S Tnk(a)
00
made as in theorems 1, 2 and
I Hb _7 = 1
- ex
" ,:
3 with k replaced by K = kh , are valid.
For the case of several conditional quantile functions, the results of
sections 3 and 4 can be extended for the multivariate case in the same way.
Thanks are due to S. N. Roy and Sudhindra N. Ray for making some helpful
suggestions.
nk
22
REFERENCES.
~1_7
.Esseen,C. G. (1945).
Fourier analysis of distribution functions.
mathematical study of the Laplace-Gaussian law.
A
Acta Math.
77, 1-125·
~2_7
Hoeffding, Wassily (1962).
random variables.
Probability inequalities for sums of bounded
Institute of Statistics, University of North
Carolina, Mimeograph Series No. 326.
~3_7
Mahalanobis, P. C. (1958).
analysis.
~4_7
Lectures in Japan:
India.n Statistical Institute Monograph.
Parthasarathy, K. R. and Bhattacharya, P. K. (1961).
in regression theory.
~5_7
Fractile graphical
Roy,
s.
Some limit theorems
Sankhya, Series A. 23, 91-102.
N. and Cobb, Whitfield (1960).
Mixed model variance analysis
with normal error and possibly non-normal other random effects:
Part I:
~6_7
The univariate case.
Sathe, Y. S. (1962).
inference.
Ann. Math. Statist. 31, 939-957.
Studies of some problems in nonparametric
Institute of Statistics, University of North Carolina,
Mimeograph Series No. 325.
© Copyright 2026 Paperzz