ON ASYMPTOTIC REPRESENTATIONS FOR REDUCED QUANTILES IN SAMPLING
FROM A LENGTH-BIAS~D DISTRIBUTION
hy
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1419
October 1982
ON ASYMPTOTIC REPRESENTATIONS FOR REDUCED QUANTILES IN SAMPLING
FROM A LENGTH-BIASED DISTRIBUTION *
BY
PRANAB KUMAR SEN
University of North
CaroZina~
ChapeZHiZZ.
Nonparametric estimation of the quantiles of a distribution based
on a sample from the corresponding length-biased distribution is considered. Along with some representations of this estimator in terms of
averages of independent random variables, some limiting results are
," e
established. The case of the reduced quantile process is treated briefly.
AMS Subject CZassification
: 60F05, 62E20, 62G05
KeyWords and Phrases: Asymptotic normality, Bahadur-representation, empirical
process, invariance principles, reduced quantiles and quantile process, weak
convergence.
* Work partially supported by the National Heart, Lung and Blood Institute,
Contract NIH-NHLBI-71-2243-L from the National Institutes of Health.
1
~
Let YI""'Y n be n independent and identically distributed
(iGied.) nonnegative random variables (r.v.) from a distribution G, defined on
R+
==
[0,(0). G = Gp is called a lensth-biased distribution
+
corresponding to a
.
distribution F (alsa defined on R ), 1f
(1.1)
Gp(Y) = }J-l f~ xdP(x) , for every y
E:
R+ ,
where
}J
(1.2)
00
= faxdF(x)
is assumed to be finite.
The distribution function (d. f.) Gp arises naturally in many fields; see for
example, Gox(1969), Patil and Rao (1977,1978) and Coleman (1979) where many
life applications are considered. We assume that the underlying d.f. F admits
a
unique p-quantile
(1. 3)
~
p
= ~ p (F),i.e.,
F(x) = phas a unique solution
~p'
where
a
< p < 1.
Our main interest is to provide a nonparametric estimate of
~p
based on Y , ••• ,
l
Yn , and to study its various properties. For a somewhat related problem
(based on a mixture of an ordinary and a length-biased distribution), we may
refer to Vardi (l982a, b) •
Let Yn: 1· -< •••
. -< Yn:n be the order statistics corresponding to Yl, ••• ,Yn •
Then, the sample p-quantile corresponding to
~p
is defined by Y : where k
n k
is a suitably chosen (random) integer, depending on all the order statistics.
This estimator is formally defined in Section 2, where the basic regularity
conditions are also introduced. Section 3 is devoted to the study of the
large sample properties of this estimator based on (i) the weak convergence
of some related empirical processes, (li) Bahadur (1966) representation of
sample quantiles and (iii) some strong invariance principles for the empirical
distributions. Some general remarks are appended in the concluding section.
~!~~vs~etl
..~~!..~~.
(2.1)
Note that by (1.1) and (1. 2),
-1
-1 ~
-1 ~p. -1
~p
-1
}J P =}J
fOP dF(x) =}J f
x
xdF(x) = f~ x dG(x),
O
so that
(2.2)
}J
-1
= fox -1 dG(x) •
00
2
Let G (y) = n-lE~ 1 I(Y
n
1=
i
~
y), Y
+
R be the empirical d.f. based on Yl, ••• ,Y •
n
In (2.1) and (2.2), we replace the true d.f. G by its nonparametric estimator
G
n
and for
~
-1
A_I
(2.3)
, we obtain the estimator
= n- 1
~n
E
A_I
~n
given by
1
E~
Y-.
n:l
1=1
Noting that the d.f. G is a step function, we now define a (random) integer
n
k (;:; K )
n
(2.4)
by
K
= max{
k
k :
-1
E.1= 1 Yn:l.
Note that depending on the value of p (O<p<l) and the Y.,i;:;l, ••• ,n, the inequality
n
1
in (2.4) may not hold even for k=l • This is particularly true when n may be
small and p is close to zero. We may,however, remove this technical difficulty
-1
n-l
by letting Kn ;:; 1 whenever Y : l > p( Ei=lYn : i ). Thus, K is a positive integer
n
n
valued randoln variable, and, for every n , p{ 1 < K < n} = 1. The sample
-
estimator of
~p
n-
is then taken as
A
~n,p = Yn:K
(2.5)
, where K is defined by (2.4).
n
n
e-
We may remark that in the classical case, the sample p-quantile is defined
by the kth largest order statistics, where k is a positive integer such that
kin is closest to p. However, in (2.5), the estimator comes out as an order
statistic
with a random index K whose properties are not all that known.In
n
fact, K in (2.4) is defined in terins of partial sums of reciprocals of order
n
statistics which are neither independent nor identically distributed •. Hence,
the classical renewal theory results may not apply directly to K
n
"'_I
-1 n
-1
-1
.
-1 2
an d var1ance
n
y , where
Note that by (2.3), ~n = n ri=lY i has expectation ~
2
00
-2
-2
00
-1
)
= f O x dF(x
O y dG(y) - ~
where we assume that E y- 2 = E X-I = v 2 < 00
G
F
(2.6)
Y = f
limit theorem, as n
(2.7)
1
n~(
"'_I
~n
+00
~-l
,
)1 y
~
x
so that
(2.8)
~ -1)
=0
(1) •
P
~
-2
Then, by the classical central
(0, 1),
3
Further, note that a(x) = x -1 has a continuous derivative al(x) = -x -2 for all
X E (0,00), so that by (2.7) and (2.8),
A
n (
~
(2.9)
~- ~n
) =
~
2
n
~
A_I
(~n
-
~
-1
~
) + Opel) ~
"
2 2
" (0, ~ y ).
These results will be useful in our subsequent analysis. We also assume that the
d.f. F has a continuous probability density function (p.d.f.) f in some
neighbourhood of t,;
p
and
f(~')
P
is strictly positive. Note that g, the p.d. f. of
~
G, is given by g(x) = xf(x), so that noting that
p
E
(0, 00), we have
00
(2.10)
We may need some other regularity conditions which will be introduced as the
occasions arise.
A
,,~.v!y~"a!,J::.~p..:r:.e~~!.1t~t:~?~, '?f ~n~p • Let us denote by
Hn(x) = ~ y-ldGn(y)
(3.1)
H(x) = f~ y-ldG(y) , for x
and
£ (0,00 ).
Then, note that both Hand Hare nondecreasing, and, by (2.1)-(2.4),
n
) = p/~ = p 1I(00) and Hn(Y : ) ~ pHn(OO) < Hn(Y : +1) ,
n K
n Kn
P
n
where, conventionally, we let Y : n + = 00 • Further, for every x £ (0,00 ), we
n
l
-1
-1
.
let Y.1 (x) = Y.1 I(Y.1 < x), l=l, •• .,n , so that
H(~
(3.2)
~
~
(3.3)
n [Hn(x) - H(x)] = n
-~ n
~
-1
l:i=l[ Yi (x)
.)('(0, . Yx2 ),
where
2
l
Yx -_ fXo y-2 dG (y) - (fxo y- dG(y))2
(3.4)
Then~
.
f or every x
eXIsts
£
( 0,00.
)
we have the following
Theorem 1. If v
<
00
and (2.10) holds, then, as
- ~ } ='pn -~ Ln 1 {-l
Y.
P
1=
1
(3.5)
~
n
~
00
{-1
-l} -n -~l:.n 1 Y.. ~ )-p~ - I
} .(1).
. +0
1=
1
P
P
Proof. Note that for every x > 0,
A
A
A
~nHn(x) - ~H(x) = ~{Hn(x)-H(;X)} +(~n-~){HnCx)-H(;X)} +(}1n -~)H(x).
_1:
Also, if we let x = ~p ±.Cn 2 , for some finite CC > 0), then, it can be shown
(3.6)
that (3.3) holds with the further simplification that Y~ may be replaced by Y~ •
P
As such, by (2.9), (3.3) and (3.6), we conclude that for every n >0, there exist
4
a c
n «
00) and an integer n , such that for every (fixed) C and n > n ,
0
I~
-
0
±Cn- k2 ) - ~H(~ ±Cn- k2 ) I> c } < n/2.
p
p
n
k2
On the other hand, if we let J = {x: ~ - Cn< x < ~ +Cn -!:2 }, then
n
p
- P
k
1
~ H (~ -Cn- 2) < ~ H. (x) < ~ H (~ +Cn -~ ),
x s J ,
(3.8)
nn p
nn
nn p
n
while, by (2.10), as n + 00
(3.7)
A
P{nk2
H (s
n n
A
(3.9)
A
nk2
{~H(~
± Cn -!:2 )_
p
V
A
~H(~')} +
P
Hence, if we choose C so large that
± C~f(~ ).
P
~Cf(~) > c
p
n .. we conclude from (3.7),
(3.8) and (3.9) that, by virtue of (3.2), namely,
A
A
llnHn(Yn : K ):: p<
~nHn(Yn:K
n
n
+1) ,
.
n K S I n } ~ 1 - n ,for every n ~ no •
n
Having obtained this weaker bound forY : K ,we like to extend a result of
n n
Ghosh(1971) on (a weaker) Bahadur representation of sample quantiles to our
(3.10)
p{ Y :
Hn and employ the same for the proof of (3.5). Note that by (3.6), for every x,y
'"
(3.11)
~
A
nHn (x)- ~H(x) - ~ nHn (y) + ~H(y)
A
= ]Jn { Hn (x)- H(x)
. - Hn (y) + H(y)}
A
+ (J1n
-]J ){
H(x)- H(y)} ,
where, by (2.9), (2.10) and the definition of I '
n
sup{ I( ~n -]J ){H(x) - H(y)}1 : x,y S I } = Open-I),
(3.12)
n
.while, using the identity that for x >- y,
(3.13)
0~1 =Op(l),
H (x) - H (y) - H(x) + H(y) = fXu-ld[G (u) - G(u)]
n
n·
y
n
= x-l{G (x)-G(x)} _y-l{G (y) _ G(y)} + fX
n
.
n
y
u-
2{G (u)-G(u)} du
n
along with (2.10) and the Bahadur (1966) representation:
(3.14)
°
3 4
sup{ IG n (x)-G n (y) - G(x) + G(y)1 : x,y s J n } = p (n- / log n),
we conclude that
(3.15)
sup {
x,y sJ
n
n~ IHn(X) - H(x) -. Hn(Y) + H(y) I} = 0p(n- 1/ 410g n)
By virtue of (3.10) .and (3.15J, ,we.c(1)c1ude that
k
k
n2{Hn (Yn: K) - H(Yn: K)} = n 2{ Hn (~ p ) n
n
so that, by (3.6), (3.15) and (3.16), we have
(3.16)
H(~
(1).
p )} + 0p
'
E:
I
n,
- ,
e·
5
k
(3.17)
A
h.l nHn (Yn:K ) - j..IH (Yn : K ) }
n.2
n
k
= n.2 {Hn (~ p ) where
H(~
P
) =
k
n.2{
(3.18)
j..I
j..I
~{
= n~
-1
H(~
p
A
p and
j..InHn(Yn:K)
-1
p - H(Yn : K )
n
H (~ ) - H(~ )}
n
n
+ H(~
)}
p
p
=
k
p
p
A
)n.2 (j..I
+
n - j..I) + Opel)
open
-1
). Hence, as n
+
00
,
n
}
+
k
A
pn.2( j..In -j..I)
+ 0 (1).
p
Finally, (3.5) follows from (3.18), (2.7), (2.9) and the fact that for x
H I (x) =
x
-1 .
£
J ,
n
) as n + 00. Q.E.D.
P
{py~l.:.p j..I-l _ y~l(~ ) + pj..l-l = py~l_y~l(~ ), i > l} form a
].
1
P
IIp
g(x)
Note that
=
f(x)
+ f(~
sequence of i.i.d.r.v.'s with mean zero and variance
2
a
(3.19)
2
~p -2
(l-p) f O Y dG(y)
=
+
p
2
00
J~
-2
Y dG(y)
P
central limit theorem, we arrive at the following.
and hence, by (3.5) and the
Theorem 2. Under the hypothesis of Theorem 1,
(3.20)
This result as well as (3.5) can be extended to several quantiles under no
extra regularity conditions.
4. Some general remarks. In (3.5), we have considered a weak representation for
. , • • • ._ . . - ....... ,~ •. _~.T
..... _
-
ot-.............. ' .......
_· ......
'- ..... ....,~ •• ' ..._ J . . .
A
~
n,p
where the precise order of the remainder term has not been studied. If, in
addition to (2.10), we assume that in some neighbourhood of
~
p
, the p.d.f. f has
a bounded first derivative f', then, noting that g'(x) = (d/dx){xf(x)} = f(x)
+
xf'(x), we conclude that g' is also bounded in the same neighbourhood. As such,
we may let J * = {x : I x - .~
n
p
I -<Cn- k.2 (logn) ~ }
and parallel to (3.10), we may
claim that Y : K £ I n* almost surely (a.s.), as n + 00 , while in (3.14), we
n n
may not only use the order as n- 3/ 4 log n a.s., but also, by Taylor's expansion,
we obtain that as n--7 oo ,
*}
-3/4 log n) a.s.
= O(n
n
As such, we obtain that under this additional regularity condition, in (3.5),
(4.1)
sup{ Ig(~ ){x p
~}-
p
Gn(x)
+ Gn(~p)
I:
x
£
I
we have an almost sure representation with a remainder term O(n- 3/ 4 log n).
6
Secondly, instead of the behaviour of a single quantile, if we like to study
!,;
the same for the entire process {n 2(
A
~
n,p
stringent moment condition. Note that v <
~
00
) ; O<p<l}, we may need a more
p
implies that as x f 0, x-2 G(x) ~ 0,
though the rate of this convergence is not precisely known. If we assume that
oo
-2-0
=f a Y
dG(y)
(4.2)
<
for some 0 > 0,
00
then, we obtain that
x- 2- O G(x)
(4.3)
~ a as
x
O.
f
Also, by (3.1), for every x > 0,
!,;
(4.4)
n 2{H (x) - Hex)} =
=x
n
-1
n
~
IGn (x) -
Further, it follows from O'Reilly (1974) that for every
(4.5)
!,; €
!,;
sup{ n 2IGn
(x)-G(x)]/{G(x)
[l-G(x)]} 2·
: x
>
0,
+
€
so that by (4.3) and (4.5), for every x £ (0, n), as
(4.6)
€
R } = 0 p (1),
.
n ~ 0,
x-ln~[Gn(X) - G(x)] = (n~[Gn(X)-G(X)J/{G(X)[l-G(X)]}~-f)G(X)[l-G(X)J}~-E/x
converges to 0, in probability,
where we choose
€
so small that
(2+0)(~
-£ ) > 1. As such, the representation
in (4.4) is valid for every x > 0, and, using (3.6) and (4.4), we have
(4.7)
1
1 1
x -2!';
n~[J..lnHn (x). - J..lH(x)J = J..lx- n~[G n (x)-G(x)J +J..lf O y n 2IG (y)-G(y)]dy
n
A.
1
A
+ H(x)n~( J..l n - J..l) + opel) a.e.
-I!,;
x -2!';·
= J..lX n 2 IG (x)-G(x)] + }lf y n 2 [G (y)-G(y)]dy +
n
n
O
- H(x)J..l
If we let WO(G(x»
n
(4.8)
p (x,y)
q
2
n
!,;
+
!,;
= n 2 [G (x)-G(x)] , x
-1
00
fa y d{n 2 IGn (y) - G(y)]} + opel) a.e.
€
R and define the
= sup{ Ix(t)-y(t)l/q(t); O<t<l}
;"q(t)
p
metric
q
by
={t(1-t)}~-£,
then, the weak convergence of WO = {Wo(t), O<t<l} to a tied-down Wiener process
n
n
O
W , in the p metric in (4.8), follows from O'Reilly(1974) , and hence, using
q
.
!,;
+
A
(4.3) and (4.7), the weak convergence of n 2[ll H (x) -. pH (x) Lx
n n
.
£
R to a Gaussian
process follows readily. Thus, if we define Wn* = {wn* (t);O<t<l} by letting
!,;
2
+
*
*
[ II H (x) - llH(x)], x £ Rand W = {W (t);O<t<l} by letting
W*(G(x»
=
n
n
n n
A
e.
7
W*(t)
=
o .
bet) 2·
0
2 1 0
2·
.
}Jb(t)W (t) +}Jf
b (s)W (s)db(s) - H(b(t))}J fa W (s)b (s)db(s) ,
o
for O<t<l where bet) = G-l(t), t £ (0,1), then, we have
(4.9)
W* ~.
~
W* ,in the Skorokhod Jl-topology on D[O,l].
n
Note that the covariance function of W* is given by
(4.10)
EW * (s)W * (t)
=
(1-5) (l-t)
v sAt + st (V l - v Sf\t) .
l.J
-(Sl\t}(V svt - VSf\t)'
(s,t)
£
2
(0,1),
where
v
(4.11)
t
t.:t
fa
=
y-
2
dG(y) ,for
t
£
(0,1).
Also, in (2.4), we denote the solution K by K (p) and assume that f(t.: ) is
.
n
p
n
strictly positive (and finite) for all p :O<p<l. Then, instead of the Bahadurrepresentation in (3.14), if we use its generalization by Kiefer(l967) and
use (4.9), then, we are able to show that for every c > 0,
0_
( 412)
•
sup·
p£(O , 1)
!,,; Iy
I
!,,;
n2
r:
j(logn)2
. n:K (p) - '""p
<
c
as n
+
00
,
a.s.,
n
and further, (3.18) holds simultaneously for all p
1
E
(0,1). As such, the weak
A
invariance principle for { n~[ t.:
- ~ ]; O<p<l} follows from this extended
n,p
p
version of (3.18), (4.9) and the random change of time results in Billingsley
(1968,pp.144-lS0). Further, using the results of Csaki (1977), (4.7) may be
strengthened to an a.s. representation,and hence, strong invariance principles
(as well as the law of iterated logarithm) for the reduced quantile process
also follow under the additional regularity conditions (4.2) and
sup{
Iff (x) I
:x
E
R+}
<
00
that
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
.-
BAHADUR, R.R.· (1966). A note on quantiles in large samples. Ann. Math.
Statist. 37 , 577-580.
BILLINGSLEY,P. (1968). Convergence of Probability Measures. John Wiley,
New York.
COLEMAN, R. (1979). An Introduction to Mathematical Stereology. Memoirs No.
3, Dept. of Theoretical Statistic.,Univ. of Aarhus, Denmark.
COX, D.R. (1969)~ Some sampling problems in technology. In New Developments
in Survey Sampling{ eds: N.L.Johnson and H. Smith), Wiley, New York.
CSAKI, E. (1977). The law of iterated logarithm for normalized empirical
distribution function. Zeit. Wahrsch. ve'P1J). Geb. 38 J 147-167.
GHOSH, J •.K. (1971). A new proof of the Bahadur representation of quantiles
and an application. Ann. Math. Statist. 42J 1957-1961.
[7] . KIEFER, J. (1967). On Bahadur's representation of sample quantiles.Ann.
Math. Statist. 38, 1323-1342.
[8] PATIL, G.p. and RAO, C.R. (1977). The weighted distributions: A survey of
their applications. In Applications of Statistics Ced: P.R. Krishnaiah)
North Holland, Amsterdam, pp. 383-405.
[9] PATIL, G.P. and RAO, C.R.(l978). Weighted distributions and size-biased
sampling with applications to wildlife populations and human families.
Biometrics 34 J 179-189.
.
[10] O'Reilly, N.E. (1974). On the weak convergence of empirical processes in
sup-norm metrics. Ann.Probability 2J 642-651.
[ll] VARDI, Y. (1982a). Nonparametric estimation in the presence of length-bias.
Ann. Statist. lOJ. 616-620.
[12] VARDI, Y. (1982b). Nonparametric estimation in renewal processes. Ann.
Statist. lOJ 772-785.
© Copyright 2026 Paperzz