Nonparametric Estimation of a Regression Function by
Means of Concomitants of Order Statistics
by
Gordon Johnston*
University of North Carolina at Chapel Hill
and
Texas Instruments Corporation
Abstract
We study the properties of nonparametric estimates of a regression
function based on concomitants of order statistics (Yang (1977)).
Key Words:
Nonparametric regression, density estimation, concomitants of order
statistics
*The work of this author was supported by the Air Force Office of
Scientific Research Contract AFOSR-7S-2796.
1.
Introduction.
s.
S. Yang (1977) proposed as an estimation of the regression function
m(u) = E[YIX = u] of a bivariate random vector (X,Y) the statistic M defined
n
by
[i/
1 n
M (u) = (nE)- I K.
n
n .1= 1
n - F . (u) )
En
n
Y["
l.n
]
Here {E-IK(X/E )} is a o-function sequence of kernel type (Watson and Leadbetter
n
n
(1964) )
(X.,Y.), i=I, ... ,n are i.i.d. observations
1
1
on (X,Y), Fn is the empirical distribution function (EDF) of the X-observations,
and Y[. ] is the Y-observation corresponding to the i-th order statistic of the
l:n
X-observations, i.e., the i-th concomitant of the X-values n (see, e.g., Yong
(1977)) .
Our purpose here is to find conditions under which
;k
(nE )2 [M (u) - m(u)]
n
n
(1.1)
[s(u)
as n
+
f k2(t)dt]~
00, where E is a random variable with density e
_2e- x
, x > 0,
a, b, are constants, {En } and {dn } are appropriate real sequences and
2
s(u) = E[Y jx = a].
Bickel and Rosenblatt (1973) proved a similar result
for kernel estimates of a density function.
A large sample confidence inter-
val for m(u), based on Mn (u) is given, using (1.1).
We also give conditions under which
;k
(1. 2)
(nE n )2 [Mn(u) - m(u)]
L
+
N(O, s(u)
f
for appropriate points u and sequence {En}'
Our method of proof is to show that
(1. 3)
(nE
n
log n)~ sup 1M (u) - M** (u)
n
a:O;u:o;b n
I g 0,
2
k (t)dt) as n
+ 00
page 2
where M** is defined by
n
(1.4)
M**(u)
n
=
(nE)-
L
n '1.= l
M** is a special case of
n
Watson (1964).
1 n
Y.K((P(X).) - F(u))/E ).
1.
n
1.
the regression function estimation proposed by
Johnston (1979) gives conditions under which (1.1) and (1.2)
hold for M** in place of M , and (1.1) and (1.2) will thus hold by virtue of
n
n
(1. 3) .
2.
Asymptotic Equivalence of M and M**.
n
n
In this section we verify (1.3).
The proof is given in the Appendix
since it is rather technical and lengthy.
M* (u)
n
1 n
= (nE)- L
i=l
n
Define
Y.K((F (X.) - F(u))/E )
n
1.
n 1.
Then Lemma 2.1 gives conditions under which
:k
°
- M~ (u) 1 ~ °
(2.1)
(nE n log n)2 sup IM* (u) - Mn (u)
a:S;u:s;b n
(2.2)
(nEn log n)2 sup IM**(u)
a:S;u:s;b n
:k
I~
which together imply (1. 3) .
Lemma 2.1
(log n)
-1
Suppose {E-
1
n
:k
(nE 2)
n
-+
00.
K(x/En )} is a a-function sequence such that
K has bounded support and 3 bounded continuous deri-
vatives on the support.
Suppose
f
IK"(t) Idt <
00
and K and K' are of bounded
variation.
Let (X,Y) be such that EIYI <
varives on [0,1] and h(u)
=
E[(Y)
00,
Ix
=
g(u)
=
E[YIX
=
1
F- (u)] has 2 bounded deri-
1
P- (u)] is bounded on [0,1].
Assume there exists a real sequence {an} such that an
2
3
an log n/(nEn )
-+
° and
~f
lyldFY(y)
Iyl > an
-+
° as n
-+
00
-+
00,
page 3
o
Then, for 0 < F(a) < F(b) < 1, (2.1) and (2.2) hold.
3.
Applications.
We will assume throughout this section that the assumptions of Theorem 2.1
are in force.
We first note that M** may be written as
n
M**(u) = (ns)n
1 n
I
Y.K((Z.-F(u))/s)
n
i=l
~
U(O,l).
1
n
1
where
Z.1
= F(X.)
1
According to Theorem 2.5.2 of Johnston (1979), under certain conditions,
(ns )~ [M**(u) - E(YIZ
n
n
= F(u))]
L .
N(O, E(y 2 /Z
+
= F(u)) f
K2 (t)dt).
If we assume F to be strictly increasing, then
E(YIZ = F(u)) = m(u)
~d
E(y
2
lz = F(u)) = s(u).
Thus we have, by virtue of (1.3)
k
(nSn)2[Mn(u) - m(u)]
k N(O,
which
comp~etes
s(u)
f
2
K (t)dt),
the proof of normality of M.
n
We note that this asymptotic
variance differs from that of Yong (1977), Theorem 6.
If the conditions of Corollary 3.2.9 of Johnston (1979) hold, then
k
(3.1)
k
(20 log n)2
(ns )2[M**(u) - m(u)]
n
n
-dn
L
+
E,
page 4
where E is a random variable with density e
2
1:.
5 < 0 < 1 and d
n
_2e- x
Here
E:
n
= n -6
is the sequence of entering constants specified in Bickel
and Rosenblatt (1973).
By virtue of (1.3), (3.1) holds with M replacing
M**, as we wished to prove.
n
approximate (I-a)
x
,
100% confidence band for m(u) over the interval (a,b),
Mn (u) ± (nEn ) -"[5 (u)
f
2
K (t)dt J"
where
=
n
Inverting (3.1) in the usual way yields an
based on Mn(u):
c(a)
, x > O.
log 2 - logllog (I-a)
I.
t
n +
c(a)
(20 log
n)~
page 5
APPENDIX
Proof of Lemma 2.1.
We begin with the following preliminary lemma, which is very similar to
Lemma 1 of Bhattacharyya (1967).
Assume that g(u) = E[YIX = F-l(u)] has r continuous derivatives
AI. Lemma
on [0,1], r
>
the support.
0, and that K has bounded support and r bounded derivatives on
Then for a, b such that 0
IE~(r+l) II
<
F(b)
yK(r) ((F(x) - F(Z))/En)dF(x,y)
uniformly in Z E[a,b] as n
Proof.
F(a)
<
+
00
I
•
Note that
-(r+l)
En
If
yK(r) ((F(x) - F(Z))/E ) dF(x,y)
n
-(r+l) EYK(r) ((F(X) - F(Z))/E )
= En
n
-(r+l)
= En
I
m(x)K(r) ((F(x) - F(Z))/E n ) dF (x)
1
= E-(r+l)
n
I0
g(u)K(r) ((u-F(Z))/E ) duo
n
Now write
- (r+l)
(r)
En
g(u)K
((u-F(Z))/E)
n
=
E- l g(r) (u)K((u-F(z))/E )
n
n
~u
Hence
sup
r-l
I
s=o
E~(s+l)g(r-s-l) (U)K(S) ((u-F(Z))/E ).
n
IE~(r+l)
1
g(U)K(r) ((u-F(Z))/En ) dul
Z
1
:;;; sup
Z
I
o
g(r)(u)K((U-F(Z))/E )
n
=
<
1,
0(1)
page 6
~
[rEI e:-(s+l)g(r-s-l) (u)K(s) ((u-F(z))/e: l
s-o n
n
+ sup
z
I
1
~u=o
The second term above is zero for large n since the argument of K(s) is
eventually outside the support of K.
Write
l
sup je:- } g(r) (u)K((u-F(z))/e: ) dU/
z
nOn
(I-F(z))/e:
I
= sup ..
z
J
n K(v)g (r) (e: v+F (z))
n
-F(z)/e:n
J
sup /g(r)(t)/
$
/K(v)
I
dv <
00
o
•
t
We now proceed with the proof of Lemma 2.1.
= e:-n l
Mn (u)
II
It is convenient to rewrite
yK((F n (x) - Fn (u))/e:n ) dFn(x,y),
and similarly for M; and M;*.
Thus, letting Zn(x,y)
= Fn(x,y) - F(x,y), we
may write
M*(u)
- Mn (u)
n
=
£~l ff
+
£~l ff { [ Fn(:~-F(U) J - K[ Fn(X~:Fn(U)
Y[K[
= J l + J 2 , say.
Fn(:~-F(U)
J
K[
Fn(X~:Fn(U)
]
dZn (x, y)
JJ
k
We first show (ne:nlog n)2 IJ21
dF(x,y)
p
+
O.
Since, by assumption,
K has 3 continuous derivatives, we may write (by expanding K((Fn(X) - Fn(u))/e:n )
about (F n (x) - F(u))/e:n )
J2 =
e:~
2
[Fn(u) - F(u)]
JJ
yK'
[ F (x)-F(u) ]
n
e:n
dF(x,y)
-3
2 · [ Fn(x)-F(u) ]
e:n
dF(x,y)
+ e:n [Fn(u) - F(u)] JJ yK"
page 7
+
£n
4
[F (u) - F(u) ]
n
Now, expanding K' (
3
II
Fn(X) - F(U)]
~
(n£nlog n)~ sup
(AI)
w
[ F (x)+
yK'"
n
£
n
(u)]
dF (x, y)
n
about (Fex) - F(u))/£
n
yields
IJJl) I
u
~ (n£n log n)~ £-2
sup IF (u) - F(u)1
n
n
u
x{
+
+
f
If.
yK' [F(X) £: F(u) ] dF(x,y)
I 1·1 [
I II [
Fn (X£)n- F(x) ]
I
yK" [ F(x) £: F(u) ] dF(x,y)
Fn(X)£n- F(X)] yK'" [Vn£n'
(x a)J dF(x,y)
I
I}
where vn(x,u) is between Fn(x) - F(u) and F(x) - F(u).
Using the fact that sup IFn (u) - F(u) I = 0p (n-~) and applying Lemma Al implies
u
that the first term on the RHS of inequality Al goes to zero. For the second
term, note that
£~l II
(l-F(u))/£
=
I
IYK" ( F(x) £- F(u) ]
n
n IKI'(V) !h(£nV
+
I dF(x,y)
F(u)) dv,
-F(u)/£n
which is a bounded sequence since h is bounded and K':' has bounded supports.
page 8
Thus the second term on the RHS of (AI) is equal to
(n£n log
n)~ £-2
0p (n- l ) 0(1), which converges to zero in probability if
n
(n£ n log n) ~/n£ 2
n
3-1
i.e., if (n£ n ) (log n)
+ 0,
+
00,
which is true by assumption.
For the third term on the RHS of (AI) note
~ sup
v
IK'"(V)IEIYI
Thus the third term is a (n£
<
00.
log n)~ £-4 0 (n- 3 / ) sequence, and converges
2
n
n
p
. pro b a b'I"
"
(1 og n )-1 n£ 7/2
to zero In
1 lty slnce
n
~ 00.
S'lml'1 ar arguments app 1y
;k
I I +P O.
to J 2(2) and J (3) ,and we have shown (n£n log n)2
supJ
2
2
We now turn to J .
I
Let {a } be a sequence as specified in the hypotheses
n
and write
Ji I )
=
+
Ji 2), say, where, for convenience, we write
Using integration by parts, write
J(2)
1
=
page 9
a
+
-1
f
lim £
t-+«> n
n
Gn (t,u) yZ n (t,dy)
-an
a
n
-1
- lim £
Gn (t,u) yZn (t,dy)
f
t-+_ oo n -a
n
+
£n a n
-1
f
Zn (x,a n ) Gn (dx,u)
+
£-l a
n n
f
Z (x,-a ) G (dx,u)
n
n
n
Since Zn(-oo,y)
=0
for each n and y, it is easily ascertained that 1
each n (e.g. Natanson (1964), p 233).
3
=0
for
Similarly,
a
-1
lim Gn (t,u) £
n
t-+«>
f
n
-an
yd~ (y)
where
on ()
Y
Now
.
= 11m
t-+«>
Y
Y
Zn (t ,y ) -- F (y) - F (y) •
n
a
fn
ydZn (y)
-a
=
n
-1 n {
n
as, n -+
arguments.
}
] (Y.)
- EYI [_ a ,a ] (Y) -- 0 (n -~)
. L1 Y.1 1[_ a,a
1
1=
n n
n n
p
00
by standard central limit theorem
Further, using the mean value theorem,
lim G (t,u)
t-+«> n
=
K[ 1 -£n F(U)] _ K[ 1 -£Fn(U))
n
uniformly in u, where q (u) is between F (u) and F(u).
n
n
page 10
Thus we have
(nE
n
1 og n ) ~ sup
n
.
3
1I 2
Slnce nE 1log n
n
(u )
I
=
( nE
1
)~ c..~ -2 0p (n- l )
n og n
n
-+
0
-+ w
For 1 , note that
4
11
Zn (x,an ) Gn (dx,u)I
/Zn (x,an )1
~ sup
x
V[G n (. , u)],
Where V[ ] denotes total variation over R.
Now
Izn (x,an )1 = Op(n-~)
sup.
x
and it is easily verified, using the mean value theorem, that
uniformly in u.
Thus
h:
(nEnlog n)2 sup II 4 (U)
I
u
=
since a
2
n
a n (nE n log n)
log ninE
3
n
(nE log
n
-+ 0
n)~
~
-2
En
0p (n -1 ) -+P 0
by assumption.
sup IIS(u)1
u
A
~ o.
For II' note that
I 1
1 Zn (x,y)
<
Iy 1-an
~ sup
x,y
dyG (dx,u)
n
/Zn (x,y)IV[yGn(x,u)]
,
I
similar argument applies to show
page 11
where V denotes here the total variation in (x,y) over R x [-a, a ].
n
n
As
before,
sup Izn(x,y)I
x,y
=
Op(n-~)
and
Thus
k-
(nEnlog n)2 sup III (u)
I
u
= an En-2
since
a~ log n/nE~
+
(nE n log n) ~ 0p (n -1 ) +P 0
0 by assumption.
k-
As the final step in the proof, we must verify that (nE log n)2
sup
u
/Ji l ) I ~ O.
n
Note that
(A2)
+
I J J yGn(x,u)dF
(x,y)
I·
!YI>an
For the first term, note
$
~
sup IGn(x,u) I J
Iyl
x,u
IYI>an
As before,
sup IGn(x,u)I
x,u
dF~ (y).
page 12
and
Now, by the Markov inequality, for any E > 0
=
J Iyl
In
dpY(y)
+
0
IYI>an
by assumption, and thus
A similar argument applies to the second integral on the RHS of (A2) and we
thus have
(nEnlog
n)~ sup IJil)(U)/
u
(nE log n)~ E- 2 0 (n- l )
n
n
p
3
since nE /log n
n
+
00
by assumption.
g0
o
The proof of (2.2) follows a similar pattern, and we omit the details.
ACKNOWLEDGEMENT
The author thanks R. J. Carroll for many helpful suggestions.
REFERENCES
Bhattacharya, P. K. (1967), "Estimation of a probability density function
and its derivatives", Sankhya, Series A, 29, pp 373-382.
Bickel, P. J. and Rosenblatt, M. (1973), "On some global measures of the
deviation of density function estimates", Ann Math Statist 1,
pp 1071-1095.
Johnston, G. J. (1979), Ph.D. Dissertation, UNC at Chapel Hill, Dept of
Statistics, unpublished.
Natanson, I. P. (1964), "Theory of functions of a real variable", Vol I,
Ungar.
Watson, G. S. (1964), "Smooth regression analysis", Sankhya, Series A, 26,
pp 359-372.
Watson, G. S. and Leadbetter, M. R. (1964), "Hazard analysis II", Sankhya,
Series A, 26, pp 101-116.
Yang, S. S. (1977), "Linear functions of concomitants of order statistics",
Technical Report No 7, Dept of Math, MIT.
© Copyright 2026 Paperzz