Gangopadhyay, Ashis KA Note on the Asymptotic Behavior of Conditional Extremes"

A NOTE ON THE ASYMPTOTIC BEHAVIOR OF
CONDITIONAL EXTREMES
Ashis K. Gangopadhyay
University of North Carolina at Chapel Hill
ABSTRACT
Suppose (X1,Zl)"'" (Xn,Zn) are LLd. as (X,Z). Let Mn,k be the maximum of all
Z.l
's for which X. isI
among the k-nearestn
neighbors of X=x ' It is shown
that M k has the
,
O
same asymptotic distribution as the maximum of k LLd. copies of the random variable Z
given X=x ' Hence, the asymptotic results for Li.d. observations are applicable to this·
O
situation as well.
Key' words: nearest neighbor, conditional extremes.
-1-
1. INTRODUCTION AND FORMULATION OF THE PROBLEM:
In recent years, the extreme value theory has generated considerable interest. Not
only that the classical results for LLd. observations are revisited, but also these results are
extended to handle some types of dependent observations as well. A review of the classicial
and current results in this area may be found in Leadbetter, Lindgren and Rootzen (1983).
Our objective, however, is to study the asymptotic behaviour of local extremes of a
conditional distribution.
More specifically, suppose (Xl'ZI)'"'' (Xn,Zn) are LLd.
as
(X,Z). Let f and F be the density and the distribution of the random variable X, and let g
and G be the conditional density and the conditional distribution of Z given X.
Define, at X=x
O
ZF is called, the right end point of the conditional distribution of Z given X=xodenote G(z Ix ) simply by G(z).
O
Now, choose k=k(n)=O(n ll)
k=k(n)=[<p(n)b], where <p(n)=n
ll
for
simplicity of argument, assume that
~ ~
where
~~
II
<
i
II
<
i
More specifically, let
and b is a constant, 0 < b <
<p(n)b is a constant.
Also
00.
For
Furthermore, define y. =
IX.-xol,
i=I,2, ... ,n and Y = IX-xol . Also let 0 ~ Yn, 1 < Yn, 2 < '" Y n,n <
1
1
00
denote
the order statistics of Y I ,... , Yn and let Zn, I'''''Z n,n be the induced order statistics of
(Y1,ZI)"'" (Yn,Zn); Le., Zn,i = Zj if Yn,i = Yj , i=I,2,... ,n.
Now, define
Mn, k = max{Z n, 1'"'' Zn, k}'
Our objective is to study the asymptotic behavior of Mn ,k'
Note that if
ZF <
00,
then
Mn,k
may be thought of as a k-nearest neighbor
estimate of ZF' However, as far as the theoretical treatment of the problem is concerned,
we don't need to assume that ZF < 00.
-2-
At this point, we make the following assumptions:
1.
(a) f(x ) > 0 and fit is continuous at X=xO'
O
(b) f'(x) is bounded at X=xO'
2.
Gx(z Ix)
(a)
Gxx(z Ix)
and
exist and are bounded for (x,z) lying in a
neighborhood of (xO,ZF)'
(b)
Gxx(zlx) is equicontinuous in x at X=x
ZF'
Le.,
lim
x-+x
O
Gxx(z Ix) = Gxx(z Ix O)
for z lying in a neighborhood of
hold uniformly for z lying in a
O
neighborhood of ZF'
2. ASYMPTOTIC DISTRIBUTION OF Mn,k
- -
-
Let Zl' Z2,,,,,Zk be
k
LLd. observations having the distribution G(z Ix O); Le"
Z1,,,,,Zk are Li.d. as Z given X=xO' Define
-
--
Mn,k = max(Zl'""Zk)
The
following
theorem
describes
the
relationship
between
the
asymptotic
distributions of Mn,k and Mn,k'
Theorem 1
Under the assumptions 1 and 2, we have
k
P(M n ,k ~ z) = G (zlxo) + R n ,k '
where the bias term R k is such that
n,
R n, k -+ 0 as n (and, hence, k) -+
00
A detailed proof of the theorem is given in Section 3. The following corollary is a
simple consequence of Theorem 1.
e
-3-
Corollary 1
Under the assumptions of Theorem 1
P(M n,k ~ z) - P(M n,k ~ z)
-+
o.
Corollary 2
Under the assumptions of Theorem 1
M n,k ~ZF
Let
f
> 0 and consider
by theorem 1. So, by (1.1) we have
(2.1)
P(M n ,k $ ZF - f)
-+
0 as n (and k)
-+ 00.
Similarly;
(2.2)
P (M n ,k ~ ZF
+ f) -+ 0
as n
-+ 00 •
So, by (2.1) and (2.2), we have
in probability. However, M k is a monotone increasing function of nand k; hence
n,
a.s. Z
M n,k --+
F·
Q.E.D.
-4-
Theorem 2
Under the assumptions 1 and 2, if for some constants ak > 0; bk ; we have
for some non-degenerate H, then H is one of the following three extreme value type:
Type I
Type II
z
~
0
for some
Q
> 0,
Z
>0
Type III
H(zlx o) = {
exP(-( -'Z) Q)
1
fo r some
Q
> 0 and z ~ 0
z >0
Proof:
By theorem 1
if and only if
hence the result is an easy consequence of theorem 1.4.2 of (4) for Li.d. sequence.
Q.E.D.
-5-
The following theorem gives a necessary and sufficient condition for the asymptotic
distribution to belong to each of the three types.
Theorem 3
Necessary and sufficient conditions for the conditional distribution G of the r. v. 's of
-
the Li.d. sequence {Zk} given X=x
O
to belong to each of the three types are:
Type I
There exists some strictly positive function g(t,x ) such that
O
1 - G(t+ zg(t,x ))
O = e-z
1 - G( t)
for all real z.
Type II
If ZF =
00
and
lim (1 - G(tz)) =z-it, a> 0 ,
t .... 00
1 -
G(t ))
for each z > 0 .
Type III
ZF
< 00 and lim
it > 0, for each z >
1 - G(Z -zh)
F
= zit ,
hl0 1 - G(ZF-h)
o.
Proof of this theorem is exactly the same as that of theorem 1.6.2 in (4). A simpler
sufficient condition for G to belong to each of the three possible types under additional
conditions is given in theorem 1.6.1 of (4).
-6-
Example:
2
2
Let (X,Z) have a bivariate normal distribution with ttX=O, ttZ=O, O'X=l, O"Z=l
and Corr(X,Z) = p. Then the conditional distribution of Z given X=xO is normal with
mean px and variance (1_p2).
O
So, by Corollary 1 and theorem 1.5.3 of (4), we have
where
and
1
1
r:-2
-1
bk = P xo + [(2 log k)2 - ¥2 log k) 2 (log log k + log 41r)]11-P- .
3. PROOF OF THEOREM 1
First, we will establish two important results in lemma 1 and lemma 2. Lemma 1
deals with the order statistics of the random variable Y =
IX-xOI
and lemma 2 gives a
representation of the conditional distribution of Z given Y=Y . (for i=1,2, ... ,k) in
n,l
terms of the conditional distribution of Z given X=x .
O
Lemma 1
Let cp(n)
B
~ f(~O)
-+
and C
00
~ 0,
and
n-lcp(n)
-+
0 as n
-+
00.
Also assume f(x ) >
O
o.
Then for
we have
•
P(Yn,cp(n)b > n-1 cp(n)B
+ C) ~ exp[-2 n-1
ep2(n) (B f(x )-b)2
O
-2 n C2 (f(xO))2 - 4C cp(n)f(xO)(B f(xO)-b)]
-7-
Proof: Bhattacharya and Mack (1987) proved the result for C =
of argument may be used to obtain the result for any C ~
o.
Exactly the same line
o.
Q.E.D.
Now, let fy and F y denote the pdf and cdf of y, G*(·I y) is the conditional cdf of Z
given y=y.
Note that
G*(·IO) = G(·lxo) = G(·).
Define
G*(·\Y .)
n,l
(i.e., the
conditional distribution of Z given Y=Y .) by G n ·(·).
n,l
,1
Lemma 2
Under the assumptions 1 and 2
i=1,2, ... ,n
where
1 2
Y .
6·=-2
n,l
n,l
and
lim R(z,6)=0
6LO
Uniformly in z lying in a neighborhood of ZF.
Proof:
This result is very similar to lemma 2.3 in Gangopadhyay (1987).
sketch the line of argument.
Since Y = IX-x
and
I
O
Here we will
-8-
Also;
Expand f(xO+Y)' f(xo-y), G(z IxO+Y) and G(z Ixo-y) around xo to obtain
f(xO+Y) + f(xo-y)
= 2 f(x O) +
1 2
2 Y {f"(x O+01y) + f"(x O+02 y)}
and
where
I 0i I ~ 1,
i=1,2,3,4.
Since by virtue of our regularity conditions
and
lim [G xx (zlxo+O.y)
-G xx (zlxo)] = 0
1
y-+ O
uniformly in z lying in a neighborhood of ZF' we can write
where
as y -+ 0 , so that (2(y) is also 0(1) as y -+ 0, and
•
-9-
f(xO+Y) G(zlxo+Y) + f(xo-Y) G(zlxo-Y)
= 2 f(xo)[G(zlx o) +
~ y2{f(xO)G xx (zlxo) + G(zlxO)fll(x O)
2
+ 2 f'(x O) Gx(z IxO)} / f(x O) + y (3(z,y)] ,
where
(3(z,y)
1 g(zlxo) 2
£(x ) i~1 {fll(xO+Oiy) - fll(X O)}
O
= (4)
1
4
+ (4) i~3 {Gxx(zlxo + DiY) - Gxx(zlxo)}
is 0(1) as Y-I 0, uniformly in z lying in a neighborhood of ZF"
Multiplying the last two expressions, we now have
GZ Iy(z Iy)
= [1 -
122
(2")y {fll(xO)/f(Xo)} + y (2(Y)]
1 2
+ [G(zlxo) + 2"Y {f(xO) Gxx(zlxo) + G(zlxO)fll(xO)
2
+ 2 f'(x O) Gx(zlxO)}/f(x O) + y (3(z,y)]
= G(zlxo) + ~y2{Q(z) +
= G(zlx ) + o{Q(z) +
o
R(z, ~y2)}
R(z,o)}
where
and
stands for the remainder term; Le., R(z,~ y2) is of the form
-10 -
where A(z) is a linear combination of G(z IxO)' Gx(z IxO) and Gxx(z IxO) which are
bounded in a neighborhood of ZF' Together with the convergence properties of f 2(y) and
f (Z,y), as observed earlier, this implies that
3
lim R(z,t5)
8-+ 0
= 1im R(z,~ y2) =
0
y.... O
uniformly in z lying a neighborhood of ZF'
Hence,
where
1 2
t5n,l'=-2 Yn,l'.
This completes the proof of the lemma 2.
Proof of theorem 1
Let
Zn, 1""'Z n,n
.A = u(Y 1" .. ,Yn ).
We use the fact that the induced order statistics
are conditionally independent given
Y1'''''Y n
with the conditional
cdf
Gn ,1" .. ,G n ,n respectively (Bhattacharya, 1974) to obtain
P(M n,k ~ z)
< Z"",Zn,k ~ z)
-
P(Zn,1
-
E(P(Zn,1 ~ z,,,,,Zn,k ~ z/ A))
•
-11-
k
=
E[i=l
n Gn,}.(Z)]
= E[
~
i=l
°
{G(z) + n,}.(Q(z) + R(z,on,}.))}]
k
= E[ n (G(z) + Vn,k,i)]
i=l
k
= G (Z)
+ Rn, k'
by lemma 2, where
Vn, k', }
= on
,}.(Q(z) + R(z,on ,.}))
and
R k
n,
=
E[ni=lk
(G(z)+V k')] - Gk(z).
n, ,}
Objective is to show that Rn, k -+ 0 as n (and hence k)
First, note
(3.1) and
So, it is enough to show
(3.2)
and
(3.3)
as n (and hence k)
-+ 00.
-+ 00.
-12 -
We will prove (3.1). Proof of (3.2) is similar. We start by defining
•
Furthermore, define
W~,k
= Wn,k
if Yn,cp(n)b
=0
~
n-1 cp(n)B
if Yn,cp(n)b > n-1 cp(n)B
So,
(3.4)
o~
E(W*n, k)
=
E[Wn, k l(Y n,'f/,n
,,",, )b ~ n-lcp(n)B)]
However, on the set Sn = {Yn,cp(n)b
~ n-lcp(n)B}
k
1 2 k
o~ Wnk ~.II {G(z)+2"Y ni (IQ(z)1 + IR(z'Oni)l)} - G (z)
'1=1
'
,
k
<.II {G(z) + ~ n-2cp2(n)B 2( IQ(z) I +
1=1
sup
O~o~n -2 cp2(n)B 2
IR(z,b) I)}
- Gk(Z)
But Q(z) is a function of f' (xO)' Gxx(z IxO)' Gx(z IxO) which are bounded in the
neighborhood of ZF' Now, using the fact that lim R(z,o) = 0 for z lying in a
010
neighborhood of ZF; we have for n > N1
sup
IR(z,b) I ~
2
O~o~n-2cp2(n)B
So, on the set Sn and for n > N1
IQ(z) I .
•
-13 -
o ~ Wn,k ~
{G(z)
k
(3.5)
- Gk(z)
(~) [n-2 <p2(n)B 2 1 Q(Z) I]l Gk-1(z)
E
=
+ n-2<p2(n)B 2 IQ(z) I}k
l=l
which is O(n-2<p3(n)). Thus for any arbitrary constant M > 0, there exists N
2
for n > N2
o~
Wn ,k ~ M,
since <p( n) is chosen to be n a where
This implies that for n
~
fr ~ a < i .
max(N 1,N 2) = N
Now, substituting (3.5) in (3.4), we have
So; n -+ 00
(3.6)
E(W*n, k)
-+
O.
Finally we will show that (3.6) implies
E(Wn, k)
-+
O.
Choose f n = n-1<p(n)B, then
such that
-14 -
+
00
E
i-I
.12
. 2
(f +i)[exp{-2 n- 'P (n)(B f(xO)-b)
n
,
Using lemma 1 (with c=i-l) and where
f (£) = (f +£)3 exp[-2 f tp(n)B-1(B f(x ) - b)2
n
n
O
n
-2 n(i-I)2 (f(xO))2 - 4tp(n)(i-l) f(xO)(B f(xO)-b)]
So, except possibly for finitely many n, fn(l) is uniformly bounded for each l.
Now, taking limit on both sides and noting that
f
n
~
0 as n
~
00,
we have
showing
E(Wn, k) ~ 0
So, by (3.1), (3.2) and (3.3), we have
Rn, k~ O.
This completes the proof of Theorem 1.
Acknowledgement: I am deeply grateful to Professor P .K. Bhattacharya for many of his
helpful suggestions which are incorporated in the present work.
•
-15 -
REFERENCES
1.
Bhattacharya, P.K. (1974) Convergence of sample paths of normalized sums of
induced order statistics. Ann. Statist. 2, 1034-1039.
2.
Bhattacharya, P.K. and Mack, Y.P. (1987) Weak convergence of K-NN density and
Regression estimators with varying K and applications. Ann. Statist. 15,
976-994.
3.
Gangopadhyay, A.K. (1987) Nonparametric estimation of conditional quantile
function. Ph.D. dissertation at the University of California at Davis.
4.
Leadbetter, M.R., Lindgren, G. and Rootzen, H. (1983) Extremes and related
properties of random sequences and processes. Springer-Verlag.