A general class of estimators of the extreme value index

ELSEVIER
Journal of Statistical Planning and
Inference 66 (1998) 95 112
journal of
statistical planning
and inference
A general class of estimators of the extreme value index
Holger Drees *
Mathematisches Institut, Universit~t :u K?En, Weyertal 86-90, 50931 Kgln, German)'
Abstract
'We consider the class of estimators of the extreme value index [~ that can be represented as
a scale invariant functional T applied to the empirical tail quantile function Q,. From an approximation of Q,, first asymptotic normality of T(Q~) is derived under quite natural smoothness
conditions on 7" if/q is positive. As a consequence, a widely applicable method lbr the construction of estimators with a prescribed asymptotic behavior is introduced. If [¢ ~<0 then either
T :must be location invariant or it has to satisfy a certain regularity condition on a neighborhood
of a constant function to ensure asymptotic normality. It turns out that in this situation location
invariant estimators are clearly preferable. © 1998 Elsevier Science B.V
A M S class!fication: primary 62G05; secondary 62G30, 60F17
Keyword~': Extreme value distribution; Extreme value index; Statistical tail functional; Empirical
tail quantile function; Hadamard differentiability; Location invariance; Adaptive estimator
1. Introduction
Suppose n i.i.d, random variables ( r . v . ' s ) 8 , l<~i<~n, with common distribution
function (d.f.) F are observed. It is assumed that F belongs to the weak domain o f
attraction of some unknown extreme value d.f. G (in short, F ~ D(G)), that is,
U ( a\ , , '
(--m a x X i - b , , ) ) - - . G
weakly
(1.1)
for s o m e normalizing constants an > 0 and bn C R.
Since the famous work of Gnedenko (11943) it is well known that up to a scale and
location parameter a nondegenerate limiting d.f. has to be of the form
Gl~(X)= e x p ( - ( l @[Jx)-l/fi),
1 ÷fix>O, [JC~,
which is interpreted as e x p ( - e -x) if [1 = O.
E-mail: [email protected].
0378-3758/98/$19.00 @ 1998 Elsevier Science B.V. All rights reserved
PH S 0 3 7 8 - 3 7 5 8 ( 9 7 ) 0 0 0 7 6 - 1
96
H. Drees/Journal of Statistical Planning and InJerence 66 (1998) 95 112
The problem of estimating the so-called extreme value index/3, which determines the
behavior of the underlying d.f. F in its upper tail, has received much attention in the
last twenty years; see, e.g., Hill (1975), Pickands (1975), Hall and Welsh (1984, 1985),
Cs6rg6 et al. (1985), Smith (1987), Dekkers et al. (1989) and Drees (1995a, b, 1996).
An extensive motivation of this estimation problem can be found in Galambos (1978);
more recently de Haan and Rootz~n (1993) demonstrated how to use an estimate of/3
to construct estimators of extreme quantiles.
In view of (1.1) it is evident that an estimator of/3 should utilize only large observations. Essentially, there are two possible interpretations o f what is to be considered
as large in this context. The Gumbel or annual maxima method utilizes the maxima of
disjointed subsamples. Such an approach suggests itself, for example, if the data are
collected over a period that splits up into several subperiods in a natural way. On the
other hand, the method seems quite arbitrary if there is no such natural partition of the
sample.
For that reason, since the paper o f Pickands (1975) more attention has been paid to
estimators /~n that are based on a certain (deterministic or random) number, say k~ + 1,
o f upper order statistics X~-i+l:~, 1 <~i<~k~ + 1. Then one can find a functional T,
such that
L = r.(Q°),
where the empirical tail quantile function (q.f.) Q, is defined as
Qn(t):=Fnl(1-~t)=Xn_[kntj:n,
t E [0,1].
If one wants to investigate the asymptotic behavior of such a sequence o f estimators,
then clearly, one has to assume that the sequence of functionals T, converges to a
limiting functional T in some sense. Thus, it seems reasonable to assume that the
sequence is even constant, that is,
/~ = T(Qn)
(1.2)
for a functional T and all n C ~. In fact, almost all known estimators of the
extreme value index that are based on largest order statistics are o f this type, for
example, the Hill estimator, the Pickands estimator and the kernel estimators considered by Cs6rg6 et al. (1985). Observe that estimators o f the form (1.2) can be
regarded as an extreme value counterpart of the classical statistical functionals (see
Fernholz, 1983). Therefore, they will be addressed as statistical tail functionals in the
sequel.
In the following, assume that the number of order statistics used for the estimation constitutes a deterministic intermediate sequence, i.e., kn ---+w and k,/n ~ O. Note,
H. Drees/Journal o[' Statistical Phznning and ln[~'rence 66 f1998) 95 112
97
however, that all results established below carry over to estimators that are based on
the exceedances over a given deterministic threshold if the average number of these
exc,eedances equals k,.
In a foregoing paper (Drees, 1995b) we investigated the asymptotic behavior of
statistical tail functionals T(Qn) under the assumption that T is scale and location
invariant. Indeed there are mathematical, as well as practical reasons to require such
invariance properties. Since an affine transformation of the r.v.'s Xi merely leads to
a change of the normalizing constants an and b,,, it influences neither the extreme
value index nor the accuracy of the approximation ( l . l ) . Moreover, in practice, it is
reasonable to demand that a change of the unit of measure, which implies a scale
transformation of the data, will not change the estimate of [L Finally, in some applications, the observations (e.g., sea levels or temperatures) depend on an arbitrarily
chosen zero-point, which equally should not affect the estimator.
,On the other hand, while all the well-known estimators of the extreme value index
are scale invariant, there are some that are not invariant under a shift of the data.
Certainly, the best-known estimator of this type is the Hill estimator, which is only
consistent for nonnegative//, but there are also estimators that work for all real extreme
value indices, e.g., the moment estimator proposed by Dekkers et al. (1989).
Therefore, in the present paper we aim at extending the results given in Drees
(1995b) to the general class of statistical tail functionals that are scale but not location
invariant. It will turn out that there are only limited changes necessary if/;~>0, but
that the situation alters completely if the extreme value index is negative. Of course,
this is due to the fact that a shift of the maximum of the observations leads to a different limit if the underlying d.f. has a finite right endpoint whereas, roughly speaking,
(X;,:,, +t~)/X, .... - O ( n -p) if [~ is positive. Hence, we have to deal with these two cases
seperately.
In Section 2 an approximation of the empirical tail q.f., which was established in
Drees (1995b), is prepared for an easy application in the absence of location invariancc.
As; a consequence, in Section 3 we obtain asymptotic normality under a differentiability
condition on the functional T that is comparable to the one imposed in our fbregoing
paper. Two examples, one of which concerns Hill's estimator, demonstrate the applicability of the result. In particular, a general method is proposed to design estimators of
the extreme value index with prescribed asymptotic behavior. The section closes with
a discussion of the asymptotic efficiency of the estimators under consideration, particularly in comparison with the location invariant statistical functionals. In Section 4
we go on to the case where the extreme value index is not positive. It turns out that
a condition on the local behavior of the functional r in a neighborhood of the constant function 1 that seems less natural than the differentiability condition in the case
[4 > 0 is necessary to obtain asymptotic normality. However, there are examples of this
type of statistical functionals including the moment estimator of Dekkers et al. (1989).
An examination of the limiting normal distribution leads to the conclusion that at least
asymptotically location invariant estimators are preferable if the extreme value index
is expected to be negative.
98
H. Drees / Journal of Statistical Planning and Inference 66 (1998) 95-112
2. A limit theorem for the empirical tail quantile function
If one wants to derive an approximation of the empirical tail q.f. that is based on the
convergence (1.1), then first, one has to quantify the rate of that convergence. This can
be done by means of the so-called second-order conditions. In literature, a lot of such
conditions can be found, see, e.g., Hall (1982), Cs6rg6 et al. (1985), Smith (1987),
Reiss (1989) and Dekkers and de Haan (1989, 1993). However, due to its simplicity
and generality, the following expansion of the underlying q.f., which was extensively
studied by de Haan and Stadtmfiller (1996), seems most attractive.
Condition 1. There exist measurable, locally bounded functions a, 4~:(0, 1 ) ~ (0, cx~)
and qJ : (0, cx~) --+ R such that
F-l(1 - tx)-
F - l ( 1 - t)
x -Is - 1
-
a(t )
-
-
fl
+ ~(t)q'(x) + R(t,x)
for all t E (0, 1) and x > 0. (By convention, (x -f3 - 1 )/fl := - log(x) if fl = 0.) Here
(i) qJ ~ 0 and R ( t , x ) = o ( 1 ) as tJ. 0 for all x > 0 or
(ii) x ~ te(x)/(x-# - 1) is not constant, 4~(t)= o(1) and R ( t , x ) = o ( ~ ( t ) ) as t l 0 for
all x > 0 .
Note that Condition l(i) is equivalent to our basic assumption F E D(G#) (de Haan,
1984, Theorem 1). The stronger Condition l(ii) implies that ~b is 6-varying at 0 for
some 6 ~>0, i.e., ~(tx)/Cb(t) --+ x ~ as t l 0, and by a suitable choice of the functions a
and ~b one can achieve that
{
xa-fl- 1
qJ(x) = c~e •
fl¢6>0,
log(x)
x -l~log(x)
log2(x)
fl = 6 > 0,
if / / # 6 = 0 ,
fl = 6 = 0.
For a detailed discussion of Condition l(ii) we refer to de Haan and Stadtm/iller (1996)
and Drees (1995b).
Next we need some notation. Let h be some function belonging to
[0,~)
.~:=(h:[0,1]
hcontinuous,
limh(t)(l°gl°g(1/t))
t+0
'/2
~0
Then the function space
Dish, := { z : [0, 1] ~ ~
limit0t~h(t)z(t) =- O, (t#h(t)z(t)),c[o, ll E D[0, 1] }
is equipped with the weighted supremum seminorm
Ilzll~,h:--
sup
t C [0,1]
t/~h(t)lz(t)l.
}
.
H. Drees/ Journal of Statistical Plannin,q and In[~,rence 66 (1998) 95 112
~,
Furthermore, we introduce the functions
{
t -~
z/~(t) :--
/~>0,
- log(t)
if fl = 0,
t p
fl<O,
: = 7j + cq, • 1{/~#¢~>o},
flF ~(1 - t)/a(t) - 1 - flceqg(t), l{/~g,~>0}
-~(t):=
where c,~ = F
F
I(1-t)/a(t)-cq,~(t).l{o>0}
l ( l ) denotes the right endpoint o f F . Finally, let
A,, := I/~l(x,,:~- ~o)/a(k,,/n)'l{l~<
e/~ : =
/~>0,
iffl=0,
IPI
1
if
i.'2},
# o,
/~=0,
and define a Gaussian process by
Vl¢(t) : = e/~t-~/~+l)W(t)
where W denotes a standard Brownian motion.
Then the following limit theorem for the empirical tail q.f. is a reformulation of
Corollary 2.1 o f Drees (1995b).
Theorem 2.1. There are constants c,, such that the following assertions hoM Jbr all
h ~i ,~:
(i) [/Condition l ( i ) a n d
sup
xl~+l/21R(k,/n,x)l = o ( k n 1'2)
x E (o,i ~ ]
(2.1)
for some c > O, then
k"~': ( c'~ - (z/~ + ~(k~/n) + A~ )) ~ VI~,
weakly in D/~,h.
(ii) Under Condition l ( i i ) one has
~(kn/n) = O(kn 1/2) ::~ kn1/2 \ cn
/
H. Drees/Journalof StatisticalPlanningand Inference66 (1998) 95-112
100
and
~21/2 = O('~(kn/n))
~b(k,/n)-' ( Qn _ (z~ + Z(kn/n) + An + e~O(kn/n)~)~
/
\ Cn
0,
weakly in D~,h.
Proof. In Drees (1995b) it was shown that if Condition l(ii) is satisfied and
O(k~-1/2), then
knl/2(an-O
n
a(kn~
~(kn/n)=
(Z~fl + qb(kn/n ) ~ ) ) ---+ e~'
Vfl
weakly in D[Lh where
[ F-l(1 -
kn/n) - a(kn/n)(l{~¢o}/fl + c~,qb(kn/n). 1{/~¢~>o})
On= {Xn:n
//~> - ½,
1
i f / / < _ ~"
Hence, the assertion is immediate in this case if one takes into account that by Drees
(1995b), Lemma 2.1, for - ~1< / / < 0
(9 - F - l ( 1 -
a(kn/n)
kn/n)
1
+ fl +c~pO(kn/n)l{5>°} =°(q~(k"/n))=°(knl/2)"
The other cases can be treated similarly.
[]
Recall that according to Drees (1995b), Lemma 2.1, for a suitably chosen normalizing function a, the left-hand side of (2.1) tends to 0 for any intermediate sequence kn
so that condition (2.1) is satisfied for some sequence. Note that though An = op(k~-l/z)
it has to be taken into account to ensure that the process under consideration belongs
1 In the case f l > 0 a somewhat related result was established by
to D/~,h if fl < -- ~.
Einmahl (1992).
Observe that 3 depends on the parameter of interest but also on a scale parameter
and the right endpoint co so that it seems impossible to extract information about fl
from S(kn/n). Therefore, we consider it as an additional source o f error, which should
vanish as the sample size increases. Since ~(t) --+ 0 i f / / > 0 but IS(t)l -~ ~ if fl~<0
and ~o ~ 0, these two cases have to be treated seperately.
3. Asymtotic normality in the case ~>0
While it is obvious that the constant term 3(kn/n) in the approximation o f Qn does
not influence location invariant estimators, at first glance it is not clear whether or not
asymptotically this also holds true for statistical tail functionals without this invariance
property. It will turn out that both cases are possible if merely Condition l(i) holds.
H Drees I Journal of Statistical Planning and lnfi, rence 66 (1998) 95 112
101
Under Condition l(ii), in general, the term E ( k . / n ) is negligible if f l > 0 and dominates cb(k./n) otherwise. However, there is one exception to this rule. De Haan and
Stadtmfiller (1996) proved in their Theorem 2 that Condition l(ii) holds with f l < 5 if
and only if for some constants dl > 0 and d2 E
g(t) : = F - I ( 1
(3.1)
- t) - d l t -l~ - d2
is (6 - fi)-varying at 0. In this case Z ( k . / n ) is negligible if and only if d2 = 0. So
in contrast to the situation in Drees (1995b), the conditions on k~ must also take into
account the rate of convergence of .~(kn/n).
Condition 2. Assume that ( k . ) . ~ is an intermediate sequence satisfying
(i) m a x ( k l / Z , z ( k . / n ) 1)SUPxc(0.1+.] x/¢~l/ZIR(k./n,x)]---~0 for some C > 0
3 ( k . / n ) --+ p C [0, co] or
(ii) k2"2ch(k~/n) ~ 2 E [0, oc].
and k2 : :
With the exception of the location invariance the smoothness conditions on T, which
is defined on Did, h, are essentially the same as in the foregoing paper:
Condition 3. For f l > 0 and some h ~ . # the functional T:D~.h -~ ~ fulfills
(i) T is ~(Dfl, h),~(R)-measurable,
(ii) T ( a z ) : T ( z )
VzCDfl, h, a > 0 ,
(iii) T(z/~) = fl,
(iv) T is Hadamard differentiable tangentially to Cl~.h at z/3 with derivative T/~.
Here ,~(X) denotes the Borel-a-field of a metric space X and Cf~,h is the space of
continuous functions belonging to Dlj, h.
Remark. Sometimes it is useful to specify T only on some scale invariant subset of
Di~,h that contains z/~ and the paths of Q, with a probability converging to 1. Then it
is sufficient to require Condition 3 (ii) and (iv) on this subset only. An example will
be: given in connection with the Hill estimator.
Recall that, by definition, (iv) is equivalent to
T(zl~ + ZnYn) - T(zfl)
2.
---+ T ~ ( y )
for all 2. ~ 0 and all sequences Yn ~ Y c Cl~,h where T/~ : DI~,~, ---* ~ has to be linear and
continuous. By the Riesz representation theorem there exists a unique signed measure
vT:[~ on 2 ( [ 0 , 1]) such that
• ~
Ivr, fll(dt)<oc
and
T~(z) = ,f z(t) VF,l~(dt)
(3.2)
H. Drees/Journal of Statistical Planning and Inference 66 (1998) 95 112
102
for all z C C~,h. Let
•__ 0-~,vr,/~
2
ff2,fl .__
with a~,,. := e}
f(st)
min(s, t) v2(ds, dt)
and
/*r,/~,÷ := e/~ /
~ ( t ) vr,/~(dt).
Now we are ready for stating the main result of this section.
Theorem 3.1. Assume that T satisfies Condition 3.
(i) Suppose that Condition 2(i) is satisfied and
Condition l(i)
or
Condition l(ii) with f l = 6 or with f i < 6 and d 2 ¢ 0
(cf (3.l)).
Then
p<oo
~
~ ( k l / 2 ( T ( Q n ) - fi))---+ ,~U(pvT, e[O, 1],0-2,/~) weakly
p = oo
~
5 ~ ( E ( k n / n ) - l ( T ( Q , ) - fl))-+ vr,~[O, 1]
and
in probability.
(ii) I f Condition l(ii) holds with fi > 6 or fl <6 and d2 = O, then under Condition 2(ii)
2<00
2
~(k~/2(T(Qn) - fl)) ---+Ar(2#T,/~, ~, ar,/~)
2=o(
5~(q~(kn/n)-l(T(Qn) - fl)) ---+#r,#,q,
weakly
and
in probability.
Note that (3.2) guarantees that vT,/~[0, 1], a2,/~ and/tr,/~,÷ are well-defined real numbers.
ProoL If the second-order Condition l(ii) holds with fl<6, then by (3.1) one can
choose a(t) = dlflt -~, c~4~(t) = g(t)t~/(dlfl) and hence ~ ( t ) = d2/(dlt~). In particular,
• (t) = o ( ~ ( t ) ) as t .L0 if d 2 ~ 0 and E ~ 0 if d2 = 0.
Condition l(ii) with fl = 6 is equivalent to g ( t ) : = F - l ( 1 - t) - dt -~ E 11o for some
d > 0 with some auxiliary function L (de Haan and Stadtmfiller, 1996, Theorem 2).
Since L ( t ) = o ( g ( t ) ) one obtains cv/Cb(t)=L(t)tl~/(dfl) and ~ ( t ) ~ g(t)t~/d and thus
• (t) = o ( S ( t ) ) .
Finally, if fi > 6, then for a suitable choice of the normalizing function a one has
~0.
To sum up, under the conditions of assertion (i) E dominates ~b and the converse holds true if the conditions of part (ii) are satisfied. Now the assertions can be
proved in a similar way as Theorem 3.2 of Drees (1995b) by using the well-known
6-method. []
H. Drees I Journal of Statistical Planning and In/erence 66 (1998) 95 112
103
In general, the rate of convergence of statistical tail functionals T with vr,/~[O, l] ¢ 0
is slower than the rate for location invariant estimators if fi~<6. On the other hand,
the rates are the same if f l > 6 , but the variance cr~;/~ of the limit distribution (or its
2
mean-squared error ILr,/~,~
+ O'2[~) is smaller if a suitable functional T is chosen that
is not location invariant.
To see this, recall from Drees (1995b) that the regularity conditions on 7' imply
certain restrictions on the measure vr,/~. In the present situation, together with the
location invariance, one of the conditions on v/,/~ is dropped, too.
L e m m a 3.1, l,f f satiaJies Condition 3 for all fi of some open interval I c(0, ~ ) , then
(i) .]'z/~dvr, t~=O VI3EI,
(iii) .]'Zl~ • logdvr.l~ = - 1 V i l l i .
Since the set of admissible measures is smaller if a third condition corresponding to
the location invariance is added, one may expect that the minimal asymptotic variance
is smaller in the present situation. In fact, whereas in the foregoing paper it was shown
that the minimal asymptotic variance under location invariance equals ([~ + 1 )2, using
the same methods one can easily see that
inf
,'¢ #*(11)
a ILv
2 = fi2,
where .#*(fi) denotes the set of all signed measures satisfying (3.2) and the conditions
given in Lemma 3.1(i) and (ii). Moreover, the infimum is attained at
~'~ :=Z--I~
*;-l(O,l) -- ~1,
where 21~0,1) denotes the uniform distribution on (0,1) and e~ the Dirac measure at 1.
Observe that f12 is the asymptotic variance of the Hill estimator. Indeed in Reiss
(1989), Section 9.4, it is proved that under more restrictive conditions the Hill estimator
is asymptotically optimal if the bias is negligible. In the following example it is shown
that the asymptotic normality of Hill's estimator, which was already established in
several papers before (see, e.g., Hall, 1982; Davis and Resnick, 1984; and Haeusler
and Teugels, 1985), follows from Theorem 3.1.
Example 3.1. For any measurable function z : [0, 1] ---+R let
TH(Z) := f log+(z(t)/z(1 )) dt
if the right-hand side is defined and finite and r H ( Z ) = 0 otherwise. Then TH(Qn) is
the Hill estimator.
Obviously, TH is scale invariant and TH(Z/~)=fi. Now fix some h c,*g that is in¿
creasing and regularly varying at 0 with exponent ~ (e.g., h(t)= (t/(l - l o g ( t ) ) 1:2) and
define
E)I~,h := {z E Dl~,h t z positive and nonincreasing}.
104
H. Drees / Journal of Statistical Planning and Inference 66 (1998) 9~112
Since P{Qn ~/5~,h} =P{Xn-k,, :. >0} ~ 1 it suffices to verify that THI~,h is Hadamard
differentiable at z/3 tangentially to C#,h (cf. the remark following Condition 3).
For this let 2.---+0 and yn ED~,h denote a sequence converging to some yEC~,h
such that z~ + 2~y~ C/5/~,h for all n E N. Then
TH(z~ +
2nYn)
--
log(1 + )onx.(t)) dt,
TH(Z/~) ----
where
x.(t):=
tl~y.(t) - yn(1)
1 + 2.y.(1)
The monotonicity of z/~+ )t.y. implies log(1 + 2.x.(t)) >~ - fl log(s/t) + log(1 + )~.xn(s))
for all t<<.s and hence,
fo
s log(1 + 2.Xn(t))dt>~s(-fl + log(1 + 2nXn(S))).
(3.3)
_ ~ 7/6
Moreover, for sn-.~n
one has
sup log(l+A.x.(t))~<2,
tC[sn,l]
sup Ix.(t)l=O(),./h(s.))=o(1).
(3.4)
tC[s,.1]
Using (3.3), (3.4) and log(1 +x)<~x one obtains
/0 s" log(1 + 2~x.(t))
I
2.x.(t) dt = O(s. ) = o(2.).
Furthermore, a combination of (3.4) and log(1 + x ) = x + O(x 2) as x--~ 0 shows that
f l log(1 + 2 . x . ( t ) ) - 2 n x . ( t )
dt = 0
=O
()mxn(t)) 2 dt
(
22 sup [Xn(t)l
tG[s.,I]
/o
1/h(t) dt)
= o(,~.).
To sum up, we have proved that
f0
1log(1 +2nx.(t))--2nx.(t) dt = o ( 2 . ) ,
which in turn is equivalent to the Hadamard differentiability of TRID~,~ at z~ with
T~,~(y) = f l t~y(t ) _ y(1)dt, because obviously ]f2Xn(t)dt-T~,p(y)[ =o(1). Notice
that the signed measure pertaining to T~,/j is the one that minimizes the asymptotic
variance.
In a similar way, one may check that the moment estimator proposed by Dekkers
et al. (1989) and kernel-type estimators (Cs6rg6 et al., 1985) satisfy Condition 3 and
H. Drees/Journal of Statistical Planning and In[brence 66 (1998) 95 112
105
are thereIbre asymptotically normal. This was already proved in the above-mentioned
papers under comparable second-order conditions.
Furthermore, Theorem 3.1 allows the construction of statistical tail functionals with
prescribed asymptotic behavior.
Example 3.2. In this example we want to design a functional T satisfying Condition 3
for all fi > 0 such that the signed measures corresponding to the ttadamard derivatives
equal a given family of measures v[~. Of course, it has to be assumed that vii satisfies
(3.2) and the conditions given in L e m m a 3.1, Moreover, it is plausible that the family
(vil)[~>0 should be smooth in some sense. More precisely, we assume that for all fi,, ~ fi
f
1
IVl~, - v/~l(dt ) --~ 0.
(3.5)
In particular, for some c > 0
f t
(1~+1,,2+~)
Iv[~l(dt)<~c.
(3.6)
For instance, the continuity Condition (3.5) is fulfilled by the family of measures v/~
that leads to a minimal asymptotic variance.
We proceed as follows: first we construct a functional with the given Hadamard
derivative for a fixed fl > 0 and then an adaptive version of this functional is introduced.
For simplicity, it is assumed that limtl0 h(t)t ~1/2+~)= oc for all e > 0 .
Let
Ht~(t ) := tl~ ~(0,d s-13 vl~(ds )
and define a functional
f z(t) dHl~(t )
T/~(z) := f z(t)Hi~(t)/t dt
(with the convention x/O = 0 for all x E ~). Note that TI~(Q~) is a generalized probability
weighted moment estimator (cfi Drees, 1995b, Example 1.1).
Integration by parts yields
dHl~(t ) = Vl~(dt ) + (flHi~(t)/t) dt
(3.7)
and a combination with (3.6) shows that T/~ is well defined. Obviously, it is scale
invariant and because of L e m m a 3.1(i) Condition 3(iii) is also satisfied. Finally,
./'zl~(t)H~(t)/tdt=-/log(t)z~(t)v~(dt)=l
and hence it is readily verified that for 2n---+ 0 and y n - ~ y c C[~,h
2~ y~(t) v/~(dt)
Tl~(zp + )..Yn) - fl= 1 + 2. f y.(t)Hl~(t)/t dt = ~o, J[ ' y(t) vl~(dt) + o(.i, ),
(3.8)
H. Drees/JournaloJ'Statistical Planning and Inference 66 (1998) 9~112
106
that is, T# is Hadamard differentiable at z~ with measure v/~ pertaining to its derivative.
Now define an "adaptive functional" by
T(z)
:=
T:(~)(z)
if the right-hand side is well defined and T(z):= 0 else. Here 7~ is some functional
that is continuous at zfl and satisfies the Conditions 3(i)-(iii) for all f l > 0 (e.g., the
Hill functional TH) so that T(Qn) is a consistent initial estimate of ft.
Obviously, T fulfills Condition 3(ii) and (iii) and it remains to prove (iv). To
this end, consider sequences 2,---+0 and Yn---~yE CI~,h and let fl(n):=T(z~ + 2ny.).
Since fl(n)---+fl, one has limt~0 t ~H/~(,)(t)=0 for sufficiently large n and thus using
integration by parts one obtains
Next, observe that the continuity property (3.5) is passed on from (v/~)/~>0 to (dHfl)fl>0.
Consequently,
T(z a + 2 . y . ) - fl
f z~(t) + 2,y,(t)dH/~(,)(t)
f(z~(t) + 2ny,(t))H/s(,)(t)/t dt
f z~(t)dH/s(,)(t)
f z~(t)H~(,)(t)/t dt
2 . ( f y(t) drift(t) f z#(t)H~(t)/t dt - f z~(t) drip(t) f y(t)Hl3(t)/t d t ) + o(2.)
f (zl~(t) + 2ny.(t))H#(n)(t)/t dt f z~(t)H~(,)(t)/t dt
= 2, fy(t)
v~(dt) +
O(2n),
where for the last equation (3.7), (3.8) and 3.l(i) have been utilized. Hence, T is
Hadamard differentiable at z/s with the preassigned derivatives for all fl > 0.
It should be mentioned that, in a similar way, one can construct statistical tail functionals with given Hadamard derivatives that are also location invariant.
Example 3.2 describes a method for designing statistical tail functionals with madeto-order asymptotic behavior. If one only considers the asymptotic variance, then such
a construction is superfluous since there is a simple estimator, namely, the Hill estimator, with minimal asymptotic variance. Yet if the bias is taken into account, i.e.,
if one considers the asymptotic mean-squared error (MSE), then Hill's estimator is
not optimal (cf. Smith, 1987, Section 4). In fact, there does not exist a statistical tail
functional that has a minimal asymptotic MSE simultaneously for all underlying d.f.'s
F satisfying Condition l(ii) (cf. Drees, 1995b, Section 4). The best one can do is to
use a functional such that the leading term of the bias vanishes if the underlying d.f.
F satisfies Condition l(ii) with 6 > 0 belonging to a given finite set. For this again
one has to distinguish the two cases of Theorem 3.1.
In the situation of Theorem 3.1(i) T~(~(kn/n)) is the dominating part of the bias of
the estimator. So if one chooses T in such a way that this term vanishes asymptotically,
H. Drees / Journal ~?f Statistical Planning and lnfi, rence 66 (1998) 95 112
I [)7
then T has to be 'almost location invariant' locally at z& Essentially, in this case we
are in the situation of Drees (1997) since T } ( Y ( k , / n ) ) = 0 implies vr, l~[O, 1] = 0, which
is exactly the additional condition that follows from location invariance.
Therefore, in the following, we assume that the conditions of Theorem 3.1(ii) are
fulfilled. Using Example 3.2, for each positive 6o one may construct functionals T such
that the bias term T}(7j) vanishes asymptotically if the second-order Condition l(il)
is satisfied for 6 = c50. Within this class of functionals the minimal asymptotic variance
cry.,, equals ((60 + 1)/30)2/~ 2 and it is attained for
+ 1 [3 ( 6 o J - l ( t l ~ - (260 + l)t/J+a')dt - e , ( d t ) ) .
Vl~"s"(dt) - 6 o3o
Oo
The situation is quite different if Condition l(ii) holds with 6=:0, because then the
asymptotic MSE of all statistical tail functionals is the same if the number of upper
order statistics used for estimation is chosen optimally. So in this case all statistical
tail functionals have the same asymptotic efficiency. For a proof' of these results and
an extensive discussion of the choice of T if the bias is taken into account we refer
to Drees (1995b), Section 4.
4. Asymptotics for /] ~<0
If/~ < 0 and the right endpoint co of the underlying d.f. equals 0, then Z - 0 , and
consequently, one can establish asymptotic normality by following the lines of Section 3. In contrast to that, i f / 3 = 0 or /~<0 and c o ¢ 0 , then limt+0 Iff(t)[ = ~ so that
no longer zl~ but Z(k~/n) is the dominating term in the approximation of Q,. Therefore,
the smoothness condition on the functional T must determine its local behavior in a
neighborhood of a constant function instead of z/~.
Since for [:~< - ½, in general, D/~,h does not include the constant functions, 7" has to
be defined on span(Di3,h, 1). This function space will be equipped with the a-field ,~t/~,h
that is generated by all sets of the form {z + c I z C B, c ~ [cl,c2]} with B C :~(D/~,t,)
and - o c < c l < c: < oc. For the sake of definiteness, in the sequel we will assume that
co > 0, but the case co < 0 can be treated in the same way.
Condition 4. Assume that T : span(Did, h, 1)---+ R satisfies for fl~<0 and some h ~ ;¢
(i) T is ,r,/l~,h,~ ( ~ ) - m e a s u r a b l e ,
(ii) T ( a z ) - T(z) Vz E span(Did, h, 1 ), a > 0 ,
(iii) T( 1 + pn(zl~ + 2nyn)) = /J + pncz:~ + )~T}(y) + o(p, + )~,,) for all Pn,),, l 0 and
yn --+ y ~ Cl~.h with T} : D~,h -~ ~ denoting some continuous, linear functional.
Note that despite the notation T} is not a proper derivative. At first glance, Condition
4(iii) seems rather peculiar but Theorem 2.1 suggests that, nevertheless, it is appropriate
in the present situation. Below we will give two examples of such functionals including
the moment estimator due to Dekkers et al. (1989). It is worth mentioning that there
are straightforward generalizations of Condition 4(iii) that lead to similar results on the
108
H. Drees / Journal of Statistical Planning and Inference 66 (1998) 95-112
asymptotic behavior of T(Q,). For example, on the right-hand side of the expansion,
one may replace p, with g[~(Pn) where g#(t) is some function converging to 0 as t
tends to 0.
In view of Condition 4(iii) for the convergence rate of T(Q,) the rates of S(kn/n) -l
and ~(kn/n) are decisive. In analogy to the case i > 0 , ~(k,/n) -l is dominating if
li[ < 6 and ~(kn/n) -1 =o(~(k~/n)) if Ill > 6 but there is no plain answer in the case
li[ = 6. So to keep the conditions on k. as simple as possible we restrict ourselves
to the case of a nondegenerate limiting distribution of the standardized estimator and
leave the other cases to the reader, though this excludes the optimal choice of k~ if
i = O or ~ = 0 .
Condition 5. Let (kn)nC N be some intermediate sequence satisfying
(i) k 1/2 SUPxc(O,l+e]X/~+1/2[R(k./n,x)[ --+ O,
(ii) kl/2/y.(k./n) ~ p E [0, oc),
(iii) k~/2q~(k./n) ~ 2 E [0, ec).
Using the notation of Section 3 we have the following limit theorem.
Theorem 4.1. Suppose T satisfies Condition 4.
(i) I f Condition l(i) holds or (ii) with l t l < 6 , then under Conditions 5(i) and (ii)
~ ( k l / 2 ( T ( Q n ) - fl)) ~ JV'(pCT, B, Cr2,[I) weakly.
(ii) Under Condition l(ii) with lil >6 and 5(iii) one has
2
5F(k~/Z(T(Q,) - i ) ) --~ ~Ar(2PT,/~,q, ar,/~)
weakly.
(iii) Condition l(ii) with [l[ = 6 in combination with Conditions 5(ii) and (iii) implies
5('(kln/2(r(Qn) - fl)) ~ ~A/'(pcr,~+ 2#r,~ ' ~, ar,~)
2
weakly.
Proofl One has to examine the cases considered in Theorem 2 of de Haan and
Stadtmiiller (1996) seperately. If i = 0 and 6 > 0 , then ~ ( t ) ~ - l o g ( t )
and hence
q ~ ( t ) = o ( ~ ( t ) - l ) . If f l < 0 , then Z(t) -~ is Jill-varying at 0 so that ~ b ( t ) = o ( ~ ( t ) -1) if
I/~1 < 6 and ~ ( t ) -1 = o ( ~ ( t ) ) if Ill > 6, Moreover, An = Op(a(1/n)/a(kn/n)) = op(k71/2)
for f l < - 5 1 because a is Ill-varying at 0.
Thus, according to the Skorohod representation theorem under the conditions of the
theorem there are versions of Qn and V~ such that almost surely
Qn
Cn~(kn/n)
- 1 +3(kn/n)-l(z~ + kn-l/2(2~ + V/~+ o(1))),
with 2 --0 if Ill < 6 or merely Condition l(i) is assumed. Consequently, by Condition 4
T(Qn) = t + 3(k,/n)-lcr, B + knl/2T~( 2 ~ + V~) + o(3(kn/n) -1 + k£-U2),
which in turn implies the assertion.
109
H. Drees / Journal of Statistical Planning and lnfi, rence 66 (1998) 95-112
Example 4.1. For a measurable function z C [0, 1] --+ R let
TM(Z):=T~(z)+I-(2(1-----
Vz5//
with
Ti(z) :=
,•01
(log+(z(t)/z(1))) i dt
if the right-hand side is well defined and finite and TM(Z) := 0 else. Then TM(Qn) is
the moment estimator introduced by Dekkers et al. (1989).
Obviously TM is scale invariant. Next we give a sketch of a proof of Condition
4(iii) under the additional assumption that the supremum of y,, is bounded, which
holds always true if fl < - ½ and h c ~ ' is chosen appropriately. The general case can
be treated in a similar way as Example 3.1.
A Taylor expansion of the logarithm yields
T~(I + pn(zl~ + ZnYn))
= p~
(/ol
Zl~(t ) -- Zl~(l ) d t + ).,
/o
y(t) - y(1 )dt
)
P~2,fo z ~ ( t ) - z~(1)dt + o(p~ + 2n)
and
T2(] + pn(z~ + ,~ny.))
= p:n
(/o
(z[~(t) - z/61)) 2 + 2Zn
(@(t) - zl~(1 ))
× ( y ( O - y ( 1 ) ) dt - P"t_
[
- z~j(1))(~(t) - ~ ( l ) )
dt + o(0,, + ;., )
]
D
/
Straightforward but lengthy computations show that, for fl < 0 ,
fi(2 - 5fl + fi2)
TM(I+P"(Z'+2"Y~))=[I+P"(1--fi)(l--3fi)
+2"
(1 - 3)(1 - 2fi)
fl
(1 -- (1 -- 2fi)t-~)z(t)dt
x
- z(1)
+o(p. + L,)
and
TM(l+p~(z0+2~yn))=2pn+),~
(/0L( 2 + l o g ( t ) ) z ( t ) d t - z ( l ) )
+ o( p,, + 2, ).
H. Drees/Journal of Statistical Planning and Inference 66 (1998) 95-112
110
Hence, Condition 4(iii) is satisfied and
I~-~
(Lr~(1-(1-2fl)t-l~)dt-e,(dt))
vrM,~(dt) =
fl<0,
if
[,(2 + log(t)) dt - ei(dt)
/~--0.
By Theorem 4.1 the asymptotic normality of the moment estimator follows under
slightly more general assumptions than Dekkers et al. (1989) used in their paper.
Note that vr~,0 is the signed measure that minimizes the asymptotic variance if
f l = 0 and location invariance is assumed (Drees, 1995b, Theorem 4.1). Generally,
vvM,~ satisfies the conditions given in Lemma 4.1 of Drees (1995b) for location invariant statistical tail functionals, i.e., the conditions of Lemma 3.1 and, in addition,
Vr,#[O, 1] = 0 if 3 1 > - 51. Later on we will see that under a very weak extra condition
this always holds true for functionals satisfying Condition 4.
The following example indicates a general method to construct functionals satisfying
Condition 4.
Example 4.2. Let Ti, i = 1,2, be scale invariant measurable functionals satisfying the
expansion
~(1 + 2,z) = ,~nT/(z) + ;~L(z) + o(,~2)
uniformly for I]zl[/~,h~< 1 where T/ is a continuous and linear and Ti some continuous
functional on D/~,h. (A typical example of such a functional is T ( z ) = f f(z(t)/z(1))
/~(dt) where f is some sufficiently smooth function with f ( 1 ) = 0 and tt is some signed
measure on ~'([0, 1]) that does not concentrate too much mass in a neighborhood
of 0.)
If one has T((z[~)/T~(z/3)=fl, then it is easily seen that
T(z) := TI(Z)/T2(z)
satisfies Condition 4 with
71(z~) TJ(z e ) - T2(z~) 7"[(z~)
cT,~ =
T~(z~) 2
and
T~(y) = r((Y)~(ze)
- r~(y)T((z~)
According to Theorem 4.1 the rate of convergence of estimators T(Qn) with T satisfying Condition 4 and cr,/j # 0 is slower than the rate for location invariant estimation
functionals if Condition l(ii) holds with [//1<6. This disadvantage is most clearly
demonstrated by an exponential sample where any rate of convergence slower than
n --1/2 is achieved by a smooth location invariant statistical functional based on a suitably chosen number of order statistics whereas under Condition 4 with cr, o # 0 the best
H. Drees / Journal of,' Statistical Planning and In[i'rence 66 (1998) 95 112
Ill
attainable rate is 1/log(n). Yet, in contrast to the situation examined in Section 3, the
following lemma shows that asymptotically there is no advantage in using functionals
that are not location invariant in the case 1/31> (5 either.
4.1. f i T satisfies Condition 4, then
(i) .[' zr; dvT:/~ = O,
(ii) vr4[O, 1] = O.
(iii) lJl in addition,
Lemma
T(1 + p,zl~,, ) =/3n + pncr.I;,, + o(p,,)
(4.1)
Jor all Pn j.O and/3~ - / 3 = O(p~), then one has
/ z f l log dvv,/j =
•
{
-l
/3¢0,
if
-2
/3 = O.
Proof. By Conditions 4(ii) and (iii) one has
/3 + p,CT.f~ + o(pn) = T(1 + p,z/~)
= T(I + p,( 1 + p2n)(zfi -F pn/(1 + p~)))
=/3 + PnC'T,l~+ p,T~(1) + o(pn )
and hence T/~(1)=0, i.e., assertion (ii). In a similar way, Condition 4(iii) applied to
kny~ = p~zl~ yields (i).
Now assume that (4.1) holds and /3 <: 0. Then (zfi,,- zl~)/(/3n-/3)-*-zl~ log in Di~,j~
and hence an application of 4(iii) to ,;.~Yn-zl;,,- zf~ gives
(/3,,
/3)(1 + T}(zfs log)) + p,(CT4~,, -- CT,fi) = O(pn)
fi)r all /3,,-/3 = O(pn). So the choice / 3 , - / 3 = o(p~) shows that the map /3~-~ c'~;l~ has
to be continuous and thus
(/3,, - / 3 ) ( 1 + T}(zfl log)) = o(p,,)
for all /3~ - / 3 = O ( p n ) , which in turn implies (iii).
Likewise, assertion (iii) can be proved in the case / 3 : 0 by utilizing 2/322(zl;,, - I
fi~zo) --~ -zo log. %
The additional condition (4.1) is very mild since, roughly speaking, it means that in
case .vn -= 0 the Condition 4 holds locally uniformly for/3 ~<0. Note that for/3 >~ - 5 the
conditions given in L e m m a 4.1 (i)-(iii) are equivalent to the conditions established
in Drees (1995b), L e m m a 4.1, for location invariant functionals, while in the case
/3 < - ~ they are even stronger. Hence, for each functional satisfying Condition 4 there
exists a location invariant statistical tail functional with the same limit distribution if
1/31>6 and a faster rate of convergence towards the true parameter if 1/31<<J. In view
112
H. Drees/ Journal of Statistical Planning and Inference 66 (1998) 95-112
o f this r e s u l t w e s t r o n g l y r e c o m m e n d
to u s e statistical tail f i m c t i o n a l s t h a t a r e l o c a t i o n
i n v a r i a n t i f o n e e x p e c t s fl to b e n o n p o s i t i v e .
Acknowledgements
Part of the work was completed while the author was visiting the Erasmus University R o t t e r d a m , s u p p o r t e d b y a D F G r e s e a r c h grant. T h e a u t h o r w o u l d like to t h a n k
L. d e H a a n f o r his h o s p i t a l i t y .
References
Cs6rgr, S., Deheuvels, P., Mason, D., 1985. Kernel estimates of the tail index of a distribution. Ann. Statist.
13, 1050-1077.
Davis, R.A., Resnick, S., 1984. Tail estimates motivated by extreme value theory. Ann. Statist. 12,
1467-1487.
Dekkers, A i . M . , Einmahl, J.H.J., de Haan, L., 1989. A moment estimator for the index of an extreme value
distribution. Ann. Statist. 17, 1833-1855.
Dekkers, A.L.M., de Haan, L., 1989. On the estimation of the extreme-value index and large quantile
estimation. Ann. Statist. 17, 1795-1832.
Dekkers, A.L.M., de Haan, L., 1993. Optimal choice of sample fraction in extreme value estimation.
J. Multivariate Anal. 47, 173-195.
Drees, H., 1995a. Refined Pickands estimators of the extreme value index. Ann. Statist. 23, 2059-2080.
Drees, H., 1995b. On smooth statistical tail functionals. Preprint, submitted.
Drees, H., 1996. Refined Pickands estimators with bias correction. Commun. Statist. Theory Methods 25,
837-852.
Einmahl, J.H.J., 1992. Limit theorems for tail processes with application to intermediate quantile estimation.
J. Statist. Planning Infer. 32, 137-145.
Fernholz, L.T., 1983. Von Mises Calculus for Statistical Functionals. Lecture Notes in Statistics, vol. 19,
Springer, New York.
Galambos, J., 1978. The Asymptotic Theory of Extreme Order Statistics. Wiley, New York.
Gnedenko, B., 1943. Sur la distribution limit du terme maximum d'une srrie al~atoire. Ann. Math. 44,
423 453.
Haan, L. de, 1984. Slow variation and characterization of domains of attraction. In: Tiago de Oliveira,
J., (Ed.), Statistical Extremes and Applications. Reidel Publishing, Dordrecht, pp. 31-38.
Haan, L. de, Rootzrn, H., 1993. On the estimation of high quantiles. J. Statist. Harm. Infer. 35, 1-13.
Haan, L. de, Stadtmfiller, U., 1996. Generalized regular variation of second order. J. Austr. Math. Soc., 61,
381-395.
Haeusler, E., Teugels, J.L., 1985. On asymptotic normality of Hill's estimator for the exponent of regular
variation. Ann. Statist. 13, 743-756.
Hall, P., 1982. On some simple estimates of an exponent of regular variation. J. Roy. Statist. Soc. Ser.
B 44, 37 42.
Hall, P., Welsh, A.H., 1984. Best attainable rates of convergence for estimates of parameters of regular
variation. Ann. Statist. 12, 1079-1084.
Hall, P., Welsh, A.H., 1985. Adaptive estimates of parameters of regular variation. Ann. Statist. 13, 331-341.
Hill, B.M., 1975. A simple general approach to inference about the tail of a distribution. Arm. Statist. 3,
1163-1174.
Pickands II1, J., 1975. Statistical inference using extreme order statistics. Ann. Statist. 3, 119-131.
Reiss, R.-D., 1989. Approximate Distributions of Order Statistics, Springer Series in Statistics. Springer,
New York.
Smith, R.L., 1987. Estimating tails of probability distributions. Ann. Statist. 15, 1174-1207.