J.Appl.Prob. 29,37-45(1992)
Printed in Israel
© Applied Probability Trust 1992
THE EXTREMAL INDEX FOR A MARKOV CHAIN
RICHARD L. SMITH,· University
ofSurrey
Abstract
The paper presents a method of computing the extremal index for a discrete-time
stationary Markov chain in continuous state space. The method is based on the
assumption that bivariate margins of the process are in the domain of attraction of a
bivariate extreme value distribution. Scaling properties of bivariate extremes then
lead to a random walk representation for the tail behaviour of the process, and hence
to computation of the extremal index in terms of the fluctuation properties of that
random walk. The result may then be used to determine the asymptotic distribution
of extreme values from the Markov chain.
HARRIS CHAINS; MULTIVARIATE EXTREME VALUE DISTRIBUTIONS; WIENER-HOPF
INTEGRAL EQUATION
AMS 1991 SUBJECT CLASSIFICATION: PRIMARY 60G70
SECONDARY 60JI0
1. Introduction
Let {Xn } denote a stationary sequence with marginal distribution function F. For
large nand u, it is typically the case that
(1.1)
where ()E [0, 1] is a constant for the process known as the extremal index. This concept
was developed in a series of papers including Newell (1964), Loynes (1965), O'Brien
(1974) and Leadbetter (1983); for a review see Leadbetter and Rootzen (1988). In view
of the simplicity of the approximation (1.1), it is the key parameter for calculating
extreme value properties of stationary sequences. However,· there are as yet very few
general methods for calculating an extremal index. The purpose ofthe present paper is to
present such a method for Harris chains, i.e. positive recurrent Markov chains in
discrete time with continuous state space. O'Brien (1987) and Rootzen (1988) have
given general theorems for extremes in Harris chains, but their results stop short of an
explicit calculation. For a good exposition of the general theory of Harris chains, see
Asmussen (1987).
Received 21 September 1988; revision received 11 December 1990.
• Present address: Department of Statistics, University of North Carolina, Chapel Hill, NC 275993260, USA.
37
Downloaded from https:/www.cambridge.org/core. IP address: 88.99.165.207, on 13 Jul 2017 at 00:53:24, subject to the Cambridge Core terms of use,
available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S0021900200106606
38
RICHARD L. SMITH
In the present paper we assume that F is the standard Gumbel d.f., F(x) =
exp( - e- X ) . There is no loss of generality in this, since the general case may be derived
by a transformation of the margins. In addition, however, we assume that the joint
distribution of (Xl' X 2) is in the domain of attraction of a bivariate extreme value
distribution. Combined with the Gumbel margins, this means that
(1.2)
where G is a bivariate extreme distribution function, also with Gumbel margins. For the
general theory of bivariate and multivariate extreme value distributions see Resnick
( 1987).
Under mild additional conditions it can be shown that, given Xl> u, the differences X 2 - X., X 3 - X 2 , · • ., Xp - Xp -1 (as U -+ 00 for fixed p) are approximately
independent of each other and of u. In other words, the Markov chain in the tails looks
like a random walk. This implies a connection between the extremal index and
fluctuation theory for random walks, the key part of which is solving a Wiener-Hopf
integral equation.
The assumption that a bivariate extreme limiting distribution exists is more restrictive than the general theory of O'Brien and Rootzen, but seems natural for
statistical applications. Papers such as Tawn (1988), Coles and Tawn (1991), Joe et al.
(1992) have been concerned with the estimation of bivariate extreme limits, and it is a
natural extension of those papers to apply the same techniques to pairs of successive
values in a time series, assuming the Markov property. In order to use such results to
calculate the extreme value properties of the time series, however, we need a method of
characterising the extremal index. It is this problem which the present paper seeks to
solve.
2. Technical preliminaries
For stationary sequences satisfying a suitable mixing condition, O'Brien (1987)
characterised the extremal index as
(2.1)
() =
lim P {Xi ~ u., 2 ~ i ~ », IXl > u;}
n-oo
where {Un} are such that n {I - F(u n)} is 0(1), and {Pn} is a suitably chosen increasing
sequence with p; = o(n). In the case of a strong mixing process, which automatically
includes any aperiodic Harris chain, a sufficient condition is that ng( Pn) = o( Pn) where
g is the mixing function.
For the present paper, we work with a variant of this: under the additional condition
(2.2)
p r- o: n rr
p"
L P {Xk >
o: k=p
lim lim
u; IXl > un} = 0,
condition (2.1) is equivalent to
Downloaded from https:/www.cambridge.org/core. IP address: 88.99.165.207, on 13 Jul 2017 at 00:53:24, subject to the Cambridge Core terms of use,
available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S0021900200106606
39
The extremalindex/or a Markovchain
(2.3)
() = lim lim P{X; ~ u, 2 ~ i ~ p IXl> u}.
p-oou-oo
In fact this was the original characterisation of O'Brien (1974). Our approach will be to
assume (2.2) and concentrate on (2.3).
The assumption that (Xl' X 2) is in the domain of attraction of a bivariate extreme
value distribution is equivalent, upon adapting Proposition 5.15 of Resnick (1987) to
the Gumbel case, to the statement that
1 - P{XI ~
U
+ Xl' X 2 ~ U + X2}
1 - P {Xl
~ U,
X 2 ~ u}
converge to a limit as u --+ 00. From this, it follows for any J > 0 and
limit
(2.4)
lim P {X2>
u-oo
U
+ x21 u -
J < Xl <
U
X2'
that the
+ J}
exists, though it may be 0 for all X2> - 00. In fact this case is particularly important since it corresponds to asymptotic independence in the tails of the bivariate
distribution.
The principal assumption of this paper is that a version of (2.4) for densities should
hold. Specifically, if q(x, y) denotes the transition density of the Markov chain, then we
assume
lim q(u, u
(2.5)
u-oo
+ x) =
h(x)
for some limiting function h(x) such that h(x) ~ 0 for all x, S:oo h(x)dx ~ 1.
The reason that the integral of h may be less than 1 stems from the possibility that
some or all of the limiting probability mass of X 2 - u, given Xl = u, may be at - 00. In
particular, this happens in the case of asymptotic independence already noted. To
encompass this possibility, we define a limiting distribution function
H(x)
=
lim P{X2 ~ u
u-oo
which exists and has density h for all X > O. Now consider
00,
+ X IXl
=
u}
but where we may also have H( - (0) >
However, under (2.5), and provided we can justify taking limits under the integral sign,
we will have
Downloaded from https:/www.cambridge.org/core. IP address: 88.99.165.207, on 13 Jul 2017 at 00:53:24, subject to the Cambridge Core terms of use,
available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S0021900200106606
40
RICHARD L. SMITH
¢J(p,x,u)-+¢J(p,x)=
roo
roo •••
Jo Jo
roof
Jo
0
h(X-X2)h(X2- X3)
-00
(2.7)
Now, Equation (2.3) implies that
1 - () = lim lim
p
L
p-oo u-oo j-2
fO
¢JU, x, u)
-00
f(u -x)
1 - F(u)
dx,
exp( - x - e- X ) . Butf(u - x)/{l - F(u)}-+e x as u -+ 00 so, provided we can again justify the interchange of limits, we have
wheref(x)
=
F'(x)
=
1 - () = lim
(2.8)
LPfO
p-oo j-2
¢J(j, xie'dx,
-00
Proposition 1. Suppose, in addition to (2.2) and (2.5), we have:
(i) there exists u* such that, for all M, q(u, u
+ y)
is uniformly bounded over
u~u*,y~-M,
(2.9)
(ii) lim M -
oo limu - oo sUPx:Su-M
P{X2 > u I Xl
=
x}
=
o.
Then the limit in (2.7) is valid, and hence also (2.8).
Proof. Fix some large positive M. If the range of X2,· • ., xp - l in (2.6) and (2.7) are
restricted to (0, M), then the result is immediate by assumption (i) and the dominated
convergence theorem. This extends to (2.8) as well, since f(u - x)/{l - F(u)} is
bounded by CeX for some C> 1, provided u ~ u*, and e' is integrable. on (- 00,0).
Hence it suffices to show that, for each j = 2,· · ., p - 1,
(2.10)
lim lim sup P{Xj < u - M, Xp > u IXl
M-oo u-oo x:50
= U -
x}
=
o.
However this follows from (2.9) because, given Xj < u - M, each of the inequalities
U - (p - j - l)M/(p - j), Xj+2 < U - (p - j - 2)M/(p - j), ... , X p < u, has
probability tending to 1. This completes the proof.
To summarise this section: if(2.2), (2.5) and conditions (i) and (ii) of Proposition 1 are
satisfied, then () is given by Equations (2.7) and (2.8).
Xj+l <
3. Calculating the extremal index
From the point of view of practical calculation, it is more satisfactory to recast the
result of Section 2 into a form that makes clear the connection with fluctuation theory
and the Wiener-Hopf equation. Let
Q~U)(x) = P{X2 < U,· • ., Xp
Then we may define
< u IXl
= U -
x}.
Qi u ) == 1 and
Downloaded from https:/www.cambridge.org/core. IP address: 88.99.165.207, on 13 Jul 2017 at 00:53:24, subject to the Cambridge Core terms of use,
available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S0021900200106606
The extremal index/or a Markovchain
Q~U)(x) =
1
41
00
q(u -x, u -
Y)Q~u.!.l(y)dy,
p> 1.
However Q~U)(x) = 1 - 'Lf-2 ¢J(j, x, u)-'1 - 'Lf-2 f/JU, x) as U -. 00, and the latter is
just P { Y2 < 0,· · ., Yp < 0 I Y 1 = - x} where Y 1, · • ., Yp form a random walk in which
Y2 - Yl' Y3 - Y2 , · •• are independent with distribution function H. Thus, as U -. 00,
Q~") --+ Qp, where Ql == 1 and
1
00
(3.1)
Qp(x) =
Qp_l(y)H(x - dy)
where the Stieltjes form of integral is adopted because of the possibility H( - 00) > O. As
P --+ 00, Qp -. Q where Q is a solution of the Wiener-Hopf equation
1
00
(3.2)
Q(x) =
Q(y)H(x - dy)
subject to the normalising condition
(3.3)
lim Q(x) = 1.
x-oo
In terms of Q, the extremal index is then given by
(3.4)
Note that in the case of asymptotically independent bivariate distributions,
H( - 00) = 1 so (3.2) gives Q(x) = Q(oo) = 1 and (3.4) implies () = 1.
Convergence of Qp to Q follows by applying the monotone convergence theorem
to (3.1), as in Section 11.5 of Grimmett and Stirzaker (1982). The rather harder
question, of whether the resulting solution to (3.2) and (3.3) is unique, does not seem to
be resolved by this argument, and requires the general theory of Smithies (1940) and
Krein (1958).
In numerical calculations, I have iterated Equation (3.1) to convergence, evaluating H
and the Qp's on a discrete lattice of up to 214 points, and a Fast Fourier Transform for the
convolution.
4. Examples
Example 1. The following example, although not absolutely continuous and therefore outside the scope of Section 2 as formulated, illustrates the main idea in elementary
fashion. Suppose {Yn } are independent standard Gumbel and X n = max(Xn - 1 - a, Yn + P)
for fixed a, p. An equivalent representation is X n = P + maxj~O (F, l - ja) and it is easy
to see that the marginal distribution of X n is also standard Gumbel provided P =
log(1 - e -a).
For large n, we also have
r
p{ max Xi ~ -l';: :, p{p + max Yi ~ x}
l~i~n
l~i~n
=
exp( - nePe- X )
Downloaded from https:/www.cambridge.org/core. IP address: 88.99.165.207, on 13 Jul 2017 at 00:53:24, subject to the Cambridge Core terms of use,
available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S0021900200106606
RICHARD L. SMITH
42
so the extremal index is (J = eP = 1 - e -a. Alpuim (1989) gives a number of properties of
models with this kind of structure.
If X n is large, then there is very high probability that X n + 1 = X n - a. Hence H(x) is 0
for x < - a, 1 for x ~ - a, and (3.2) reduces to
Q(x)
Q (X
=
{
+ a),
if x
~
- a
if x < - a.
0,
Hence Q(x) is 1 for x ~ - a, 0 otherwise and (3.4) shows
(J =
r. e'dx =
1 - e- a •
Example 2. Consider a stationary Markov chain in which the joint distribution of
consecutive observations follows the logistic bivariate extreme value distribution with
Gumbel margins:
(4.1)
where 1 ~ r < 00. This has proved useful in modelling bivariate extremes in oceanography (Tawn (1988)), though Tawn also considered an asymmetric version of(4.1) and
other parametric models are currently under investigation. Therefore this example
should be regarded as illustrating the general technique in the jointly continuous case.
It is readily checked from (4.1) that
(4.2)
P{Xn + 1 ~ x
+ z I X n = x} = exp[e-
X{1
- (1
+ e- rz)l/r}](1 + e-rz)llr-1
as x
--+
00.
Thus the last expression gives the distribution function of the random walk which the
process follows in the tails. Hence, for r > 1,
(4.3)
The case r
=
1 is the independent case and therefore trivial.
To verify (2.2) for this process, we first note that a random variable with density h has
negative mean and finite moment generating function in a neighbourhood of O.
Continuity arguments in (4.2) then show that there exists t E (0, 1), 11 < 1 and finite x*
such that
for x
(4.4)
Let r denote the first hitting time of ( -
P {Xk + I >
U,
00,
~
x*.
x*) for the Markov chain. Then
r > k IXI > u}
(4.5)
which is summable in k, so that in particular the sum from k
whenever p~, P« both tend to infinity..
= p~
to Pn tends to 0
Downloaded from https:/www.cambridge.org/core. IP address: 88.99.165.207, on 13 Jul 2017 at 00:53:24, subject to the Cambridge Core terms of use,
available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S0021900200106606
43
Theextremal index for a Markov chain
We also have, for u, such that 1 - F(u n ) = O(n- I ) , P{Xk + 1 > u; IXI = x} = O(n- I )
uniformly in k ~ 1 and x ~ x*: uniformity in x essentially follows from the equicontinuity of the family of conditional distributions of X n + I given X n = x. Since o; ~ P« =
o(n) we have
(4.6)
P"
L
k==P~
P {Xk > u., r ~ k IXI > u; } --+ o.
Putting (4.5) and (4.6) together gives (2.2). This argument has been given in some detail
because it illustrates how (2.2) may be verified in nicely behaved cases.
Conditions (i) and (ii) of Proposition 1 may be verified directly. The details are
elementary and are therefore omitted.
Table 1 gives numerical results for r = 2, 3, 4, 5. These were calculated using the
numerical method described at the end of Section 3. Typically 40-120 iterations were
required and comparison of results for different truncation points and discretisations
suggests that the computed results are correct to three significant figures.
Extremal index
(J (calculated)
TABLE 1
and On (simulated) for the logistic model with various values of r
n
r=2
3
10
20
50
100
200
500
0.414
0.372
0.350
0.340
0.335
0.338
0.271
0.220
0.180
0.167
0.166
0.161
0.212
0.160
0.116
0.103
0.100
0.190
0.135
0.087
0.072
0.063
Limiting (J
0.328
0.158
0.0925
0.0616
4
5
Also shown in Table 1 are approximations for finite n. Define u; such that
Fn(un)=! then let (}n=-logP{max(XI,···,Xn)~un}/log2. Thus (}n--+() as
n --. 00, and the table gives an indication of the rate of convergence. These values
are each based on 10 000 simulations. It should be noted that some sequences
are not monotonic, for example with r = 2 we have (}200 < (}SOO, but it may be checked
that the standard error of () ~ 0.335, based on 10000 simulations, is about 0.007.
We conjecture that in fact (}n is monotonically decreasing to () as n --+ 00, for each r
in this model.
Example 3. This example, essentially due to a referee, illustrates a case where (2.8)
is not satisfied and the result incorrect. Let {Vn } denote an independent uniformly
distributed random variable on (0, 1), and let VI = VI'
w.p.]
w.p.!
Downloaded from https:/www.cambridge.org/core. IP address: 88.99.165.207, on 13 Jul 2017 at 00:53:24, subject to the Cambridge Core terms of use,
available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S0021900200106606
44
RICHARD L. SMITH
these choices being mutually independent. Then {Vn } also have uniform marginal
distribution but Vn = Vn + 2 with probability 1. Finally, let X n = -log( -log(Vn » to
transform to Gumbel margins. Then (Xn , X n + l ) are asymptotically independent in the
tails, so H( - 00) = 1 and the method of Section 3 would lead to extremal index 1, but in
fact there is clustering and the true extremal index is i.
5. Summary and conclusions
The main purpose of this paper has been to present an explicit computational
technique for the extremal index of a Markov chain. Although the method depends on
certain assumptions, these do not seem too restrictive for practical application.
A natural extension is to ask whether the method is applicable to kth-order Markov
chains, i.e. processes in which the conditional distribution of X n given the past is a
function of the k ~ 1 immediate past values. In this case, multivariate extreme value
theory indicates that ratios of joint probabilities of the form
1 - P {XI
~ U
+ X.,· · ., X k ~ U + X k }
1 - P {XI ~ u,· · ., X k ~ u}
are asymptotically independent of u, and this might point to a similar technique being
applicable in which the distribution function H is replaced by the limiting distribution of
X k - max(X.,· · ., X k - I), given max(XI , · • ., X k - I) > u. However, it is clear that the
computational details will be much harder.
Acknowledgments
The first version of this paper was written during a visit to the University of British
Columbia, Vancouver, Canada and supported in part by the Natural Sciences and
Engineering Research Council of Canada, and by the Wolfson Foundation. I am grateful
to Harry Joe, Jonathan Tawn, Sammy Yuen, Seokhoon Yun and Nick Bingham for
comments and references. I would also like to thank a referee for pointing out an error in
the earlier version.
References
ALPUIM, M. T. (1989) An extremal Markovian sequence. J. Appl. Probe 26, 219-232.
ASMUSSEN S. (1987) Applied Probability and Queues. Wiley, Chichester.
COLES, S. G. AND TAWN, J. A. (1991) Modelling multivariate extreme events. J. R. Statist. Soc.
B. 53, 377-392.
GRIMMETT, G. AND STIRZAKER, D. (1982) Probability and Random Processes. Oxford University
Press.
JOE, H., SMITH, R. L. AND WEISSMAN, I. (1992) Bivariate threshold methods for extremes. J. R.
Statist. Soc. B. 54, 171-183.
KREIN, M. G. (1958) Integral equations on the half-line with a kernel depending on the difference of
the arguments (Russian). Uspehi Mat Nauk NS 13, 3-120. (English translation: AMS Transl. Sere 2 22,
163-288.)
Downloaded from https:/www.cambridge.org/core. IP address: 88.99.165.207, on 13 Jul 2017 at 00:53:24, subject to the Cambridge Core terms of use,
available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S0021900200106606
The extremal index/or a Markov chain
(1983) Extremes and local dependence in stationary sequences. Z. Wahrschein-
LEADBETTER,
M.
R.
LEADBETTER,
M.
R. AND ROOTZEN,
lichkeitsth, 65, 291-306.
16, 431-478.
45
H. (1988) Extremal theory for stochastic processes. Ann. Prob.
LoYNES, R. M. (1965) Extreme values in uniformly mixing stationary stochastic processes. Ann.
Math. Statist. 36, 993-999.
NEWELL, G. F. (1964) Asymptotic extremes for m-dependent random variables. Ann. Math. Statist.
35, 1322-1325.
O'BRIEN, G. L. (1974) The maximum term of uniformly mixing stationary processes. Z.
Wahrscheinlichkeitsth . 30, 57-63.
O'BRIEN, G. L. (1987) Extreme values for stationary and Markov sequences. Ann. Probe 15,
281-291.
RESNICK, S. (1987) Extreme Values, Point Processes and Regular Variation. Springer Verlag,
New York.
ROOTZEN, H. (1988) Maxima and exceedances of stationary Markov chains. Adv. Appl. Prob. 20,
371-390.
SMITHIES, F. (1940) Singular integral equations. Proc. Lond. Math. Soc. 46, 409-466.
TAWN, J. A. (1988) Bivariate extreme value theory models and estimation. Biometrika 75,
397-415.
Downloaded from https:/www.cambridge.org/core. IP address: 88.99.165.207, on 13 Jul 2017 at 00:53:24, subject to the Cambridge Core terms of use,
available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S0021900200106606
© Copyright 2026 Paperzz