Physically-Based Statistical Models of Extremes arising from

Lancaster University
STOR603: PhD Proposal
Physically-Based Statistical Models of
Extremes arising from Extratropical
Cyclones
Supervisors:
Jonathan Tawn
Jenny Wadsworth
Simon Brown
Author:
Paul Sharkey
Abstract
Extratropical cyclones are low pressure weather systems in the mid-latitudes that
are associated with strong winds and heavy rainfall. Extreme value theory is a statistical field that has often been used to analyse extreme rainfall accumulations and wind
speeds, but without incorporating the physical characteristics of extratropical cyclones
that generate these extremes. These features are generally spatially heterogeneous and
non-stationary in time, so this presents a unique modelling challenge from both a statistical and climatological perspective. This report gives an outline of the various methods
that will be used to tackle these problems during PhD research, which aims to combine
numerous aspects from extreme value analysis and atmospheric science. The goal is to
present a framework that is a statistically consistent representation of extratropical cyclones that incorporates various aspects of cyclone evolution, movement and structure
that can be used to predict certain future aspects of cyclone behaviour.
August 30, 2014
Contents
1 Introduction
1
2 Univariate extreme value theory
2.1 Block maxima approach . . . . . . . . . . . . .
2.2 Threshold methods . . . . . . . . . . . . . . . .
2.3 Modelling extremes of stationary processes . . .
2.4 Modelling extremes of non-stationary processes
2.4.1 Generalised additive models . . . . . . .
2.4.2 Random effects . . . . . . . . . . . . . .
2.5 Simulation study . . . . . . . . . . . . . . . . .
2.5.1 Simulating a Poisson process . . . . . . .
2.5.2 Model fitting . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
6
10
11
11
12
14
14
15
3 Bivariate extreme value theory
3.1 Measures of dependence . . . . . . . .
3.1.1 Testing dependence of simulated
3.2 Componentwise block maxima . . . . .
3.3 Threshold methods . . . . . . . . . . .
3.4 Conditional approach . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
17
18
20
21
22
24
.
.
.
.
.
.
25
26
27
27
27
28
29
5 Exploratory data analysis
5.1 Data availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Covariate analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Dependence structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
30
31
35
6 Further Work
6.1 Short-term goals . . . .
6.1.1 Data collection .
6.1.2 Spatial extremes
6.1.3 Random effects .
6.2 Long-term goals . . . . .
37
38
38
39
42
43
. . .
data
. . .
. . .
. . .
.
.
.
.
.
4 Extratropical cyclones
4.1 Formation . . . . . . . . . . . . . . . . . . . .
4.1.1 Airmasses . . . . . . . . . . . . . . . .
4.1.2 The Norwegian Cyclone Model . . . .
4.1.3 The Shapiro-Keyser Model . . . . . . .
4.2 Key features . . . . . . . . . . . . . . . . . . .
4.3 Statistical modelling of extratropical cyclones
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
Introduction
The prevalence of extratropical cyclones in the mid-latitudes is a dominant feature of the
weather landscape affecting the United Kingdom. The UK has come to expect a consistent
pattern of temperate summers and mild winters on a yearly basis. However, in recent years,
this country has been a focus of extreme weather events. This is exemplified by major flood
events (as recently as February 2014 in Devon and Cornwall), damaging windstorms (Cyclone
Christian, 2013) and other events that have caused mass infrastructural damage, transport
chaos and, in some instances, even human fatalities. The ongoing threat of weather systems
associated with extratropical cyclones is of great concern to the Met Office and its clients.
Accurate modelling and forecasting of extreme weather events related to these cyclones is
essential to minimise potential damage caused, to aid design of appropriate defence mechanisms to protect the threat to human life and to limit the economic difficulties such an event
may cause.
Such events usually manifest in the form of strong winds and heavy rainfall. Such instances
are both examples of an extreme event. In this context, an extreme event is one that is very
rare, with the consequence that datasets of extreme observations are usually quite small.
The statistical field of extreme value theory is focused on modelling such rare events, with
the ideology of extrapolation of the physical process of interest from the observed data to
unobserved levels. This allows a rigorous statistical modelling procedure to be followed in
spite of the data constaints.
Features of extratropical cyclones have already been analysed using extreme value methods. Wind speed and rainfall accumulations are commonly used datasets for extreme value
analysis. However, many of these statistical models fail to incorporate the large scale characteristics of extratropical cyclones that shape the rate of occurrence and magnitude of
extreme wind and rain. In particular, the physical processes and atmospheric dynamics that
drive the evolution and movement of cyclonic behaviour are largely ignored, and as a result,
existing models are divorced from the atmospheric activity that is generating the extremes.
This is further complicated by the fact that the features of cyclonic behaviour that have the
most damaging consequences can be small-scale in nature. Sting jets, for example, produce
localised gusts that can cause mass damage on short time scales. However, the presence of
many of these small-scale features are difficult to observe through both empirical and modelgenerated measurements, with the consequence that they are difficult to model in practice.
It is essential that these physical characteristics of cyclone behaviour are further investigated
and incorporated into an extreme value model in order to produce a more realistic, consistent
statistical representation of the underlying physical processes. The goal of this PhD research
is to develop such a model.
In addition to the atmospheric complexity of extratropical cyclones, model specification is
compounded by the fact that these weather systems feature irregularly occurring phenomena with rates and magnitudes that are spatially hetereogeneous and non-stationary in time.
Previous extreme value analyses have focused on modelling variables such as rainfall at a
single location. From a modelling perspective, such univariate methods will not be sufficient
1
in capturing the full picture of extremal behaviour associated with extratropical cyclones. It
is inevitable that any model for estimating extreme weather will have to incorporate a dependence structure. This is intuitive considering that, for example, the risk of extreme winds
in two locations are related if they are situated on the same track of a storm. Extending
this to incorporate spatial variability in the model is crucial due to the clearly spatially heterogenous behaviour of weather systems. The impact of covariates will also be a significant
area of research. Some covariates such as the North Atlantic Oscillation (NAO) index (see
Section 6) are known to be associated with the behaviour of weather systems in the North
Atlantic region. However, to develop a model that fully represents the physical processes of
the cyclones, an investigation must be carried out into discovering relevant covariates related
to the structure of the cyclone itself. Spatial modelling is essential as covariates which may
impact the overall extreme behaviour of the cyclone may not be significant when analysing
remote sites. Random effects modelling over space may also be necessary if the form of
the covariate is not yet clear. Other factors to consider would be seasonality and long-term
climate change. It is hoped that the complete model will provide a more robust assessment
of how the risk of extreme events arising from extratropical cyclones will change over time.
A major short-term component of the PhD research will focus on gathering meaningful data
relevant to modelling extremal behaviour of extratropical cyclones. Naturally, the extreme
value methods used in such a model will depend on the relevance and quality of the data
obtained. With any analysis involving raw weather data, it is natural to expect inaccuracies
and gaps in the data due to factors such as unreliable recording instruments. Measurements
recorded at irregularly distributed sites may also distort the spatial structure of the data
somewhat. With this in mind, a data assimilation scheme known as reanalysis is used to
generate weather observations over fixed time intervals. Observational data are combined
with prior information from a forecast model to produce estimates of the state of weather
systems. Examples of such reanalysis projects include ERA-40 and ERA-Interim, which are
introduced in Section 4. These reanalyses often comprise a system of millions of observations,
but it is important not to equate these datasets with reality. These datasets are generated
with spatial and temporal resolutions that are constrained by limits of computational power.
Model bias may also cause spurious variability and trends to appear in the data. This is
further discussed in Section 4.
The structure of the report is as follows. Section 2 consists of an overview of extreme
value methods in a univariate context, discussing block maxima and threshold exceedance
approaches for modelling the tails of a distribution. In addition, extensions of these models to stationary processes are described. Various methods of incorporating non-stationary
components into an extreme value model are also addressed, which is likely to be a key
to modelling extreme cylonic behaviour. Section 3 introduced the notion of dependence
modelling, describing methods of determining the joint risk of extreme events over multiple
variables of interest. Section 4 describes the physical context of the problem in greater detail.
An overview is presented of the physical processes that drive and shape the evolution and
movement of extratropical cylones, which is key to developing a statistical representation of
the physics that generate extremes from these weather systems. A brief description and exploratory analysis of reanalysis datasets is presented in Section 5. Lastly, Section 6 describes
2
the future direction of PhD research, detailing potential short-term and long-term avenues
of interest.
2
Univariate extreme value theory
In practical terms, the importance of analysing and predicting extreme events creates a necessity for a statistically rigorous model of the tail of the distribution of interest. Often
of interest are events that occur perhaps once every 100 or 200 years, such as a particularly damaging flood event, for example. However, by definition, observations in the tails
are scarce (see Figure 1), and so it is often required that information regarding unobserved
scenarios is gained using observed data. Extreme value theory focuses essentially on using
asymptotic models to extrapolate from observed to unobserved levels.
Many problems arise from estimating the tails using standard modelling approaches. As data
are concentrated towards the centre of the distribution, parameter estimates and model fit
are driven by these central values. In addition, different models that fit the body of the data
well can have very different extrapolations. These issues create the need for a tail model that
is not compromised by having to be fitted to the body of the distribution simultaneously.
Figure 1: Density of a normal distribution with few observations in the tails
2.1
Block maxima approach
Consider a set of observations from independent and identically distributed (IID) random variables X1 , . . . , Xn with an unknown distribution function F . Define MX,n =
3
max(X1 , . . . , Xn ) to be the maximum of this sequence of random variables. The distribution
function of the sample maxima can then be expressed as:
P(MX,n ≤ x) = P(X1 ≤ x, . . . , Xn ≤ x)
= P(X1 ≤ x) . . . P(Xn ≤ x)
= {F (x)}n .
An analogous result for minima can be obtained by defining:
mX,n = min(X1 , . . . , Xn )
= − max(−X1 , . . . , −Xn )
= −M−X,n
This report focuses on application of extreme value analysis to sample maxima. Results for
minima can be obtained using the above identity. Henceforth, Mn and mn will be used in
place of MX,n and mX,n respectively.
The formula for the distribution of maxima is unhelpful in practice as the distributional form
of F is typically unknown. One approach is to search for families of models for which the
expression F n converges for the tails of the distribution of F . However,
Mn → xF as n → ∞,
where
xF = sup{x : F (x) < 1}.
In other words, the distribution of Mn degenerates to a point mass on the upper end point
of F . A method of overcoming this difficulty is to obtain a linear renormalisation of Mn to
give a non-degenerate limit distribution. Let Mn ∗ be defined as:
Mn ∗ =
Mn − bn
,
an
for sequences of constants an > 0 and bn , which stabilise the location and scale of Mn ∗ as n
increases, avoiding the issues that arise with the distribution of Mn . The Extremal Types
Theorem (Leadbetter et al., 1983) states that given appropriate choices of these normalising
constants, as n → ∞:
Mn − bn
P
≤ x → G(x),
an
where G is non-degenerate and is of the same type as one of the following distributions:
• Gumbel: G(x) = exp{− exp(−x)}
− ∞ < x < ∞;
0
x≤0
• Fréchet: G(x) =
−α
exp{−x } x > 0, α > 0;
4
• Negative Weibull: G(x) =
exp{−(−x)α } x < 0, α > 0
1
x ≥ 0.
The Unified Extremal Types Theorem (UETT) unites these distributions under one parameterisation, the Generalised Extreme Value (GEV) distribution, with distribution function
( −1/ξ )
x−µ
,
G(x) = exp − 1 + ξ
σ
+
where x+ = max(x, 0) and σ > 0. The parameters µ, σ and ξ are interpreted as the location,
scale and shape parameters respectively. The distribution of Mn ∗ is of the same type as a
GEV distribution as n → ∞, for some value of ξ. A Gumbel distribution corresponds to
ξ = 0, with the feature of an exponential upper tail. A Fréchet distribution corresponds to
ξ > 0, with a heavy upper tail. A Negative Weibull distribution, for which ξ < 0, has the
property of a finite upper end point.
Substantial research has gone into the characterisation of the domains of attraction of extreme value limits. Essentially, this involves characterising the set of distributions F for
which the normalised maxima converges to an extreme value limit. Alternatively, given a
distribution F , it involves evaluating the form of the normalising sequences an and bn such
that the distribution of normalised maxima converges. The reciprocal hazard function h is
defined by:
1 − F (x)
xF < x < x F ,
h(x) =
f (x)
where f (x) is the density function, xF and xF are the lower and upper end points of the
distribution respectively. Expressions for an , bn and the shape parameter ξ can be formulated
as follows:
• h0 (y) → ξ as y → xF assuming h is differentiable.
• bn is such that 1 − F (bn ) = 1/n.
• an = h(bn ).
The GEV distribution is used to model the distribution of maxima. This asymptotic model
is used to approximate the distribution of extreme values for finitely many observations
n, provided Mn is constructed by taking the maximum of sufficiently many observations.
The procedure involves partitioning the data into blocks and analysing the maximum observation from each block. The choice of block size is critical for model performance. The
structure and size of the dataset may indicate natural choices for block size. For example, a
rainfall dataset containing 150 years of observations may be partitioned into annual blocks.
However, it must be ensured that block size must be large enough so that the limit model
approximation holds and small enough to obtain a desirably small estimation variance. The
applicability of the GEV distribution is also determined by the flatness of the derivative of
the reciprocal hazard. Numerical methods are required to solve for the maximum likelihood
estimates of θ = (µ, σ, ξ). For more details on the asymptotic properties of these maximum
5
likelihood estimates, see Smith (1985).
In practical applications, interest lies in the estimation of a probability that extreme events
are sufficiently small. The return period of level z is defined as the expected waiting time
until the level z is next exceeded. The T -year return level is defined as the level for which the
expected waiting time between exceedances is T years. The 1/p return level zp is the 1 − p
quantile of the GEV distribution for 0 < p < 1. By the invariance property, the maximum
likelihood estimates can be substituted for the parameters of the GEV distribution to give
an MLE for zp , defined as:
(
µ̂ − σ̂ξ̂ [1 − {− log(1 − p)}−ξ̂ ] for ξˆ 6= 0
ẑp =
µ̂ − σ̂ log{− log(1 − p)}
for ξˆ = 0.
Because of asymptotic normality, the delta method can be used to determine the uncertainty
of these estimates. However, this approximation performs poorly when considering return
levels corresponding to long return periods that fall beyond the scope of the data. Profile
likelihood-based confidence intervals provide a more accurate representation of uncertainty
when a strong degree of extrapolation is required.
2.2
Threshold methods
While the block maxima approach is easily useful and interpretable, one of its drawbacks
is its failure to capture the full behaviour of the tail of a distribution. The model is limited to analysing data selected as the maximum of a pre-selected block, despite the strong
possibility of there being other observations in the same block that may be characterised as
extreme (see Figure 2). Threshold methods account for the extra tail information in these
observations by analysing data above a pre-determined level u. This leads to a more efficient
modelling procedure.
Let X1 , X2 , . . . , Xn be a sequence of independent and identically distributed random variables, with common marginal distribution function F . Considering some high threshold u,
the behaviour of extreme events can be characterised by the conditional probability:
P(X > u + y|X > u) =
1 − F (u + y)
, y > 0.
1 − F (u)
Given the formulation of a block maxima model that is found to follow a GEV distribution,
then for large u, the distribution function of Yu |Yu > 0, where Yu = X − u, is approximately
ξy
H(y) = 1 − 1 +
σ̃
−1/ξ
, y > 0.
+
It follows that Yu |Yu > 0 follows a Generalised Pareto (GP) distribution (Pickands III, 1975)
with scale parameter σu and shape parameter ξ. Complete and outline proofs of this result
can be found in Leadbetter et al. (1983) and Coles (2001) respectively.
6
Figure 2: Scatterplot of rainfall accumulations in southwest England (1956-62), showing
the data used in the block maxima and threshold exceedance approaches. For the latter, a
threshold of u = 30 is selected.
Threshold models are alternatively characterised by limiting results from the theory of point
processes. Assuming that F is in the domain of attraction of a GEV(0, 1, ξ) distribution and
the required normalising constants are an and bn , then a sequence of point processes Pn can
be constructed on [0, 1] × R by
X i − bn
i
Pn =
,
; i = 1, . . . , n
n+1
an
and examining the behaviour as n → ∞. The limit process is non-degenerate as the distribution of the normalised maxima is non-degenerate. Large points of the process are retained
in the limit process while small points are normalised to the same value bl , with
x F − bn
bl = lim
.
n→∞
an
Under these conditions on Pn , on the set [0, 1] × (bl , ∞)
Pn → P as n → ∞,
7
where P is a non-homogeneous Poisson process with intensity function
−1−1/ξ
λ(t, x) = (1 + ξx)+
.
For a proof of this limit result, the reader is referred to Kallenberg (1983). This result
motivates the idea that the behaviour of all threshold exceedances is determined asymptotically by the characteristics of an , bn and ξ, as with the block maxima approach. However,
with the same number of parameters to estimate and a greater availability of extreme data,
this suggests the model could benefit from potential efficiency gains. Most importantly, it
motivates the use of the GP distribution as a conditional limit model for excesses of a high
threshold. The focus lies on the distribution of threshold exceedances in the process Pn . For
any fixed v > bl , let
un (v) = an v + bn ,
then as un (v) → xF , letting x > 0:
X i − bn
X i − bn
> x + v|
>v
P(Xi > an x + un (v)|Xi > un (v)) = P
an
an
= P(a given point in Pn > x + v|a given point in Pn > v)
→ P(a given point in P > x + v|a given point in P > v)
=
(1 + ξ(x + v))−1/ξ
+
−1/ξ
(1 + ξv)+
−1/ξ
x
=
1+ξ
,
σv +
where σv = 1 + ξv. Hence the limiting distribution for a scaled excess
[Xi − un (v)]+
|Xi > un (v)
an
follows a generalised Pareto distribution, GP(σv , ξ). The motivates the use of a GP model
for an approximate distribution of excesses above a threshold Yu |Yu > 0, such that
−1/ξ
ξy
P(Yu < y|Yu > 0) = 1 − 1 +
, y > 0.
σu +
One of the underlying issues in modelling threshold exceedance data is the choice of threshold.
The GP distribution has a threshold stability property. This states that if
Yu |Yu > 0 ∼ GP(σu , ξ),
for some high threshold u, then for a higher threshold v ≥ u
Yv |Yv > 0 ∼ GP(σu + ξ(v − u), ξ).
Thus, ξ is invariant to threshold choice, but σu is not.
8
The GP model of the excess variable Yu is conditional on having observed a threshold. To
obtain a model for the original variable X, the rate parameter φu is included in the model,
that is
φu = P(Yu > 0) = P(X > u),
the probability of observing an excess over the threshold u. This is estimated as simply the
proportion of data that exceed u.
The asymptotic approximation of a GP model may not be valid if the threshold is too low,
while a threshold that is too high will reduce the size of the dataset, which leads to greater
parameter uncertainty. An ideal threshold choice is based on this trade-off between bias and
variance. While there are no exact methods for threshold selection, graphical techniques are
available to guide selection based on properties of the GP distribution. Such methods include
mean residual life plots and parameter stability plots (Coles, 2001). The former is based on
the idea that if a GP model is a good fit, then the sample mean excess over a threshold
should be a linear with respect to the threshold. The latter is used based on the idea that
ξ and a reparameterised scale parameter σ ∗ = σu −ξu are constant with respect to threshold.
The point process framework provides an alternative method to formulate extreme value
limit results that unifies the block maxima and threshold exceedance approaches. Let
X1 , X2 , . . . , Xn be a series of independent and identically distributed random variables, and
let
i
, Xi : i = 1, . . . , n .
Nn =
n+1
Then for sufficiently large u, on regions of the form (0, 1) × [u, ∞), Nn is approximately a
Poisson process, with intensity measure on A = [t1 , t2 ] × (x, ∞) given by
−1/ξ
x−µ
.
Λ(A) = (t2 − t1 ) 1 + ξ
σ
Assuming the limit process is a reasonable approximation to the behaviour of Nn on A,
an appropriate likelihood can be derived and maximum likelihood estimates of parameters
(µ, σ, ξ) evaluated. Multiplying the intensity measure by a factor ny , the number of years
of observation, means that the parameters of the point process likelihood will correspond to
the GEV distribution of annual maxima. However, because the point process model makes
use of all data that are extreme, inferences are likely to be more accurate than estimates
based on a direct fit of the GEV distribution to the annual maximum data. The shape
parameter of the point process model is equal to the threshold exceedance model, while the
scale parameter is related through the identity
σu = σ + ξ(u − µ).
The point process model is advantageous in its parameterisation in terms of the GEV parameters that are invariant to threshold. This is beneficial when adapting the model to account
for non-stationarity by modelling the parameters as functions of covariates. In addition,
because the parameters are not threshold-dependent, the model can be adapted to include
time-varying thresholds.
9
2.3
Modelling extremes of stationary processes
In the previous sections, the models described work under the assumption that the random
variables of interest are independently and identically distributed. However, in practice,
such an assumption is unrealistic. Rainfall data, as an example, exhibits a high degree of
temporal dependence. For example, a day of torrential rain is more likely to succeed a day
of rain than a day of sunshine. Hence, there is a need for a statistically rigorous model that
accounts for short-range and long-range temporal dependence between extreme observations.
Rather than independence, the assumption of stationarity is made.
A process {Xt } is said to be a stationary process if the joint distributions of (Xt1 , . . . , Xtk )
and (Xt1 +τ , . . . , Xtk +τ ) are the same for any k, t1 , . . . , tk and τ .
There is a need to limit the amount of long-range dependence between extreme observations.
The Asymptotic Independence of Maxima (AIM) condition (O’Brien, 1987) ensures that separated groups of extreme observations become independent as their separation and level are
sufficiently large. Let Mi,j = max(Xi , . . . , Xj ) and un = an x + bn for normalising sequences
an , bn and any real number x. Under the AIM(un ) condition, there exists a sequence qn of
positive integers with qn = o(n) such that for all i and j
max |P(M1,i ≤ un , Mi+qn ,i+qn +j ≤ un ) − P(M1,i ≤ un )P(M1,j ≤ un )| → 0 as n → ∞
(1)
The Unified Extremal Types Theorem for stationary sequences says that if this condition
holds, and if normalising sequences an and bn exist, then if
Mn − bn
≤ x → H(x) as n → ∞,
P
an
where H is non-degenerate, then H is a member of the GEV family of distributions.
A measure of short-range extremal dependence, the extremal index θ ∈ (0, 1) is defined by
θ = lim P(M2,pn ≤ un |X1 > un ),
n→∞
where pn = o(n). θ essentially represents the limiting probability of consecutive observations
following a maximum occurring below a given threshold un . Hence, values of θ close to 1
correspond to weaker dependence, while values closer to 0 correspond to stronger dependence.
For an IID process, θ = 1. Providing equation (1) holds and θ exists, then
H(x) = {G(x)}θ ,
where G(x) is the limiting distribution under the IID assumption. More details on the extremal index can be found in Leadbetter (1983).
In the threshold exceedance approach, a cluster is defined as a set of points exceeding a
threshold u that occur within a short time period of one another. The expected number of
exceedances of the threshold u per cluster is θ−1 . Cluster maxima are independent and can
10
be modelled using a GP distribution or point process method. Values within the cluster are
dependent. Numerous methods have been proposed for identifying independent clusters of
extreme values. A selection of these methods can be found in Smith and Weissman (1994),
Ledford and Tawn (2003) and Ferro and Segers (2003).
2.4
Modelling extremes of non-stationary processes
Because non-stationarity is a prevalent feature of many physical processes modelled using
extreme value methods, a model framework is required that incorporates this feature in a
statistically precise manner. Non-stationarity can manifest in a number of ways, the most
common being trend and seasonal effects. Rainfall, for example, tends to exhibit a seasonal
pattern due to worsening winter weather conditions.
Traditional methods have focused on modelling non-stationary margins directly through the
model parameters. In this way, the parameters become functions of covariates, which are
easily estimated using the likelihood framework and standard model selection techniques such
as the likelihood ratio test. Constraints are imposed such that the scale parameters in both
the GEV and GP approaches are positive and that the rate parameter in the GP approach lies
in (0, 1). Conditional and marginal return levels can then be evaluated for extrapolation. For
a comprehensive overview of this procedure, see Coles (2001). Alternatives to this approach
include nonparametric fitting (Hall and Tajvidi, 2000) and preprocessing (Eastoe and Tawn,
2009), which removes the non-identical margins before applying the traditional approach.
2.4.1
Generalised additive models
The traditional approach is conveniently implemented in a linear framework. A generalised
additive modelling approach expresses the model parameters as linearly dependent on smooth
functions of covariates. Chavez-Demoulin and Davison (2005) fit a nonhomogenous Poisson
process model with parameters λ(t), ξ(t) and σ(t) such that
λ(t) = exp{xT α + f (t)}
ξ(t) = xT β + g(t)
σ(t) = exp{xT γ + s(t)},
where α, β and γ are parameter vectors and f , g and s are smooth functions. Here, the
time covariate is a smooth function of t, but this can be extended to include other covariates.
Estimating the rate λ involves the use of penalised likelihood estimation. The Poisson process
log-likelihood is given by
Z t0
n
X
lλ =
log λ(tj ) −
λ(t)dt,
0
i=1
which is approximated by
ˆlλ =
m
X
ck log λ(kδ) − δ
k=1
m
X
k=1
11
λ(kδ).
The roughness penalised log-likelihood is defined by
lλ∗ = lλ + ρλ Rλ ,
where Rλ is a parameter roughness penalty. There are numerous ways to define this penalty,
one being
Z
1 b 00 2
f (t) dt.
(2)
Rλ = −
2 a
The value of roughness coefficient ρλ is selected using cross-validation to provide good predictive performance.
Similarly, the Generalised Pareto model is used to estimate the size of threshold exceedance
by maximising the roughness penalised GP likelihood
∗
lξ,σ
= lξ,σ + ρξ Rξ + ρσ Rσ ,
where Rξ and Rσ are parameter roughness penalties for GP shape and scale respectively,
defined similarly to equation (2). Roughness coefficients ρξ and ρσ are evaluated using cross
validation.
Jonathan et al. (2014b) presented a similar method for estimating a threshold function φ
above which observations are deemed to be extreme. This is done using quantile regression
(Koenker, 2005). In particular, estimating φ requires minimising the quantile regression lack
of fit criterion
n
n
X
X
lφ = τ
|ri | + (1 − τ )
|ri |,
i:ri ≥0
i:ri <0
for residuals ri = zi −φi and where τ is the non-exceedance probability given any combination
of covariates. The smoothness of the quantile function is regulated by penalising lack of fit
for parameter roughness Rφ by minimising the revised penalised criterion
lφ∗ = lφ + ρφ Rφ ,
A spline modelling approach is used to evaluate the parameter roughness penalties, see
Chavez-Demoulin and Davison (2005) and Jonathan et al. (2014b) for more details. Spline
representations are also useful in non-stationary conditional extremes modelling based on the
approach of Heffernan and Tawn (2004) (see Section 3.4). Penalised likelihood optimisation
is performed using a backfitting algorithm (see, for example, Davison (2003)).
2.4.2
Random effects
It is often found that not all of the observed variability in the model parameters is accounted for by traditional regression models or the generalised additive approach. One way
to account for this extra variation is to incorporate a random effect term into the model
parameters. This is particularly useful when no covariate data is available, or even as part
of an investigation into identifying possible covariates that could be of benefit to model fit.
12
Previous work in the extreme value literature has rarely focused on incorporating random
effects into a statistical model. Eastoe and Tawn (2010) include an annual random effect
component in the formulation of the rate parameter of flood events occurring corresponding
to a homogeneous Poisson process. For this model specification, the index of dispersion
D = 1, that is
Var(N )
= 1,
D=
E(N )
where N is the number of events. However, it has been seen that major flood events are
overdispersed, that is, D > 1, and consequently, extra variation between years cannot be
captured by the homogeneous Poisson process model. This is due to the lack of explanatory
variables in the model, and thus, the model is incapable of capturing the non-identical margins in the data.
Let Ni be the number of events in year i. Then consider the following hierarchical model:
Ni ∼ Poisson(λγi );
γi ∼ Gamma(1/α, 1/α),
where λ > 0, α > 0 and γi are independent and identically distributed. In this model,
E(Ni ) = λ but Var(Ni ) = λ(1 + λα), so the index of dispersion is
D = 1 + λα.
Assuming the event peaks are Generalised Pareto distributed, the annual maxima Mi have a
extended Generalised Logistic distribution, see Eastoe and Tawn (2010) for further details.
The model is extended to include within-year variability and covariates:
Ni ∼ Poisson(λi ),
where
λi =
365
X
λij
j=1
xij )
λij = γi g(βx
γi ∼ Gamma(1/α, 1/α),
where Ni is the number of counts in year i, λij is the probability of there being a peak
xij ) is a function of covariates and γi is the random effect. The
event on day j in year i, g(βx
parameter α quantifies any extra annual variability in the rate which is not explained by the
regression part of the model. Thus, the γi can be interpreted as covariates, on the annual
scale, that are unobserved. The model can be further extended to include year-to-year dependence in the random effects. This, in turn, introduces dependence to the distributions of
the counts Ni and the annual maxima Mi . see Eastoe and Tawn (2010) for more information on model extensions and a detailed overview of the MCMC procedure used for inference.
The random effect component of this model can be interpreted as approximating additional,
unobserved covariates. This is useful in that estimation of these random effects can be used
13
in the identification of suitable covariates for the model. This can be helpful for learning
about weather extremes in the sense that intense or fast-moving storms may be influenced
by an unobserved variable, whose structure can be analysed and matched with a climate
process whose properties are known and similar to the estimate of the random effect. A major
component of future research will be to uncover ways to extend the concept of random effects
to univariate and multivariate modelling of extremal behaviour in extratropical cyclones (see
Section 6).
2.5
Simulation study
In this section, a study is presented which illustrates the application of the threshold-based
point process model introduced in Section 2.2 to simulated data. An approach for simulating a Poisson process is introduced. This is followed by an overview of the model fitting
procedure, implemented using software developed by the author in R and verified using
the simulation procedure. In addition, an example is presented to illustrate how likelihood
methods can be used to test for nonstationarity in the data using the traditional approach.
2.5.1
Simulating a Poisson process
Consider a two-dimensional non-homogeneous Poisson process with intensity λ(t, x) on the
set A = [0, τ ] × (u, ∞), for some fintie u. For a non-stationary limiting point process, the
intensity function λ is of the form
−1/ξt −1
x − µt
1
1 + ξt
,
λ(t, x) =
σt
σt
(3)
for covariate-dependent parameters θ = (µt , σt , ξt ).
Let N (A) be the number of points of the Poisson process in the set A. A key property of a
Poisson process is that
N (A) ∼ Poisson(Λ(A)),
where Λ(A) is the integrated intensity function
Z τZ ∞
Λ(A) =
λ(t, x)dxdt.
0
(4)
u
The density of points in the set A at the point (t, x) is defined as
f (t, x) =
λ(t, x)
, for t ∈ [0, τ ], x ∈ [u, ∞)
Λ(A)
Simulating a Poisson process on A corresponds to simulating N (A) = nu points from this
bivariate density.
14
Recall from probability theory that to simulate from this bivariate distribution, t is simulated
from the marginal f (t), which can be expressed as
∞
Z
f (t) =
u
o−1/ξt
t
1 + ξt u−µ
σt
f (t, x)dx = R n
o−1/ξt
τ
u−µt
1
+
ξ
dt
t
σt
0
n
A probability integral transform can be used to achieve this. Defining u ∼ U (0, 1), the
following equation holds:
Z t
f (s)ds
(5)
u = F (t) =
0
Simulations of t can be found by solving for t in equation (5) using a standard equation
solver algorithm. Then for realisation T = t, simulate x from the conditional X|T = t,
which is a GP distribution. The set of vectors {(ti , xi ) : i = 1, . . . , nu } then represents a
two-dimensional Poisson process, with parameters depending on covariates.
2.5.2
Model fitting
Standard maximum likelihood techniques are used to compute parameter estimates for θ =
(µ, σ, ξ). The likelihood function for a Poisson process is defined as
L(θθ ) =
n
Y
λ(ti , xi ) exp{−Λ(A)},
i=1
where λ(t, x) and Λ(A) are defined by equations (3) and (4) respectively. Hence, the loglikelihood to be maximised can be expressed as
" −1/ξti −1 # Z τ −1/ξti −1
n
X
1
xi − µti
u − µti
l(θθ ) =
log
−
dt
1 + ξti
1 + ξti
σti
σti
σti
0
i=1
Numerical techniques are required to solve this optimisation problem. Difficulties arise, however, in the numerical estimation of the integral component of the log-likelihood function. A
common numerical method used to compute this integral is a simple Monte Carlo estimator,
specifically
−1/ξti −1
n X
1
u
−
µ
t
i
Iˆ =
1 + ξti
,
n i=1
σti
where ti is the time of exceedance i and n is the number of exceedances. This is sufficient for
the case of stationarity in the data. The absence of covariates in the stationary case means
that time points should be uniformly distributed in the data (see Figure 3), and hence, the
Monte Carlo estimator should be a valid approximation of the integral.
Now, consider the case of a non-stationary trend in the data. For simplicity, assume a linear
time trend in µt . Simulating such a Poisson process gives the plots shown in Figure 4.
An adjustment must be made in the formulation of the integral to account for the non15
Figure 3: A scatterplot of a simulated stationary Poisson process with parameters θ =
(100, 15, 0.05) alongside a histogram illustrating the uniformity of the time points.
Figure 4: A scatterplot of a simulated non-stationary Poisson process with parameters σ =
15, ξ = 0.05 as before, but with µt = 100 + 60t, alongside a histogram illustrating the clear
non-uniformity of the time points.
stationarity component, which in this case, causes the density of observations to increase
with respect to time. The assumption of uniformly distributed time points is therefore
no longer valid. The Monte-Carlo estimator of the integral is therefore evaluated at userspecified uniform intervals over the time period, which accounts for the trend component of
µt in the correct way. In particular, the new estimator I˜ of the integral can be expressed as
−1/ξsi −1
n u − µsi
1 X
˜
1 + ξsi
I=
,
m i=1
σ si
where si represents the ith component of the uniform grid in (0, τ ) and m represents the
number of intervals specified on the grid.
Because the Poisson process of interest is generated by the user, this provides an opportunity
to test the logic of the model fitting arguments outlined in this section. A Poisson process
with parameters θ = (µ0 + µ1 t, σ, ξ) is generated with µ0 = 100, µ1 = 30, σ = 15, ξ = 0.05.
16
Performing maximum likeilhood estimation on the model parameters using the arguments
outlined above, parameter estimates and standard errors can be found in Table 1. The four
µ0
µ1
Estimate
86.7553
30.1935
Standard Error 12.8086
1.7231
95% CI
(61.55,111.86) (26.82,33.57)
σ
ξ
21.225
0.1043
4.6236
0.0323
(12.16,30.29) (0.04,0.17)
Table 1: Table of parameter estimates and standard errors for simulated Poisson process
model
parameters fall within a 95% confidence interval of their corresponding estimate calculated
using the delta method, suggesting that these estimates fall within a reasonable margin of
error.
Developing this point process methodology is important when analysing weather extremes.
As discussed in Section 2.2, the point process model has numerous advantages over both the
block maxima and GP threshold approaches. With weather extremes, using this approach
and incorporating more data into the extreme value model increases efficiency and reduces
variability in the parameter estimates. In particular, reduced variability is an advantage
when modelling extremes of physical processes that are highly variable by their very nature.
3
Bivariate extreme value theory
In many physical applications, the extremal behaviour of one variable may not be sufficient in representing the complexity of the underlying processes. It is therefore prudent to
consider the joint extremal properties of multiple variables in order to better approximate
this complexity. When considering the case of multivariate extreme value theory to model
the behaviour of natural systems, it is important to account for spatial and temporal dependence. For example, when analysing rainfall data from two nearby locations, intuition
suggests that extreme rainfall on one site could be dependent on extreme rainfall on the
other. This dependence may also have a temporal component in the case where the same
weather system impacts on two locations at two different time points. Dependence must
also be modelled when considering the extremes of two variables. In weather data, rainfall
and wind speed are often affected by the same storm event. By introducing a multivariate
dependence structure to the modelling procedure, there is scope to produce a model that is
a more accurate representation of the physical process than a univariate approach can provide. It is expected that this will manifest in an incorporation of dependence over space, but
also perhaps dependence between different physical features of extratropical cyclones (see
Section 4) that can be captured in Met Office data. In this section, the concept of bivariate
extreme value theory is introduced, which can be easily extended to the multivariate setting.
17
3.1
Measures of dependence
Consider random variables (X, Y ) whose joint distribution function is defined by
F (x, y) = P(X ≤ x, Y ≤ y).
Since this function contains a complete description of dependence between X and Y , a
common method of exploring this further is to remove the effect of the marginal distributions by transforming the variables onto common margins. The copula function describing
dependence between X and Y is defined is given by the function C such that
F (x, y) = C{FX (x), FY (y)},
where FX (x) = F (x, ∞) and FY (y) = F (∞, y) denote the marginal distributions of X and
Y . Analysis of dependence concepts can sometimes be more mathematically convenient on
certain marginal scales. Scales frequently used for transformation include Uniform, Gumbel,
Fréchet and Laplace marginal distributions (see Figure 5). In each case, the copula allows
an analysis of dependence between the two variables.
Figure 5: A bivariate normal distribution with ρ = 0.6, transformed to Uniform, Fréchet
and Gumbel margins.
The study of copulas leads to the formulation of a summary measure of dependence. Assuming a common marginal distribution, this is given by the quantity χ where
χ = lim∗ P(Y > z|X > z),
z→z
(6)
where z ∗ is the upper end point of the distribution. Intuitively, this can be interpreted as the
probability of one variable being extreme given that the other is extreme. This leads to the
18
concept of asymptotic dependence, a property where the realisations of the tail components
of a random vector occur simultaneously with a high probability. When this scenario is
unlikely, the variables are said to be asymptotically independent. To explore how this idea
relates to χ, consider the following. By a probability integral transformation, (X, Y ) can be
transformed to standard Uniform margins (U, V ) and equation (6) can be rewritten as
χ = lim P(V > u|U > u).
u→1
Then using the laws of conditional probability and the exclusion-inclusion formula, the following result holds:
log C(u, u)
log u
(7)
log P(U < u, V < u)
,
log P(U < u)
(8)
P(V > u|U > u) ≈ 2 −
Hence, defining
χ(u) = 2 −
it follows that
χ = lim χ(u).
u→1
In practice, analysis will often lead to estimates of χ = 0, suggesting asymptotic independence. This result merely captures the behaviour of variables that occur simultaneously and
hence there is a need for a second measure that summarises the degree of finite dependence
under asymptotic independence. Defining the joint survivor function as
F̃ (x, y) = P(X > x, Y > y),
the same reasoning as in equations (7) and (8) can be applied. Define
χ̃(u) =
2 log(1 − u)
− 1 for 0 ≤ u ≤ 1,
log C̃(u, u)
where −1 ≤ χ̃(u) ≤ 1. Then
χ̃ = lim χ̃(u).
u→1
For a complete summary of extremal dependence, the pair (χ, χ̃) is required. The combination (χ > 0, χ̃ = 1) corresponds to asymptotic dependence, where the value of χ determines
the strength of the dependence. Asymptotic independence, in contrast, is given by the combination (χ = 0, χ̃ < 1), where χ̃ signifies the strength of finite dependence within this class.
Having defined measures of extremal dependence based on limiting values of dependence
functions, it is necessary to relate these quantities to the bivariate extreme value theory and
modelling procedure. Ledford and Tawn (1996) formulated a flexible model that provided
a smooth link between the bounding cases of perfect dependence and perfect independence.
19
Consider a pair of random variables (X, Y ), with unit Fréchet margins. The joint survivor
function of (X, Y ) satisfies the asymptotic condition
P(X > z, Y > z) ≈ L(z)z −1/η , for large z
(9)
where L(z) is a slowly varying function as z → ∞. The parameter η ∈ (0, 1] is the coefficient
of tail dependence. If η = 1 and L(z) → c as z → ∞, with 0 ≤ c ≤ 1, then (χ = c, χ̃ = 1),
and the variables are asymptotically dependent of degree c. If η < 1, then it can be shown
that χ̃ = 2η −1 and χ = 0, and thus, (X, Y ) are asymptotically independent. The parameter
η has been identified as a pivotal parameter in the characterisation of extremal dependence.
Inference on η can be made by defining T = min(X, Y ) such that
P(T > z) = P(X > z, Y > z) ∼ L(z)z −1/η , z → ∞
η is the shape parameter of the variable T and so standard univariate techniques can be used
to estimate η. For example, Ledford and Tawn (1996) use a point process model to analyse
the extremal behaviour of the structure variable T . These techniques are easily extended to
the multivariate case where the number of variables is greater than 2.
3.1.1
Testing dependence of simulated data
This numerical investigation aims to analyse the extremal dependence properties of simulated data through estimation of the parameter η. Following the procedure for estimating η
outlined in Section 3.1, analysis is performed on data simulated from the bivariate normal
distribution (BVN) and the bivariate logistic distribution (BVE).
Bivariate Normal distribution
The bivariate normal distribution has the form
0
1 ρ
X ∼ BVN
,
0
ρ 1
where ρ is a dependence measure between two variables X and Y . A value of ρ = 0
indicates independence, while values of ρ = −1, 1 corresponds to perfect negative and positive
dependence respectively. Data can be simulated from a bivariate normal distribution by
simulating X̃ ∼ N (0, 1) and Ỹ ∼ N (0, 1), then setting:
X = X̃
1/2
Y = ρX̃ + Ỹ (1 − ρ2 ) .
First, 1000 samples of size 10000 are simulated. After transformation to Fréchet margins, η
is estimated at a 90% threshold and averaged over each sample. It can be shown that the
true value of η for a bivariate normal distribution is (1 + ρ)/2, which in this case means
that η = 0.9. The mean value of η from the estimation procedure is η̂ = 0.8549, with 95%
confidence bounds of [0.735, 0.965]. This strongly indicates that η < 1, corresponding to
asymptotic independence.
20
Bivariate Logistic distribution
Similarly, 1000 samples of size 10000 are simulated from a bivariate logistic distribution,
with distribution function
n
o
α
F (x, y) = exp −(x−1/α + y −1/α ) ,
where x, y > 0 and α ∈ (0, 1]. Independence corresponds to α → 1 and perfect dependence
corresponds to α → 0. For the purpose of this simulation, a value of α = 0.75 is selected.
Like in the previous example, the estimate of η is evaluated at a 90% threshold and averaged
over each sample. The estimate of η from this procedure is η̂ = 0.963, with 95% confidence
bounds of [0.872, 1], which is in the range of the true value of η for the bivariate logistic
distribution, η = 1. All bivariate extreme value distributions, like the logistic distribution,
are asymptotically dependent, and this estimation procedure supports this result.
3.2
Componentwise block maxima
In the bivariate case, the componentwise block maxima approach is suitable in the case
where only the annual maximum data are available from two locations, for example. Hence,
this method is the bivariate extension to the approach introduced in Section 2.1, though
extensions to more than two variables are possible. Consider the maxima of a pair of random
variables (X, Y ), and define
MX,n = max{X1 , . . . , Xn } and MY,n = max{Y1 , . . . , Yn },
with Mn = (MX,n , MY,n ). Assume (X, Y ) have Fréchet marginal distributions. The limiting
distribution of the normalised vector Mn /n is non-degenerate, that is
P(MX,n /n ≤ x, MY,n ≤ y) = {F (nx, ny)}n → G(x, y) as n → ∞,
where G has the form
G(x, y) = exp (−V (x, y))
where
Z
V (x, y) =
1
max
0
w 1−w
,
x
y
2dH(w)
and H is a distribution function on [0, 1] satisfying the mean constraint
Z 1
wdH(w) = 1/2.
0
The family of distributions that arise from this limiting result is termed the class of bivariate
extreme value distributions. Although this result provides a complete summary of bivariate
extreme value distributions, the class of possible limits is wide. One method is to use
parametric sub-families of distributions for H, leading to sub-families of distributions for G.
One standard class is the logistic family:
α
G(x, y) = exp{−(x−1/α + y −1/α ) }, x > 0, y > 0,
21
where 0 < α ≤ 1. V is said to be homogeneous of order −1 as for any constant a > 0,
V (a−1 x, a−1 y) = aV (x, y).
Using the homogeneity property, it can be shown that for the class of bivariate extreme value
distributions
χ = 2 − V (1, 1).
For the logistic family, χ = 2 − 2α . This gives the parameter α some interpretation of a
measure of dependence. When α = 1, this corresponds to χ = 0, asymptotic independence.
The quantity V (1, 1) can also be used to define another measure relating to extremal dependence. Assuming common Fréchet marginal distributions, the extremal coefficient θ is
defined such that
P(X < x, Y < x) = P(X < x)θ , 1 ≤ θ ≤ 2.
Consider a pair of bivariate logistic random variables with Fréchet margins. Then
P(X ≤ u, Y ≤ u) = exp{−V (1, 1)/u} = [exp{−1/u}]V (1,1) = P(X ≤ u)V (1,1)
Thus, θ = V (1, 1), with θ = 1 corresponding to perfect dependence and θ = 2 corrresponding
to independence.
Like in the univariate case, the theoretical results for componentwise block maxima are
applied to a bivariate extreme value modelling procedure, where the asymptotic arguments
are assumed to be exact for a large number of observations n. This allows estimation of model
parameters χ and χ̃. Because χ(u) is constant for any member of this family of distributions,
evidence of non-constancy is indicative of a lack of model fit. Likelihood-based methods are
commonly used for parameter estimation, while nonparametric methods are also available.
Like in the univariate case, the componentwise block maxima approach suffers from a failure
to capture the full extremal behaviour of the process by only considering the maxima. By
considering a threshold-based model, improvements in efficiency and flexibility can be gained.
3.3
Threshold methods
There are substantial efficiency gains when considering more general point process characterisations of extremal behaviour. Let (X1 , Y1 ), (X2 , Y2 ), . . . be an independent series of
realisations of the random vector (X, Y ) with standard Fréchet marginal distributions. Define a sequence of point processes Pn such that
Xi Yi
,
: i = 1, . . . , n .
Pn =
n n
As n → ∞, Pn → P , where P is a Poisson process. The intensity function λ of the limiting
process is stated upon transformation of the coordinate system to ‘radial’ and ‘angular’
components (R, W )
dr
λ(dr × dw) = 2 × 2dH(w),
r
22
where H is the dependence measure of the associated componentwise block maxima vector.
The joint tail probability of events follows immediately from this result. If A ⊂ B, where
B = {(x, y) : x > x0 , y > y0 } for large x0 and y0 , then
1
P{(X, Y ) ∈ tA} ≈ P{(X, Y ) ∈ A}.
t
If (X, Y ) have Gumbel margins, obtained by taking the logarithm of Fréchet margins, then
this translates to:
P{(X, Y ) ∈ t + A} ≈ exp(−t)P{(X, Y ) ∈ A}.
Difficulties arise in this formulation due to the degeneracy of H in the case of asymptotic
independence, which can be avoiding by extending the limit process to account for the
degree of dependence within the class of asymptotically independent distributions. Following
Ledford and Tawn (1997), define another sequence of point processes P̃n such that
P̃n =
Xi Yi
,
nη nη
: i = 1, . . . , n ,
where η is the coefficient of tail dependence corresponding to (X, Y ) (see equation (9)).
P̃n → P̃ as n → ∞, with intensity function defined on the transformed coordinate system as
λ̃(dr × dw) =
dr
r(1+η)/η
× dH̃(w).
In this case, for A ⊂ C, where C = {(x, y) : x + y > r0 , w0 ≤ x/(x + y) ≤ 1 − w0 }, for large
r0 and small w0 > 0, it follows that
P{(X, Y ) ∈ tA} ≈
1
t1/η
P{(X, Y ) ∈ A}.
(10)
In Gumbel margins, similarly
P{(X, Y ) ∈ t + A} ≈ exp (−t/η)P{(X, Y ) ∈ A}.
(11)
Inference for the limiting Poisson process model as a reasonable approximation for the distribution of observations above high threshold can proceed in a number of ways. Parametric
estimation can proceed with the formulation of a likelihood defined relative to the corresponding intensity function. Alternative nonparametric procedures have also been proposed
(de Haan and de Ronde, 1998). The aim of these methods is to map nonparametric estimates of probabilities within a set A of observed data to sets t + A, tA that may contain no
observed data. In particular, this extrapolation, using equations (10) and (11), allows the
formulation of probabilities of extreme events.
23
3.4
Conditional approach
A key issue with the standard multivariate extreme value approach concerns the estimation
of probabilities for sets A that are not simultaneously extreme in each component, that is, the
random variables of interest are asymptotically independent. This is evident as the empirical
estimate of the mapped probability is likely to be 0 since the mapped data are unlikely to fall
in the sets t + A, tA. The conditional approach of Heffernan and Tawn (2004) is applicable
whether the variables are asymptotically independent or asymptotically dependent. For the
purpose of this report, the bivariate case is considered.
Consider a pair of random variables (X, Y ) transformed onto Gumbel margins and the limiting behaviour of the conditional distribution of Y given X, that is, P(Y ≤ y|X = x).
Assuming the existence of normalising functions a(x) and b(x), both R+ → R, which can be
chosen such that, for all fixed z and for any sequence of x-values such that x → ∞, then
lim P(Z ≤ z|X = x) = G(z),
x→∞
for
Z=
(12)
Y − a(x)
,
b(x)
where the limit distribution G is non-degenerate.
Using this assumption, the result follows that, conditionally on X > u, as u → ∞ the
variables X − u and Z are independent in the limit. To illustrate this, let x = u + y with
y > 0 fixed; then
P(Z ≤ z, X − u = y|X > u) = P(X ≤ a(u + y) + b(u + y)z|X = u + y)
fX (u + y)
P(X > u)
→ G(z) exp (−y) as u → ∞.
The limiting marginal distributions of X − u and Z are exponential and G respectively.
Like in the univariate case, the normalising functions a(x) and b(x) must be defined in terms
of characteristics of the conditional distribution of Y |X. The limit distribution G is unique
up to type, so the normalising functions are identified up to a constant. The following result
from Heffernan and Tawn (2004) describes how these functions can be derived analytically.
Suppose that a pair of random variables (X, Y ) has an absolutely continuous joint density.
If the functions a(x) and b(x) > 0 satisfy the property (12), then these functions satisfy the
following properties up to type:
lim F {a(x)|x} = p,
x→∞
where p is a constant in the range (0, 1), and
b(x) = h{a(x)|x}−1 ,
24
where F is the conditional distribution and h is defined as the conditional hazard function. A
detailed proof is given in Heffernan and Tawn (2004), along with some theoretical examples.
In Keef et al. (2013), a number of problems with the Heffernan and Tawn (2004) approach
are identified that have been found to limit the utility of the method. Complications arise
with modelling variables with some components that are positively associated and others
negatively associated. There are also issues with parameter identifiability and inferences
that are inconsistent with the marginal distributions. In order to overcome the first problem,
the variables of interest are transformed onto Laplace margins rather than Gumbel margins,
such that:
log{2FX (X)}
for X < FX−1 (0.5);
XL =
− log{2[1 − FX (X)]} for X ≥ FX−1 (0.5),
and
YL =
log{2FY (Y )}
for Y < FY−1 (0.5);
− log{2[1 − FY (Y )]} for Y ≥ FY−1 (0.5).
Then (XL , YL ) has Laplace marginal distributions with
exp(x)/2
if x < 0;
P(XL < x) =
1 − exp(−x)/2 if x ≥ 0,
and
P(YL < y) =
exp(y)/2
if y < 0;
1 − exp(−y)/2 if y ≥ 0.
A Laplace transformation captures the exponential tails of a Gumbel distribution while also
having a symmetry property that allows for negatively associated variables to be incorporated
into the model parsimoniously. Using these transformations motivate the use of a single class
of normalising functions of the form:
a(x) = αx and b(x) = xβ ,
with (α, β) ∈ [−1, 1] × (−∞, 1). This is a unified class representing both postively and
negatively associated variables. Values of α = 1, β = 0 indicate asymptotic dependence,
otherwise variables are asymptotically independent. For asymptotically independent distributions, values of 0 < α ≤ 1 correspond to positive extremal dependence and −1 ≤ α < 0
corresponding to negative extremal dependence.
In Jonathan et al. (2014a), an extension to this model is introduced facilitating general nonstationary conditional extremes inference using spline representations of model parameters
with respect to covariates, similar to the approach described in Section 2.4.1.
4
Extratropical cyclones
In recent years, the occurrence of extratropical cyclones in the North Atlantic Ocean has
caused incidents of severe weather over Western Europe. For example, in October 2013, Cyclone Christian caused 18 fatalities, widespread damage and mass destruction across Western
25
Europe. This catastrophic event was estimated to have cost the insurance industry an aggregated total of A
C1.094 billion in the countries affected. While the Met Office was successful
in forecasting this extreme weather event, there remains a need to relate the complexities
of the physical features of extratropical cyclones to a statistical context. This is due to the
underlying stochasticity in cyclonic behaviour that make it a mortal and economic threat.
This section introduces the key ideas behind the evolution of extratropical cyclones based on
long-standing physical models, while providing a brief overview of the application of Extreme
Value Theory to characterising the extreme behaviour of this weather phenomenon.
Figure 6: The tracks in the mid-latitudes in the winter of 1987/1988, in which ‘The Great
Storm’ caused mass damage in France, Great Britain and Ireland.
4.1
Formation
Extratropical cyclones are low pressure weather systems that occur in the middle latitudes
of the earth, and are primarily associated with stormy weather with strong winds and heavy
precipitation. A cyclone centre moves along a path known as a track (see Figure 6), which
is often the centre of a region affected by severe cyclonic weather. Many factors affect the
movement of the track, and consequently, the complexity of quantifying the intensity and
severity of extratropical cyclones makes their occurrence difficult to predict. Extreme behaviour in extratropical cyclones is usually identified in data by unusually high observations
of wind speed, rainfall accumulations and/or vorticity. In order to develop statistical methods to estimate these quantities, a knowledge of the physical processes behind the formation
and development of extratropical cyclones is required.
26
4.1.1
Airmasses
An airmass is defined as a large body of air whose physical properties are approximately
uniform horizontally in a large area of space. Airmasses can be characterised by their temperature; hot and cold, simply. Airmasses move away from their source region because of
the differences in temperature between the poles and the equator. Because of this, warm
and cold airmasses tend to move and interact, potentially altering their properties as a consequence. A key component of this movement is the jet stream, a narrow band of air where
wind speed is at its maximum. It has been found that many day-to-day weather variations
are associated with the formation and movement of boundaries, or fronts, between different
airmasses.
4.1.2
The Norwegian Cyclone Model
The Norwegian Cyclone Model (Bjerknes and Solberg, 1922) led the meteorological research
into the behaviour of weather systems at fronts between airmasses. These fronts were classified into four types:
• Cold front - Cold air advancing onto warm air.
• Warm front - Warm air advancing onto cold air.
• Stationary front - Neither airmass advances.
• Occluded front - A cold front overtaking a warm front.
An extratropical cyclone forms when the interface between the warm and cold airmasses
develops into a wave form with its apex located at the centre of the low-pressure area
(see Figure 7). Precipitation and wind are characteristic of the locations of the warm and
cold fronts. This idealised model is generally characteristic of oceanic extratropical cyclone
formation, but analyses following cyclone formation over land found substantial departures
from the Norwegian model.
4.1.3
The Shapiro-Keyser Model
The ideology behind the Norwegian model was born from analysis of surface weather maps
over Europe in a time before routine air observations began. In recent years, due to inconsistencies between data and the Norwegian model, revisions have been made to the original
configuration, such as the Shapiro-Keyser model (Shapiro and Keyser, 1990). As with the
Norwegian cyclone model, an incipient cyclone develops cold and warm fronts, but in this
case, the cold front moves roughly perpendicular to the warm front such that the fronts
never meet, the so-called ’T-bone’ (see Figure 7). This is followed by seclusion, the mature
phase of the cyclone life-cycle, which may result in hurricane winds and torrential rain.
Not all extratropical cyclones originate as frontal waves. Some begin as tropical cyclones
before moving into the mid-latitudes, where different types of behaviour have been observed,
Barry and Chorley (2009) for a detailed overview.
27
Figure 7: From Schultz et al. (1998): (a) Norwegian cyclone model: (I) incipient frontal
cyclone, (II) and (III) narrowing warm sector, (IV) occlusion; (b) Shapiro-Keyser cyclone
model: (I) incipient frontal cyclone, (II) frontal fracture, (III) frontal T-bone and bent-back
front, (IV) frontal T-bone and warm seclusion.
4.2
Key features
As discussed in Section 4.1.1, the motion of an extratropical cyclone is steered essentially
by a jet stream. In the North Atlantic, the variation in the strength and location of the jet
stream is related to North Atlantic Oscillation (NAO) (Hurrell et al., 2003), which essentially measures the degree to which tracks shift to the north or south of Western Europe.
The NAO, roughly speaking, is the pressure gradient between two large scale pressure cells
over the Atlantic Ocean, the Icelandic low and the Azores high. A positive NAO index
brings strong westerly winds, pushing the track of precipitation further north, resulting in
cool summers and mild, wet winters in Northern Europe. A negative NAO index brings
cold, dry winters to Northern Europe and cyclonic activity with warm temperatures to the
Mediterranean region. This project aims to explore the dependence structure between the
various measures of cyclonic intensity and the NAO.
A key issue with modelling extratropical cyclones is that some of their small-scale features
can potentially have the most damaging effects. This is apparent in the occurrence of sting
jets. Sting jets are a sequence of forceful winds that occur at the tail of the head of the
cyclone (see Figure 8) in a localised region sometimes spanning only 50 kilometres across
(Baker, 2009). This results in gusts that are generally stronger than those located on the
warm and cold fronts. They originate in the upper air before accelerating downwards at high
speeds. The Great Storm of 1987 is an example of a cyclone with a sting jet. Challenges
to consider when modelling sting jets include the lack of historical data due to the rarity of
the phenomenon. However, it is vital to be able to account for the influence of sting jets
due to the threat that it poses. This project aims to explore ways to incorporate small-scale
28
features, such a sting jets, into the model without compromising model validity.
Figure 8: From Baker et al. (2014): A conceptual picture of a sting jet cyclone, featuring
warm and cold air conveyor belts and sting jet component.
Like with any weather system, a key area of interest is the effect of climate change on the
location and severity of extratropical cyclones. The consensus among climate models is that
tracks will shift slightly poleward in response to increases in greenhouse gases, in line with
the change in jet streams. In addition, it is predicted that while there will be a reduction in
total storm numbers, there will be a higher occurrence of intense cyclone activity (Ulbrich
et al., 2009).
4.3
Statistical modelling of extratropical cyclones
Extreme Value Theory has been used in recent years to model intense extratropical cyclone
events. Because of the complexity of the cyclone system, authors have used different features of the weather system when undergoing statistical inference on the process. Lionello
et al. (2008) fit a GEV distribution to monthly pressure minima derived from three different
climate models over the entire North Atlantic domain. In two scenarios, it is projected that
North Atlantic regions will suffer worsening winters and milder summers, which is consistent
with the predicted effects of the northward shift of tracks caused by climate change. DellaMarta and Pinto (2009) fit a GP distribution to extreme wind intensity measurements in
three non-overlapping regions of Europe. By this model, the frequency of cyclone occurrence
is predicted to increase in Western Europe but remain the same in North Atlantic regions.
Each model is formulated without accounting for the spatial and temporal variation in the
extremes.
The model of Sienz et al. (2010) uses a GP distribution to fit a tail model to sufficiently
extreme values of geopotential height, mean horizontal gradient, cyclone depth and relative
vorticity, each a measure of cyclone intensity. This model incorporates trends in time and
NAO, finding that the probability of extreme cyclonic activity in the North Atlantic increases
in months with a positive NAO phase. However, this model fails to account for dependence
29
between measures of cyclone intensity. Spatial variability is also ignored.
Bonazzi et al. (2012) uses a bivariate extreme value copula (see Section 3.1) to analyse the
tail dependence of wind intensity between pairs of locations. Four nodes over Europe are
defined, and the measure of dependence χ is used as a probability of a storm hitting node
B given that it hits node A. The tail dependence exhibits stronger coupling in the zonal
direction, which is consistent with the dominant west-east track of extratropical cyclones.
Economou et al. (2014) specify a Bayesian hierarchical procedure (outlined in Davison et al.
(2012)) in which a point process conditional model on pressure minima is used due to the
model parameters being invariant to threshold, unlike the GP distribution. The model
includes spatial random effects and time-dependent covariates in the model parameters, extending the work of Cooley and Sain (2010). A Bayesian hierarchical framework is preferred
due to its flexibility compared to max-stable processes and the natural inclusion of physical
mechanisms in the model (see Section 6 for a thorough model description).
5
5.1
Exploratory data analysis
Data availability
From a statistical perspective, the methods selected to model extreme behaviour of extratropical cyclones largely depend on the quality of available data. Practical difficulties arise
with weather data collection. For example, data collected at one site is not necessarily reflective of the behaviour of the weather system in a spatial grid centred at that site. To counter
this issue, reanalysis is used to produce reliable datasets for climate modelling and research.
Reanalysis data are produced with a sequential data assimilation scheme, advancing forward
in time cycles of a pre-determined length. In each cycle, available observations are combined
with prior information from a forecast model to estimate the evolving state of the global
atmosphere and its underlying surface (Dee et al., 2011). Variational analysis is performed
on the basic upper-air atmospheric fields and other factors such as soil moisture, soil temperature, snow and ocean waves. These analyses are then used to initialise a short-range
model forecast, which provides the prior state estimates needed for the next analysis cycle.
The strength of this data assimilation means that global datasets are readily available with
consistent spatial and temporal resolutions. In recent years, model resolution and bias correction techniques have steadily improved. Reanalysis also incorporates millions of observations
into a system that would be impossible for an individual to collect and analyse separately.
Despite these clear advantages, care must be taken not to equate reanalysis datasets with
reality. The data assimilation system can introduce spurious variability and trends into output due to model and observational bias, for example.
Examples of reanalysis projects include ERA-40, ERA-Interim, MERRA and JRA. This
report focuses on ERA-40 and ERA-Interim data. ERA-Interim was designed to address
30
several problems that arose from the ERA-40 project, such as the representation of the hydrological cycle and technical issues such as data selection and quality control techniques.
Hence, ERA-Interim outputs have been used in analysis stretching back to 1979, before
which ERA-40 measurements are still incorporated into analyses. The difference in data
structure between the two projects is illustrated in the example in Figure 9. This shows the
effect of the new reanalysis project on data output after 1979.
Figure 9: A collection of wind speed measurements (knots) from the ERA-40 project (red)
and the ERA-Interim project (green).
This report will consist of analyses featuring two datasets from the ERA-Interim reanalysis.
The first dataset contains track data over 34 consecutive springs. The identification and
tracking of the cyclones is performed following the approach used in Hoskins and Hodges
(2002). The approach uses relative vorticity to identify and track the cyclones in 6-hourly
intervals. The second dataset contains monthly maxima measurements of wind and rain over
gridded regions in the United Kingdom in the period 1958-2012. Wind speed is measured
in units of knots and rainfall is measured in millimetres. Regression and interpolation are
used to generate values on a regular 25 km × 25 km grid, taking into account factors such as
position, terrain shape and coastal influence among others. A visual representation of this
dataset is shown in Figure 10. This dataset is accompanied by measures of the NAO index
(see Section 4.2) for each month of interest.
The objective of this section is to explore the data and apply methods of extreme value
analysis to these datasets. Firstly, covariate methods are applied to the gridded data to
establish evidence of non-stationarity in the physical processes, using information from the
track data and the NAO. Then, a further analysis is performed on the spatial dependence
structure in the gridded data.
5.2
Covariate analysis
A key feature of this PhD project will be to identify covariates linked to extremal behaviour
in intensity and movement of extratropical cyclones. In this report, an exploratory study is
presented analysing the effect of three covariates on monthly maxima of wind and rain in
31
Figure 10: Maximum wind speed in knots (left) and rainfall accumulations in millimetres
(right) in gridded regions over the UK in December 2012.
the UK. The value of NAO for each month is readily available for analysis. Here, two other
covariates are presented which are constructed from the track data. For the purpose of this
investigation, analysis is performed at one location, specifically the grid box containing the
city of Lancaster.
Firstly, the data are cleaned to filter the tracks that have a path within a certain radius
of the UK. The first covariate d is defined as the minimum distance between the centre of
the grid (for which latitude and longitude data is available) and the nearest track for every month. This covariate is selected as, intuitively, one would expect extreme wind speed
and rainfall accumulations to increase as the distance to the storm centre decreases. The
second covariate v is the corresponding vorticity value at the point of minimum distance on
the track. Because these values are only available in 6-hourly intervals, linear interpolation
is used in order to ascertain the value of interest. Again, one would expect that extreme
wind speed and rainfall accumulations would increase as the vorticity at the nearest track
increases. A visual representation of these covariates is shown in Figure 11.
Another factor to consider is the effect of seasonality on the extremal behaviour of wind and
rain. Naturally, one would expect to see more intense storms in the winter months than in
any other season. In any case, it may not be valid to assume a constant effect throughout
the year.
As shown in Figure 12, median rainfall and wind speed are higher in autumn and winter
than in spring and summer, with winter rainfall being more variable. Therefore, it is clearly
evident from the plots that a seasonal component in the model is necessary. However, for the
purpose of this exploratory investigation, extremal behaviour in spring months is explored
as an isolated study.
Since monthly maxima are being analysed, a GEV model is suitable. For the purpose of this
32
Figure 11: A map of the UK showing the location of interest (in red), the minimum distance
d between this location and a track and the corresponding vorticity value v, obtained by
linear interpolation of the vorticity data over 6-hourly intervals
analysis, a constant shape parameter ξ is assumed. The initial model for both variables is
taken to be GEV(µt , σt , ξt ) where,
µt = µ0 + µ1 d + µ2 v + µ3 NAO
σt = exp(σ0 + σ1 d + σ2 v + σ3 NAO)
ξt = ξ
A backward selection likelihood ratio test procedure is implemented. Backwards selection
involves reducing the number of parameters in the model by one each time until all covariates
are statistically significant. Given the log-likelihood l0 of the null model H0 , and the loglikeilhood lA of the alternative model HA , the likelihood ratio statistic D is defined as
D = 2(lA − l0 ).
While additional parameters increase the log-likelihood, this test ensures it increases sufficiently as not to overfit the model. This test statistic follows a χ2f distribution, where f
is the difference in the number of parameters of the null and alternative models. By this
33
Figure 12: Boxplots of monthly maximum rainfall accumulations (top) and wind speeds
(bottom) over all seasons in the period 1980-2012.
process, the following model is deemed the best fit to the wind speed data:
µt = µ0 + µ1 d
σt = σ
ξt = ξ,
with parameter estimates and standard errors summarised in Table 2.
Figure 13 shows the effect of values of covariate d on return levels. The graph shows that the
time interval between extreme observations is expected to be larger as the distance between
your location and the track increases. Hence, the probability of extreme wind speeds is larger
if one is closer to the track, which intuitively makes sense.
µ0
Estimate
8.430
Standard Error
0.623
95% CI
(7.21, 9.65)
µ1
-1.056
0.372
(-1.79, -0.33)
σ
3.435
0.274
(2.89, 3.97)
ξ
-0.151
0.072
(-0.29,-0.01)
Table 2: Table of parameter estimates, standard errors and 95% confidence intervals for the
GEV wind speed model
34
Figure 13: Return levels of wind speed corresponding to the best fitting GEV model, conditional on the minimum (black), mean (green) and maximum (red) values of the covariate
d.
Applying a GEV model to the rainfall data, following the same procedure as before gives a
constant model as the model of best fit. While this is inconsistent with the results of the
wind speed analysis, it is important to consider the effect of using monthly data. As the
track data show, storms tend to last for only a few days, but with the chance of multiple
large storms taking place in one month, considering monthly maxima will not capture the
full picture of extreme cyclone activity. As explained in Section 6, daily maxima will be
available at the PhD stage, which should result in an analysis that is more representative of
the extremal characteristics of wind and rain. Further analysis will also aim to incorporate
information from other locations in the UK. While covariates are not significant in this analysis, when a broader spatial model is developed, a more systematic relationship is expected.
A brief introduction to spatial extremes modelling is given in Section 6.
5.3
Dependence structure
Because an analysis of wind and rain at one point is not sufficient in representing the extremal behaviour of the overarching weather system, a preliminary analysis is presented here
of the dependence structure between locations. In particular, 10 grid-boxes in a west-east direction and 10 grid-boxes in a north-south direction are chosen and the dependence between
both sets compared (see Figure 14). It is hoped to investigate whether prevailing westerly
winds over the UK have any effect on dependence between locations. Because the spatial
resolution of the wind measurements is such that neighbouring grid-boxes in the west-east
direction have identical values for wind speed, this analysis will focus on the rain variable
only.
35
Figure 14: The grid-boxes chosen for a dependence study in the west-east direction (red)
and north-south direction (yellow) (image courtesy of Google Maps)
An initial check involves the effect of dependence as the distance between grid-boxes increases. To illustrate this, the Kendall Tau correlation coefficient (Kendall, 1938) is calculated for three pairs of location in the west-east direction, (1, 2), (1, 4) and (1, 10), which
result in respective correlation coefficients of 0.681, 0.395 and 0.263 (see Figure 15).
Figure 15: Rainfall accumulations in locations (1, 2) (black), (1, 4) (red) and (1, 10) (green).
To analyse the pattern of extremal dependence, the dependence parameter η is estimated
for pairwise locations in both the west-east and north-south grid-boxes. The estimation
36
procedure follows the outline in Section 3.1. Figure 16 shows the estimates of η plotted
against distance between locations. Distance is defined in terms of number of grid-boxes.
Figure 16: Estimates of dependence parameter η changing with respect to distance for the
north-south grid-boxes (left) and the west-east grid-boxes (right), with estimates testing
negative for asymptotic dependence shown in blue.
For each estimate of η, a hypothesis test is performed in order to check if the parameter
estimate is significantly different than η = 1, the case of asymptotic dependence. It is interesting that asymptotic dependence tends to weaken as the distance between locations
increases, as one would expect. Also of interest is the generally stonger asymptotic dependence between locations on the west-east directions, as analysis on the locations on the
north-south direction yields more rejected hypotheses, that is, more evidence of asymptotic
independence. This indicates that the prevailing westerly winds have some impact on the
dependence structure between locations. Section 6 gives a brief overview of how this way
of thinking can be extended to the spatial extremes methodology, in particular, a class of
processes that can incorporate both asymptotic independence and asymptotic dependence
in the model.
6
Further Work
The previous chapters have introduced univariate and multivariate extreme value theory,
followed by an overview of the physical processes which define the evolution and movement
of extratropical cyclones and a preliminary data analysis. As outlined in Section 1, the goal
of the PhD research is to apply methods from extreme value theory to develop a consistent
model that accounts for the physical characteristics of these storms. This section features
an outline of the short-term and long-term plans to achieve this.
37
6.1
6.1.1
Short-term goals
Data collection
One immediate objective is to better explore the Met Office datasets to gain a further insight into the extreme behaviour of extratropical cyclones. This would include incorporating
seasonality into the model to account for weather variations at different times of the year.
Using monthly maxima as in Section 5.2 is problematic when dealing with limited datasets
as not enough data may be available to justify an asymptotic-based model. Further analysis
will take place using daily maximum wind speed and rainfall accumulations.
It is important to consider the change of structure as shown in Figure 9 caused by the
introduction of a new reanalysis scheme and its effect on extreme value analysis. It is
clear from the plot that at least a location or scale change has occurred in the data. As
a preliminary theory as to how to incorporate this change into the model, consider the
following. Assume that pre-change random variables Z1 , . . . , Zn are IID with mean a and
variance b2 and post-change random variables Z̃1 , . . . Z̃n are IID with mean c and variance
d2 . Assume that other features of the distribution, such as skewness, are not subject to
change. Then, in theory, normalised Z should be equally distributed as normalised Z̃, that
is:
Z − a d Z̃ − c
=
.
b
d
Rearranging this expression gives
d d
Z̃ = (Z − a) + c.
b
Assuming that the maximum Mn of Z1 , . . . , Zn is distributed as a GEV(µ, σ, ξ) distribution, it
would be interesting to discover the equivalent distribution of the maximum M̃n of Z̃1 , . . . Z̃n .
P(M̃n ≤ x) = P(Z̃1 ≤ x, . . . Z̃n ≤ x)
d
d
= P
(Z1 − a) + c ≤ x, . . . , (Zn − a) + c ≤ x
b
b
b(x − c)
b(x − c)
= P Z1 ≤
+ a, . . . , Zn ≤
+a
d
d
b(x − c)
= P Mn ≤
+a
d
( −1/ξ )
ξ b(x − c)
= exp − 1 +
+a−µ
σ
d
( −1/ξ )
b
da dµ
= exp − 1 + ξ
x−c+
−
.
σd
b
b
This gives the extreme value parameters of the post-change maxima in terms of the param38
eters of the pre-change maxima:
ξ˜ = ξ
σ̃ = βσ
µ̃ = α + βµ,
where α = c − da/b and β = d/b. Exploring whether this formulation can be applied to the
Met Office datasets will be a task in the short-term. This will allow the change in structure
to be incorporated into the model, producing more meaningful parameter estimates that
may be distorted by this change otherwise.
Any future analysis should account for spatial variability, in contrast to the analysis in Section 5.2 which dealt with a single location. It would be interesting to discover, in particular,
if covariate significance changes with location. With increased spatial and temporal resolution, more information will be available to make more definitive results. It would also be
interesting to source some pure observational data from the Met Office on which the Poisson
process methodology developed in Section 2.5 can be applied. In addition, sourcing more
data from the Met Office with regard to the physical processes controlling the evolution and
movement of extratropical cyclones should aid the discovery of covariates related to extreme
wind and rainfall. As a starting point, the procedure to obtain the covariate d in Section 5.2
could be made more exact with use of cubic splines as a means of interpolating over the
tracks. Immediate analysis will focus on quantifying covariates related to the speed and age
of the cyclone at a given point in space. As well as data exploration, further discussion and
collaboration with the Met Office is necessary to determine likely factors linked to extreme
wind speed and rainfall accumulations.
6.1.2
Spatial extremes
Because the physical processes associated with extratropical cyclones are spatial in extent,
spatial modelling of extremes will be necessary to capture the full extremal properties of
these weather systems. In the case where few observations are available, incorporating data
from nearby locations into the model can help in efficient parameter estimation. There are
a number of methods in the extreme value literature to approach this problem.
Bayesian hierarchical models
Bayesian hierarchical modelling is a common approach for specifying extreme value models
over continuous space. In this setting, dependence is introduced by integration over spatial
latent variables or processes. Hence, spatial variation can be introduced in the parameters
(Davison et al., 2012). Following the example in Coles (2001), consider observations of
annual maxima X(ri ) at a set of locations r1 , . . . , rk ∈ R. A simple model is
X(ri )|(µ1 , . . . , µk , σ, ξ) ∼ GEV(µi , σ, ξ),
independently for r1 , . . . , rk , where µ1 , . . . µk are the realisations of a smoothly varying random process µ(r) observed at r1 , . . . , rk respectively. Because µ(r) varies smoothly, nearby
values of the µ(ri ) are more likely to be similar than distant values. Hence, values of X(r)
39
are more likely to be similar at nearby locations.
An important assumption in the Bayesian hierarchical modelling approach is that of conditional independence of the extremes. An advantage of this approach is its flexibility in terms
of incorporating spatial random effects and covariates into the model. To illustrate this,
the model of Economou et al. (2014) is presented, which extends the work of Cooley and
Sain (2010) to the application of extreme behaviour of extratropical cyclones. In particular,
pressure minima are taken as a variable representing cyclonic intensity. Like in Cooley and
Sain (2010), a Poisson process model is used due to strengths with regard to model flexibility
and efficiency.
Let X(s, t) be the depth in grid cell s ∈ S at time t ∈ T , where S and T are the domains of
space and time respectively. The model is described as follows:
X(s, t)|θψ (s), β2 (s)
µ(s, t)
log σ(s, t)
ξ(s)
∼
=
=
=
PP(µ(s, t), σ(s, t), ξ(s))
β0µ + β1µ z1 (t) + β2 (s)z2 (t) + θµ (s)
β0σ + β1σ z1 (t) + θσ (s)
β0ξ + θξ (s),
for ψ = µ, σ, ξ, where z1 is the latitude of the occurrence and z2 is the NAO index and
where θµ (s), θσ (s) and θξ (s) are spatial random effects. Spatial random effects define spatial
variability in µ, log(σ) and ξ across the cells after allowing for covariates. These random
effects are modelled using an intrinstic autoregressive (IAR) spatial model, for more details
see Cooley and Sain (2010). In this model, the NAO parameter also varies with space. Prior
distributions for the model parameters are chosen and an MCMC procedure is used to sample from the posterior distributions.
It would be interesting to apply this formulation to other measures of cyclonic intensity,
specifically measures of wind speed and rainfall accumulations. The discovery of covariates
obtained from the physical structure of the cyclone could be implemented in further work
and adapted to investigate if any underlying spatial variability exists.
Max-stable models
A common approach to modelling spatial extremes is using max-stable processes. Max-stable
processes are the extension of multivariate extreme value theory to the infinite dimensional
setting. Consider a random process X(r) having continuous sample paths. Then, if for
sequences of continuous functions an (r) > 0 and bn , as n → ∞
Xi (r) − bn (r)
→ {Z(r)}r∈Rd ,
max
i=1,...,n
an (r)
r∈Rd
where Xi are independent replications of X and Z is non-degenerate, then Z(r) is a maxstable process. By this result, max-stable processes are the limits of pointwise maxima in
the same way that the GEV family is the limit distribution of block maxima. There are two
specific ways of characterising max-stable processes.
40
Smith model
First, consider
Z(r) = max{ξi f (si , r)},
i
where {(ξi , si ) : i ≥ 1} are the points of a Poisson process on (0, ∞] × S with intensity
measure ξ −2 dξ × ν(ds) and f is a probability density function on S. The process Z is a maxstable process with unit Fréchet margins. A physical interpretation of this characterisation
is described in Smith (1991). The set S can be regarded as a space of storm centres, and
ν is a measure which represents the distribution of storms over S. Then each ξi represents
the intensity of the storm and the function f determines the profile of the storm. Hence,
ξi f (si , r) represents the size of the storm at position r from a storm of size ξi centred at
location si .
Letting S = Rd and f be a multivariate normal density with zero mean and covariance matrix
Σ gives the Smith model (Smith, 1991). Based on these assumptions, the joint distribution
function at two sites r1 and r2 is given by
a 1
a 1
z1
1
z2
1
+ log
− Φ
+ log
,
P(Z(r1 ) ≤ z1 , Z(r2 ) ≤ z2 ) = exp − Φ
z1
2 a
z2
z2
2 a
z1
q
where Φ is the standard normal distribution function and a = (r1 − r2 )T Σ−1 (r1 − r2 ) is
the Mahalanobis distance between r1 and r2 .
Schlather model
Another useful characterisation of max-stable processes is presented in Schlather (2002). Let
∞
−2
{ξi }∞
i=1 be points of a Poisson process on R+ with intensity measure ξ dξ. Let {Wi }i=1 be
independent replications of a positive random process having continuous sample paths W
such that E[W (r)] = 1 for all r ∈ Rd . Then Schlather defines
Z(r) = max ξi Wi (r).
i
d
Z(r) is a stationary
√ max-stable process on R with unit Fréchet margins. The Schlater model
defines W (r) = 2π max{0, (r)}, where is a standard Gaussian process. This leads to the
bivariate distribution function
s
"
!#
1 1
1
2{1 + ρ(h)}z1 z2
P(Z(r1 ) ≤ z1 , Z(r2 ) ≤ z2 ) = exp −
+
1+ 1−
2 z1 z2
(z1 + z2 )2
Because of the strong spatial component of physical processes associated with extratropical
cyclones, an expansive review of the methodology of spatial extremes is required in the shortterm. A useful task would be to run simulations of the Smith and Schlather models using
the SpatialExtremes package in R. Further work will also involve investigations into other
max-stable processes, including the Brown-Resnick process (Kabluchko et al., 2009) and the
process used in Davison and Gholamrezaee (2012) to fit models to extreme temperature data.
41
A key point to note is that the Smith and Schlather models are justified asymptotically
for modelling spatial extremes under asymptotic dependence and perfect independence. In
practice, however, it may be difficult to detect whether a dataset should be modelling using
a model for asymptotic dependence or asymptotic independence. An ideal model is one
that is asymptotically dependent over short distances, with weakening dependence as the
distance increases, and asymptotic independence over longer distances, which would be consistent with findings from the dependence analysis in Section 5.3. Wadsworth and Tawn
(2012) present a hybrid spatial extremes model, which is a mixture model of asymptotically
dependent and asymptotically independent processes.
Let X(x) be a max-stable process, with extremal coefficient function θ(h) defined as
P(X(x1 ) < z, X(x2 ) < z) = exp{−θ(h)/z},
where h is the distance between locations x1 and x2 . Let Y(x) be an asymptotically independent spatial process, with coefficient of tail dependence function η(h) defined as
P(Y (x1 ) > z, Y (x2 ) > z) = L(z; h)z −1/η(h) .
Assuming each process has unit Fréchet margins, then for α ∈ [0, 1]
H(x) = max{aX(x), (1 − a)Y (x)}
is a spatial process with unit Fréchet margins and bivariate joint survivor function
P(H(x1 ) > z, H(x2 ) > z) =
a(2 − θ(h)) (1 − a)1/η(h)
+
+ O(z −2 ) as z → ∞.
z
z 1/η(h)
If there exists finite h∗ = inf{h : θ(h) = 2}, then the process H(x) is asymptotically dependent up to distance h∗ and asymptotically independent for longer distances. While the
model is flexible, the freedom of choice for a could result in a model lacking in structure.
This PhD project aims to impose constraints on the mixing probability a by incorporating
physical knowledge of the extratropical cyclone structure over multiple sites. By incorporating asymptotically independent processes into the model, one hopes to fully capture the full
picture of extreme cyclonic behaviour over the North Atlantic.
6.1.3
Random effects
As discussed in Section 2.4.2, random effects modelling has rarely been used in the extreme
value literature. However, the model of Eastoe and Tawn (2010) shows that this method
can be useful in the absence of available data relating to covariates, and even as a diagnostic
tool for covariate selection. In light of this, short-term research will focus on extending this
principle to models for size of exceedance, through random effects and/or covariates in the
parameters of a GP model. Because weather extremes are associated with high variability
due to the complexity of atmospheric processes, it is expected that random effects will
account for some of the extra variation not captured by the regression model. In addition,
the distribution of the random effect may help to identify new covariates by matching its
42
behaviour with that of a known climate process. By obtaining data related to this climate
process, model fit can be improved and a more parsimonious representation of weather
extremes can be achieved. It is hoped that this could be further extended to the parameters
of a GEV distribution. Investigating temporally and spatially dependent random effects may
also be of interest in the long-term. In the immediate term, however, it will be necessary to
create a Bayesian modelling framework in order to carry out initial investigations.
6.2
Long-term goals
As discussed in Section 1, the long-term aim of this PhD research is to develop a statistical
model of extremes arising from extratropical cyclones that is a valid representation of the
physical processes that generate these extremes. While short-term goals will focus mainly
on developing extreme value methodology for tackling this problem, it will be necessary to
investigate further the structure of the cyclone from a climate science perspective in the
long-term. This should generate an increased understanding of the physics that drive and
shape the evolution and movement of extratropical cyclones. While the details in Section 4
are a good starting point, a broader knowledge is necessary in order to develop a model that
is fully representative of these atmospheric processes. For example, the UK floods of 2007
(see Figure 17) were possible due to slack large scale flow and resulting slow-moving cyclonic
activity. In addition, sting jets (see Section 4) can cause mass damage but are difficult to
observe in datasets and to model in practice. For now, it is envisaged that simulated sting
jet events can be generated by ensemble-type forecasting (Zhu, 2005) to improve their representation in the data. Part of this investigation will involve whether there are larger-scale
factors influencing the occurrence of sting jets, which may prove easier to include in the
model. The long-term aim is to identify these phenomena on weather charts and incorporate
their effects into a physically and statistically consistent extreme value model. This will involve exploring wind and rain generating processes separately and jointly, using data where
available to draw conclusions regarding their effects on extreme behaviour of these variables.
It would also be interesting to explore any possible differences in the extremal behaviour in
the wind and rain associated with the different types of fronts and using some theory similar
to that described in Section 6.1.1 to account for these differences in a parsimonious way.
Ways of incorporating macro-characteristics of the storm into the model, such as speed or
shape of the cyclone, will also need to be explored.
From a data perspective, the numerical models discussed in Section 5 gives a complete set
of observations on which to build a model. A vital long-term goal will be to calibrate model
output with observational site data.
Once an understanding of the spatial extremes methodology has been developed in the
short-term, the long-term aim is to apply these techniques to Met Office datasets in order
to obtain a more accurate representation of the spatial variability of extreme wind speeds
and rainfall accumulations. It is hoped that model output will be of a high resolution in
space and time, which may involve pooling information from the many different reanalysis
projects introduced in Section 5. The extreme features of cyclones could be further modelled
by extending the spatial extremes methodology to the multivariate setting, where extreme
43
Figure 17: Rainfall accumulations (mm) in the UK during the flood events of July 2007.
rainfall accumulations and wind speeds can be modelled jointly over space. This would
involve exploring new methods for incorporating covariates into a spatial and multivariate
framework. Spatial random effects modelling could be used to reveal a dependence on covariates that are insignificant in analysis of a single site. Track data could be used to model
the spatial occurrence of the storm in the North Atlantic, which could be related to the
impact of extreme wind speeds and rainfall. In the long-term, it is hoped that a combined
model will be developed that will capture both the broad spatial and temporal aspects of
extratropical cyclones through a framework that will combine components of multivariate
and spatial extreme value theory, incorporating non-stationarity in the form of known covariates and random effects.
It is hoped that this developed spatio-temporal model can provide a more robust assessment
of future risk associated with extratropical cyclones. Estimates of future risk can be improved
by jointly assessing changes in intensity, frequency and spatial distribution, with the aim of
improving the signal to noise of any future changes. In the long-term, the aim is that this
study will provide tools that will allows significantly improved risk assessments that are
required by policy makers and clients of the Met Office for both current and future risk of
wind and rainfall extremes arising from extratropical cyclones.
References
Baker, L. (2009). Sting jets in severe northern european wind storms. Weather, 64(6):143–
148.
Baker, L., Gray, S., and Clark, P. (2014). Idealised simulations of sting-jet cyclones. Quarterly Journal of the Royal Meteorological Society, 140(678):96–110.
Barry, R. G. and Chorley, R. J. (2009). Atmosphere, Weather and Climate. Routledge.
Bjerknes, J. and Solberg, H. (1922). Life Cycle of Cyclones and the Polar Front Theory of
Atmospheric Circulation. Grondahl.
44
Bonazzi, A., Cusack, S., Mitas, C., and Jewson, S. (2012). The spatial structure of european
wind storms as characterized by bivariate extreme-value copulas. Natural Hazards and
Earth System Science, 12(5):1769–1782.
Chavez-Demoulin, V. and Davison, A. C. (2005). Generalized additive modelling of sample
extremes. Journal of the Royal Statistical Society: Series C (Applied Statistics), 54(1):207–
222.
Coles, S. G. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Cooley, D. and Sain, S. R. (2010). Spatial hierarchical modeling of precipitation extremes
from a regional climate model. Journal of Agricultural, Biological, and Environmental
statistics, 15(3):381–402.
Davison, A. C. (2003). Statistical Models, volume 11. Cambridge University Press.
Davison, A. C. and Gholamrezaee, M. M. (2012). Geostatistics of extremes. Proceedings of
the Royal Society A: Mathematical, Physical and Engineering Science, 468:581–608.
Davison, A. C., Padoan, S., Ribatet, M., et al. (2012). Statistical modeling of spatial extremes. Statistical Science, 27(2):161–186.
de Haan, L. and de Ronde, J. (1998). Sea and wind: multivariate extremes at work. Extremes,
1(1):7–45.
Dee, D., Uppala, S., Simmons, A., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M., Balsamo, G., Bauer, P., et al. (2011). The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quarterly Journal of the Royal
Meteorological Society, 137(656):553–597.
Della-Marta, P. M. and Pinto, J. G. (2009). Statistical uncertainty of changes in winter
storms over the north atlantic and europe in an ensemble of transient climate simulations.
Geophysical Research Letters, 36(14).
Eastoe, E. F. and Tawn, J. A. (2009). Modelling non-stationary extremes with application to
surface level ozone. Journal of the Royal Statistical Society: Series C (Applied Statistics),
58(1):25–45.
Eastoe, E. F. and Tawn, J. A. (2010). Statistical models for overdispersion in the frequency
of peaks over threshold data for a flow series. Water Resources Research, 46(2).
Economou, T., Stephenson, D. B., and Ferro, C. A. (2014). Spatio-temporal modelling of
extreme weather events. In submission.
Ferro, C. A. and Segers, J. (2003). Inference for clusters of extreme values. Journal of the
Royal Statistical Society: Series B (Statistical Methodology), 65(2):545–556.
Hall, P. and Tajvidi, N. (2000). Nonparametric analysis of temporal trend when fitting
parametric models to extreme-value data. Statistical Science, pages 153–167.
45
Heffernan, J. E. and Tawn, J. A. (2004). A conditional approach for multivariate extreme
values (with discussion). Journal of the Royal Statistical Society: Series B (Statistical
Methodology), 66(3):497–546.
Hoskins, B. J. and Hodges, K. I. (2002). New perspectives on the northern hemisphere winter
storm tracks. Journal of the Atmospheric Sciences, 59(6):1041–1061.
Hurrell, J. W., Kushnir, Y., Ottersen, G., and Visbeck, M. (2003). An Overview of the North
Atlantic Oscillation. Wiley Online Library.
Jonathan, P., Ewans, K., and Randell, D. (2014a). Non-stationary conditional extremes of
northern north sea storm characteristics. Environmetrics, 25(3):172–188.
Jonathan, P., Randell, D., Wu, Y., and Ewans, K. (2014b). Return level estimation from nonstationary spatial data exhibiting multidimensional covariate effects. Ocean Engineering,
88(0):520–532.
Kabluchko, Z., Schlather, M., and De Haan, L. (2009). Stationary max-stable fields associated to negative definite functions. The Annals of Probability, pages 2042–2065.
Kallenberg, O. (1983). Random Measures. Akademie-Verlag and Academic Press.
Keef, C., Papastathopoulos, I., and Tawn, J. A. (2013). Estimation of the conditional
distribution of a multivariate variable given that one of its components is large: Additional
constraints for the Heffernan and Tawn model. Journal of Multivariate Analysis, 115:396–
404.
Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, pages 81–93.
Koenker, R. (2005). Quantile Regression. Cambridge University Press.
Leadbetter, M. (1983). Extremes and local dependence in stationary sequences. Probability
Theory and Related Fields, 65(2):291–306.
Leadbetter, M. R., Lindgren, G., and Rootzén, H. (1983). Extremes and Related Properties
of Random Sequences and Processes. Springer.
Leckebusch, G. C. and Ulbrich, U. (2004). On the relationship between cyclones and extreme windstorm events over europe under climate change. Global and Planetary Change,
44(1):181–193.
Ledford, A. W. and Tawn, J. A. (1996). Statistics for near independence in multivariate
extreme values. Biometrika, 83(1):169–187.
Ledford, A. W. and Tawn, J. A. (1997). Modelling Dependence within Joint Tail Regions.
Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59(2):475–499.
Ledford, A. W. and Tawn, J. A. (2003). Diagnostics for dependence within time series
extremes. Journal of the Royal Statistical Society: Series B (Statistical Methodology),
65(2):521–543.
46
Lionello, P., Boldrin, U., and Giorgi, F. (2008). Future changes in cyclone climatology over
europe as inferred from a regional climate simulation. Climate Dynamics, 30(6):657–671.
O’Brien, G. L. (1987). Extreme values for stationary and Markov sequences. The Annals of
Probability, pages 281–291.
Pickands III, J. (1975). Statistical inference using extreme order statistics. The Annals of
Statistics, pages 119–131.
Schlather, M. (2002). Models for stationary max-stable random fields. Extremes, 5(1):33–44.
Schultz, D. M., Keyser, D., and Bosart, L. F. (1998). The effect of large-scale flow on
low-level frontal structure and evolution in midlatitude cyclones. Monthly weather review,
126(7):1767–1791.
Shapiro, M. A. and Keyser, D. A. (1990). Fronts, Jet streams, and the Tropopause. US
Department of Commerce, National Oceanic and Atmospheric Administration, Environmental Research Laboratories, Wave Propagation Laboratory.
Sienz, F., Schneidereit, A., Blender, R., Fraedrich, K., and Lunkeit, F. (2010). Extreme
value statistics for north atlantic cyclones. Tellus A, 62(4):347–360.
Smith, R. L. (1985). Maximum likelihood estimation in a class of nonregular cases.
Biometrika, 72(1):67–90.
Smith, R. L. (1991). Max-stable processes and spatial extremes. Technical Report, University
of North Carolina.
Smith, R. L. and Weissman, I. (1994). Estimating the extremal index. Journal of the Royal
Statistical Society. Series B (Methodological), pages 515–528.
Ulbrich, U., Leckebusch, G., and Pinto, J. (2009). Extra-tropical cyclones in the present and
future climate: a review. Theoretical and Applied Climatology, 96(1-2):117–131.
Wadsworth, J. L. and Tawn, J. A. (2012). Dependence modelling for spatial extremes.
Biometrika, 99(2):253–272.
Woollings, T. (2010). Dynamical influences on european climate: an uncertain future. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering
Sciences, 368(1924):3733–3756.
Zhu, Y. (2005). Ensemble forecast: A new approach to uncertainty and predictability.
Advances in Atmospheric Sciences, 22(6):781–788.
47