The linked stress release model for spatio

Geophys. J. Int. (2003) 154, 925–946
The linked stress release model for spatio-temporal seismicity:
formulations, procedures and applications
Mark Bebbington1 and David Harte2
1 IIS&T,
Massey University, Private Bag 11222, Palmerston North, New Zealand. E-mail: [email protected]
Research Associates, PO Box 12649, Wellington, New Zealand
2 Statistics
Accepted 2003 April 23. Received 2003 March 11; in original form 2002 May 7
SUMMARY
The linked stress release model is based on the build-up of stress through elastic rebound and
its dissipation in the form of earthquakes. In addition, stress can be transferred between largescale geological or seismic features. The model can be statistically fitted to both historical and
synthetic seismicity catalogues and, through simulation, can be used to create probabilistic
forecasts of earthquake risk. We review the genesis of the model, provide some observations
on forecasting using the model, and follow with a comprehensive review of applications to
date. A systematic procedure for identification of the best model is illustrated by data from
the Persian region. We then consider the evaluation of fitted models, using residual point
processes and information gains. Implications of the use of Benioff strain rather than seismic
moment are discussed. The sensitivity of the model to regionalization, magnitude errors,
catalogue incompleteness, catalogue size and declustering/magnitude cut-off is then considered
in detail with reference to data from north China. The latter data are also used to illustrate the
model evaluation techniques introduced earlier. Some technical material on numerical fitting,
simulation and calculation of the information gain is given in an appendix.
Key words: earthquake prediction, regionalization, seismic-event rates, seismic modelling,
statistical methods.
1 I N T RO D U C T I O N
Elastic rebound theory (Reid 1910) postulates that elastic stress in
a seismically active region accumulates due to movement of tectonic plates, and is released when the stress exceeds the strength
of the medium. Fixing, for example, strength or residual stress produces time- and slip-predictable models (Shimazaki & Nakata 1980;
Kiremidjian & Anagnos 1984). However, many features of earthquake generation are not captured in these basic models. This results
in a degree of randomness, which must be factored into forecasts
through the medium of stochastic models. Developing the stochastic
(Markov) model for the occurrence of main-sequence earthquakes
suggested by Knopoff (1971), Vere-Jones (1978) proposed the stress
release model, a stochastic version of the elastic rebound theory, incorporating a deterministic build-up of stress within a region and
its stochastic release through earthquakes.
Earthquake interaction by means of stress triggering and stress
shadows is now an accepted phenomenon (see Harris 1998, and references therein). Knopoff (1996) states that ‘modelling of fractures
on the major units of the fault system must take into account coupling between the members of a 2-D web of faults through interactive stress transfer’. Apart from aftershocks and the like, large events
are noticeably often followed by large events quite distant from the
first (Shimazaki 1976; Li & Kisslinger 1985; Hill et al. 1993; King
C
2003 RAS
et al. 1994; Pollitz & Sacks 1995, 1997; Toda et al. 1998). On
the other hand, very large events can delay subsequent events on
the same or other faults (Harris & Simpson 1996). Nalbant et al.
(1998) also find evidence of faults in stress shadows being reloaded
by motion on adjacent faults shortly before slipping. The omission
of this interaction between regions may be responsible for the underestimation of activity by the time- and magnitude-predictable
model of Papadimitriou et al. (2001). Synthetic models, for example Ward (1996) and Bebbington (1997), have also emphasized
interactions between spatially distributed faults. This motivates an
extension of the stress-release model to include interactions between
regions, whatever they may be, by means of stress transfer and reduction. In this paper we will attempt to examine the questions of
formulation of the model, regionalization of the data, and interpretation and significance of the results in a reasonably systematic
fashion.
1.1 Formulation
In the univariate stress release model (SRM) as formulated by
Vere-Jones (1978), the key variable, or state, is the stress level in a
region, which controls the probability of an earthquake occurring
(Vere-Jones & Deng 1988). This stress level X (t) increases deterministically between earthquakes and is reduced stochastically as a
925
926
M. Bebbington and D. Harte
result of earthquakes. The current value X (t) can be represented in
the form
Poisson process with trend). The alternative parametrization used
in the numerical optimization routines (Harte 1999) is
X (t) = X (0) + ρt − S(t),
λ(t) = exp{a + b[t − cS(t)]},
(1)
where X (0) is the initial value, ρ is a constant loading rate from
external tectonic forces and S(t) is the accumulated stress release
from earthquakes within the region over the period (0, t), i.e. S(t) =
ti <t Si , where t i and S i are the origin time and the stress release
associated with the ith earthquake. The result is a piecewise linear
Markov process.
In order to fit the model to data we need to estimate the value
of stress released during an earthquake. Hence we need a relation
between the observed quantity of magnitude, and our underlying
variable of ‘stress’. Kanamori & Anderson (1975) show that the
magnitude M is proportional to the logarithm of the seismic energy
E released during an earthquake according to the relation M =
2
log10 E + constant. If we suppose that the ‘stress drop’ during
3
an earthquake is proportional to some power 2η/3 of the released
energy, i.e. S ∝ E 2η/3 , then we have the formula
S = 10η(M−M0 ) ,
(2)
where M 0 is the normalized magnitude. The classical stress, or
Benioff strain (Benioff 1951), representation has η = 0.75, the value
used in previous work. A value of η = 1.5, on the other hand, results
in our ‘stress’ corresponding to seismic moment (see also Main
1999, for a discussion of proxy measures of strain).
The probability intensity of an earthquake occurrence is controlled by a hazard function (x), with the interpretation that the
probability of an event occurring in the time interval (t, t + ) is
approximately [X (t)] for small . There are a wide range of possibilities for the function . Obviously, it must be non-decreasing.
A constant independent of x would result in a random (Poisson)
model of occurrences. Using
0
x ≤ xc
(x) =
(3)
∞ x > xc
produces the time-predictable model (Shimazaki & Nakata 1980),
supposing a fixed crustal strength x c . An effective compromise
(Zheng & Vere-Jones 1991, 1994) between these extremes of behaviour is the form (x) = exp(µ + νx). It also represents the
behaviour that might be expected from a region with a locally heterogeneous strength. One might also consider that local stress is
inhomogeneous (Shimazaki 1976). The stochastic nature of the process with an exponential hazard function is compatible with these
sources of ‘noise’. We can interpret the constant µ (or rather the
parameter α that replaces it, see below) as effectively a parameter to
be fitted for the unknown initial value of stress, while the constant ν
is an amalgam of the strength and heterogeneity of the crust in the
region. This can also be understood as a form of sensitivity to risk.
Statistical analysis is made feasible by treating the data in historical earthquake catalogues as a point process in time–stress space
with the conditional intensity function
λ(t) = [X (t)] = exp{µ + ν[X (0) + ρt − S(t)]}.
(4)
Obviously, X (0) can be absorbed into the other parameters, and so
we obtain the form λ(t) = exp{α + ν[ρt − S(t)]}, where α =
µ + ν X (0). Estimates of the parameters can then be found by maximizing the log-likelihood function (see, for example, Daley & VereJones 1988, Section 13.1). The form (4) includes the special cases:
λ(t) = exp[α] (the Poisson process) and λ(t) = exp [α + βt] (the
(5)
which has implications when considering numerical and statistical
fitting issues (Bebbington & Harte 2001).
Inference and stochastic process properties have been investigated by Ogata & Vere-Jones (1984), Vere-Jones & Ogata (1984),
Vere-Jones (1988), Zheng (1991) and Borovkov & Vere-Jones
(2000). The model is similar in concept to the elastic rebound model
of Knopoff (1971), the major practical advantage being the ability,
through point process methods, to fit the model to data.
Obviously, stress transfer and interaction cannot be considered in
the simple stress release model, and the earlier analyses concentrated
on dealing with regional differences. In particular, Zheng & VereJones (1994) found that large geographical regions give better fits
to the stress release model when broken down into subunits, and
further noted some hints of clustering relating to some form of action
at a distance, i.e. stress transfer and interaction. Although Zheng
& Vere-Jones (1991) investigated various multivariate extensions to
the model, the natural and elegant extension that we shall next outline
was not considered until Shi et al. (1998) and Liu et al. (1998). The
evolution of stress X i (t) in the ith region can be rewritten as
X i (t) = X i (0) + ρi t −
θi j S ( j) (t),
(6)
j
where S ( j) (t) is the accumulated stress release in region j over the
period (0, t), and the coefficient θ i j measures the fixed proportion
of stress drop initiated in region j which is transferred to region i.
Here, θ i j may be positive or negative, resulting in damping or excitation, respectively. It is convenient, in dealing with a declustered
catalogue (i.e. with aftershocks removed), to set θ ii = 1 for all i.
The new version we shall call a linked (or coupled) stress release
model (LSRM). If θ i j = 0 for all i = j, the model is reduced to
an independent aggregation of simple forms as in eq. (1). Liu et al.
(1999) use a slightly different form of eq. (6), replacing S ( j) (t) by
(i) (i)
( j)
Si− (t) = S ( j) max tk : tk < t ,
(7)
k
(i)
where {t k } are the occurrence times in region i. Thus, events in
other regions j have no transfer effect on region i until the occurrence of a subsequent event in region i. In a somewhat similar fashion, Imoto et al. (1999) introduce a time delay into the transfer by
replacing S ( j) (t) by
( j)
S D (t) = S ( j) (t − td ),
(8)
in order to produce periodic-type behaviour.
We shall assume each region to have an exponential risk function,
with differing parameters indicating different tectonic properties by
region. In other words, the strength (earthquake triggering condition) and tectonic loading rate can differ in each seismic region.
Thus we obtain a point process conditional intensity function
( j)
λi (t) = [X i (t)] = exp αi + νi ρi t −
θi j S (t) ,
(9)
j
for each region i, where α i (=µi + ν i X i (0)), ν i , ρ i and θ i j are the
parameters to be fitted. We choose to parametrize the intensity in
this form because it is more amenable to physical intuition. The
seemingly excess parameters are in response to fixing θ ii = 1 for
C
2003 RAS, GJI, 154, 925–946
The linked stress release model
all i. A simpler parametrization (Liu et al. 1998) can be recovered
by setting bi = ν i ρ i , cij = θ i j /ρ i , yielding
( j)
ci j S (t) ,
(10)
λi (t) = exp ai + bi t −
j
for each region i. Estimates of the parameters are found by numerically maximizing the log-likelihood
T2
N
log L =
log λ(ti ) −
λ(t) dt,
(11)
i=1
T1
where the interval (T 1 , T 2 ) contains events at times t i (i=1, 2, . . .
,N ): T 1 < t 1 < t 2 < . . . < t N < T 2 . See Appendix A1 for more
details. We note that the linked model provides the possibility of
creeping faults, with ρ i < 0, which are reloaded by transfer from
other faults rather than tectonically.
Emphasis has shifted away from an exact prediction of earthquakes to estimation of probabilities of future events (see, for example, Aki 1989; Vere-Jones 1995, 1998; Kagan 1997b). Results
of this type are usually indicated by the term forecasting, and can
be obtained from the (linked) stress release model by means of repeated forward simulation (see Appendix A2) and averaging. Lutz
& Kiremidjian (1995) use this technique with a generalized semiMarkovian process model for the northern San Andreas. Given a
history of events (generally occurrence time, magnitude and region),
one estimates the time-varying intensity from the fitted model and
the history by means of eqs (4) or (9), and generates the time to the
next event. After assigning a magnitude to this event, it is added
to the history, and the next interval found. Note that we do not refit the model parameters. After many repetitions, we will have an
estimate of the probability of occurrence in a time, magnitude and
space (i.e. region) window. This can be compared with ‘null hypothesis’ forecasts (see Stark 1997) obtained in a similar way from the
Poisson, characteristic (Working Group on California Earthquake
Probabilities 1995) or clustering (Kagan & Jackson 1991) models.
Note that our high-magnitude cut-off and deliberate discarding of
aftershocks will minimize the difference between these alternatives.
The result then reflects the coupling and rebound behaviour around
which the linked stress release model is built (see Lu et al. 1999a,
for an example).
1.2 Review of applications
The stress release model (4) has been applied to historical data
from the Kamakura region of Japan (Vere-Jones 1978), north China
(Vere-Jones & Deng 1988; Zheng & Vere-Jones 1991, 1994; Zhuang
& Ma 1998), southwest China and Taiwan (Zhuang & Ma 1998),
central Japan and Persia (Zheng & Vere-Jones 1994) and southwest
Japan (Imoto 2001). The repeated investigations of north China
were driven by incremental improvements in the quality of data,
and by subtle changes to the identification of regions, which we
shall examine later.
The emphasis in many of these investigations was to identify
statistically distinct regions, in the sense that the best-fitting models, eq. (4), estimate different (as justified by AIC) parameters for
the regions. It was concluded that divisions into four regions (north
China, Japan), and two regions (Taiwan) were justified. Persia proved
more difficult, resulting in a division into three regions based on an
aggregation of smaller regions proposed by Ambraseys & Melville
(1982), but with considerable reservations concerning the fit, particularly to earlier data. Several of the Japanese regions also displayed a
C
2003 RAS, GJI, 154, 925–946
927
poor fit to the stress release model when compared with the simpler
Poisson with trend model. The conclusion was reached that, provided the data are reasonably complete, the stress release model is
superior statistically to the Poisson and Poisson with trend models.
Zhuang & Ma (1998) also simulated the model forward to forecast
a quiescent period of 30–50 yr for the easternmost part of north
China.
Zheng & Vere-Jones (1991) fitted different magnitude ranges by
independent stress release processes, investigating the possibility of
making the stress drop (i.e. the magnitude) dependent on the level of
stress, with inconclusive results. Consequently, subsequent authors
have assumed that the magnitude can be assigned independently
(see, however, Jaumé & Bebbington 2000).
Imoto (2001) examined the Nankai sequence of eight to ten earthquakes, and showed that the stress release model produced a superior
fit to renewal process models. By formalizing a scale based on average AIC reduction, sensitivity of the model to the value of η in
the stress–magnitude relation (2), missing events and magnitude
perturbation were investigated.
Liu et al. (1999) revisited the north China data using the linked
model (6) with the modification (7). Based on a division into two
regions (the Ordos Plateau and Hebei plain/Tanlu seismic belts),
they concluded that there was statistically significant interaction between them, of an inhibitory nature. Unfortunately, in maximizing
eq. (11), the summation term was calculated using the stress modification (7) while the integral term appears to have been calculated
using the unmodified relationship (6). Lu & Vere-Jones (2000) found
that the linked model, eq. (9), provides almost no improvement here
over a collection of independent stress release models, attributing
this to the intraplate collision seismic zone being of less complexity than a plate boundary subduction zone. We shall revisit this
issue in greater detail later, in our investigation of regionalization
effects.
Lu et al. (1999a) revisited the Japanese data, but were forced to
completely revise the regions in the light of geophysical information.
The regions were again found to be heterogeneous, and not necessarily best fitted by the stress release model. It was found that the
Median Tectonic Line region was independent of the other three
regions, while events in the Fossa Magna/Sagami Trough had an
excitatory effect on the Chubu/Kinki Triangle and Nankai Trough
regions, as did the Chubu/Kinki Triangle region on the Fossa
Magna/Sagami Trough. These results were shown to be consistent
with previous models (see, for example, Kanamori 1972; Shimazaki
1976; Kanaori et al. 1993, 1994; Pollitz & Sacks 1997), and some
additional quantification of the results was provided.
Imoto et al. (1999), using eq. (8), showed that periodic seismicity in the Kanto region of central Japan could best be explained
by a linked stress release model, with a delay in the effect of the
Kasumigaura cluster on the Kinugawa cluster.
Bebbington & Harte (2001) used the Taiwan data of Zhuang
& Ma (1998) to examine the statistical behaviour of the linked
stress release model itself. The two region (divided by the Eurasian–
Phillipine Sea Plate boundary) model with significant interactions
(both inhibitory) between the regions provided an excellent testbed.
They outlined a number of complementary procedures for examining the robustness of the model, and testing the significance of predicted interactions. The study concluded that although the damping
effect of the Phillipine Sea Plate events on Eurasian Plate events
was significant, the converse effect was less certain.
The linked stress release model has also served as a basis for a
stochastic model of aftershocks (Borovkov & Bebbington 2003).
928
M. Bebbington and D. Harte
Having one ‘region’ generate mainshocks, and a second generating
aftershocks, reproduced the result of Dieterich (1994) for the aftershock decay rate, with the additional property that, as with the stress
release model generally, the parameters can be fitted statistically to
the data.
Lu & Vere-Jones (2001) fit the simple stress release model (4)
to synthetic catalogues generated by the model of Ben-Zion (1996).
Four levels of disorder in distribution of fault zone strength are
considered: uniform properties (U), a Parkfield-type asperity (A),
fractal brittle properties (F) and multisize-scale heterogeneities (M).
As measured by AIC (relative to the number of events) and simulation, the degree of regularity or predictability (Vere-Jones 1998)
in the fitted stress release models follows the order U, F, A, M,
in agreement with the results of the pattern-recognition techniques
used by Eneva & Ben-Zion (1997a,b). The fit was poorest for the
asperity model (A), which possesses a feature (a fixed region of high
strength) not catered for by the simple stress release model. It was
suggested that a two-region linked model might be appropriate.
Lu et al. (1999b,c) fit the linked stress release model to data
obtained from a synthetic model of crack coalescence and an elastic
block lattice, in order to demonstrate the existence of ‘long-range
interactions’ in the medium.
The linked stress release model can also be used as a validation tool for complex synthetic earthquake models such as those
of Rundle (1988a,b), Ben-Zion (1996), Bebbington (1997) and Shi
et al. (1998), since the output from the synthetic model should be
statistically similar to the historical data. The linked stress release
model provides a measure of this similarity through the estimated
parameters, an idea used by Bebbington et al. (1998) to validate
the synthetic seismicity model of Bebbington (1997) for the Middle
America Trench. A possible further step in investigating synthetic
models is to compare the fitted stress with the internal state of the
synthetic model (Shi et al. 1998; Liu et al. 1999).
2 FITTING
Although the linked stress release model has been applied widely,
there has been little attention paid to certain technical and systemic
issues concerning its use. Bebbington & Harte (2001) made a beginning, by investigating the sensitivity and robustness of the fitting
procedure by means of Monte Carlo simulation, numerical analysis
and residual point processes. Below we summarize this work and
the idea of information gain (Vere-Jones 1998), and consider some
additional issues affecting the implementation of, and conclusions
drawn from, the linked stress release model.
We will usually be considering various possible models, with
differing (numbers) of parameters, for a given set of data. The choice
of the best model in the sense of justified parameters will be based
on the Akaike information criterion (AIC), which is defined as
AIC = −2 log L̂ + 2k,
(12)
where log L̂ is the maximum likelihood for a given model and k is
the number of parameters to be fitted in the model (Akaike 1977).
This represents a rough way of compensating for the effect of adding
parameters, and is a useful heuristic measure of the relative effectiveness of different models, in avoiding overfitting. If we systematically test the various families of interactions, the best model is that
with the smallest AIC value. A difference of 1.5–2 in AIC is usually considered significant. However, we should note that the AIC
values obtained here should be used with some caution, since the
amount of historical earthquake data is not very large and the distribution of the log-likelihood is probably not chi-squared (Ogata &
Vere-Jones 1984; Wang et al. 1991). Given the small size of the data
sets suitable for the model, the application of asymptotic results is
problematic.
2.1 Model identification
We shall follow a convention of denoting models by their possible
(unrestricted) parameter set. The full, unrestricted model is denoted
by (α, ν, ρ, :θ ii = 1). For example, the model with uniform
tectonic input across regions (which is the ‘symmetrical model’ of
Liu et al. 1999) is (α, ν, ρ = ρ1, :θ ii = 1, and the aggregation
of independent stress release models for each region is (α, ν, ρ,
= I).
For a model with J regions, there are potentially J (J + 2) parameters, and hence 2 J (J +2) possible models. A systematic method
of reduction to the ‘best’ model is obviously desirable. The first step
is to consider the regions individually, using the standard stress release model (4). A ‘baseline’ model (and associated AIC) is taken
as the aggregate of the best individual models. The question then
becomes whether any regional interactions can improve the likelihood sufficiently to offset the cost of the additional parameters on
the AIC. If so, we consider those parameters (and the interaction
they represent) justified by the data. The next step is to consider the
question of tectonic input. Evidence in this regard can be obtained
by comparing the models (α, ν, ρ, = I) and (α,ν, ρ = ρ1, =
I). Only if the former has a superior AIC should differing rates of
tectonic input be accepted. Even in this case there may be groups of
regions that can profitably be assigned equal inputs. Once the number of input parameters in the linked model are identified, it remains
only to consider the matrix of regional transfers, . Here the procedure becomes more of an art, leavened by geophysical intuition.
One question, in the case where not all regions are neighbours, is
to test whether there are long-range interactions. Another is to test
whether individual regions (or groups of regions) are independent
of the remainder. Having thus eliminated certain broad categories
of model, it is then possible to fit the remainder. This should provide
a number of models with close to the smallest AIC, which have
similar interactions (see, for example, Lu et al. 1999a).
For an example of the model selection algorithm outlined above,
let us consider Ambraseys & Melville (1982), who provide extensive data for the Persian region. They suggest that the complicated
tectonics of the region can be broken down into seven seismic zones:
northern (N), eastern (E), Zagros (Z), part of Azerbaijan (A), Kapet
Dagh (K), the central desert (C) and Makran-Baluchistan (M). Unfortunately the data are likely to be incomplete for M < 6.5 until
the Qajar period (1794–1924) (Ambraseys & Melville 1982), and
so we follow Zheng & Vere-Jones (1994) in limiting ourselves to
events of M ≥ 6.0 after 1780, as listed in Table B1, and displayed in
Fig. 1. The data should be a complete record of occurrences for the
time period and magnitude interval, but the location data are less
accurate, particularly outside of region Z. However, our regions are
sufficiently broad to be able to ignore this. The question of magnitude accuracy in the linked stress release model will be examined
later. As a result of the amount of data available, we need to combine regions to reduce their number. Combining regions E, C and K,
and regions N and A, and discarding region M because there are no
events prior to 1934, we are left with three regions. These regions
are mutually adjacent.
Zheng & Vere-Jones (1994) found that, for 1780–1980, a combination of stress release models for each of the three regions (ECK,
NA and Z) fit the data reasonably well. Repeating the exercise for the
C
2003 RAS, GJI, 154, 925–946
40
The linked stress release model
929
A
38
K
34
C
32
Latitude
36
N
Z
Region 1
Region 2
Region 3
Symbol size scales
with magnitude
M
26
28
30
E
45
50
55
60
Longitude
Figure 1. Persian earthquakes, 1780–1994. The region boundaries are those of Zheng & Vere-Jones (1994).
Table 1. Persia: aggregate of independent SRMs. AIC =
AIC − AIC P is the improvement in AIC from the Poisson
process.
Region
αi
νi
ρi
AIC
1 (NA)
2 (ECK)
3 (Z)
−1.041
−4.333
−5.223
0.090
0.055
0.090
0.522
0.730
0.454
−1.4
−10.2
−10.2
Table 2. Persia: full (unconstrained) LSRM.
Region
αi
νi
ρi
θ i1
θ i2
θ i3
1 (NA)
2 (ECK)
3 (Z)
−1.152
−4.205
−6.082
0.088
0.088
0.199
0.537
0.285
0.052
1
−0.221
−0.294
−0.174
1
−0.283
0.394
−0.879
1
σ̂i j
|ĉi j |
=
⎧⎛
0.25
⎪
⎪
⎪
⎪⎜
⎪
⎪
⎝2.86
⎪
⎪
⎪
⎪
⎨ 2003 RAS, GJI, 154, 925–946
0.04
⎟
1.37⎠
0.05
(full model)
(13)
(common ρi ),
where σ̂i j is the estimated standard deviation of cij (the indicates
that the value could not be calculated). Thus, statistical considerations argue for retaining θ 23 , θ 32 and possibly θ 31 among the transfer
terms. Fitting all of the 24 = 16 remaining possible models produced
a best model (α, ν, ρ = ρ1, = ), where
1 0
C
0.98
⎞
1.85
⎛
⎞
⎪
⎪
0.32 1.77 2.75
⎪
⎪
⎪
⎪
⎜
⎟
⎪
⎪
⎝11.9 0.32 0.86⎠
⎪
⎪
⎩
1.66 0.32
⎛
period 1780–1994, we obtain the results in Table 1. This ‘baseline’
model (α, ν, ρ, = I) has an AIC of 516.8 with nine parameters.
If instead we fit the full model (α, ν, ρ, :θ ii = 1), we obtain
the results in Table 2. The full model has an AIC of 522.2 with
15 parameters, far worse than the baseline model. At this point we
investigate the use of a common tectonic input, with which the nature of the region is not incompatible. The model (α, ν, ρ = ρ1,
) produces an AIC of 523.2 with 13 parameters. The AICs of the
two linked models indicate that we need to remove at least three or
four further parameters in order to match the AIC of our baseline
model in Table 1. Note that the model (α, ν, ρ = ρ1, = I) has
an AIC of 518.9, higher than that of the baseline model. Checking
the Hessian (see Appendix A1) of the matrix C = (cij ) in eq. (5), we
find that the coefficients of variation in the two linked models are
2.27
⎜
= ⎝0 1
0 0
0
⎞
⎟
θ23 ⎠
(14)
1
and the fitted parameter values are given in Table 3. The AIC is
514.0 with eight parameters, which is sufficiently better than the
baseline model. We note that the interactions are limited to events
in region Z exciting those in region ECK. This means there is an
apparent alternation, depending on the activity level in region Z,
in activity between regions ECK and NA. This accords well with
the observation of Ambraseys & Melville (1982) that there is an
apparent alternation of activity between regions E and N. Fig. 2
illustrates the effect, which is less dramatic than the damping effect
across the Taiwan Plate boundary found by Bebbington & Harte
(2001).
6.5
0.2
Magnitude
7
0.3
Region 1
0.0
6
0.1
Conditional Intensity
7.5
M. Bebbington and D. Harte
0.4
930
1800
1850
1900
1950
2000
1900
1950
2000
1900
1950
2000
7.5
6.5
0.2
Magnitude
7
0.3
Region 2
0.0
6
0.1
Conditional Intensity
0.4
Time
1800
1850
7.5
6.5
0.2
Magnitude
7
0.3
Region 3
0.0
6
0.1
Conditional Intensity
0.4
Time
1800
1850
Time
Figure 2. Persia: fitted intensities; model (α, ν, ρ = ρ1, = ) (dotted line), model (α, ν, ρ, = I) (dashed line). Event magnitudes are indicated by the
solid vertical lines.
Table 3. Persia: LSRM with minimal interactions.
Region
αi
νi
ρi
θ i1
θ i2
θ i3
1 (NA)
2 (ECK)
3 (Z)
−0.959
−4.197
−4.894
0.090
0.073
0.066
0.515
0.515
0.515
1
0
0
0
1
0
0
−0.592
1
2.2 Model evaluation
Statistical investigation of the model relies on various tools. We can
use the AIC to distinguish between competing models. A residual
analysis will indicate whether the chosen model is systematically
inadequate, and the robustness of the model can be evaluated by
Monte Carlo simulation and refitting. The latter, together with the
Hessian from the numerical optimization of the log-likelihood, can
also provide a measure of significance of the estimated parameters.
Large standard errors of parameter estimates derived from the Hes-
Table 4. Persia: estimated standard errors for best
linked model.
Parameter
Estimate
Std error
a1 = α1
a2 = α2
a3 = α3
b1 = ν 1 ρ
b2 = ν 2 ρ
b3 = ν 3 ρ
cii = 1/ρ
c23 = θ 23 /ρ
−0.96
−4.20
−4.89
0.046
0.038
0.034
1.94
−1.15
0.51
0.82
1.04
0.021
0.014
0.012
0.12
0.41
sian indicate that an associated effect may not be well founded. The
standard errors determined from the Hessian for the final Persian
model in Table 3 are shown in Table 4. The Hessian also provides estimates of the correlation between the parameters, which may warn
C
2003 RAS, GJI, 154, 925–946
The linked stress release model
of possible systematic deficiencies associated with overfitting in the
model.
Although the Hessian matrix provides estimates of the standard
errors and correlations, whether or not the Hessian can be evaluated
at all seems to be critically dependent on certain elements of the
numerical fitting procedure. As a check on the results, and as an alternative, we can simulate the fitted model and refit the model to the
simulated catalogue(s), in order to obtain information concerning
the distributions of the parameters. We can also attempt to fit the
simulated data by alternative models, perhaps with fewer parameters, in order to establish the credibility of the model. Bebbington
& Harte (2001) found, at least in the two-region case, that the correlations for the parameters from the simulated data show much
the same pattern as observed from the Hessian. This provides, in
some sense, a validation for the estimated Hessian, and hence the
variances inferred from it. Furthermore, simulation of a less than optimum model produced quantitative features significantly different
from the historical data. The numerical estimates of the parameter
variances were shown to be usable, and more importantly, verifiable,
by means of Monte Carlo simulation and refitting.
Ogata (1988) provides a paradigm for assessment of the utility of
a stochastic model. The evaluation is on two levels, using AIC and
residual analysis. The latter is used to identify systematic deviation
of the data from the fitted model, which would indicate a significant
factor underlying the data which is not included in the model. If
( j)
we suppose that the (regional) point-process data {t k } (for each
region j) is generated by the intensity (t) = (λ1 (t), λ2 (t), . . .), we
can define the compensator
t
ψ j (t) =
λ j (s) ds.
(15)
0
Then, provided that ψ(∞) = ∞, Aalen & Hoem (1978) show that
( j)
( j)
( j)
the {u k }, where u k = ψ j (t k ), are transformations to the time
locations of stationary Poisson processes of intensity one. Standard
tests for stationarity, independence and exponentially distributed interevent times can then be used to determine whether these residual
processes are, in fact, Poisson, or if the deviation from the model
is, in fact, more than just random noise. The residual processes for
the final model in Table 3 did not differ significantly from a Poisson
process of unit rate. Interestingly, neither did the residual processes
from the independent model in Table 1, which explains the similarity
of the intensities in Fig. 2.
A method of scoring earthquake forecasts provided by a given
model is to calculate the information gain (Vere-Jones 1998). Given
a point process model for earthquake occurrence, it is used to simulate (see Appendix A2) a long synthetic occurrence of events. The
complete simulated time period is then divided into bins of equal
length, and the synthetic catalogue is then used to estimate occurrence probabilities { p i } for each bin (see Appendix A3), where p i
is the estimated occurrence probability for the ith ‘bin’. A simple
binomial score is given by
[Yi log pi + (1 − Yi ) log(1 − pi )],
(16)
B=
i
where Y i = 1 if at least one event occurred in the ith bin or 0 otherwise. This rewards high forecast probabilities for bins where an
event occurs, and low forecast probabilities where events do not
occur, and penalizes false alarms and missed events. The binomial
score B for the ‘fitted’ model can be compared with the (reference)
binomial score obtained by a ‘null’ Poisson process, B, by calculating the difference B − B. This is referred to as the information gain.
It can be shown (see also Kagan & Knopoff 1977) that this differ
C
2003 RAS, GJI, 154, 925–946
931
ence is simply the difference in the log-likelihood of the fitted model
to that of the null model (see Appendix A3 for further details).
Generally the log-likelihood difference is used in the context of
model goodness of fit. That is, when the log-likelihood is calculated
using the observed data, it is interpreted as a measure of the goodness of fit of the given model. However, since the log-likelihood
is the same as the information gain, it also has an interpretation of
quantifying the model predictability when is it calculated using simulated data from the given model. Hence, a greater information gain
(relative to the null model) describes how much more predictable the
fitted model is than the null model. This is useful when one wants to
compare the predictability of various competing models. However,
in order to make such comparisons, one obviously needs to use the
same timescale (e.g. days, years, etc.) for all compared models, and
to compare the information gain per unit time or number of events if
the processes are observed over time intervals of different lengths.
If a given process is simulated for a total time T, we can calculate
the mean information gain per unit time (see Vere-Jones 1998) as
ρT =
B−B
.
T
(17)
The analyses can be elaborated further by including magnitude
intervals and time intervals.
3 SENSITIVITY
Considering the number of works reviewed above, a critical analysis
of the model itself is overdue. We will attempt to fill this void by
discussing the sensitivity of the model to perturbations in regionalization, and the input earthquake data. An illustration will follow
in the next section, examining the sensitivity of the model when
fitted to data from north China. We will also consider the question
of what power of the seismic moment the notional ‘stress’ variable
should be.
3.1 Determination of regions
A major focus of the linked stress release model is the identification
of appropriate regions. An objective method that might be used to
define regions is by the application of some clustering algorithm,
with boundaries drawn equidistant between neighbouring clusters.
This should recognize implicitly the geophysical structure of the
regions, subject to the completeness of the data. However, this would
add an unknown, but substantial, number of degrees of freedom
to the fitted model. Given the already large number of adjustable
parameters, relative to the amount of data, this is undesirable.
Alternatively one can attempt to use known tectonic features as
the basis of the regions, in order to investigate possible coupling between these features, which can thus serve as a null hypothesis. To
date (see also Papadimitriou et al. 2001), such regionalization has
been on the basis of geophysical structure, ideally corresponding to
major tectonic features such as seismic belts (Liu et al. 1999), plate
boundaries (Bebbington & Harte 2001), or a mixture of both (Lu
et al. 1999a). Published studies include plate boundaries (Japan,
Taiwan, Persia, New Zealand), and intraplate seismicity (north
China). We should note that, as most events occur along fault lines
and other tectonic features, these features must form the interior of
investigated regions, rather than performing the perhaps more natural role of region boundaries. Ideally the region boundaries should
pass through areas of low seismicity. In other words, we use ‘seismic’, rather than ‘geological’, regions.
932
M. Bebbington and D. Harte
Table 5. Size of historical data sets. Parameter numbers in parentheses indicate that the fitting did not reliably
converge.
Locality
Taiwan
North China
North China
New Zealand
Kanto
Middle America Trench
Persia
North China
Japan
California
Southern California
Regions
Parameters
Observations
Source
2
2
2
2
2
3, (4)
3
4
4
5
6
8
7, 8
8
8
8
13, (18)
15
18
21, (24)
23
32, (33)
43
65
66 (?)
65
66, 79
45
89
65
76
180
82
Bebbington & Harte (2001)
Liu et al. (1999)
Lu & Vere-Jones (2000)
Lu & Vere-Jones (2000)
Imoto et al. (1999)
Bebbington et al. (1998)
Section 2.1 above
Section 4 below
Lu et al. (1999)
Bebbington (2001) (unpublished)
Bebbington (2001), unpublished
Another aspect that arises when there are more than two regions is geometry, and the possibility of interaction between nonadjacent regions. Along linear boundaries such as the Aleutians (Li
& Kisslinger 1985), the Middle America Trench (Ward 1991), or
the North Anatolian fault (Stein et al. 1997), it is expected that
stress propagates to all and only neighbouring segments. On the
other hand, in a complex plate collision such as Japan, there may
be neighbouring regions without interactions (Lu et al. 1999a). One
might even envisage interactions between non-adjacent regions.
The question of the number of regions is constrained by the fact
that they must include sufficient observations to allow the numerical
parameter-fitting procedure to converge. Within this constraint, it is
feasible to test whether the model fit can be improved statistically
by combining or dividing regions.
3.2 Catalogue issues
In using historical data, there may be problems with accuracy and
completeness of the catalogue. Pre-instrumental catalogues, where
the magnitude is estimated from the intensity of shaking, may well
have magnitude errors of 0.5 or greater, since errors of this size
are common in early instrumental catalogues (Field et al. 1999;
Kagan 2002). Secondly, although analysis has been limited to those
(portions of) historical catalogues considered complete, there may
be missing events.
Because it is implicit in the formulation that earthquakes lower
the regional stress, and hence reduce the probability of immediately
subsequent events, the model is one for main-sequence events only.
However, it is also those main events which carry the majority of
tectonic information. Smaller events usually occur in clusters, resulting primarily from perturbations of a near critical system, and
hence it is difficult to extract meaningful information from them.
Other aspects such as (quasi-)periodicity and aftershock sequences
may be susceptible to analysis using an extended version of the
stress-release idea (Imoto et al. 1999; Jaumé & Bebbington 2000;
Schoenberg & Bolt 2000; Borovkov & Bebbington 2003). As the
model attempts only to represent main-sequence events, aftershocks
must be carefully identified and removed from the data before numerical fitting can begin. This is usually done using some form of
space–time windowing (Gardner & Knopoff 1974), such as that used
by the M8 procedure (Kossobokov 1997). Lu & Vere-Jones (2001)
show, at least for synthetic data, that these smaller events play a
relatively minor role in determining the large-scale behaviour of the
system. Any such identified aftershocks can have their contribution
to the stress release added to that of the mainshock.
Because of the form of the intensity (9), near contemporaneous
events exert considerable influence on the parameters. Should these
events be in interacting regions, they may dominate the fitted model
interaction. The investigation of the Japanese data (Lu et al. 1999a)
provided two contrasting experiences. The Ansei twin events (1854
December 23 and 24), being in the same region, could be treated
as a single event (Seno 1979), without which the numerical fitting
procedure would not converge. On the other hand, the observation
of Kanaori et al. (1993) that events in the Chubu/Kinki triangle
region were followed coevaly or slightly afterward by events on the
Median Tectonic line was rejected as an interaction by the model,
perhaps due to the inclusion of additional regions in the study. Lu
& Vere-Jones (2000) found that the closeness in time of the Buller
(1929, M = 7.8) and Napier (1931, M = 7.8) events, the largest
in their data set, dominated the fitting of a two-region model for
New Zealand earthquakes. The example from north China in the
following section will further illustrate this sensitivity to individual
events.
The growing number of studies using the linked stress release
model allow us to look at the amount of data required for the fitting
to converge. Table 5 shows the studies to date. The Southern California data did not allow for the desired number of interaction parameters, given the fault network involved. Sometimes the details of the
data (particularly in terms of coverage and near-contemporaneous
events) is more important than the amount.
3.3 Magnitude–stress relation
We remarked in the introduction that the stress drop is proportional
to some power of the released energy, leading to the relation (2).
Zheng & Vere-Jones (1991) examined the sensitivity of η with the
north China data in terms of the AIC of the resulting fit. For a
range of 0.5 ≤ η ≤ 1.5 the fit was little affected by the choice of
η, with η = 0.75 being the optimum. However, this is a data set
with a magnitude cut-off of 6.0. One can envisage problems if this
magnitude cut-off is not very high. On the other hand, Imoto (2001)
reported that η plays a crucial role in the fit to the Nankai earthquake
sequence, with smaller η being preferred. Schoenberg
t & Bolt (2000)
use as their ‘long-term correcting’ intensity term 0 ν d N (s), which
corresponds to a value η = 0. However, they do not decluster the
catalogue, or apply a magnitude cut-off, and so the absence of a
magnitude-weighted decrease in the intensity is compensated for by
the larger number of aftershocks for large events. We also observe
that using η = 1.5 allows the input parameters ρ to be determined
C
2003 RAS, GJI, 154, 925–946
The linked stress release model
933
6.4
0.0010
0.0008
6.2
6
0.0008
Mc
5.8
0.0006
5.6
0.0004
5.4
0.0012
0.0002
5.2
0.0010
5
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
eta
Figure 3. Contour plot of the trend b in the SRM with varying magnitude cut-off and stress as power of energy (η = M −1 log10 S).
exogenously from geodetic measurements, but is likely to magnify
the sensitivity of the model to errors in magnitude.
In order to examine the relationship between η and the magnitude cut-off M c , we conducted the following experiment. Taking the
PDE catalogue for 30◦ N ≤ latitude ≤44◦ N and 112◦ W ≤ longitude
≤126◦ W, a region including California and parts of neighbouring
states, we removed the aftershocks via the M8 procedure, leaving a
catalogue with 202 mainshocks with M ≥ 5.0. We then proceeded
to vary M c from 5.0 to 6.4 in steps of 0.2. The preferred value of
η, by AIC, varied from 0.5 to 0.6. In no case, however, was the fit
significantly better than for η = 0.75, although it was significantly
better than with η = 1.5 for M c ≤ 5.6. As one would expect of the parameters in eq. (5), a decreases with increasing M c and c decreases
with increasing η. Oddly, in the California data, there appears to be
a discontinuity around M c = 6, resulting in a smaller than expected
value of the trend parameter b, as shown in Fig. 3. By examining
AIC/N (Imoto 2001), we can consider the optimum cut-off M c . We
find that the lower cut-offs are preferred, irrespective of η, when the
numerical fitting procedure converges. High η and low M c can lead
to instability.
A clue as to what is happening is provided by Jaumé & Bebbington
(2000) who incorporated the Kagan distribution (Vere-Jones et al.
2001, see also Kagan 1997a)
Pr(X > x) = 1 +
x −α
L
e−x/U
(18)
into the simulation of magnitudes from the stress release model.
When U is dependent on the stress level of the process and η =
1.5, they found that the stress release model produces accelerating
moment release behaviour (see Jaumé & Sykes 1999, and references
therein). This is the opposite to the behaviour that is observed above,
which corresponds to the process trying to minimize variation by
treating all events as being of similar size, which is facilitated by
providing more, more evenly scattered, events. This phenomenon
was recognized for η = 0.75 by Zheng & Vere-Jones (1994), who
noted that the output from the stress release model is clustered, with
quiescent periods following larger events, rather than being periodic.
C
2003 RAS, GJI, 154, 925–946
Data limitations, numerical stability and a desire for consistency
thus incline us towards η = 0.75, particularly with medium- to longterm hazard forecasting in mind. Jaumé & Sykes (1999) find a similar phenomenon with the accelerating moment release model, noting
that the use of a restricted magnitude interval appears to favour the
use of Benioff strain, a state of affairs that usually also applies to
the stress release model.
4 E X A M P L E : N O RT H C H I N A
For details of the data, geology and tectonics we direct the reader
to Vere-Jones & Deng (1988), Zheng & Vere-Jones (1991, 1994)
and Zhuang & Ma (1998). Briefly, the data are considered complete
(Gu 1983a,b), where M is the instrumental magnitude estimated
from historical evidence of the intensity and extent of shaking. The
catalogue has been declustered, but the given mainshock magnitude
is an equivalent magnitude calculated by adding the stress drops for
the sequence and inverting eq. (2) (Vere-Jones & Deng 1988).
4.1 Regionalization and model evaluation
Geophysical considerations (Eguchi 1983; Li & Liu 1986) argue
for there being two or four distinct regions. Three classifications
of events into regions were considered: (A) that of Zheng & VereJones (1994), (B) that of Zhuang & Ma (1998), which are based
on slightly different geophysical interpretations, and (C) via a clustering algorithm using a centroid criterion. The latter is included,
even though using the data in this way adds several degrees of freedom to the model, in order to see if there is any measurable effect.
The divisions are broadly similar, differing by between two and six
events (out of 65). This provides an opportunity to test the utility
of a clustering algorithm as a determinator of regions, and to examine the sensitivity of the models fitted, in both quantitative and
qualitative respects. Secondary objectives are to check the results
of Liu et al. (1998), and to illustrate the systematic fitting procedures elucidated above. The data are presented in Table B2 and
M. Bebbington and D. Harte
34
36
Latitude
38
40
42
934
32
Region 1
Region 3
Region 2
Region 4
Symbol size scales
with magnitude
105
110
115
120
Longitude
Figure 4. North China earthquakes 1480–1996. The crosses mark events that are allocated to different regions in the different data sets.
Table 6. North China: SRM parameters and fits by region and classification.
A plus indicates combined regions. The asterisk indicates the SRM was not
the best fit. The AIC for the Poisson process is denoted AIC P .
Reg.
Class.
N
α
ν
ρ
AIC
AIC P
1
2
All
A
B
C
A
B
C
A&B
C
A
B
C
A
B
C
All
20
12
14
11
21
19
19
12
15
32
34
31
33
31
34
65
−4.728
−2.797
−2.666
−2.751
−3.940
−4.051
−3.811
−5.199
−5.438
−3.154
−3.049
−3.121
−3.450
−3.539
−3.472
−2.462
0.033
0.042
0.029
0.030
0.083
0.070
0.079
0.053
0.060
0.025
0.027
0.026
0.024
0.022
0.023
0.010
0.487
0.138
0.121
0.075
0.200
0.189
0.187
0.350
0.370
0.626
0.636
0.612
0.540
0.540
0.553
1.176
168.5
115.1
129.9
107.2*
173.0
163.5
163.0
112.0
128.8
241.7
251.1
235.5
242.3
233.6
248.1
397.7
172.1
116.3
131.1
108.7
178.5
165.5
165.3
116.3
138.2
244.1
255.1
238.5
249.6
238.5
255.1
401.6
3
4
1+2
3+4
All
Table 7. North China: composite AICs for differing number of regions. The AIC for the corresponding Poisson process model is given
in parentheses.
Number of
regions
4
2
1
Classification
A
B
C
568.5 (583.2)
573.4 (583.3)
575.4 (579.3)
573.8 (585.0)
576.1 (585.1)
577.1 (581.0)
567.5 (584.3)
574.6 (584.6)
576.7 (580.6)
Table 8. North China: AICs for various linked models.
I
I
I+D
I+D
I+L+U
I + L+U
ρ
k
A
B
C
ρ1
ρ
ρ1
ρ
ρ1
ρ
9
12
15
18
21
24
578.5
568.5
575.1
578.5
576.1
581.0
583.1
573.8
578.6
582.1
580.0
583.6
583.7
567.5
572.5
575.6
583.7
584.0
Ni log(Ni / j∈R N j ) to the log-likelihood, and hence to
the AIC, where R is any combined region. In addition, we must add
4 to the AIC for the two additional parameters for the region in
the two-region model (one additional parameter in the four-region
model). This produces the AICs in Table 7, from which we see that
division into four regions is justified by the consequent improvement
in AIC, the regional parameters being sufficiently different.
We can now turn to the linked stress release model, with the
objective of bettering the AICs in the first row of Table 7. The fitted
AICs are in Table 8. The matrix D = (d ij ) is such that d ij = 0 if |i − j|
= 1, and L and U are strictly lower and upper diagonal, respectively.
We see that none of the linked models improve on the AICs of
the aggregate of four independent stress release models. Note that
a common ρ is decidedly rejected for the independent model, but
accepted for the interacting models, indicating that variations in
activity are better explained by differing tectonic inputs than by
interactions. This has not been observed in other cases so far.
The evidence in Tables 7 and 8 seems to be that regionalization
B is significantly less in line with the observed data than either of
the others. This of course supposes that the model is appropriate.
R
Fig. 4, identifying those events which differ in region under different
classifications.
First, for all classifications, all regions were best fitted by the
stress release model rather than the Poisson process or Poisson with
trend models. There was one exception, indicated by an asterisk
in Table 6, where the Poisson with trend was marginally, but not
significantly (AIC = 0.4), better than the stress release model.
In general the Poisson with trend was outperformed by the Poisson
process, indicating that the data set seems to be stationary. Numbering the regions 1–4 from west to east, we then considered pairing
regions 1 and 2, and regions 3 and 4, to obtain two regions. Again
the stress release model was the best fit, a pattern repeated when
the data were considered as a single region. The details of the best
stress release models are presented in Table 6.
The first question we must ask is how many regions are justifiable. The AIC can provide the answer, by determining the best
combined (over the regions) AIC. In order to compare models
with fewer regions, we have to add the multinomial MLE term
i∈R
C
2003 RAS, GJI, 154, 925–946
The linked stress release model
Table 9. North China: preferred interactions by regionalization. All
parameter sets with an AIC within 2 of the best are given.
Classification
A
B
C
AIC
i:θ i = 0
AIC
i:θ i = 0
AIC
i:θ i = 0
570.8
571.0
571.4
571.7
571.8
571.8
571.9
572.3
572.5
146
143
145
16
1246
1456
126
1345
1346
573.1
574.2
574.6
574.7
574.8
574.9
575.0
575.1
145
134
456
45
1345
1245
1456
346
567.3
568.4
568.7
568.9
569.2
569.3
569.3
146
126
16
1456
134
136
1346
The model (α, ν, ρ = ρ1, = I+L+U) for C seems to have an
AIC out of character with the remainder. This is another example of
the influence of particular event pairs referred to above. In moving
to regionalization C, the second of three events in 1976 is moved
from region 3 to region 4, while the first and last remain in region 2.
Thus in the regionalization C, a large weight (because of the short
time intervals) is given to the non-neighbouring transfers. There
were also two events in 1624, the first of which moves from region
4 to region 3 while the second stays in region 1 (see Table B2).
Conversely, the regionalization B moves one of two events in 1618,
but to the same region as the other.
Because our objective here is different, we will subtly alter the
procedure to find the best model, rather than follow the algorithm
outlined above. The most promising interactive model seems to be
(α, ν, ρ = ρ1, = I+D) and so we will ask what pattern of
interactions are preferred by the three regionalizations. Setting
⎛
⎞
1 θ1 0 0
⎜θ
⎟
⎜ 4 1 θ2 0 ⎟
=I+D=⎜
(19)
⎟,
⎝ 0 θ 5 1 θ3 ⎠
0
0
θ6
1
we will fit every possible (2 ) combination of the parameters θ 1 ,
. . . , θ 6 . The best AIC values, and their corresponding non-zero
parameter set (
) are given in Table 9. We see that θ 1 is the most
important interaction in regionalizations A and C, followed by θ 4
and θ 6 (θ 6 and θ 4 in C). However, in regionalization B, the most
important interaction is θ 4 , followed by θ 5 and θ 1 . This appears to
be related to the 1618 November 16 and 1626 June 28 events. In the
first case, a third successive event in region 2 requires extra transfer
of stress from the previous in region 1 via θ 4 . On the other hand,
these three successive events would affect region 1 too greatly if
θ 1 were important. In the second case, the new region 2 event can
be more easily explained by transfer from region 1 via θ 4 than as
a region 3 event influenced by transfer from region 2 or 4 via θ 5
or θ 3 .
The fitted input and transfer parameters, for the models in the first
line of Table 9, are given in Table 10. Fig. 5 shows the corresponding
residual processes, none of which demonstrate any systematic de6
Table 10. North China: best models by regionalization.
Reg.
A
B
C
C
ρ
θ 12
θ 21
θ 32
θ 43
0.204
0.349
0.189
−1.534
−0.634
−1.624
0.140
0.429
0.242
0
0.759
0
−0.633
0
−0.857
2003 RAS, GJI, 154, 925–946
935
ficiency in the model(s) as they are not significantly different from
Poisson processes of unit rate. We note that none of the models in
Table 9 improve significantly on the baseline AIC given in Table 7,
with the possible exception of the best model for regionalization B.
Hence we must conclude that there is little significant interaction
between the regions, and that the regionalization can affect the qualitative results from the model. The later conclusion is illustrated by
Fig. 6 which shows the quite different fitted intensities for the best
linked models for regionalizations B and C.
Following Vere-Jones (1998), we simulated each of the best models listed in Table 10 in order to estimate their ‘theoretical’ information gains. Each model was simulated forward 100 times, each
for 10 000 yr. The information gains for the simulated data were
then calculated. The information gains per year for models A, B
and C were 0.04635, 0.04619 and 0.04955, respectively. Similarly
the information gains per event for models A, B and C were 0.375,
0.347 and 0.423, respectively. These measure the predictability of
the model, as encapsulated in the parameter values estimated using
the observed historical data.
It appears that the higher information gains are derived from
smaller fitted values of ρ (see Table 10), the tectonic input term and
larger absolute values in the transfer matrix . Generally, larger tectonic input means that less impetus needs come from transfer. Conversely, large transfer values mean that prior events have a large effect on the likelihood of (especially immediately) subsequent events,
which is exactly the source of information gain.
The calculated information gains using the historical data sets
were 0.02856, 0.02851 and 0.03557 per year for models A, B and
C, respectively. Similarly, the values per event were 0.2272, 0.2267
and 0.2829, respectively. These, relative to the upper bounds above,
indicate the goodness of fit of the model to the actual historical data.
The results are consistent with the approximately equal model AICs.
While Vere-Jones (1998) found that the single-region stress release model for all of north China achieves an information gain of
0.40 per success (against a theoretical maximum of 0.48), in that
case there are only temporal interactions between events. In decomposing space into four regions, we seem to have already extracted
a large amount of the potential information. This is in line with
the results of Kagan & Jackson (2000) who find that a simple spatial clustering model outperforms the Poisson process in forecasting
earthquakes.
4.2 Sensitivity to catalogue errors
We will now investigate the sensitivity to magnitude error and magnitude cut-off/catalogue completeness. Since there appears to be
little relevant difference between the regionalizations, we will limit
our consideration to the two best models for regionalization A: the
unlinked model (α, ν, ρ, = I), and the best linked model in
Table 10, with AICs of 568.5 and 570.8, respectively. Both models
have 12 parameters. First, we shall conduct a Monte Carlo simulation, adding an N (0, 0.252 ) or N (0, 0.52 ) error to the magnitude
and refitting the model parameters. The results of 1000 simulations are given in Table 11. We see that larger errors result in a
greater perturbation in the fitted parameters. The effect is greater for
the linked than the unlinked model, perhaps reflecting the fact that
the former is an inferior fit to the original data. On a qualitative level,
the negative (exciting) transfer θ 12 appears more definite than the
other transfers, in that the confidence interval confirms a non-zero
value. The remaining transfers are clearly magnitude-error sensitive.
On a quantitative level, we see that the model is clearly sensitive to
errors in reported magnitudes.
M. Bebbington and D. Harte
15
0
5
10
Cumulative Number of Events
15
10
0
5
Cumulative Number of Events
Region 2
20
Region 1
20
936
0
5
10
15
20
0
5
Transformed Time
20
Region 4
15
10
0
0
5
10
15
Cumulative Number of Events
20
Region 3
20
15
Transformed Time
5
Cumulative Number of Events
10
0
5
10
15
Transformed Time
20
0
5
10
15
20
Transformed Time
Figure 5. North China: residual processes for the best LSRM. Regionalizations A, B and C are indicated by circles, triangles and plus signs, respectively.
95 per cent confidence intervals for stationarity are at approximately ±3.6 (N = 12) to ±4.2 (N = 21) on the vertical axis. All the transformed interevent times
were indistinguishable from exponential mean one random variables, and no significant correlations were found.
C
2003 RAS, GJI, 154, 925–946
0.00
Magnitude
6.0
6.5
0.05
7.0
0.10
7.5
8.0
0.15
Region 1
Conditional Intensity
937
8.5
0.20
The linked stress release model
1500
1600
1700
1800
1900
2000
1800
1900
2000
1800
1900
2000
1800
1900
2000
8.5
0.20
Time
7.0
6.0
6.5
Magnitude
7.5
8.0
0.15
0.10
0.05
0.00
Conditional Intensity
Region 2
1500
1600
1700
8.5
0.20
Time
7.0
6.0
6.5
Magnitude
7.5
8.0
0.15
0.10
0.05
0.00
Conditional Intensity
Region 3
1500
1600
1700
8.5
0.20
Time
7.0
6.0
6.5
Magnitude
7.5
8.0
0.15
0.10
0.05
0.00
Conditional Intensity
Region 4
1500
1600
1700
Time
Figure 6. North China: fitted intensities for the best models (B and C) in Table 9. Solid vertical lines depict magnitudes of events in a given region for both
regionalizations, dashed vertical lines in regionalization B only, dotted in C only. The dotted intensity is the best model for regionalization B, and the dashed
the best model for C.
The question of magnitude cut-off is bound up with that of catalogue completeness and declustering. At the usual magnitude cutoffs of M c = 6.0 or 6.5, aftershocks can be fairly readily identified
in that different declustering algorithms are unlikely to result in
different output catalogues. In any case, the quantity used in the
C
2003 RAS, GJI, 154, 925–946
model is the equivalent magnitude calculated from the sum of the
stress drops in a sequence. This is required by the formulation of the
model, which as the intensity decreases after an event, does not otherwise allow for aftershocks. Here we will investigate the effects on
the fitted linked model of changing M c . Because a high-magnitude
938
M. Bebbington and D. Harte
Table 11. North China: Monte Carlo simulated magnitude errors. Shown are the original fitted parameter values, and 95 per cent confidence intervals
from the simulated catalogues. Approximately 16 per cent of the simulations for the unlinked +N (0, 0.52 ) model failed to have the fitting procedure
converge.
Unlinked model
Linked model
Magnitude perturbation
Fitted
N (0, 0.252 )
N (0, 0.52 )
Param.
Fitted
N (0, 0.252 )
N (0, 0.52 )
−4.728
−2.797
−3.940
−5.199
0.033
0.042
0.083
0.053
0.487
0.138
0.200
0.350
(−6.765, −3.848)
(−3.155, −2.502)
(−4.330, −3.338)
(−6.155, −4.673)
(0.006, 0.081)
(0.011, 0.104)
(0.034, 0.139)
(0.017, 0.170)
(0.388, 0.812)
(−0.005, 0.249)
(0.144, 0.336)
(0.189, 0.704)
(−6.564, −3.618)
(−3.721, −2.154)
(−4.401, −2.519)
(−6.332, −4.121)
(0.001, 0.079)
(0.004, 0.154)
(0.009, 0.143)
(0.007, 0.248)
(0.361, 1.991)
(−0.383, 0.478)
(0.122, 0.649)
(0.135, 1.451)
α1
α2
α3
α4
ν1
ν2
ν3
ν4
ρ
θ 12
θ 21
θ 43
−4.507
−2.889
−4.034
−4.807
0.017
0.048
0.088
0.061
0.204
−1.534
0.140
−0.633
(−7.610, −3.790)
(−4.251, −2.675)
(−4.398, −3.267)
(−6.107, −4.042)
(0.004, 0.076)
(0.012, 0.133)
(0.001, 0.123)
(0.016, 0.153)
(0.162, 0.597)
(−4.018, −0.071)
(−0.000, 0.802)
(−1.860, 1.003)
(−11.177, −3.512)
(−4.222, −2.367)
(−4.427, −2.714)
(−6.491, −3.681)
(0.001, 0.083)
(0.001, 0.164)
(0.000, 0.133)
(0.004, 0.185)
(0.135, 0.625)
(−12.454, −0.031)
(−0.265, 1.868)
(−4.802, 0.711)
Param.
α1
α2
α3
α4
ν1
ν2
ν3
ν4
ρ1
ρ2
ρ3
ρ4
Table 12. North China: effect of varying magnitude cut-off. Shown are the
original parameter values used to simulate the catalogue, and the resulting
fitted values. N is the number of events in the catalogue.
Magnitude cut-off M c
Param.
α1
α2
α3
α4
ν1
ν2
ν3
ν4
ρ
θ 12
θ 21
θ 43
Magnitude perturbation
Fitted
−4.507
−2.889
−4.034
−4.807
0.017
0.048
0.088
0.061
0.204
−1.534
0.140
−0.633
6.0
6.2
6.5
6.7
(N = 1126) (N = 753) (N = 632) (N = 508)
−4.442
−2.956
−4.069
−4.531
0.017
0.042
0.089
0.059
0.204
−1.538
0.141
−0.632
−4.444
−3.340
−4.571
−5.070
0.014
0.041
0.090
0.057
0.194
−1.505
0.125
−0.675
−4.739
−3.303
−5.312
−4.732
0.013
0.038
0.081
0.053
0.188
−1.551
0.123
−0.718
−4.857
−3.267
−5.784
−4.710
0.012
0.032
0.078
0.044
0.176
−1.692
0.115
−0.808
cut-off is implicit in the model formulation, and thus we want to
look at a higher cut-offs, and because the data set is so small, we
will first simulate the model (see the appendix for details) to obtain
a record of approximately 1000 events, and then check whether raising the cut-off retains a fitted model consistent with that simulated.
The results are given in Table 12. We see that the basic character of
the model is unaffected by the magnitude cut-off. The systematic
decrease in {ν i } and ρ is due to the decrease in the rate of events,
and the decrease in the total stress release with fewer events, respectively. Differences in the fitted values of {θ i j } are small, and do not
affect the sign.
We have examined deleting each of the events from the historical catalogue, and as might be expected from the experiments with
magnitude errors, and region perturbation, the resulting fitted model
is very different for a few very important events, and little different
when other events are deleted. However, this is not a statistical experiment, and we have no way to randomly add events to the catalogue.
Hence the same simulated catalogue was used to study the effects
of catalogue completeness, by randomly deleting events to produce
missing data. Events were deleted with probability 0.05 (M < 6.5),
Table 13. North China: effect of randomly deleting events (catalogue incompleteness). Shown are the
original parameter values used to simulate the catalogue, and 95 per cent confidence intervals for the
refitted parameters after deletion.
Param.
α1
α2
α3
α4
ν1
ν2
ν3
ν4
ρ
θ 12
θ 21
θ 43
Fitted
Refitted
−4.507
−2.889
−4.034
−4.807
0.017
0.048
0.088
0.061
0.204
−1.534
0.140
−0.633
(−5.026, −3.424)
(−4.029, −2.199)
(−4.776, −3.292)
(−5.210, −3.406)
(0.008, 0.020)
(0.011, 0.053)
(0.011, 0.096)
(0.001, 0.065)
(0.196, 0.204)
(−1.615, −1.430)
(0.121, 0.163)
(−0.681, −0.482)
0.02 (6.5 ≤ M < 7.0) and 0.01 (M ≥ 7.0), respectively. The results
for 1000 realizations of the deletion and refitting process are given
in Table 13. Again, the basic character of the model is unaffected.
The decreases in {ν i } and ρ follow from there being fewer events,
and thus a smaller stress release.
The question remains as to how sensitive the model is overall. The
important parameters, from the view of hazard forecasting, are {ν i }
and {θ i j }. The first gives the reduction in hazard following a local
event, and in combination with the second, gives the enhancement or
reduction in hazard following an event elsewhere. On this criterion,
we find that the model can be very sensitive to slight perturbations
in assigning events to regions (Table 10), and to magnitude errors in
the catalogue. These are exacerbated by the small number of events
in the data set, and the consequent domination of the fitted model
by a few of them. In experiments with a larger simulated catalogue,
the model does not appear to be as sensitive to magnitude cut-off or
to random deletion of events.
4.3 Precision
On the presumption of correct data, simulation and refitting may also
offer a possible idea of the amount of data that might be required,
C
2003 RAS, GJI, 154, 925–946
The linked stress release model
Table 14. Parameter tightness by number of observations. N = 25
(Taiwan) and N = 50 (north China) did not converge in all cases.
Parameter
Taiwan
a 1 = −1.447
a 2 = −2.271
b1 = 0.545
b2 = 0.134
c11 = 0.678
c12 = 0.250
c21 = 0.211
c22 = 0.323
North China
a 1 = −4.473
a 2 = −2.871
a 3 = −3.858
a 4 = −5.426
b1 = 0.003
b2 = 0.007
b3 = 0.015
b4 = 0.017
cii = 5.278
c12 = −8.573
c21 = 1.278
c43 = −4.521
Interquartile range
N = 25
1.451
1.812
0.461
0.212
0.153
0.056
0.689
0.177
N = 50
1.733
0.915
1.250
2.191
0.006
0.012
0.018
0.019
1.161
7.451
1.478
2.722
N = 50
1.068
1.081
0.289
0.107
0.103
0.023
0.421
0.086
N = 100
1.029
0.686
0.706
0.952
0.002
0.006
0.009
0.007
0.376
3.177
0.626
0.839
N = 100
0.827
0.728
0.180
0.065
0.079
0.014
0.249
0.041
N = 150
0.753
0.615
0.554
0.753
0.001
0.004
0.006
0.004
0.205
1.879
0.347
0.440
N = 150
0.712
0.606
0.136
0.051
0.061
0.010
0.202
0.328
N = 200
0.632
0.549
0.456
0.624
0.001
0.003
0.005
0.004
0.131
1.228
0.238
0.274
or equivalently, the expected precision from a given amount of data.
We will use the four-region, 12-parameter model for north China
(Table 9, best model for regionalization C), and for comparison, the
two-region, eight-parameter model for Taiwan (Bebbington & Harte
2001, see eq. A3). Simulating and refitting each model 1000 times
for each of 25 (Taiwan only), 50, 100, 150 and 200 (north China
only) events produced the results in Table 14. First, the {a i } terms
are effectively the initial stress level. This is very sensitive to the
elapsed time to the first event, hence the large interquartile ranges.
Fortunately, however, we are not usually interested in these parameters, although care should be exercised when using the model for
forecasting. Most of our interest is concentrated on the interaction
terms {cij , i = j} and the tectonic input term cii (see the remark
preceding eq. 10).
For Taiwan, Bebbington & Harte (2001) observed that the c21
term (influence of Eurasian Plate earthquakes on Phillipine Sea Plate
earthquakes) is of doubtful significance, a pattern that is repeated
here. The reverse interaction appears well founded even for N =
25, and so it is the {bi } terms that need to be examined. We see
that while b1 is relatively consistent for small N , b2 requires closer
to N = 50 (see Bebbington & Harte 2001, for comments on the
relationship between bi and cij ). Overall, a two-region, seven- or
eight-parameter model requires of the order of 35–40 observations.
Similar considerations apply to the north China model, although
{bi } are much better behaved because of the restricted set of interactions {cij }. Here it is the {cij , i = j} terms, in particular c12 and
c21 , that control matters. It appears that a four-region 12-parameter
model requires somewhere between 50 and 100 observations. The
fact that we have only 65 observations explains the somewhat inconclusive models we fitted earlier.
5 DISCUSSION
We have reviewed the definition and applications of the linked stress
release model. While it has enjoyed some success with Chinese,
C
2003 RAS, GJI, 154, 925–946
939
Japanese and Persian data, subtle aspects of data from other regions
have made its widespread application difficult. In addition, there has
been no systematic procedure for applying the model, or for evaluating the results. This paper attempts to address these deficiencies,
and by investigating certain technical aspects of the model, shed
light on the sensitivity to data anomalies and quality.
The computational aspects of fitting the model are outlined, along
with certain ‘goodness-of-fit’ indicators that arise therein. These
tools of the Hessian, compensator, information gain and Monte
Carlo simulation provide valuable information concerning the significance of the results. They are illustrated at various points in the
paper using data from Persia and north China. In some cases, the
amount of data relative to the number of parameters is less important
than the ‘niceness’ of the data in facilitating numerical convergence.
Examination of previous studies and a small simulation exercise provide some feel for the amount of data desirable for models of various
complexity. It appears that the required number of events increases
slightly faster than linearly with the number of fitted parameters.
Models with more parameters have a more complex interaction matrix . Since the model may be overparametrized, more dense matrices can contain a complex correlation structure between parameters that causes fitting problems, thus requiring more data to
ensure convergence.
The question of whether the model works best with Benioff strain
(‘stress’) or seismic moment seems to have the same answer as
the accelerating moment release model: for some unknown reason,
stress seems to be the variable that works. The model appears, at
least for the small data sets used, to be quite sensitive to possible
errors in the observed magnitudes. This is not unexpected, as an
error of 0.5 in magnitude results in a factor of 5 error in Benioff
strain. The fact that the same magnitude error results in a factor of
30 error in seismic moment may contribute to the model working
better with Benioff strain as the underlying variable.
We provide a systematic procedure for applying the model, illustrated by data from Persia. This appears to show that apparent
alternation of activity between two of the regions may be related
to the activity in the Zagros region. The question of how to identify regions, and more importantly, the sensitivity of conclusions to
this regionalization, is examined with reference to north China. A
clustering algorithm, in other words having the data determine the
regions, is shown to provide a similar outcome to geophysical considerations, but the qualitative results are dependent on the regionalization, primarily through covalent events. Monte Carlo simulation
experiments with the north China model show that the model is very
sensitive to errors in the small catalogue, particularly in magnitude
and completeness. These effects decrease, naturally enough, with
larger simulated catalogues. There seems to be less dependence on
the details of the declustering algorithm, and the magnitude cut-off
used.
A defining characteristic of the linked stress release model is that
it incorporates the spatial dimension only through the regionalization. This does have an advantage in modelling spatial heterogeneity, for example dealing with faults. A more general spatio-temporal
model, such as a time–space marked point process, would have to
assume a spatially homogeneous spatial influence function. The result would then be very similar to the model of Kagan & Jackson
(2000). Moreover, the linked stress release model is more closely
akin to a non-parametric approach, in having a large parameter set
reduced to the statistically significant minimum by the data, rather
than building a constrained physical model and then determining
the parameter values. Rathbun (1996) has shown that maximumlikelihood estimators for a spatio-temporal point process are
940
M. Bebbington and D. Harte
consistent and asymptotically normal, and fitted a spatio-temporal
point process to California data. Rathbun (2002) has formulated a
form of spatio-temporal stress release model, but numerical instability has hindered its application.
As a result of the sensitivity of the model to details of the catalogue
and regionalization, with the small historical catalogues necessarily
used, the low values of the information gain are not unexpected. As
a forecasting tool, the lack of a true spatial dimension is a great
hindrance given the size of the regions used. Nevertheless, many
geophysical hypotheses are advanced at the level of regional interactions (see, for example, Ambraseys & Melville 1982; Thatcher
1984; Kanaori et al. 1993), and so are testable using the linked stress
release model. Models such as that of Papadimitriou et al. (2001)
can also be investigated in this manner.
Until more data become available, the best use of the linked stress
release model may be as a procedure for testing hypotheses on geophysical interactions, and as a diagnostic and validation tool for
other, more complicated, models. This purpose will be facilitated
by the examination of the statistical properties of the model contained in this paper.
AC K N OW L E D G M E N T S
This work was supported by the Marsden Fund, administered by the
Royal Society of New Zealand. Valuable discussion was provided by
David Vere-Jones and Yehuda Ben-Zion. Reviews by Yan Kagan and
an anonymous referee greatly improved the content and organization
of the paper. The authors are grateful to the Institute for Mathematics
and its Applications, University of Minnesota, for its hospitality.
REFERENCES
Aalen, O.O. & Hoem, J.M., 1978. Random time changes for multivariate
counting processes, Scand. Actuarial J., 5, 81–101.
Akaike, H., 1977. On entropy maximization principle, in Applications of
Statistics, pp. 27–41, ed. Krishnaiah, P.R., North-Holland, Amsterdam.
Aki, K., 1989. Ideal probabilistic earthquake prediction, Tectonophysics,
169, 197–198.
Ambraseys, N.N. & Melville, C.P., 1982. A History of Persian Earthquakes,
Cambridge University Press, Cambridge, p. 219.
Bebbington, M., 1997. A hierarchical stress release model for synthetic
seismicity, J. geophys. Res., 102, 11 677–11 687.
Bebbington, M. & Harte, D., 2001. On the statistics of the linked stress
release process, J. Appl. Probab., 38A, 176–187.
Bebbington, M.S., Harte, D.S. & Vere-Jones, D., 1998. A linked stress release
model for spatial seismicity, EOS, Trans. Am. geophys. Un., 79, F643.
Benioff, H., 1951. Crustal strain characteristics derived from earthquake
sequences, Trans. AGU, 32, 508–514.
Ben-Zion, Y., 1996. Stress, slip and earthquakes in models of complex singlefault systems incorporating brittle and creep deformations, J. geophys.
Res., 101, 5677–5706.
Borovkov, K. & Bebbington, M.S., 2003. A simple two-node stress transfer
model reproducing Omori’s law, Pure appl. Geophys., 160, 1429–1445.
Borovkov, K. & Vere-Jones, D., 2000. Explicit formulae for stationary distributions of stress release processes, J. Appl. Probab., 37, 315–321.
Daley, D.J. & Vere-Jones, D., 1988. An Introduction to the Theory of Point
Processes, Springer, Berlin, p. 702.
Dieterich, J., 1994. A constitutive law for rate of earthquake production and
its application to earthquake clustering, J. geophys. Res., 99, 2601–2618.
Eguchi, T., 1983. Tectonic stress fields in East Eurasia, Phys. Earth planet.
Inter., 33, 318–327.
Eneva, M. & Ben-Zion, Y., 1997a. Techniques and parameters to analyse
seismicity patterns associated with large earthquakes, J. geophys. Res.,
102, 17 785–17 795.
Eneva, M. & Ben-Zion, Y., 1997b. Application of pattern recognition techniques to earthquake catalogs generated by model of segmented fault
systems in three-dimensional elastic solids, J. geophys. Res., 102, 24 513–
24 528.
Field, E.H., Jackson, D.D. & Dolan, J.F., 1999. A mutually consistent seismic
hazard model for southern California, Bull. seism. Soc. Am., 89, 559–578.
Gardner, J.K. & Knopoff, L., 1974. Is the sequence of earthquakes in southern
California, with aftershocks removed, Poissonian?, Bull. seism. Soc. Am.,
64, 1363–1367.
Gu, G., ed., 1983a. Chinese Earthquake Catalogue, Part I: 1831BC-1969AD,
Scientific Press, Beijing.
Gu, G., ed., 1983b. Chinese Earthquake Catalogue, Part II: 1970–1979AD,
Seismological Press, Beijing.
Harris, R.A., 1998. Introduction to special section: stress triggers, stress
shadows, and implications for seismic hazard, J. geophys. Res., 103,
24 347–24 358.
Harris, R.A. & Simpson, R.W., 1996. In the shadow of 1857—the effect
of the great Ft. Tejon earthquake on subsequent earthquakes in southern
California, Geophys. Res. Lett., 23, 229–232.
Harte, D.S., 1999. Documentation for the Statistical Seismology Library. Research Report 98/10, revised edition, School of Mathematical and Computing Sciences, Victoria University of Wellington, New Zealand.
Harte, D.S., 2001. Multifractals: Theory and Applications, Chapman and
Hall/CRC, Boca Raton, p. 248.
Hill, D.P. et al., 1993. Seismicity remotely triggered by the magnitude
7.3 Landers, California, earthquake, Science, 260, 1617–1623.
Imoto, M., 2001. Application of the stress release model to the Nankai
earthquake sequence, southwest Japan, Tectonophysics, 338, 287–295.
Imoto, M., Maeda, K. & Yoshida, A., 1999. Use of statistical models to
analyze periodic seismicity observed for clusters in the Kanto region,
central Japan, Pure appl. Geophys., 155, 609–624.
Jaumé, S. & Bebbington, M.S., 2000. Accelerating seismic moment release
from modified stress release models, EOS, Trans. Am. geophys. Un., 48,
F582.
Jaumé, S.C. & Sykes, L.R., 1999. Evolving towards a critical point: a review
of accelerating seismic moment/energy release prior to large and great
earthquakes, Pure appl. Geophys., 155, 279–305.
Kagan, Y.Y., 1991. Seismic moment distribution, Geophys. J. Int., 106, 123–
134.
Kagan, Y.Y., 1997a. Seismic-moment frequency relationship for shallow
earthquakes; regional comparisons, J. geophys. Res., 102, 2835–2852.
Kagan, Y.Y., 1997b. Are earthquakes predictable,? Geophys. J. Int., 131,
505–525.
Kagan, Y.Y., 2002. Seismic moment distribution revisited: I. Statistical results, Geophys. J. Int., 148, 521–542.
Kagan, Y.Y. & Jackson, D.D., 1991. Long-term earthquake clustering, Geophys. J. Int., 104, 117–133.
Kagan, Y.Y. & Jackson, D.D., 2000. Probabilistic forecasting of earthquakes,
Geophys. J. Int., 143, 438–453.
Kagan, Y.Y. & Knopoff, L., 1977. Earthquake risk prediction as a stochastic
process, Phys. Earth planet. Inter., 14, 97–108.
Kanamori, H., 1972. Relations among tectonic stress, great earthquakes and
earthquake swarm, Tectonophysics, 14, 1–12.
Kanamori, H. & Anderson, D.L., 1975. Theoretical basis of some empirical
relations in seismology, Bull. seism. Soc. Am., 65, 1073–1095.
Kanaori, Y., Kawakami, S. & Yairi, K., 1993. Space–time correlations between inland earthquakes in central Japan and great offshore earthquakes
along the Nankai trough: implication for destructive earthquake prediction, Eng. Geol., 33, 289–303.
Kanaori, Y., Kawakami, S. & Yairi, K., 1994. Seismotectonics of the Median
Tectonic Line in southwest Japan: implications for coupling among major
fault systems, Pure appl. Geophys., 142, 589–607.
King, G.C.P., Stein, R.S. & Lin, J., 1994. Static stress changes and the triggering of earthquakes, Bull. seism. Soc. Am., 84, 935–953.
Kiremidjian, A. & Anagnos, T., 1984. Stochastic slip-predictable model for
earthquake occurrences, Bull. seism. Soc. Am., 74, 739–755.
Knopoff, L., 1971. A stochastic model for the occurrence of main sequence
events, Rev. Geophys. Space Phys., 9, 175–188.
C
2003 RAS, GJI, 154, 925–946
The linked stress release model
Knopoff, L., 1996. A selective phenomenology of the seismicity of Southern
California, Proc. Natl. Acad. Sci. USA, 93, 3756–3763.
Kossobokov, V.G., 1997. User manual for M8, in Algorithms for Earthquake
Statistics and Prediction, pp. 167–222, eds Healy, J.H., Kellis-Borok, V.I.
& Lee, W.H.K., IASPEI, Menlo Park.
Kullback, S., 1997. Information Theory and Statistics, p. 399, Dover, New
York.
Li, C. & Kisslinger, C., 1985. Stress transfer and non-linear stress accumulation at subduction type plate boundaries—application to the Aleutians,
Pure appl. Geophys., 122, 813–830.
Li, F.Q. & Liu, G.X., 1986. Stress state in the upper crust of the China
mainland, J. Phys. Earth, 34, S71–S80.
Liu, J., Vere-Jones, D., Ma, L., Shi, Y. & Zhuang, J.C., 1998. The principal
of coupled stress release model and its application, Acta Seismologica
Sinica, 11, 273–281.
Liu, J., Chen, Y., Shi, Y. & Vere-Jones, D., 1999. Coupled stress release model
for time dependent seismicity, Pure appl. Geophys., 155, 649–667.
Lu, C. & Vere-Jones, D., 2000. Application of linked stress release model
to historical earthquake data: comparison between two kinds of tectonic
seismicity, Pure appl. Geophys., 157, 2351–2364.
Lu, C. & Vere-Jones, D., 2001. Statistical analysis of synthetic earthquake
catalogs generated by models with various levels of fault zone disorder,
J. geophys. Res., 106, 11 115–11 125.
Lu, C., Harte, D. & Bebbington, M., 1999a. A linked stress release model for
historical Japanese earthquakes: coupling among major seismic regions,
Earth Planets Space, 51, 907–916.
Lu, C., Vere-Jones, D. & Takayasu, H., 1999b. Avalanche behaviour and
statistical properties in a microcrack coalescence process, Phys. Rev. Lett.,
82, 347–350.
Lu, C., Vere-Jones, D., Takayasu, H., Tretyakov, A.Y. & Takayasu, M., 1999c.
Spatio-temporal seismicity in an elastic block lattice model, Fractals, 7,
301–311.
Lutz, K.A. & Kiremidjian, A.S., 1995. A stochastic model for spatially and
temporally dependent earthquakes, Bull. seism. Soc. Am., 85, 1177–1189.
Main, I.G., 1999. Applicability of time-to-failure analysis to accelerated
strain before earthquakes and volcanic eruptions, Geophys. J. Int., 139,
F1–F6.
Nalbant, S.S., Hubert, A. & King, G.C.P., 1998. Stress coupling between
earthquakes in northwest Turkey and the north Aegean sea, J. geophys.
Res., 103, 24 469–24 486.
Ogata, Y., 1981. On Lewis’s simulation method for point processes, IEEE
Trans. Inf. Theory, IT-27, 23–31.
Ogata, Y., 1988. Statistical models for earthquake occurrences and residual
analysis for point processes, J. Amer. Statist. Assoc., 83, 9–27.
Ogata, Y. & Vere-Jones, D., 1984. Inference for earthquake models: a selfcorrecting model, Stoch. Proc. Appl., 17, 337–347.
Papadimitriou, E.E., Papazachos, C.B. & Tsapanos, T.M., 2001. Test and application of the time- and magnitude-predictable model to the intermediate
and deep focus earthquakes in the subduction zones of the circum-Pacific
belt, Tectonophysics, 330, 45–68.
Press, W.H., Flannery, B.P., Teukolsky, S.A. & Vetterling, W.T., 1986. Numerical Recipes, p. 818, Cambridge University Press, Cambridge.
Pollitz, F.F. & Sacks, I.S., 1995. Consequences of stress changes following
the 1891 Nobi earthquake, Japan, Bull. seism. Soc. Am., 85, 796–807.
Pollitz, F.F. & Sacks, I.S., 1997. The 1995 Kobe, Japan, earthquake: a longdelayed aftershock of the offshore 1944 Tonankai and 1946 Nankaido
earthquakes, Bull. seism. Soc. Am., 87, 1–10.
Rathbun, S.L., 1996. Asymptotic properties of the maximum likelihood estimator for spatio-temporal point processes, J. Statist. Plan. Infer., 51,
55–74.
Rathbun, S.L., 2002. A marked spatio-temporal point process
model for California earthquakes, in Seminar Notes at the
IMA Workshop on Point Process Modeling and Seismological Applications of Statistics, Minneapolis, Minnesota, 10–14
June
2002
(http://www.ima.umn.edul//talks/workshops/6-1014.2002/rathbun/Rathbun.pdf).
Reid, H.F., 1910. The mechanism of the earthquake, in The California
Earthquake of April 18, 1906, Report of the State Earthquake Investi
C
2003 RAS, GJI, 154, 925–946
941
gation Commission, Vol. 2, pp. 16–28, Carnegie Institute of Washington,
Washington, DC.
Rényi, A., 1959. On the dimension and entropy of probability distributions.,
Acta Mathematica, 10, 193–215.
Rundle, J.B., 1988a. A physical model for earthquakes 1. Fluctuations and
interactions, J. geophys. Res., 93, 6237–6254.
Rundle, J.B., 1988b. A physical model for earthquakes 2. Application to
southern California, J. geophys. Res., 93, 6255–6274.
Schoenberg, F. & Bolt, B., 2000. Short-term exciting, long-term correcting
models for earthquake catalogs, Bull. seism. Soc. Am., 90, 849–858.
Seno, T., 1979. Pattern of intraplate seismicity in Southwest Japan before
and after great interplate earthquakes, Tectonophysics, 57, 267–283.
Shi, Y., Liu, J., Vere-Jones, D., Zhuang, J. & Ma, L., 1998. Application of
mechanical and statistical models to the study of seismicity of synthetic
earthquakes and the prediction of natural ones, Acta Seismologica Sinica,
11, 421–430.
Shimazaki, K., 1976. Intra-plate seismicity and inter-plate earthquakes: historical activity in southwest Japan, Tectonophysics, 33, 33–42.
Shimazaki, K. & Nakata, T., 1980. Time-predictable model for large earthquakes, Geophys. Res. Lett., 7, 279–282.
Sornette, D. & Sornette, A., 1999. General theory of the modified Gutenberg–
Richter law for large seismic moments, Bull. seism. Soc. Am., 89, 1121–
1030.
Stark, P.B., 1997. Earthquake prediction: the null hypothesis, Geophys. J.
Int., 131, 495–499.
Stein, R.S., Barka, A.A. & Dieterich, J.H., 1997. Progressive failure on the
North Anatolian fault since 1939 by earthquake stress triggering, Geophys.
J. Int., 128, 594–604.
Thatcher, W., 1984. The earthquake deformation cycle at the Nankai Trough,
J. geophys. Res., 89, 3087–3101.
Toda, S., Stein, R.S., Reasenberg, P.A., Dieterich, J.H. & Yoshida, A., 1998.
Stress transferred by the 1995 M W = 6.9 Kobe, Japan, shock: effect on
aftershocks and future earthquake probabilities, J. geophys. Res., 103,
24 543–24 565.
Utsu, T., 1999. Representation and analysis of the earthquake size distribution: a historical review and some new approaches, Pure appl. Geophys.,
155, 509–535.
Vere-Jones, D., 1978. Earthquake prediction—a statistician’s view, J. Phys.
Earth, 26, 129–146.
Vere-Jones, D., 1988. On the variance properties of stress-release models,
Austral. J. Statist., 30A, 123–135.
Vere-Jones, D., 1995. Forecasting earthquakes and earthquake risk, Int. J.
Forecasting, 11, 503–538.
Vere-Jones, D., 1998. Probabilities and information gain for earthquake forecasting, Comput. Seismol., 30, 248–263.
Vere-Jones, D. & Deng, Y.L., 1988. A point process analysis of historical
earthquakes from North China, Earthquake Res. China, 2, 165–181.
Vere-Jones, D. & Ogata, Y., 1984. On the moments of a self-correcting
process, J. Appl. Prob., 21, 335–342.
Vere-Jones, D., Robinson, R. & Yang, W., 2001. Remarks on the accelerated
moment release model: problems of model formulation, simulation and
estimation, Geophys. J. Int., 144, 517–531.
Walrand, J., 1988. An Introduction to Queueing Networks, p. 384, PrenticeHall, Englewood Cliffs, NJ.
Wang, A., Vere-Jones, D. & Zheng, X., 1991. Simulation and estimation
procedures for stress release models, in Stochastic Processes and Their
Applications, Lecture Notes in Economics and Mathematical Systems, Vol.
370, pp. 11–27, eds Beckmann, M.J., Gopalan, M.N. & Subramanian, R.,
Springer, Berlin.
Ward, S.N., 1991. A synthetic seismicity model for the Middle America
Trench, J. geophys. Res., 96, 21 433–21 442.
Ward, S.N., 1996. A synthetic seismicity model for southern California:
cycles, probabilities, hazard, J. geophys. Res., 101, 22 393–22 418.
Working Group on California Earthquake Probabilities, 1995. Seismic hazards in southern California; probable earthquakes, 1994 to 2024, Bull.
seism. Soc. Am., 85, 379–439.
Zheng, X., 1991. Ergodic theorems for stress release processes, Stoch. Proc.
Appl., 38, 239–258.
942
M. Bebbington and D. Harte
Zheng, X. & Vere-Jones, D., 1991. Applications of stress release models to
earthquakes from North China, Pure appl. Geophys., 135, 559–576.
Zheng, X. & Vere-Jones, D., 1994. Further applications of the stochastic
stress release model to historical earthquake data, Tectonophysics, 229,
101–121.
Zhuang, J. & Ma, L., 1998. The stress release model and results from
modelling features of some seismic regions in China, Acta Seismologica Sinica, 11, 59–70.
APPENDIX A: TECHNICAL NOTES
A1 Numerical likelihood maximization
Assume that the process has occurred for an infinite time into the
past and that events have occurred at t k , where k = . . . , −2, −1, 0,
1, 2, . . . , n; and t n is the time of the last observed event. It is usually
the case though, that the process has been observed over only part
of this period, say in the interval [T 0 , T 2 ], where t 0 < T 0 < t 1 <
. . . < t n < T 2 . This raises a potential problem with the conditional
intensity function which is conditional on the history of the process,
i.e. λ(t) = λ(t|Ht ). Since we have no recorded history before T 0 , the
calculated amount of stress released prior to T 0 is S(T 0 ) = 0. This
is not a real problem in the case of the stress release model because
the parameter α in eq. (9) describes the intensity at t = 0. In a more
general situation where the model does not make this compensation,
one possible strategy is to maximize the log-likelihood function over
a smaller interval, say [T 1 , T 2 ], i.e.
T2
log L(T1 , T2 ) =
log λ(tk |Htk ) −
λ(t|Ht ) dt,
(A1)
T1
k:T1 ≤tk ≤T2
where t 0 < T 0 < t 1 < . . . < t m < T 1 < t m+1 < . . . < t n < T 2 . That
is, the first m observed events are used as part of the history of the
process (i.e. in calculating λ(t|Ht )) but only the last n − m events
enter explicitly into the log-likelihood equation. This then gives the
fitted process an initial settling-in period. In all models fitted in this
paper, T 0 = T 1 , i.e. m = 0.
Note that the values of T 0 , T 1 and T 2 do not coincide with event
times t k . The length of time between, for example, T 0 and t 1 and
also between t n and T 2 is part of the sample information. Hence
perturbing these values will produce different parameter estimates,
and is the equivalent to perturbing ones sample in any other statistically sampled situation. Hence the values of T 0 , T 1 and T 2 should
be determined a priori and not with reference to the event times. In
all models fitted in this paper, T 0 and T 2 are set to the beginning and
end of the first and last calendar years for which there are data, including null data, respectively. The parameters {α j } then subsume
the unknown time from t 1 to the previous event.
Sometimes individual data sets contain little information concerning certain model parameters. For example, parameters that are
dependent on the characteristics of the seismic network, rather than
the geophysical properties of the region under study, may be better
determined using the collective information of many studies or specific knowledge of the capabilities of the seismic network. In these
situations, a possibility is to maximize the posterior log-likelihood
log Q(T1 , T2 ) = log L(T1 , T2 ) +
log f j (θ j ),
(A2)
j
where f j (θ j ) is a prior density for the parameter θ j .
Fitting the model by maximizing the log-likelihood presents many
computational difficulties. Some algorithms are good at determining
the rough location of the maximum, but converge very slowly in
close proximity to the maximum. Other algorithms that are based
on hill climbing in the steepest direction appear to work when the
initial starting location is sufficiently close to the solution, but can
become hopelessly lost if the initial values are too far away.
The relative scales in the parameter space can also cause considerable problems. For example, one parameter may have a possible
domain that spans a number of orders of magnitude more than a
second parameter. Thus, a step taken by the optimizer that is scaled
in a manner to be appropriate for the first parameter may completely
overshoot the maximum because of the finer scale in the second parameter. Most optimizers have an argument so that relative scales
in the parameter space can be specified. A good knowledge of the
function being optimized, in this respect, and an understanding of
the manner in which the optimizer scales the parameter space is
critical in order to calculate the maximum-likelihood parameter estimates without relying too much on trial and error. Furthermore,
the calculation of numerical derivatives for the determination of the
steepest ascent is fraught with many problems, particularly machine
precision, and this is further exacerbated when different parameters
are on very different scales. In the stress-release model, the parameters have widely differing scales, a problem further magnified by
using, for example, the seismic moment (energy) as the variable
rather than the stress (cf. eq. 2).
Many optimizers use a quadratic approximation to determine the
step length and direction, and therefore internally calculate the Hessian (Davidson–Fletcher–Powell) of the log-likelihood. This estimate of the Hessian, if satisfactory, is very useful because its inverse
provides an estimate of the covariance matrix of the fitted parameters (see, for example, Press et al. 1986, pp. 510–515), and hence
rudimentary confidence intervals. However, there are a number of
potential problems. The estimate of the Hessian typically starts with
the identity matrix, which is modified at each iteration. Thus, if the
process converges quickly, the calculated Hessian may be a poor
estimate. Furthermore, the calculation of the Hessian by way of the
second derivatives is fraught with all of the same difficulties as when
one calculates the first derivatives as described above, but probably
more so!
The formulation (10) indicates correlation between the parameters is very likely. We will illustrate this, and the shape of the
likelihood surface, using the linked stress release model (a, b, C)
fitted to data from Taiwan (Bebbington & Harte 2001), where
−1.45
0.55
0.68 0.25
a=
, b=
, C=
.
(A3)
−2.27
0.13
0.21 0.32
We see that this indicates inhibitory behaviour, events in one region
lowering the intensity in the other region.
Examining Fig. A1 we see that the likelihood surface is well
behaved for the transfer terms c12 and c21 , indicating no tendency
for compensating effects to allow arbitrary amounts of interaction.
Bebbington & Harte (2001) also indicate some correlation between
b2 and c21 , the input terms for region 2. Fig. A2 clearly shows
this effect. This factor, plus the ‘flatness’ of the likelihood surface
evident in both figures are often a principal cause of instability in
the fitting procedure.
A2 Simulation
There are two aspects to simulating the process: event time and event
magnitude. Generally, determining the time to the next occurrence
will follow the thinning method (see Ogata 1981, or Wang et al. 1991
for an account) where points in a dominating (with a higher intensity) Poisson process are simulated and then successively accepted
C
2003 RAS, GJI, 154, 925–946
The linked stress release model
Figure A1. Likelihood contours for transfer terms (Taiwan data).
Figure A2. Likelihood contours for region 2 inputs (Taiwan data).
C
2003 RAS, GJI, 154, 925–946
943
944
M. Bebbington and D. Harte
or rejected with suitable likelihood. Apart from the slight complication of having to restart due to an event in another region in the linked
case, matters are quite straightforward. Assigning a magnitude to
an event is more complicated. The distribution of the stress released
by an event is usually assumed to be independent of the stress level
itself. While there may be a weak dependence, the resulting improvement in fit does not justify its inclusion in the model (Zheng
& Vere-Jones 1991). Although the Gutenberg–Richter power-law
decay holds fairly consistently for small to medium magnitudes, it
is precisely at the large magnitudes we are interested in where we
see deviation. It may be possible to use a tapered Pareto distribution
(Vere-Jones et al. 2001; see also Sornette & Sornette 1999; Utsu
1999) or a left-truncated Gamma distribution with negative shape
parameter (Kagan 1991), but this requires estimating further parameters. Instead it makes sense to eliminate the magnitude distribution
as a source of error, and to use the empirical magnitude distributions
from the original catalogue. Since there often seem to be significant
differences in the magnitude distribution by region, particularly between intraplate and interplate events (see, for example, Kanaori
et al. 1993), this must also be allowed for. Lu & Vere-Jones (2001)
demonstrate that the magnitude distribution is probably the most
critical element in producing qualitative and quantitative agreement
with historical data. Use of the empirical magnitude distribution is
particularly apt when the objective is to isolate the factors involved
in the fitting of the model, and thereby to evaluate their significance
(Bebbington & Harte 2001).
A3 Calculation of the information gain
Information theory arises in connection with the transmission of
information, in particular, the length of a binary representation of
that information. Say a set E has N elements. If log2 N ∈ Z+ ,
each element can be labelled by a binary number having log2 N
digits. As such, Hartley defined log2 N as the necessary information
to characterize E (see, for example, Kullback 1997). We can use
this theory to describe the information generated concerning the
probability distribution of an observed point process.
To start this discussion, assume that E = E 1 + E 2 + . . . + E b ,
where E 1 , . . . , E b are pairwise disjoint finite sets. Experiments
are performed, that consist of independently and randomly allocating elements to the b subsets (E j , j = 1, . . . , b) according to the
probabilities p ij , where i denotes the experiment number. Let Y ij
be a random variable that is one if, in the ith experiment, set E j is
allocated at least one element, and zero otherwise. We denote the
particular experimental outcome as y ij . Then the information generated concerning the probability distribution P i = ( pi1 , . . . , p ib )
by the ith experiment is
−
n
yi j log2 pi j .
j=1
The expected amount of information generated by an experiment,
concerning the probability distribution P i = ( pi1 , . . . , p ib ), is
−
b
pi j log2 pi j .
by a sequence of n experiments is
b
n Then Shannon’s formula, being the expected value of the above
estimate, generalizes to
−
b
n pi j log2 pi j .
i=1 j=1
Note that we simply use natural logarithms in our calculations and
discussions below.
We use this in the point process context as follows. Assume that
the observed time period (T 1 , T 2 ) is divided into n bins each of
width δ. Each bin is an ‘experiment’ in the above terminology. Furthermore, each experiment has only two possible outcomes (i.e. b
= 2), hence we will drop the index j since the two probabilities in
the ith experiment can be written as p i and 1 − p i .
Denote the bin boundaries as τ 0 , τ 1 , . . . , τ n , where τ 0 = T 1 and
τ n = T 2 . Let ∅[τi ,t) denote the outcome that no events occur in the
interval [τ i , t). Let Y i be one if one or more events occur in the ith
bin, [τ i−1 , τ i ), and zero otherwise, then
pi = Pr{Yi = 1}
≈ 1 − exp −
τi
τi−1
!
λ t|Hτi−1 ∩ ∅[τi−1 ,t) , θ dt ,
(A4)
where θ is a vector of parameters peculiar to the conditional intensity function λ. Hence the information generated concerning the
probability distribution of the point process during the observation
period (T 1 , T 2 ) is
−
n
[yi log pi + (1 − yi ) log(1 − pi )].
i=1
This is simply the log-likelihood function of the binomial distribution, hence it is referred to as the binomial score and is denoted by
B.
The relationship between the binomial score and the point process
log-likelihood can be derived in much the same way that one derives
the Poisson distribution as an infinite number of Bernouilli trials.
Assume that the bin width δ is sufficiently small so that the number
of events per bin is no greater than one. Denote the bin that contains
the kth event as I k . That is, in what follows, the summation over k
is only over those bins that contain an event, the time of which is t k ;
but the summation over i is over all bins. Then the binomial score
can be rewritten as
N
n
pk
B=
+
log
log(1 − pi )
1 − pk
k=1
i=1
!
" N
≈
log exp
λ(t|Ht )dt − 1 − λ(t|Ht ) dt
k=1
≈
j=1
This is known as Shannon’s formula (see, for example, Kullback
1997).
If the joint probabilities of a set of Y ij can be written as a chain
of probabilities of each Y ij conditioned by those going before it in
time (i.e. smaller i), then the total amount of information generated
yi j log2 pi j .
i=1 j=1
N
≈
! λ(t|Ht ) dt − λ(t|Ht ) dt
log
Ik
k=1
N
Ik
log δλ(tk |Htk ) −
λ(t|Ht ) dt
k=1
= N log δ +
N
log λ(tk |Htk ) −
λ(t|Ht ) dt.
(A5)
k=1
C
2003 RAS, GJI, 154, 925–946
The linked stress release model
It can be shown that as δ becomes very small (i.e. n very large),
N
log λ(tk |Htk ) − λ(t|Ht ) dt + (δ),
(A6)
B = N log δ +
k=1
where (δ) → 0 as δ → 0. Rényi (1959) would describe this information increase, as δ → 0, as the point process distribution having
a ‘dimension’ of N (see Harte 2001).
Now denote the constant intensity of a ‘null’ model as λ̄, and the
corresponding binomial score as B̄. Then the information gain is
B − B̄. Since the term N log δ cancels out, B − B̄ is simply the
log-likelihood difference. The expected information gain for a point
process is bounded by
λ(t|Ht )
− [λ(t|Ht ) − λ̄],
λ̄
where the ‘random’ part within the expectation is the history of the
process up to time t, i.e. Ht .
The information gain therefore describes the ‘predictability’ of
the process relative to the null process. If a process generates a higher
information gain, then it should be inherently more predictable than
a process that generates little information. Typically, one would express the information gain as information gain per unit time, i.e.
IEλ(t|Ht ) log
B − B̄
,
T2 − T1
or per event, i.e. (B − B̄)/N . Note that all models will be dependent
on the particular time units used, and hence for comparative purposes
these units should be the same.
Since the information gain is simply the difference in the loglikelihood between a null and ‘fitted’ model, then it also has the
usual goodness-of-fit interpretations when applied to observed historical data. If the distribution of the information gain is unknown,
945
then many realizations can be simulated of the same length as that of
the observed series. For each simulated series, the information gain
is calculated using the fitted model after re-estimating the parameter values. Thus we form an empirical probability distribution of
the information gains. One then compares the information gain calculated using the observed data with the empirical distribution, and
rejects the hypothesis that the data are sampled from such a process
if the information gain lies in the tails of the empirical distribution.
The null model for the linked stress release process can be defined
in several natural ways. Assuming it to be a multidimensional Poisson process, that is, an independent Poisson process for each region,
it remains to determine the rates of these Poisson processes. We will
follow Vere-Jones (1998) in calculating the rate as the stress input
rate divided by the average stress release, λ̄i = υi /IE{Ji }, where J i
is a random variable for the stress drop distribution in the ith region.
Since stress is transferred between the regions, the stress ‘input’
rates {υ i } are found as the solutions to the simultaneous equations
θi j υ j ,
(A7)
υi = ρi −
j=i
for all i. These are known as flow conservation equations in a Jackson network (Walrand 1988), which function similarly here as a
consistency condition. For example, the values for regionalization
A in the north China model are υ 1 = 0.426, υ 2 = 0.144, υ 3 = 0.204
and υ 4 = 0.333. Using the empirical jump distribution to estimate
E{J i }, these produce λ̄1 = 0.0377, λ̄2 = 0.0189, λ̄3 = 0.0378 and
λ̄4 = 0.0282 (per year). The other regionalizations produce similar, but slightly different, values because of the different estimated
parameters ρ and .
A P P E N D I X B : H I S T O R I C A L D AT A S E T S
Table B1. Persian earthquake data 1780–1994. Reprinted from Zheng, X. and Vere-Jones, D., Further applications of the stochastic stress release
model to historical earthquake data, Tectonophysics, 229, 101–121, Copyright (1994), with permission from Elsevier.
Date
1780.01.08
1780.-.1786.10.1808.06.26
1809.-.1810.-.1824.06.25
1825.-.1830.03.27
1833.-.1834.-.1838.-.1840.07.02
1844.05.12
1844.05.13
1851.04.02
1851.06.1853.05.05
1861.05.24
1862.12.19
1862.12.21
1863.12.30
1864.01.17
1864.12.07
1865.06.1868.03.18
C
Lat.
Long.
Mag.
Reg.
Date
38.20
46.00
38.30
35.30
36.30
38.00
29.80
36.10
35.70
37.30
39.70
29.60
39.50
33.60
37.40
40.00
36.80
29.60
39.40
39.30
29.50
38.20
30.60
33.30
29.60
39.60
45.60
54.50
52.50
57.20
52.40
52.60
52.40
58.10
43.70
59.90
43.90
51.40
48.00
47.30
58.40
52.50
47.50
47.50
52.50
48.60
57.00
45.90
53.10
47.60
7.70
6.50
6.30
6.60
6.50
6.50
6.40
6.70
7.10
6.20
6.00
7.00
7.40
6.40
6.90
6.20
6.90
6.20
6.00
6.00
6.20
6.10
6.00
6.40
6.00
6.00
A
E
A
N
N
K
Z
N
N
E
A
E
A
Z
N
N
E
Z
A
A
Z
N
C
Z
Z
A
1923.09.22
1927.07.22
1929.05.01
1929.07.15
1930.05.06
1930.08.23
1931.04.27
1933.10.05
1933.11.28
1934.02.04
1934.06.13
1935.04.11
1936.06.30
1940.05.04
1941.02.06
1945.11.27
1947.08.05
1947.09.23
1948.07.05
1948.10.05
1949.04.24
1953.02.12
1956.10.31
1957.07.02
1957.12.13
1961.06.11
2003 RAS, GJI, 154, 925–946
Lat.
Long.
Mag.
Reg.
29.51
34.90
37.73
32.08
38.24
27.88
39.48
34.42
32.01
30.54
27.63
36.36
33.68
35.76
33.41
25.02
25.25
33.67
29.88
37.88
27.28
35.39
27.27
36.07
34.58
27.78
56.63
52.90
57.81
49.48
44.60
55.02
46.09
57.07
55.94
51.64
62.64
53.32
60.05
58.53
58.87
63.47
63.20
58.67
57.53
58.55
56.46
54.88
54.55
52.47
47.82
54.51
6.70
6.30
7.30
6.00
7.20
6.10
6.40
6.00
6.20
6.30
6.60
6.30
6.00
6.40
6.10
8.00
7.00
6.80
6.00
7.20
6.30
6.50
6.30
6.80
6.86
6.50
C
N
K
Z
A
Z
A
C
C
Z
M
N
E
E
E
M
M
E
C
E
Z
N
Z
N
Z
Z
946
M. Bebbington and D. Harte
Table B1. (Continued.)
Date
1868.08.01
1871.12.23
1872.06.03
1875.05.1879.03.22
1883.05.03
1890.03.25
1890.07.11
1893.11.17
1895.01.17
1896.01.04
1897.01.10
1902.03.09
1903.03.22
1904.11.09
1905.01.09
1905.06.19
1908.09.28
1909.01.23
1911.04.18
1923.09.17
Lat.
Long.
Mag.
Reg.
Date
34.90
37.40
34.70
31.20
37.80
37.90
28.80
36.60
37.00
37.10
37.80
26.90
27.08
33.16
36.94
37.00
29.89
38.00
33.41
31.23
37.63
52.50
58.40
47.70
56.30
47.90
47.20
53.50
54.60
58.40
58.40
48.40
56.00
56.34
59.71
59.77
48.68
59.98
44.00
49.13
57.03
57.21
6.40
7.20
6.10
6.00
6.70
6.20
6.40
7.20
7.10
6.80
6.70
6.40
6.40
6.20
6.40
6.20
6.90
6.00
7.40
6.20
6.30
N
K
Z
C
N
N
Z
N
K
E
N
Z
Z
E
E
N
E
A
Z
C
K
1962.09.01
1964.12.22
1968.08.31
1969.11.07
1970.07.30
1972.04.10
1975.03.07
1976.11.07
1976.11.24
1977.03.21
1977.04.06
1978.09.16
1978.12.04
1979.12.14
1979.01.10
1979.01.16
1980.05.04
1981.07.28
1989.03.05
1990.06.20
1990.11.06
Lat.
Long.
Mag.
Reg.
35.71
28.12
34.02
27.42
37.67
28.38
27.47
33.82
39.12
27.59
31.90
33.40
37.67
32.14
26.52
33.80
38.05
30.01
29.95
36.96
28.23
49.81
56.80
58.96
60.40
55.89
52.98
56.44
59.19
43.92
56.45
50.76
57.12
48.90
49.65
61.01
59.50
48.99
57.79
51.68
49.41
55.43
7.20
6.10
7.41
6.40
6.60
6.90
6.10
6.40
7.30
6.90
6.10
7.30
6.00
6.10
6.00
7.22
6.20
7.10
6.20
7.70
6.70
N
Z
E
M
N
Z
Z
E
A
Z
Z
C
N
Z
M
E
N
E
Z
N
Z
Table B2. Historical large earthquakes for north China, 1480–1996, reprinted from Zheng, X. and Vere-Jones, D., Further applications of the stochastic
stress release model to historical earthquake data, Tectonophysics, 229, 101–121, Copyright (1994), with permission from Elsevier, with additions. The
region (Reg.) is for the regionalization (A) of Zheng & Vere-Jones (1994), with alternative regionalizations given in parentheses.
Date
1484.01.29
1487.08.10
1501.01.19
1502.10.17
1536.10.22
1548.09.13
1556.01.23
1561.07.25
1568.05.15
1568.04.25
1573.01.10
1587.04.10
1597.10.06
1604.10.25
1614.10.23
1618.05.20
1618.11.16
1622.03.18
1622.10.25
1624.02.10
1624.04.17
1624.07.04
1626.06.28
1627.02.15
1634.01.1642.06.30
1654.07.21
1658.02.03
1665.04.16
1668.07.25
1679.09.02
1683.11.22
1695.05.18
Lat.
Long.
Mag.
Reg.
Date
40.40
34.30
34.80
35.70
39.60
38.00
34.50
37.50
39.00
34.40
34.40
35.20
38.50
34.20
37.20
37.00
39.80
35.50
36.50
32.40
39.80
35.40
39.40
37.50
34.10
35.10
34.30
39.40
39.90
35.30
40.00
38.70
36.00
116.10
108.90
110.10
115.30
116.80
121.00
109.70
106.20
119.00
109.00
104.10
113.80
120.00
105.00
112.50
111.90
114.50
116.00
106.30
119.50
118.80
105.90
114.20
105.50
105.30
111.10
105.50
115.70
116.60
118.60
117.00
112.70
111.50
6.70
6.20
7.00
6.50
6.00
7.00
8.00
7.20
6.00
6.70
6.70
6.00
7.00
6.00
6.50
6.50
6.50
6.00
7.00
6.00
6.20
6.00
7.00
6.00
6.00
6.00
8.00
6.00
6.50
8.60
8.00
7.00
8.00
3
2
2
3
3
4
2
1
4
2
1
3
4
1
2
2
3 (B:2)
3
1
4
3 (C:4)
1
3 (B:2)
1
1
2
1
3
3
4
3
2
2
1704.09.28
1709.10.14
1718.06.29
1720.07.12
1730.09.30
1739.01.03
1815.10.23
1820.08.03
1829.11.19
1830.06.12
1831.09.28
1852.05.26
1861.07.19
1879.07.01
1882.12.02
1885.01.14
1888.06.13
1888.11.02
1920.12.06
1922.09.29
1937.08.01
1945.09.23
1966.03.22
1967.03.27
1969.07.18
1975.02.04
1976.04.06
1976.07.28
1976.09.23
1979.08.25
1989.10.18
1996.05.03
Lat.
Long.
Mag.
Reg.
34.90
37.40
35.00
40.40
40.00
38.80
34.80
34.10
36.60
36.40
32.80
37.50
39.70
33.20
38.10
34.50
38.50
37.10
36.70
39.20
35.40
39.50
37.50
38.50
38.20
40.70
40.20
39.40
39.90
41.20
40.00
40.80
106.80
105.30
105.20
115.50
116.20
106.50
111.20
113.90
118.50
114.20
116.80
105.20
121.70
104.70
115.50
105.70
119.00
104.20
104.90
120.50
115.10
119.00
115.10
116.50
119.40
122.80
111.10
118.00
106.40
108.10
113.70
109.60
6.00
7.50
7.50
6.70
6.50
8.00
6.70
6.00
6.00
7.50
6.20
6.00
6.00
8.00
6.00
6.00
7.50
6.20
8.50
6.50
7.00
6.20
7.20
6.30
7.40
7.30
6.20
7.80
6.20
6.00
6.00
6.50
1
1
1
3
3
1
2
3
4
3
4
1
4
1
3
1
4
1
1
4
3
3 (C:4)
3
3
4
4
2
3 (C:4)
1
1
2 (C:3)
1
C
2003 RAS, GJI, 154, 925–946