Geophys. J. Int. (2003) 154, 925–946 The linked stress release model for spatio-temporal seismicity: formulations, procedures and applications Mark Bebbington1 and David Harte2 1 IIS&T, Massey University, Private Bag 11222, Palmerston North, New Zealand. E-mail: [email protected] Research Associates, PO Box 12649, Wellington, New Zealand 2 Statistics Accepted 2003 April 23. Received 2003 March 11; in original form 2002 May 7 SUMMARY The linked stress release model is based on the build-up of stress through elastic rebound and its dissipation in the form of earthquakes. In addition, stress can be transferred between largescale geological or seismic features. The model can be statistically fitted to both historical and synthetic seismicity catalogues and, through simulation, can be used to create probabilistic forecasts of earthquake risk. We review the genesis of the model, provide some observations on forecasting using the model, and follow with a comprehensive review of applications to date. A systematic procedure for identification of the best model is illustrated by data from the Persian region. We then consider the evaluation of fitted models, using residual point processes and information gains. Implications of the use of Benioff strain rather than seismic moment are discussed. The sensitivity of the model to regionalization, magnitude errors, catalogue incompleteness, catalogue size and declustering/magnitude cut-off is then considered in detail with reference to data from north China. The latter data are also used to illustrate the model evaluation techniques introduced earlier. Some technical material on numerical fitting, simulation and calculation of the information gain is given in an appendix. Key words: earthquake prediction, regionalization, seismic-event rates, seismic modelling, statistical methods. 1 I N T RO D U C T I O N Elastic rebound theory (Reid 1910) postulates that elastic stress in a seismically active region accumulates due to movement of tectonic plates, and is released when the stress exceeds the strength of the medium. Fixing, for example, strength or residual stress produces time- and slip-predictable models (Shimazaki & Nakata 1980; Kiremidjian & Anagnos 1984). However, many features of earthquake generation are not captured in these basic models. This results in a degree of randomness, which must be factored into forecasts through the medium of stochastic models. Developing the stochastic (Markov) model for the occurrence of main-sequence earthquakes suggested by Knopoff (1971), Vere-Jones (1978) proposed the stress release model, a stochastic version of the elastic rebound theory, incorporating a deterministic build-up of stress within a region and its stochastic release through earthquakes. Earthquake interaction by means of stress triggering and stress shadows is now an accepted phenomenon (see Harris 1998, and references therein). Knopoff (1996) states that ‘modelling of fractures on the major units of the fault system must take into account coupling between the members of a 2-D web of faults through interactive stress transfer’. Apart from aftershocks and the like, large events are noticeably often followed by large events quite distant from the first (Shimazaki 1976; Li & Kisslinger 1985; Hill et al. 1993; King C 2003 RAS et al. 1994; Pollitz & Sacks 1995, 1997; Toda et al. 1998). On the other hand, very large events can delay subsequent events on the same or other faults (Harris & Simpson 1996). Nalbant et al. (1998) also find evidence of faults in stress shadows being reloaded by motion on adjacent faults shortly before slipping. The omission of this interaction between regions may be responsible for the underestimation of activity by the time- and magnitude-predictable model of Papadimitriou et al. (2001). Synthetic models, for example Ward (1996) and Bebbington (1997), have also emphasized interactions between spatially distributed faults. This motivates an extension of the stress-release model to include interactions between regions, whatever they may be, by means of stress transfer and reduction. In this paper we will attempt to examine the questions of formulation of the model, regionalization of the data, and interpretation and significance of the results in a reasonably systematic fashion. 1.1 Formulation In the univariate stress release model (SRM) as formulated by Vere-Jones (1978), the key variable, or state, is the stress level in a region, which controls the probability of an earthquake occurring (Vere-Jones & Deng 1988). This stress level X (t) increases deterministically between earthquakes and is reduced stochastically as a 925 926 M. Bebbington and D. Harte result of earthquakes. The current value X (t) can be represented in the form Poisson process with trend). The alternative parametrization used in the numerical optimization routines (Harte 1999) is X (t) = X (0) + ρt − S(t), λ(t) = exp{a + b[t − cS(t)]}, (1) where X (0) is the initial value, ρ is a constant loading rate from external tectonic forces and S(t) is the accumulated stress release from earthquakes within the region over the period (0, t), i.e. S(t) = ti <t Si , where t i and S i are the origin time and the stress release associated with the ith earthquake. The result is a piecewise linear Markov process. In order to fit the model to data we need to estimate the value of stress released during an earthquake. Hence we need a relation between the observed quantity of magnitude, and our underlying variable of ‘stress’. Kanamori & Anderson (1975) show that the magnitude M is proportional to the logarithm of the seismic energy E released during an earthquake according to the relation M = 2 log10 E + constant. If we suppose that the ‘stress drop’ during 3 an earthquake is proportional to some power 2η/3 of the released energy, i.e. S ∝ E 2η/3 , then we have the formula S = 10η(M−M0 ) , (2) where M 0 is the normalized magnitude. The classical stress, or Benioff strain (Benioff 1951), representation has η = 0.75, the value used in previous work. A value of η = 1.5, on the other hand, results in our ‘stress’ corresponding to seismic moment (see also Main 1999, for a discussion of proxy measures of strain). The probability intensity of an earthquake occurrence is controlled by a hazard function (x), with the interpretation that the probability of an event occurring in the time interval (t, t + ) is approximately [X (t)] for small . There are a wide range of possibilities for the function . Obviously, it must be non-decreasing. A constant independent of x would result in a random (Poisson) model of occurrences. Using 0 x ≤ xc (x) = (3) ∞ x > xc produces the time-predictable model (Shimazaki & Nakata 1980), supposing a fixed crustal strength x c . An effective compromise (Zheng & Vere-Jones 1991, 1994) between these extremes of behaviour is the form (x) = exp(µ + νx). It also represents the behaviour that might be expected from a region with a locally heterogeneous strength. One might also consider that local stress is inhomogeneous (Shimazaki 1976). The stochastic nature of the process with an exponential hazard function is compatible with these sources of ‘noise’. We can interpret the constant µ (or rather the parameter α that replaces it, see below) as effectively a parameter to be fitted for the unknown initial value of stress, while the constant ν is an amalgam of the strength and heterogeneity of the crust in the region. This can also be understood as a form of sensitivity to risk. Statistical analysis is made feasible by treating the data in historical earthquake catalogues as a point process in time–stress space with the conditional intensity function λ(t) = [X (t)] = exp{µ + ν[X (0) + ρt − S(t)]}. (4) Obviously, X (0) can be absorbed into the other parameters, and so we obtain the form λ(t) = exp{α + ν[ρt − S(t)]}, where α = µ + ν X (0). Estimates of the parameters can then be found by maximizing the log-likelihood function (see, for example, Daley & VereJones 1988, Section 13.1). The form (4) includes the special cases: λ(t) = exp[α] (the Poisson process) and λ(t) = exp [α + βt] (the (5) which has implications when considering numerical and statistical fitting issues (Bebbington & Harte 2001). Inference and stochastic process properties have been investigated by Ogata & Vere-Jones (1984), Vere-Jones & Ogata (1984), Vere-Jones (1988), Zheng (1991) and Borovkov & Vere-Jones (2000). The model is similar in concept to the elastic rebound model of Knopoff (1971), the major practical advantage being the ability, through point process methods, to fit the model to data. Obviously, stress transfer and interaction cannot be considered in the simple stress release model, and the earlier analyses concentrated on dealing with regional differences. In particular, Zheng & VereJones (1994) found that large geographical regions give better fits to the stress release model when broken down into subunits, and further noted some hints of clustering relating to some form of action at a distance, i.e. stress transfer and interaction. Although Zheng & Vere-Jones (1991) investigated various multivariate extensions to the model, the natural and elegant extension that we shall next outline was not considered until Shi et al. (1998) and Liu et al. (1998). The evolution of stress X i (t) in the ith region can be rewritten as X i (t) = X i (0) + ρi t − θi j S ( j) (t), (6) j where S ( j) (t) is the accumulated stress release in region j over the period (0, t), and the coefficient θ i j measures the fixed proportion of stress drop initiated in region j which is transferred to region i. Here, θ i j may be positive or negative, resulting in damping or excitation, respectively. It is convenient, in dealing with a declustered catalogue (i.e. with aftershocks removed), to set θ ii = 1 for all i. The new version we shall call a linked (or coupled) stress release model (LSRM). If θ i j = 0 for all i = j, the model is reduced to an independent aggregation of simple forms as in eq. (1). Liu et al. (1999) use a slightly different form of eq. (6), replacing S ( j) (t) by (i) (i) ( j) Si− (t) = S ( j) max tk : tk < t , (7) k (i) where {t k } are the occurrence times in region i. Thus, events in other regions j have no transfer effect on region i until the occurrence of a subsequent event in region i. In a somewhat similar fashion, Imoto et al. (1999) introduce a time delay into the transfer by replacing S ( j) (t) by ( j) S D (t) = S ( j) (t − td ), (8) in order to produce periodic-type behaviour. We shall assume each region to have an exponential risk function, with differing parameters indicating different tectonic properties by region. In other words, the strength (earthquake triggering condition) and tectonic loading rate can differ in each seismic region. Thus we obtain a point process conditional intensity function ( j) λi (t) = [X i (t)] = exp αi + νi ρi t − θi j S (t) , (9) j for each region i, where α i (=µi + ν i X i (0)), ν i , ρ i and θ i j are the parameters to be fitted. We choose to parametrize the intensity in this form because it is more amenable to physical intuition. The seemingly excess parameters are in response to fixing θ ii = 1 for C 2003 RAS, GJI, 154, 925–946 The linked stress release model all i. A simpler parametrization (Liu et al. 1998) can be recovered by setting bi = ν i ρ i , cij = θ i j /ρ i , yielding ( j) ci j S (t) , (10) λi (t) = exp ai + bi t − j for each region i. Estimates of the parameters are found by numerically maximizing the log-likelihood T2 N log L = log λ(ti ) − λ(t) dt, (11) i=1 T1 where the interval (T 1 , T 2 ) contains events at times t i (i=1, 2, . . . ,N ): T 1 < t 1 < t 2 < . . . < t N < T 2 . See Appendix A1 for more details. We note that the linked model provides the possibility of creeping faults, with ρ i < 0, which are reloaded by transfer from other faults rather than tectonically. Emphasis has shifted away from an exact prediction of earthquakes to estimation of probabilities of future events (see, for example, Aki 1989; Vere-Jones 1995, 1998; Kagan 1997b). Results of this type are usually indicated by the term forecasting, and can be obtained from the (linked) stress release model by means of repeated forward simulation (see Appendix A2) and averaging. Lutz & Kiremidjian (1995) use this technique with a generalized semiMarkovian process model for the northern San Andreas. Given a history of events (generally occurrence time, magnitude and region), one estimates the time-varying intensity from the fitted model and the history by means of eqs (4) or (9), and generates the time to the next event. After assigning a magnitude to this event, it is added to the history, and the next interval found. Note that we do not refit the model parameters. After many repetitions, we will have an estimate of the probability of occurrence in a time, magnitude and space (i.e. region) window. This can be compared with ‘null hypothesis’ forecasts (see Stark 1997) obtained in a similar way from the Poisson, characteristic (Working Group on California Earthquake Probabilities 1995) or clustering (Kagan & Jackson 1991) models. Note that our high-magnitude cut-off and deliberate discarding of aftershocks will minimize the difference between these alternatives. The result then reflects the coupling and rebound behaviour around which the linked stress release model is built (see Lu et al. 1999a, for an example). 1.2 Review of applications The stress release model (4) has been applied to historical data from the Kamakura region of Japan (Vere-Jones 1978), north China (Vere-Jones & Deng 1988; Zheng & Vere-Jones 1991, 1994; Zhuang & Ma 1998), southwest China and Taiwan (Zhuang & Ma 1998), central Japan and Persia (Zheng & Vere-Jones 1994) and southwest Japan (Imoto 2001). The repeated investigations of north China were driven by incremental improvements in the quality of data, and by subtle changes to the identification of regions, which we shall examine later. The emphasis in many of these investigations was to identify statistically distinct regions, in the sense that the best-fitting models, eq. (4), estimate different (as justified by AIC) parameters for the regions. It was concluded that divisions into four regions (north China, Japan), and two regions (Taiwan) were justified. Persia proved more difficult, resulting in a division into three regions based on an aggregation of smaller regions proposed by Ambraseys & Melville (1982), but with considerable reservations concerning the fit, particularly to earlier data. Several of the Japanese regions also displayed a C 2003 RAS, GJI, 154, 925–946 927 poor fit to the stress release model when compared with the simpler Poisson with trend model. The conclusion was reached that, provided the data are reasonably complete, the stress release model is superior statistically to the Poisson and Poisson with trend models. Zhuang & Ma (1998) also simulated the model forward to forecast a quiescent period of 30–50 yr for the easternmost part of north China. Zheng & Vere-Jones (1991) fitted different magnitude ranges by independent stress release processes, investigating the possibility of making the stress drop (i.e. the magnitude) dependent on the level of stress, with inconclusive results. Consequently, subsequent authors have assumed that the magnitude can be assigned independently (see, however, Jaumé & Bebbington 2000). Imoto (2001) examined the Nankai sequence of eight to ten earthquakes, and showed that the stress release model produced a superior fit to renewal process models. By formalizing a scale based on average AIC reduction, sensitivity of the model to the value of η in the stress–magnitude relation (2), missing events and magnitude perturbation were investigated. Liu et al. (1999) revisited the north China data using the linked model (6) with the modification (7). Based on a division into two regions (the Ordos Plateau and Hebei plain/Tanlu seismic belts), they concluded that there was statistically significant interaction between them, of an inhibitory nature. Unfortunately, in maximizing eq. (11), the summation term was calculated using the stress modification (7) while the integral term appears to have been calculated using the unmodified relationship (6). Lu & Vere-Jones (2000) found that the linked model, eq. (9), provides almost no improvement here over a collection of independent stress release models, attributing this to the intraplate collision seismic zone being of less complexity than a plate boundary subduction zone. We shall revisit this issue in greater detail later, in our investigation of regionalization effects. Lu et al. (1999a) revisited the Japanese data, but were forced to completely revise the regions in the light of geophysical information. The regions were again found to be heterogeneous, and not necessarily best fitted by the stress release model. It was found that the Median Tectonic Line region was independent of the other three regions, while events in the Fossa Magna/Sagami Trough had an excitatory effect on the Chubu/Kinki Triangle and Nankai Trough regions, as did the Chubu/Kinki Triangle region on the Fossa Magna/Sagami Trough. These results were shown to be consistent with previous models (see, for example, Kanamori 1972; Shimazaki 1976; Kanaori et al. 1993, 1994; Pollitz & Sacks 1997), and some additional quantification of the results was provided. Imoto et al. (1999), using eq. (8), showed that periodic seismicity in the Kanto region of central Japan could best be explained by a linked stress release model, with a delay in the effect of the Kasumigaura cluster on the Kinugawa cluster. Bebbington & Harte (2001) used the Taiwan data of Zhuang & Ma (1998) to examine the statistical behaviour of the linked stress release model itself. The two region (divided by the Eurasian– Phillipine Sea Plate boundary) model with significant interactions (both inhibitory) between the regions provided an excellent testbed. They outlined a number of complementary procedures for examining the robustness of the model, and testing the significance of predicted interactions. The study concluded that although the damping effect of the Phillipine Sea Plate events on Eurasian Plate events was significant, the converse effect was less certain. The linked stress release model has also served as a basis for a stochastic model of aftershocks (Borovkov & Bebbington 2003). 928 M. Bebbington and D. Harte Having one ‘region’ generate mainshocks, and a second generating aftershocks, reproduced the result of Dieterich (1994) for the aftershock decay rate, with the additional property that, as with the stress release model generally, the parameters can be fitted statistically to the data. Lu & Vere-Jones (2001) fit the simple stress release model (4) to synthetic catalogues generated by the model of Ben-Zion (1996). Four levels of disorder in distribution of fault zone strength are considered: uniform properties (U), a Parkfield-type asperity (A), fractal brittle properties (F) and multisize-scale heterogeneities (M). As measured by AIC (relative to the number of events) and simulation, the degree of regularity or predictability (Vere-Jones 1998) in the fitted stress release models follows the order U, F, A, M, in agreement with the results of the pattern-recognition techniques used by Eneva & Ben-Zion (1997a,b). The fit was poorest for the asperity model (A), which possesses a feature (a fixed region of high strength) not catered for by the simple stress release model. It was suggested that a two-region linked model might be appropriate. Lu et al. (1999b,c) fit the linked stress release model to data obtained from a synthetic model of crack coalescence and an elastic block lattice, in order to demonstrate the existence of ‘long-range interactions’ in the medium. The linked stress release model can also be used as a validation tool for complex synthetic earthquake models such as those of Rundle (1988a,b), Ben-Zion (1996), Bebbington (1997) and Shi et al. (1998), since the output from the synthetic model should be statistically similar to the historical data. The linked stress release model provides a measure of this similarity through the estimated parameters, an idea used by Bebbington et al. (1998) to validate the synthetic seismicity model of Bebbington (1997) for the Middle America Trench. A possible further step in investigating synthetic models is to compare the fitted stress with the internal state of the synthetic model (Shi et al. 1998; Liu et al. 1999). 2 FITTING Although the linked stress release model has been applied widely, there has been little attention paid to certain technical and systemic issues concerning its use. Bebbington & Harte (2001) made a beginning, by investigating the sensitivity and robustness of the fitting procedure by means of Monte Carlo simulation, numerical analysis and residual point processes. Below we summarize this work and the idea of information gain (Vere-Jones 1998), and consider some additional issues affecting the implementation of, and conclusions drawn from, the linked stress release model. We will usually be considering various possible models, with differing (numbers) of parameters, for a given set of data. The choice of the best model in the sense of justified parameters will be based on the Akaike information criterion (AIC), which is defined as AIC = −2 log L̂ + 2k, (12) where log L̂ is the maximum likelihood for a given model and k is the number of parameters to be fitted in the model (Akaike 1977). This represents a rough way of compensating for the effect of adding parameters, and is a useful heuristic measure of the relative effectiveness of different models, in avoiding overfitting. If we systematically test the various families of interactions, the best model is that with the smallest AIC value. A difference of 1.5–2 in AIC is usually considered significant. However, we should note that the AIC values obtained here should be used with some caution, since the amount of historical earthquake data is not very large and the distribution of the log-likelihood is probably not chi-squared (Ogata & Vere-Jones 1984; Wang et al. 1991). Given the small size of the data sets suitable for the model, the application of asymptotic results is problematic. 2.1 Model identification We shall follow a convention of denoting models by their possible (unrestricted) parameter set. The full, unrestricted model is denoted by (α, ν, ρ, :θ ii = 1). For example, the model with uniform tectonic input across regions (which is the ‘symmetrical model’ of Liu et al. 1999) is (α, ν, ρ = ρ1, :θ ii = 1, and the aggregation of independent stress release models for each region is (α, ν, ρ, = I). For a model with J regions, there are potentially J (J + 2) parameters, and hence 2 J (J +2) possible models. A systematic method of reduction to the ‘best’ model is obviously desirable. The first step is to consider the regions individually, using the standard stress release model (4). A ‘baseline’ model (and associated AIC) is taken as the aggregate of the best individual models. The question then becomes whether any regional interactions can improve the likelihood sufficiently to offset the cost of the additional parameters on the AIC. If so, we consider those parameters (and the interaction they represent) justified by the data. The next step is to consider the question of tectonic input. Evidence in this regard can be obtained by comparing the models (α, ν, ρ, = I) and (α,ν, ρ = ρ1, = I). Only if the former has a superior AIC should differing rates of tectonic input be accepted. Even in this case there may be groups of regions that can profitably be assigned equal inputs. Once the number of input parameters in the linked model are identified, it remains only to consider the matrix of regional transfers, . Here the procedure becomes more of an art, leavened by geophysical intuition. One question, in the case where not all regions are neighbours, is to test whether there are long-range interactions. Another is to test whether individual regions (or groups of regions) are independent of the remainder. Having thus eliminated certain broad categories of model, it is then possible to fit the remainder. This should provide a number of models with close to the smallest AIC, which have similar interactions (see, for example, Lu et al. 1999a). For an example of the model selection algorithm outlined above, let us consider Ambraseys & Melville (1982), who provide extensive data for the Persian region. They suggest that the complicated tectonics of the region can be broken down into seven seismic zones: northern (N), eastern (E), Zagros (Z), part of Azerbaijan (A), Kapet Dagh (K), the central desert (C) and Makran-Baluchistan (M). Unfortunately the data are likely to be incomplete for M < 6.5 until the Qajar period (1794–1924) (Ambraseys & Melville 1982), and so we follow Zheng & Vere-Jones (1994) in limiting ourselves to events of M ≥ 6.0 after 1780, as listed in Table B1, and displayed in Fig. 1. The data should be a complete record of occurrences for the time period and magnitude interval, but the location data are less accurate, particularly outside of region Z. However, our regions are sufficiently broad to be able to ignore this. The question of magnitude accuracy in the linked stress release model will be examined later. As a result of the amount of data available, we need to combine regions to reduce their number. Combining regions E, C and K, and regions N and A, and discarding region M because there are no events prior to 1934, we are left with three regions. These regions are mutually adjacent. Zheng & Vere-Jones (1994) found that, for 1780–1980, a combination of stress release models for each of the three regions (ECK, NA and Z) fit the data reasonably well. Repeating the exercise for the C 2003 RAS, GJI, 154, 925–946 40 The linked stress release model 929 A 38 K 34 C 32 Latitude 36 N Z Region 1 Region 2 Region 3 Symbol size scales with magnitude M 26 28 30 E 45 50 55 60 Longitude Figure 1. Persian earthquakes, 1780–1994. The region boundaries are those of Zheng & Vere-Jones (1994). Table 1. Persia: aggregate of independent SRMs. AIC = AIC − AIC P is the improvement in AIC from the Poisson process. Region αi νi ρi AIC 1 (NA) 2 (ECK) 3 (Z) −1.041 −4.333 −5.223 0.090 0.055 0.090 0.522 0.730 0.454 −1.4 −10.2 −10.2 Table 2. Persia: full (unconstrained) LSRM. Region αi νi ρi θ i1 θ i2 θ i3 1 (NA) 2 (ECK) 3 (Z) −1.152 −4.205 −6.082 0.088 0.088 0.199 0.537 0.285 0.052 1 −0.221 −0.294 −0.174 1 −0.283 0.394 −0.879 1 σ̂i j |ĉi j | = ⎧⎛ 0.25 ⎪ ⎪ ⎪ ⎪⎜ ⎪ ⎪ ⎝2.86 ⎪ ⎪ ⎪ ⎪ ⎨ 2003 RAS, GJI, 154, 925–946 0.04 ⎟ 1.37⎠ 0.05 (full model) (13) (common ρi ), where σ̂i j is the estimated standard deviation of cij (the indicates that the value could not be calculated). Thus, statistical considerations argue for retaining θ 23 , θ 32 and possibly θ 31 among the transfer terms. Fitting all of the 24 = 16 remaining possible models produced a best model (α, ν, ρ = ρ1, = ), where 1 0 C 0.98 ⎞ 1.85 ⎛ ⎞ ⎪ ⎪ 0.32 1.77 2.75 ⎪ ⎪ ⎪ ⎪ ⎜ ⎟ ⎪ ⎪ ⎝11.9 0.32 0.86⎠ ⎪ ⎪ ⎩ 1.66 0.32 ⎛ period 1780–1994, we obtain the results in Table 1. This ‘baseline’ model (α, ν, ρ, = I) has an AIC of 516.8 with nine parameters. If instead we fit the full model (α, ν, ρ, :θ ii = 1), we obtain the results in Table 2. The full model has an AIC of 522.2 with 15 parameters, far worse than the baseline model. At this point we investigate the use of a common tectonic input, with which the nature of the region is not incompatible. The model (α, ν, ρ = ρ1, ) produces an AIC of 523.2 with 13 parameters. The AICs of the two linked models indicate that we need to remove at least three or four further parameters in order to match the AIC of our baseline model in Table 1. Note that the model (α, ν, ρ = ρ1, = I) has an AIC of 518.9, higher than that of the baseline model. Checking the Hessian (see Appendix A1) of the matrix C = (cij ) in eq. (5), we find that the coefficients of variation in the two linked models are 2.27 ⎜ = ⎝0 1 0 0 0 ⎞ ⎟ θ23 ⎠ (14) 1 and the fitted parameter values are given in Table 3. The AIC is 514.0 with eight parameters, which is sufficiently better than the baseline model. We note that the interactions are limited to events in region Z exciting those in region ECK. This means there is an apparent alternation, depending on the activity level in region Z, in activity between regions ECK and NA. This accords well with the observation of Ambraseys & Melville (1982) that there is an apparent alternation of activity between regions E and N. Fig. 2 illustrates the effect, which is less dramatic than the damping effect across the Taiwan Plate boundary found by Bebbington & Harte (2001). 6.5 0.2 Magnitude 7 0.3 Region 1 0.0 6 0.1 Conditional Intensity 7.5 M. Bebbington and D. Harte 0.4 930 1800 1850 1900 1950 2000 1900 1950 2000 1900 1950 2000 7.5 6.5 0.2 Magnitude 7 0.3 Region 2 0.0 6 0.1 Conditional Intensity 0.4 Time 1800 1850 7.5 6.5 0.2 Magnitude 7 0.3 Region 3 0.0 6 0.1 Conditional Intensity 0.4 Time 1800 1850 Time Figure 2. Persia: fitted intensities; model (α, ν, ρ = ρ1, = ) (dotted line), model (α, ν, ρ, = I) (dashed line). Event magnitudes are indicated by the solid vertical lines. Table 3. Persia: LSRM with minimal interactions. Region αi νi ρi θ i1 θ i2 θ i3 1 (NA) 2 (ECK) 3 (Z) −0.959 −4.197 −4.894 0.090 0.073 0.066 0.515 0.515 0.515 1 0 0 0 1 0 0 −0.592 1 2.2 Model evaluation Statistical investigation of the model relies on various tools. We can use the AIC to distinguish between competing models. A residual analysis will indicate whether the chosen model is systematically inadequate, and the robustness of the model can be evaluated by Monte Carlo simulation and refitting. The latter, together with the Hessian from the numerical optimization of the log-likelihood, can also provide a measure of significance of the estimated parameters. Large standard errors of parameter estimates derived from the Hes- Table 4. Persia: estimated standard errors for best linked model. Parameter Estimate Std error a1 = α1 a2 = α2 a3 = α3 b1 = ν 1 ρ b2 = ν 2 ρ b3 = ν 3 ρ cii = 1/ρ c23 = θ 23 /ρ −0.96 −4.20 −4.89 0.046 0.038 0.034 1.94 −1.15 0.51 0.82 1.04 0.021 0.014 0.012 0.12 0.41 sian indicate that an associated effect may not be well founded. The standard errors determined from the Hessian for the final Persian model in Table 3 are shown in Table 4. The Hessian also provides estimates of the correlation between the parameters, which may warn C 2003 RAS, GJI, 154, 925–946 The linked stress release model of possible systematic deficiencies associated with overfitting in the model. Although the Hessian matrix provides estimates of the standard errors and correlations, whether or not the Hessian can be evaluated at all seems to be critically dependent on certain elements of the numerical fitting procedure. As a check on the results, and as an alternative, we can simulate the fitted model and refit the model to the simulated catalogue(s), in order to obtain information concerning the distributions of the parameters. We can also attempt to fit the simulated data by alternative models, perhaps with fewer parameters, in order to establish the credibility of the model. Bebbington & Harte (2001) found, at least in the two-region case, that the correlations for the parameters from the simulated data show much the same pattern as observed from the Hessian. This provides, in some sense, a validation for the estimated Hessian, and hence the variances inferred from it. Furthermore, simulation of a less than optimum model produced quantitative features significantly different from the historical data. The numerical estimates of the parameter variances were shown to be usable, and more importantly, verifiable, by means of Monte Carlo simulation and refitting. Ogata (1988) provides a paradigm for assessment of the utility of a stochastic model. The evaluation is on two levels, using AIC and residual analysis. The latter is used to identify systematic deviation of the data from the fitted model, which would indicate a significant factor underlying the data which is not included in the model. If ( j) we suppose that the (regional) point-process data {t k } (for each region j) is generated by the intensity (t) = (λ1 (t), λ2 (t), . . .), we can define the compensator t ψ j (t) = λ j (s) ds. (15) 0 Then, provided that ψ(∞) = ∞, Aalen & Hoem (1978) show that ( j) ( j) ( j) the {u k }, where u k = ψ j (t k ), are transformations to the time locations of stationary Poisson processes of intensity one. Standard tests for stationarity, independence and exponentially distributed interevent times can then be used to determine whether these residual processes are, in fact, Poisson, or if the deviation from the model is, in fact, more than just random noise. The residual processes for the final model in Table 3 did not differ significantly from a Poisson process of unit rate. Interestingly, neither did the residual processes from the independent model in Table 1, which explains the similarity of the intensities in Fig. 2. A method of scoring earthquake forecasts provided by a given model is to calculate the information gain (Vere-Jones 1998). Given a point process model for earthquake occurrence, it is used to simulate (see Appendix A2) a long synthetic occurrence of events. The complete simulated time period is then divided into bins of equal length, and the synthetic catalogue is then used to estimate occurrence probabilities { p i } for each bin (see Appendix A3), where p i is the estimated occurrence probability for the ith ‘bin’. A simple binomial score is given by [Yi log pi + (1 − Yi ) log(1 − pi )], (16) B= i where Y i = 1 if at least one event occurred in the ith bin or 0 otherwise. This rewards high forecast probabilities for bins where an event occurs, and low forecast probabilities where events do not occur, and penalizes false alarms and missed events. The binomial score B for the ‘fitted’ model can be compared with the (reference) binomial score obtained by a ‘null’ Poisson process, B, by calculating the difference B − B. This is referred to as the information gain. It can be shown (see also Kagan & Knopoff 1977) that this differ C 2003 RAS, GJI, 154, 925–946 931 ence is simply the difference in the log-likelihood of the fitted model to that of the null model (see Appendix A3 for further details). Generally the log-likelihood difference is used in the context of model goodness of fit. That is, when the log-likelihood is calculated using the observed data, it is interpreted as a measure of the goodness of fit of the given model. However, since the log-likelihood is the same as the information gain, it also has an interpretation of quantifying the model predictability when is it calculated using simulated data from the given model. Hence, a greater information gain (relative to the null model) describes how much more predictable the fitted model is than the null model. This is useful when one wants to compare the predictability of various competing models. However, in order to make such comparisons, one obviously needs to use the same timescale (e.g. days, years, etc.) for all compared models, and to compare the information gain per unit time or number of events if the processes are observed over time intervals of different lengths. If a given process is simulated for a total time T, we can calculate the mean information gain per unit time (see Vere-Jones 1998) as ρT = B−B . T (17) The analyses can be elaborated further by including magnitude intervals and time intervals. 3 SENSITIVITY Considering the number of works reviewed above, a critical analysis of the model itself is overdue. We will attempt to fill this void by discussing the sensitivity of the model to perturbations in regionalization, and the input earthquake data. An illustration will follow in the next section, examining the sensitivity of the model when fitted to data from north China. We will also consider the question of what power of the seismic moment the notional ‘stress’ variable should be. 3.1 Determination of regions A major focus of the linked stress release model is the identification of appropriate regions. An objective method that might be used to define regions is by the application of some clustering algorithm, with boundaries drawn equidistant between neighbouring clusters. This should recognize implicitly the geophysical structure of the regions, subject to the completeness of the data. However, this would add an unknown, but substantial, number of degrees of freedom to the fitted model. Given the already large number of adjustable parameters, relative to the amount of data, this is undesirable. Alternatively one can attempt to use known tectonic features as the basis of the regions, in order to investigate possible coupling between these features, which can thus serve as a null hypothesis. To date (see also Papadimitriou et al. 2001), such regionalization has been on the basis of geophysical structure, ideally corresponding to major tectonic features such as seismic belts (Liu et al. 1999), plate boundaries (Bebbington & Harte 2001), or a mixture of both (Lu et al. 1999a). Published studies include plate boundaries (Japan, Taiwan, Persia, New Zealand), and intraplate seismicity (north China). We should note that, as most events occur along fault lines and other tectonic features, these features must form the interior of investigated regions, rather than performing the perhaps more natural role of region boundaries. Ideally the region boundaries should pass through areas of low seismicity. In other words, we use ‘seismic’, rather than ‘geological’, regions. 932 M. Bebbington and D. Harte Table 5. Size of historical data sets. Parameter numbers in parentheses indicate that the fitting did not reliably converge. Locality Taiwan North China North China New Zealand Kanto Middle America Trench Persia North China Japan California Southern California Regions Parameters Observations Source 2 2 2 2 2 3, (4) 3 4 4 5 6 8 7, 8 8 8 8 13, (18) 15 18 21, (24) 23 32, (33) 43 65 66 (?) 65 66, 79 45 89 65 76 180 82 Bebbington & Harte (2001) Liu et al. (1999) Lu & Vere-Jones (2000) Lu & Vere-Jones (2000) Imoto et al. (1999) Bebbington et al. (1998) Section 2.1 above Section 4 below Lu et al. (1999) Bebbington (2001) (unpublished) Bebbington (2001), unpublished Another aspect that arises when there are more than two regions is geometry, and the possibility of interaction between nonadjacent regions. Along linear boundaries such as the Aleutians (Li & Kisslinger 1985), the Middle America Trench (Ward 1991), or the North Anatolian fault (Stein et al. 1997), it is expected that stress propagates to all and only neighbouring segments. On the other hand, in a complex plate collision such as Japan, there may be neighbouring regions without interactions (Lu et al. 1999a). One might even envisage interactions between non-adjacent regions. The question of the number of regions is constrained by the fact that they must include sufficient observations to allow the numerical parameter-fitting procedure to converge. Within this constraint, it is feasible to test whether the model fit can be improved statistically by combining or dividing regions. 3.2 Catalogue issues In using historical data, there may be problems with accuracy and completeness of the catalogue. Pre-instrumental catalogues, where the magnitude is estimated from the intensity of shaking, may well have magnitude errors of 0.5 or greater, since errors of this size are common in early instrumental catalogues (Field et al. 1999; Kagan 2002). Secondly, although analysis has been limited to those (portions of) historical catalogues considered complete, there may be missing events. Because it is implicit in the formulation that earthquakes lower the regional stress, and hence reduce the probability of immediately subsequent events, the model is one for main-sequence events only. However, it is also those main events which carry the majority of tectonic information. Smaller events usually occur in clusters, resulting primarily from perturbations of a near critical system, and hence it is difficult to extract meaningful information from them. Other aspects such as (quasi-)periodicity and aftershock sequences may be susceptible to analysis using an extended version of the stress-release idea (Imoto et al. 1999; Jaumé & Bebbington 2000; Schoenberg & Bolt 2000; Borovkov & Bebbington 2003). As the model attempts only to represent main-sequence events, aftershocks must be carefully identified and removed from the data before numerical fitting can begin. This is usually done using some form of space–time windowing (Gardner & Knopoff 1974), such as that used by the M8 procedure (Kossobokov 1997). Lu & Vere-Jones (2001) show, at least for synthetic data, that these smaller events play a relatively minor role in determining the large-scale behaviour of the system. Any such identified aftershocks can have their contribution to the stress release added to that of the mainshock. Because of the form of the intensity (9), near contemporaneous events exert considerable influence on the parameters. Should these events be in interacting regions, they may dominate the fitted model interaction. The investigation of the Japanese data (Lu et al. 1999a) provided two contrasting experiences. The Ansei twin events (1854 December 23 and 24), being in the same region, could be treated as a single event (Seno 1979), without which the numerical fitting procedure would not converge. On the other hand, the observation of Kanaori et al. (1993) that events in the Chubu/Kinki triangle region were followed coevaly or slightly afterward by events on the Median Tectonic line was rejected as an interaction by the model, perhaps due to the inclusion of additional regions in the study. Lu & Vere-Jones (2000) found that the closeness in time of the Buller (1929, M = 7.8) and Napier (1931, M = 7.8) events, the largest in their data set, dominated the fitting of a two-region model for New Zealand earthquakes. The example from north China in the following section will further illustrate this sensitivity to individual events. The growing number of studies using the linked stress release model allow us to look at the amount of data required for the fitting to converge. Table 5 shows the studies to date. The Southern California data did not allow for the desired number of interaction parameters, given the fault network involved. Sometimes the details of the data (particularly in terms of coverage and near-contemporaneous events) is more important than the amount. 3.3 Magnitude–stress relation We remarked in the introduction that the stress drop is proportional to some power of the released energy, leading to the relation (2). Zheng & Vere-Jones (1991) examined the sensitivity of η with the north China data in terms of the AIC of the resulting fit. For a range of 0.5 ≤ η ≤ 1.5 the fit was little affected by the choice of η, with η = 0.75 being the optimum. However, this is a data set with a magnitude cut-off of 6.0. One can envisage problems if this magnitude cut-off is not very high. On the other hand, Imoto (2001) reported that η plays a crucial role in the fit to the Nankai earthquake sequence, with smaller η being preferred. Schoenberg t & Bolt (2000) use as their ‘long-term correcting’ intensity term 0 ν d N (s), which corresponds to a value η = 0. However, they do not decluster the catalogue, or apply a magnitude cut-off, and so the absence of a magnitude-weighted decrease in the intensity is compensated for by the larger number of aftershocks for large events. We also observe that using η = 1.5 allows the input parameters ρ to be determined C 2003 RAS, GJI, 154, 925–946 The linked stress release model 933 6.4 0.0010 0.0008 6.2 6 0.0008 Mc 5.8 0.0006 5.6 0.0004 5.4 0.0012 0.0002 5.2 0.0010 5 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 eta Figure 3. Contour plot of the trend b in the SRM with varying magnitude cut-off and stress as power of energy (η = M −1 log10 S). exogenously from geodetic measurements, but is likely to magnify the sensitivity of the model to errors in magnitude. In order to examine the relationship between η and the magnitude cut-off M c , we conducted the following experiment. Taking the PDE catalogue for 30◦ N ≤ latitude ≤44◦ N and 112◦ W ≤ longitude ≤126◦ W, a region including California and parts of neighbouring states, we removed the aftershocks via the M8 procedure, leaving a catalogue with 202 mainshocks with M ≥ 5.0. We then proceeded to vary M c from 5.0 to 6.4 in steps of 0.2. The preferred value of η, by AIC, varied from 0.5 to 0.6. In no case, however, was the fit significantly better than for η = 0.75, although it was significantly better than with η = 1.5 for M c ≤ 5.6. As one would expect of the parameters in eq. (5), a decreases with increasing M c and c decreases with increasing η. Oddly, in the California data, there appears to be a discontinuity around M c = 6, resulting in a smaller than expected value of the trend parameter b, as shown in Fig. 3. By examining AIC/N (Imoto 2001), we can consider the optimum cut-off M c . We find that the lower cut-offs are preferred, irrespective of η, when the numerical fitting procedure converges. High η and low M c can lead to instability. A clue as to what is happening is provided by Jaumé & Bebbington (2000) who incorporated the Kagan distribution (Vere-Jones et al. 2001, see also Kagan 1997a) Pr(X > x) = 1 + x −α L e−x/U (18) into the simulation of magnitudes from the stress release model. When U is dependent on the stress level of the process and η = 1.5, they found that the stress release model produces accelerating moment release behaviour (see Jaumé & Sykes 1999, and references therein). This is the opposite to the behaviour that is observed above, which corresponds to the process trying to minimize variation by treating all events as being of similar size, which is facilitated by providing more, more evenly scattered, events. This phenomenon was recognized for η = 0.75 by Zheng & Vere-Jones (1994), who noted that the output from the stress release model is clustered, with quiescent periods following larger events, rather than being periodic. C 2003 RAS, GJI, 154, 925–946 Data limitations, numerical stability and a desire for consistency thus incline us towards η = 0.75, particularly with medium- to longterm hazard forecasting in mind. Jaumé & Sykes (1999) find a similar phenomenon with the accelerating moment release model, noting that the use of a restricted magnitude interval appears to favour the use of Benioff strain, a state of affairs that usually also applies to the stress release model. 4 E X A M P L E : N O RT H C H I N A For details of the data, geology and tectonics we direct the reader to Vere-Jones & Deng (1988), Zheng & Vere-Jones (1991, 1994) and Zhuang & Ma (1998). Briefly, the data are considered complete (Gu 1983a,b), where M is the instrumental magnitude estimated from historical evidence of the intensity and extent of shaking. The catalogue has been declustered, but the given mainshock magnitude is an equivalent magnitude calculated by adding the stress drops for the sequence and inverting eq. (2) (Vere-Jones & Deng 1988). 4.1 Regionalization and model evaluation Geophysical considerations (Eguchi 1983; Li & Liu 1986) argue for there being two or four distinct regions. Three classifications of events into regions were considered: (A) that of Zheng & VereJones (1994), (B) that of Zhuang & Ma (1998), which are based on slightly different geophysical interpretations, and (C) via a clustering algorithm using a centroid criterion. The latter is included, even though using the data in this way adds several degrees of freedom to the model, in order to see if there is any measurable effect. The divisions are broadly similar, differing by between two and six events (out of 65). This provides an opportunity to test the utility of a clustering algorithm as a determinator of regions, and to examine the sensitivity of the models fitted, in both quantitative and qualitative respects. Secondary objectives are to check the results of Liu et al. (1998), and to illustrate the systematic fitting procedures elucidated above. The data are presented in Table B2 and M. Bebbington and D. Harte 34 36 Latitude 38 40 42 934 32 Region 1 Region 3 Region 2 Region 4 Symbol size scales with magnitude 105 110 115 120 Longitude Figure 4. North China earthquakes 1480–1996. The crosses mark events that are allocated to different regions in the different data sets. Table 6. North China: SRM parameters and fits by region and classification. A plus indicates combined regions. The asterisk indicates the SRM was not the best fit. The AIC for the Poisson process is denoted AIC P . Reg. Class. N α ν ρ AIC AIC P 1 2 All A B C A B C A&B C A B C A B C All 20 12 14 11 21 19 19 12 15 32 34 31 33 31 34 65 −4.728 −2.797 −2.666 −2.751 −3.940 −4.051 −3.811 −5.199 −5.438 −3.154 −3.049 −3.121 −3.450 −3.539 −3.472 −2.462 0.033 0.042 0.029 0.030 0.083 0.070 0.079 0.053 0.060 0.025 0.027 0.026 0.024 0.022 0.023 0.010 0.487 0.138 0.121 0.075 0.200 0.189 0.187 0.350 0.370 0.626 0.636 0.612 0.540 0.540 0.553 1.176 168.5 115.1 129.9 107.2* 173.0 163.5 163.0 112.0 128.8 241.7 251.1 235.5 242.3 233.6 248.1 397.7 172.1 116.3 131.1 108.7 178.5 165.5 165.3 116.3 138.2 244.1 255.1 238.5 249.6 238.5 255.1 401.6 3 4 1+2 3+4 All Table 7. North China: composite AICs for differing number of regions. The AIC for the corresponding Poisson process model is given in parentheses. Number of regions 4 2 1 Classification A B C 568.5 (583.2) 573.4 (583.3) 575.4 (579.3) 573.8 (585.0) 576.1 (585.1) 577.1 (581.0) 567.5 (584.3) 574.6 (584.6) 576.7 (580.6) Table 8. North China: AICs for various linked models. I I I+D I+D I+L+U I + L+U ρ k A B C ρ1 ρ ρ1 ρ ρ1 ρ 9 12 15 18 21 24 578.5 568.5 575.1 578.5 576.1 581.0 583.1 573.8 578.6 582.1 580.0 583.6 583.7 567.5 572.5 575.6 583.7 584.0 Ni log(Ni / j∈R N j ) to the log-likelihood, and hence to the AIC, where R is any combined region. In addition, we must add 4 to the AIC for the two additional parameters for the region in the two-region model (one additional parameter in the four-region model). This produces the AICs in Table 7, from which we see that division into four regions is justified by the consequent improvement in AIC, the regional parameters being sufficiently different. We can now turn to the linked stress release model, with the objective of bettering the AICs in the first row of Table 7. The fitted AICs are in Table 8. The matrix D = (d ij ) is such that d ij = 0 if |i − j| = 1, and L and U are strictly lower and upper diagonal, respectively. We see that none of the linked models improve on the AICs of the aggregate of four independent stress release models. Note that a common ρ is decidedly rejected for the independent model, but accepted for the interacting models, indicating that variations in activity are better explained by differing tectonic inputs than by interactions. This has not been observed in other cases so far. The evidence in Tables 7 and 8 seems to be that regionalization B is significantly less in line with the observed data than either of the others. This of course supposes that the model is appropriate. R Fig. 4, identifying those events which differ in region under different classifications. First, for all classifications, all regions were best fitted by the stress release model rather than the Poisson process or Poisson with trend models. There was one exception, indicated by an asterisk in Table 6, where the Poisson with trend was marginally, but not significantly (AIC = 0.4), better than the stress release model. In general the Poisson with trend was outperformed by the Poisson process, indicating that the data set seems to be stationary. Numbering the regions 1–4 from west to east, we then considered pairing regions 1 and 2, and regions 3 and 4, to obtain two regions. Again the stress release model was the best fit, a pattern repeated when the data were considered as a single region. The details of the best stress release models are presented in Table 6. The first question we must ask is how many regions are justifiable. The AIC can provide the answer, by determining the best combined (over the regions) AIC. In order to compare models with fewer regions, we have to add the multinomial MLE term i∈R C 2003 RAS, GJI, 154, 925–946 The linked stress release model Table 9. North China: preferred interactions by regionalization. All parameter sets with an AIC within 2 of the best are given. Classification A B C AIC i:θ i = 0 AIC i:θ i = 0 AIC i:θ i = 0 570.8 571.0 571.4 571.7 571.8 571.8 571.9 572.3 572.5 146 143 145 16 1246 1456 126 1345 1346 573.1 574.2 574.6 574.7 574.8 574.9 575.0 575.1 145 134 456 45 1345 1245 1456 346 567.3 568.4 568.7 568.9 569.2 569.3 569.3 146 126 16 1456 134 136 1346 The model (α, ν, ρ = ρ1, = I+L+U) for C seems to have an AIC out of character with the remainder. This is another example of the influence of particular event pairs referred to above. In moving to regionalization C, the second of three events in 1976 is moved from region 3 to region 4, while the first and last remain in region 2. Thus in the regionalization C, a large weight (because of the short time intervals) is given to the non-neighbouring transfers. There were also two events in 1624, the first of which moves from region 4 to region 3 while the second stays in region 1 (see Table B2). Conversely, the regionalization B moves one of two events in 1618, but to the same region as the other. Because our objective here is different, we will subtly alter the procedure to find the best model, rather than follow the algorithm outlined above. The most promising interactive model seems to be (α, ν, ρ = ρ1, = I+D) and so we will ask what pattern of interactions are preferred by the three regionalizations. Setting ⎛ ⎞ 1 θ1 0 0 ⎜θ ⎟ ⎜ 4 1 θ2 0 ⎟ =I+D=⎜ (19) ⎟, ⎝ 0 θ 5 1 θ3 ⎠ 0 0 θ6 1 we will fit every possible (2 ) combination of the parameters θ 1 , . . . , θ 6 . The best AIC values, and their corresponding non-zero parameter set ( ) are given in Table 9. We see that θ 1 is the most important interaction in regionalizations A and C, followed by θ 4 and θ 6 (θ 6 and θ 4 in C). However, in regionalization B, the most important interaction is θ 4 , followed by θ 5 and θ 1 . This appears to be related to the 1618 November 16 and 1626 June 28 events. In the first case, a third successive event in region 2 requires extra transfer of stress from the previous in region 1 via θ 4 . On the other hand, these three successive events would affect region 1 too greatly if θ 1 were important. In the second case, the new region 2 event can be more easily explained by transfer from region 1 via θ 4 than as a region 3 event influenced by transfer from region 2 or 4 via θ 5 or θ 3 . The fitted input and transfer parameters, for the models in the first line of Table 9, are given in Table 10. Fig. 5 shows the corresponding residual processes, none of which demonstrate any systematic de6 Table 10. North China: best models by regionalization. Reg. A B C C ρ θ 12 θ 21 θ 32 θ 43 0.204 0.349 0.189 −1.534 −0.634 −1.624 0.140 0.429 0.242 0 0.759 0 −0.633 0 −0.857 2003 RAS, GJI, 154, 925–946 935 ficiency in the model(s) as they are not significantly different from Poisson processes of unit rate. We note that none of the models in Table 9 improve significantly on the baseline AIC given in Table 7, with the possible exception of the best model for regionalization B. Hence we must conclude that there is little significant interaction between the regions, and that the regionalization can affect the qualitative results from the model. The later conclusion is illustrated by Fig. 6 which shows the quite different fitted intensities for the best linked models for regionalizations B and C. Following Vere-Jones (1998), we simulated each of the best models listed in Table 10 in order to estimate their ‘theoretical’ information gains. Each model was simulated forward 100 times, each for 10 000 yr. The information gains for the simulated data were then calculated. The information gains per year for models A, B and C were 0.04635, 0.04619 and 0.04955, respectively. Similarly the information gains per event for models A, B and C were 0.375, 0.347 and 0.423, respectively. These measure the predictability of the model, as encapsulated in the parameter values estimated using the observed historical data. It appears that the higher information gains are derived from smaller fitted values of ρ (see Table 10), the tectonic input term and larger absolute values in the transfer matrix . Generally, larger tectonic input means that less impetus needs come from transfer. Conversely, large transfer values mean that prior events have a large effect on the likelihood of (especially immediately) subsequent events, which is exactly the source of information gain. The calculated information gains using the historical data sets were 0.02856, 0.02851 and 0.03557 per year for models A, B and C, respectively. Similarly, the values per event were 0.2272, 0.2267 and 0.2829, respectively. These, relative to the upper bounds above, indicate the goodness of fit of the model to the actual historical data. The results are consistent with the approximately equal model AICs. While Vere-Jones (1998) found that the single-region stress release model for all of north China achieves an information gain of 0.40 per success (against a theoretical maximum of 0.48), in that case there are only temporal interactions between events. In decomposing space into four regions, we seem to have already extracted a large amount of the potential information. This is in line with the results of Kagan & Jackson (2000) who find that a simple spatial clustering model outperforms the Poisson process in forecasting earthquakes. 4.2 Sensitivity to catalogue errors We will now investigate the sensitivity to magnitude error and magnitude cut-off/catalogue completeness. Since there appears to be little relevant difference between the regionalizations, we will limit our consideration to the two best models for regionalization A: the unlinked model (α, ν, ρ, = I), and the best linked model in Table 10, with AICs of 568.5 and 570.8, respectively. Both models have 12 parameters. First, we shall conduct a Monte Carlo simulation, adding an N (0, 0.252 ) or N (0, 0.52 ) error to the magnitude and refitting the model parameters. The results of 1000 simulations are given in Table 11. We see that larger errors result in a greater perturbation in the fitted parameters. The effect is greater for the linked than the unlinked model, perhaps reflecting the fact that the former is an inferior fit to the original data. On a qualitative level, the negative (exciting) transfer θ 12 appears more definite than the other transfers, in that the confidence interval confirms a non-zero value. The remaining transfers are clearly magnitude-error sensitive. On a quantitative level, we see that the model is clearly sensitive to errors in reported magnitudes. M. Bebbington and D. Harte 15 0 5 10 Cumulative Number of Events 15 10 0 5 Cumulative Number of Events Region 2 20 Region 1 20 936 0 5 10 15 20 0 5 Transformed Time 20 Region 4 15 10 0 0 5 10 15 Cumulative Number of Events 20 Region 3 20 15 Transformed Time 5 Cumulative Number of Events 10 0 5 10 15 Transformed Time 20 0 5 10 15 20 Transformed Time Figure 5. North China: residual processes for the best LSRM. Regionalizations A, B and C are indicated by circles, triangles and plus signs, respectively. 95 per cent confidence intervals for stationarity are at approximately ±3.6 (N = 12) to ±4.2 (N = 21) on the vertical axis. All the transformed interevent times were indistinguishable from exponential mean one random variables, and no significant correlations were found. C 2003 RAS, GJI, 154, 925–946 0.00 Magnitude 6.0 6.5 0.05 7.0 0.10 7.5 8.0 0.15 Region 1 Conditional Intensity 937 8.5 0.20 The linked stress release model 1500 1600 1700 1800 1900 2000 1800 1900 2000 1800 1900 2000 1800 1900 2000 8.5 0.20 Time 7.0 6.0 6.5 Magnitude 7.5 8.0 0.15 0.10 0.05 0.00 Conditional Intensity Region 2 1500 1600 1700 8.5 0.20 Time 7.0 6.0 6.5 Magnitude 7.5 8.0 0.15 0.10 0.05 0.00 Conditional Intensity Region 3 1500 1600 1700 8.5 0.20 Time 7.0 6.0 6.5 Magnitude 7.5 8.0 0.15 0.10 0.05 0.00 Conditional Intensity Region 4 1500 1600 1700 Time Figure 6. North China: fitted intensities for the best models (B and C) in Table 9. Solid vertical lines depict magnitudes of events in a given region for both regionalizations, dashed vertical lines in regionalization B only, dotted in C only. The dotted intensity is the best model for regionalization B, and the dashed the best model for C. The question of magnitude cut-off is bound up with that of catalogue completeness and declustering. At the usual magnitude cutoffs of M c = 6.0 or 6.5, aftershocks can be fairly readily identified in that different declustering algorithms are unlikely to result in different output catalogues. In any case, the quantity used in the C 2003 RAS, GJI, 154, 925–946 model is the equivalent magnitude calculated from the sum of the stress drops in a sequence. This is required by the formulation of the model, which as the intensity decreases after an event, does not otherwise allow for aftershocks. Here we will investigate the effects on the fitted linked model of changing M c . Because a high-magnitude 938 M. Bebbington and D. Harte Table 11. North China: Monte Carlo simulated magnitude errors. Shown are the original fitted parameter values, and 95 per cent confidence intervals from the simulated catalogues. Approximately 16 per cent of the simulations for the unlinked +N (0, 0.52 ) model failed to have the fitting procedure converge. Unlinked model Linked model Magnitude perturbation Fitted N (0, 0.252 ) N (0, 0.52 ) Param. Fitted N (0, 0.252 ) N (0, 0.52 ) −4.728 −2.797 −3.940 −5.199 0.033 0.042 0.083 0.053 0.487 0.138 0.200 0.350 (−6.765, −3.848) (−3.155, −2.502) (−4.330, −3.338) (−6.155, −4.673) (0.006, 0.081) (0.011, 0.104) (0.034, 0.139) (0.017, 0.170) (0.388, 0.812) (−0.005, 0.249) (0.144, 0.336) (0.189, 0.704) (−6.564, −3.618) (−3.721, −2.154) (−4.401, −2.519) (−6.332, −4.121) (0.001, 0.079) (0.004, 0.154) (0.009, 0.143) (0.007, 0.248) (0.361, 1.991) (−0.383, 0.478) (0.122, 0.649) (0.135, 1.451) α1 α2 α3 α4 ν1 ν2 ν3 ν4 ρ θ 12 θ 21 θ 43 −4.507 −2.889 −4.034 −4.807 0.017 0.048 0.088 0.061 0.204 −1.534 0.140 −0.633 (−7.610, −3.790) (−4.251, −2.675) (−4.398, −3.267) (−6.107, −4.042) (0.004, 0.076) (0.012, 0.133) (0.001, 0.123) (0.016, 0.153) (0.162, 0.597) (−4.018, −0.071) (−0.000, 0.802) (−1.860, 1.003) (−11.177, −3.512) (−4.222, −2.367) (−4.427, −2.714) (−6.491, −3.681) (0.001, 0.083) (0.001, 0.164) (0.000, 0.133) (0.004, 0.185) (0.135, 0.625) (−12.454, −0.031) (−0.265, 1.868) (−4.802, 0.711) Param. α1 α2 α3 α4 ν1 ν2 ν3 ν4 ρ1 ρ2 ρ3 ρ4 Table 12. North China: effect of varying magnitude cut-off. Shown are the original parameter values used to simulate the catalogue, and the resulting fitted values. N is the number of events in the catalogue. Magnitude cut-off M c Param. α1 α2 α3 α4 ν1 ν2 ν3 ν4 ρ θ 12 θ 21 θ 43 Magnitude perturbation Fitted −4.507 −2.889 −4.034 −4.807 0.017 0.048 0.088 0.061 0.204 −1.534 0.140 −0.633 6.0 6.2 6.5 6.7 (N = 1126) (N = 753) (N = 632) (N = 508) −4.442 −2.956 −4.069 −4.531 0.017 0.042 0.089 0.059 0.204 −1.538 0.141 −0.632 −4.444 −3.340 −4.571 −5.070 0.014 0.041 0.090 0.057 0.194 −1.505 0.125 −0.675 −4.739 −3.303 −5.312 −4.732 0.013 0.038 0.081 0.053 0.188 −1.551 0.123 −0.718 −4.857 −3.267 −5.784 −4.710 0.012 0.032 0.078 0.044 0.176 −1.692 0.115 −0.808 cut-off is implicit in the model formulation, and thus we want to look at a higher cut-offs, and because the data set is so small, we will first simulate the model (see the appendix for details) to obtain a record of approximately 1000 events, and then check whether raising the cut-off retains a fitted model consistent with that simulated. The results are given in Table 12. We see that the basic character of the model is unaffected by the magnitude cut-off. The systematic decrease in {ν i } and ρ is due to the decrease in the rate of events, and the decrease in the total stress release with fewer events, respectively. Differences in the fitted values of {θ i j } are small, and do not affect the sign. We have examined deleting each of the events from the historical catalogue, and as might be expected from the experiments with magnitude errors, and region perturbation, the resulting fitted model is very different for a few very important events, and little different when other events are deleted. However, this is not a statistical experiment, and we have no way to randomly add events to the catalogue. Hence the same simulated catalogue was used to study the effects of catalogue completeness, by randomly deleting events to produce missing data. Events were deleted with probability 0.05 (M < 6.5), Table 13. North China: effect of randomly deleting events (catalogue incompleteness). Shown are the original parameter values used to simulate the catalogue, and 95 per cent confidence intervals for the refitted parameters after deletion. Param. α1 α2 α3 α4 ν1 ν2 ν3 ν4 ρ θ 12 θ 21 θ 43 Fitted Refitted −4.507 −2.889 −4.034 −4.807 0.017 0.048 0.088 0.061 0.204 −1.534 0.140 −0.633 (−5.026, −3.424) (−4.029, −2.199) (−4.776, −3.292) (−5.210, −3.406) (0.008, 0.020) (0.011, 0.053) (0.011, 0.096) (0.001, 0.065) (0.196, 0.204) (−1.615, −1.430) (0.121, 0.163) (−0.681, −0.482) 0.02 (6.5 ≤ M < 7.0) and 0.01 (M ≥ 7.0), respectively. The results for 1000 realizations of the deletion and refitting process are given in Table 13. Again, the basic character of the model is unaffected. The decreases in {ν i } and ρ follow from there being fewer events, and thus a smaller stress release. The question remains as to how sensitive the model is overall. The important parameters, from the view of hazard forecasting, are {ν i } and {θ i j }. The first gives the reduction in hazard following a local event, and in combination with the second, gives the enhancement or reduction in hazard following an event elsewhere. On this criterion, we find that the model can be very sensitive to slight perturbations in assigning events to regions (Table 10), and to magnitude errors in the catalogue. These are exacerbated by the small number of events in the data set, and the consequent domination of the fitted model by a few of them. In experiments with a larger simulated catalogue, the model does not appear to be as sensitive to magnitude cut-off or to random deletion of events. 4.3 Precision On the presumption of correct data, simulation and refitting may also offer a possible idea of the amount of data that might be required, C 2003 RAS, GJI, 154, 925–946 The linked stress release model Table 14. Parameter tightness by number of observations. N = 25 (Taiwan) and N = 50 (north China) did not converge in all cases. Parameter Taiwan a 1 = −1.447 a 2 = −2.271 b1 = 0.545 b2 = 0.134 c11 = 0.678 c12 = 0.250 c21 = 0.211 c22 = 0.323 North China a 1 = −4.473 a 2 = −2.871 a 3 = −3.858 a 4 = −5.426 b1 = 0.003 b2 = 0.007 b3 = 0.015 b4 = 0.017 cii = 5.278 c12 = −8.573 c21 = 1.278 c43 = −4.521 Interquartile range N = 25 1.451 1.812 0.461 0.212 0.153 0.056 0.689 0.177 N = 50 1.733 0.915 1.250 2.191 0.006 0.012 0.018 0.019 1.161 7.451 1.478 2.722 N = 50 1.068 1.081 0.289 0.107 0.103 0.023 0.421 0.086 N = 100 1.029 0.686 0.706 0.952 0.002 0.006 0.009 0.007 0.376 3.177 0.626 0.839 N = 100 0.827 0.728 0.180 0.065 0.079 0.014 0.249 0.041 N = 150 0.753 0.615 0.554 0.753 0.001 0.004 0.006 0.004 0.205 1.879 0.347 0.440 N = 150 0.712 0.606 0.136 0.051 0.061 0.010 0.202 0.328 N = 200 0.632 0.549 0.456 0.624 0.001 0.003 0.005 0.004 0.131 1.228 0.238 0.274 or equivalently, the expected precision from a given amount of data. We will use the four-region, 12-parameter model for north China (Table 9, best model for regionalization C), and for comparison, the two-region, eight-parameter model for Taiwan (Bebbington & Harte 2001, see eq. A3). Simulating and refitting each model 1000 times for each of 25 (Taiwan only), 50, 100, 150 and 200 (north China only) events produced the results in Table 14. First, the {a i } terms are effectively the initial stress level. This is very sensitive to the elapsed time to the first event, hence the large interquartile ranges. Fortunately, however, we are not usually interested in these parameters, although care should be exercised when using the model for forecasting. Most of our interest is concentrated on the interaction terms {cij , i = j} and the tectonic input term cii (see the remark preceding eq. 10). For Taiwan, Bebbington & Harte (2001) observed that the c21 term (influence of Eurasian Plate earthquakes on Phillipine Sea Plate earthquakes) is of doubtful significance, a pattern that is repeated here. The reverse interaction appears well founded even for N = 25, and so it is the {bi } terms that need to be examined. We see that while b1 is relatively consistent for small N , b2 requires closer to N = 50 (see Bebbington & Harte 2001, for comments on the relationship between bi and cij ). Overall, a two-region, seven- or eight-parameter model requires of the order of 35–40 observations. Similar considerations apply to the north China model, although {bi } are much better behaved because of the restricted set of interactions {cij }. Here it is the {cij , i = j} terms, in particular c12 and c21 , that control matters. It appears that a four-region 12-parameter model requires somewhere between 50 and 100 observations. The fact that we have only 65 observations explains the somewhat inconclusive models we fitted earlier. 5 DISCUSSION We have reviewed the definition and applications of the linked stress release model. While it has enjoyed some success with Chinese, C 2003 RAS, GJI, 154, 925–946 939 Japanese and Persian data, subtle aspects of data from other regions have made its widespread application difficult. In addition, there has been no systematic procedure for applying the model, or for evaluating the results. This paper attempts to address these deficiencies, and by investigating certain technical aspects of the model, shed light on the sensitivity to data anomalies and quality. The computational aspects of fitting the model are outlined, along with certain ‘goodness-of-fit’ indicators that arise therein. These tools of the Hessian, compensator, information gain and Monte Carlo simulation provide valuable information concerning the significance of the results. They are illustrated at various points in the paper using data from Persia and north China. In some cases, the amount of data relative to the number of parameters is less important than the ‘niceness’ of the data in facilitating numerical convergence. Examination of previous studies and a small simulation exercise provide some feel for the amount of data desirable for models of various complexity. It appears that the required number of events increases slightly faster than linearly with the number of fitted parameters. Models with more parameters have a more complex interaction matrix . Since the model may be overparametrized, more dense matrices can contain a complex correlation structure between parameters that causes fitting problems, thus requiring more data to ensure convergence. The question of whether the model works best with Benioff strain (‘stress’) or seismic moment seems to have the same answer as the accelerating moment release model: for some unknown reason, stress seems to be the variable that works. The model appears, at least for the small data sets used, to be quite sensitive to possible errors in the observed magnitudes. This is not unexpected, as an error of 0.5 in magnitude results in a factor of 5 error in Benioff strain. The fact that the same magnitude error results in a factor of 30 error in seismic moment may contribute to the model working better with Benioff strain as the underlying variable. We provide a systematic procedure for applying the model, illustrated by data from Persia. This appears to show that apparent alternation of activity between two of the regions may be related to the activity in the Zagros region. The question of how to identify regions, and more importantly, the sensitivity of conclusions to this regionalization, is examined with reference to north China. A clustering algorithm, in other words having the data determine the regions, is shown to provide a similar outcome to geophysical considerations, but the qualitative results are dependent on the regionalization, primarily through covalent events. Monte Carlo simulation experiments with the north China model show that the model is very sensitive to errors in the small catalogue, particularly in magnitude and completeness. These effects decrease, naturally enough, with larger simulated catalogues. There seems to be less dependence on the details of the declustering algorithm, and the magnitude cut-off used. A defining characteristic of the linked stress release model is that it incorporates the spatial dimension only through the regionalization. This does have an advantage in modelling spatial heterogeneity, for example dealing with faults. A more general spatio-temporal model, such as a time–space marked point process, would have to assume a spatially homogeneous spatial influence function. The result would then be very similar to the model of Kagan & Jackson (2000). Moreover, the linked stress release model is more closely akin to a non-parametric approach, in having a large parameter set reduced to the statistically significant minimum by the data, rather than building a constrained physical model and then determining the parameter values. Rathbun (1996) has shown that maximumlikelihood estimators for a spatio-temporal point process are 940 M. Bebbington and D. Harte consistent and asymptotically normal, and fitted a spatio-temporal point process to California data. Rathbun (2002) has formulated a form of spatio-temporal stress release model, but numerical instability has hindered its application. As a result of the sensitivity of the model to details of the catalogue and regionalization, with the small historical catalogues necessarily used, the low values of the information gain are not unexpected. As a forecasting tool, the lack of a true spatial dimension is a great hindrance given the size of the regions used. Nevertheless, many geophysical hypotheses are advanced at the level of regional interactions (see, for example, Ambraseys & Melville 1982; Thatcher 1984; Kanaori et al. 1993), and so are testable using the linked stress release model. Models such as that of Papadimitriou et al. (2001) can also be investigated in this manner. Until more data become available, the best use of the linked stress release model may be as a procedure for testing hypotheses on geophysical interactions, and as a diagnostic and validation tool for other, more complicated, models. This purpose will be facilitated by the examination of the statistical properties of the model contained in this paper. AC K N OW L E D G M E N T S This work was supported by the Marsden Fund, administered by the Royal Society of New Zealand. Valuable discussion was provided by David Vere-Jones and Yehuda Ben-Zion. Reviews by Yan Kagan and an anonymous referee greatly improved the content and organization of the paper. The authors are grateful to the Institute for Mathematics and its Applications, University of Minnesota, for its hospitality. REFERENCES Aalen, O.O. & Hoem, J.M., 1978. Random time changes for multivariate counting processes, Scand. Actuarial J., 5, 81–101. Akaike, H., 1977. On entropy maximization principle, in Applications of Statistics, pp. 27–41, ed. Krishnaiah, P.R., North-Holland, Amsterdam. Aki, K., 1989. Ideal probabilistic earthquake prediction, Tectonophysics, 169, 197–198. Ambraseys, N.N. & Melville, C.P., 1982. A History of Persian Earthquakes, Cambridge University Press, Cambridge, p. 219. Bebbington, M., 1997. A hierarchical stress release model for synthetic seismicity, J. geophys. Res., 102, 11 677–11 687. Bebbington, M. & Harte, D., 2001. On the statistics of the linked stress release process, J. Appl. Probab., 38A, 176–187. Bebbington, M.S., Harte, D.S. & Vere-Jones, D., 1998. A linked stress release model for spatial seismicity, EOS, Trans. Am. geophys. Un., 79, F643. Benioff, H., 1951. Crustal strain characteristics derived from earthquake sequences, Trans. AGU, 32, 508–514. Ben-Zion, Y., 1996. Stress, slip and earthquakes in models of complex singlefault systems incorporating brittle and creep deformations, J. geophys. Res., 101, 5677–5706. Borovkov, K. & Bebbington, M.S., 2003. A simple two-node stress transfer model reproducing Omori’s law, Pure appl. Geophys., 160, 1429–1445. Borovkov, K. & Vere-Jones, D., 2000. Explicit formulae for stationary distributions of stress release processes, J. Appl. Probab., 37, 315–321. Daley, D.J. & Vere-Jones, D., 1988. An Introduction to the Theory of Point Processes, Springer, Berlin, p. 702. Dieterich, J., 1994. A constitutive law for rate of earthquake production and its application to earthquake clustering, J. geophys. Res., 99, 2601–2618. Eguchi, T., 1983. Tectonic stress fields in East Eurasia, Phys. Earth planet. Inter., 33, 318–327. Eneva, M. & Ben-Zion, Y., 1997a. Techniques and parameters to analyse seismicity patterns associated with large earthquakes, J. geophys. Res., 102, 17 785–17 795. Eneva, M. & Ben-Zion, Y., 1997b. Application of pattern recognition techniques to earthquake catalogs generated by model of segmented fault systems in three-dimensional elastic solids, J. geophys. Res., 102, 24 513– 24 528. Field, E.H., Jackson, D.D. & Dolan, J.F., 1999. A mutually consistent seismic hazard model for southern California, Bull. seism. Soc. Am., 89, 559–578. Gardner, J.K. & Knopoff, L., 1974. Is the sequence of earthquakes in southern California, with aftershocks removed, Poissonian?, Bull. seism. Soc. Am., 64, 1363–1367. Gu, G., ed., 1983a. Chinese Earthquake Catalogue, Part I: 1831BC-1969AD, Scientific Press, Beijing. Gu, G., ed., 1983b. Chinese Earthquake Catalogue, Part II: 1970–1979AD, Seismological Press, Beijing. Harris, R.A., 1998. Introduction to special section: stress triggers, stress shadows, and implications for seismic hazard, J. geophys. Res., 103, 24 347–24 358. Harris, R.A. & Simpson, R.W., 1996. In the shadow of 1857—the effect of the great Ft. Tejon earthquake on subsequent earthquakes in southern California, Geophys. Res. Lett., 23, 229–232. Harte, D.S., 1999. Documentation for the Statistical Seismology Library. Research Report 98/10, revised edition, School of Mathematical and Computing Sciences, Victoria University of Wellington, New Zealand. Harte, D.S., 2001. Multifractals: Theory and Applications, Chapman and Hall/CRC, Boca Raton, p. 248. Hill, D.P. et al., 1993. Seismicity remotely triggered by the magnitude 7.3 Landers, California, earthquake, Science, 260, 1617–1623. Imoto, M., 2001. Application of the stress release model to the Nankai earthquake sequence, southwest Japan, Tectonophysics, 338, 287–295. Imoto, M., Maeda, K. & Yoshida, A., 1999. Use of statistical models to analyze periodic seismicity observed for clusters in the Kanto region, central Japan, Pure appl. Geophys., 155, 609–624. Jaumé, S. & Bebbington, M.S., 2000. Accelerating seismic moment release from modified stress release models, EOS, Trans. Am. geophys. Un., 48, F582. Jaumé, S.C. & Sykes, L.R., 1999. Evolving towards a critical point: a review of accelerating seismic moment/energy release prior to large and great earthquakes, Pure appl. Geophys., 155, 279–305. Kagan, Y.Y., 1991. Seismic moment distribution, Geophys. J. Int., 106, 123– 134. Kagan, Y.Y., 1997a. Seismic-moment frequency relationship for shallow earthquakes; regional comparisons, J. geophys. Res., 102, 2835–2852. Kagan, Y.Y., 1997b. Are earthquakes predictable,? Geophys. J. Int., 131, 505–525. Kagan, Y.Y., 2002. Seismic moment distribution revisited: I. Statistical results, Geophys. J. Int., 148, 521–542. Kagan, Y.Y. & Jackson, D.D., 1991. Long-term earthquake clustering, Geophys. J. Int., 104, 117–133. Kagan, Y.Y. & Jackson, D.D., 2000. Probabilistic forecasting of earthquakes, Geophys. J. Int., 143, 438–453. Kagan, Y.Y. & Knopoff, L., 1977. Earthquake risk prediction as a stochastic process, Phys. Earth planet. Inter., 14, 97–108. Kanamori, H., 1972. Relations among tectonic stress, great earthquakes and earthquake swarm, Tectonophysics, 14, 1–12. Kanamori, H. & Anderson, D.L., 1975. Theoretical basis of some empirical relations in seismology, Bull. seism. Soc. Am., 65, 1073–1095. Kanaori, Y., Kawakami, S. & Yairi, K., 1993. Space–time correlations between inland earthquakes in central Japan and great offshore earthquakes along the Nankai trough: implication for destructive earthquake prediction, Eng. Geol., 33, 289–303. Kanaori, Y., Kawakami, S. & Yairi, K., 1994. Seismotectonics of the Median Tectonic Line in southwest Japan: implications for coupling among major fault systems, Pure appl. Geophys., 142, 589–607. King, G.C.P., Stein, R.S. & Lin, J., 1994. Static stress changes and the triggering of earthquakes, Bull. seism. Soc. Am., 84, 935–953. Kiremidjian, A. & Anagnos, T., 1984. Stochastic slip-predictable model for earthquake occurrences, Bull. seism. Soc. Am., 74, 739–755. Knopoff, L., 1971. A stochastic model for the occurrence of main sequence events, Rev. Geophys. Space Phys., 9, 175–188. C 2003 RAS, GJI, 154, 925–946 The linked stress release model Knopoff, L., 1996. A selective phenomenology of the seismicity of Southern California, Proc. Natl. Acad. Sci. USA, 93, 3756–3763. Kossobokov, V.G., 1997. User manual for M8, in Algorithms for Earthquake Statistics and Prediction, pp. 167–222, eds Healy, J.H., Kellis-Borok, V.I. & Lee, W.H.K., IASPEI, Menlo Park. Kullback, S., 1997. Information Theory and Statistics, p. 399, Dover, New York. Li, C. & Kisslinger, C., 1985. Stress transfer and non-linear stress accumulation at subduction type plate boundaries—application to the Aleutians, Pure appl. Geophys., 122, 813–830. Li, F.Q. & Liu, G.X., 1986. Stress state in the upper crust of the China mainland, J. Phys. Earth, 34, S71–S80. Liu, J., Vere-Jones, D., Ma, L., Shi, Y. & Zhuang, J.C., 1998. The principal of coupled stress release model and its application, Acta Seismologica Sinica, 11, 273–281. Liu, J., Chen, Y., Shi, Y. & Vere-Jones, D., 1999. Coupled stress release model for time dependent seismicity, Pure appl. Geophys., 155, 649–667. Lu, C. & Vere-Jones, D., 2000. Application of linked stress release model to historical earthquake data: comparison between two kinds of tectonic seismicity, Pure appl. Geophys., 157, 2351–2364. Lu, C. & Vere-Jones, D., 2001. Statistical analysis of synthetic earthquake catalogs generated by models with various levels of fault zone disorder, J. geophys. Res., 106, 11 115–11 125. Lu, C., Harte, D. & Bebbington, M., 1999a. A linked stress release model for historical Japanese earthquakes: coupling among major seismic regions, Earth Planets Space, 51, 907–916. Lu, C., Vere-Jones, D. & Takayasu, H., 1999b. Avalanche behaviour and statistical properties in a microcrack coalescence process, Phys. Rev. Lett., 82, 347–350. Lu, C., Vere-Jones, D., Takayasu, H., Tretyakov, A.Y. & Takayasu, M., 1999c. Spatio-temporal seismicity in an elastic block lattice model, Fractals, 7, 301–311. Lutz, K.A. & Kiremidjian, A.S., 1995. A stochastic model for spatially and temporally dependent earthquakes, Bull. seism. Soc. Am., 85, 1177–1189. Main, I.G., 1999. Applicability of time-to-failure analysis to accelerated strain before earthquakes and volcanic eruptions, Geophys. J. Int., 139, F1–F6. Nalbant, S.S., Hubert, A. & King, G.C.P., 1998. Stress coupling between earthquakes in northwest Turkey and the north Aegean sea, J. geophys. Res., 103, 24 469–24 486. Ogata, Y., 1981. On Lewis’s simulation method for point processes, IEEE Trans. Inf. Theory, IT-27, 23–31. Ogata, Y., 1988. Statistical models for earthquake occurrences and residual analysis for point processes, J. Amer. Statist. Assoc., 83, 9–27. Ogata, Y. & Vere-Jones, D., 1984. Inference for earthquake models: a selfcorrecting model, Stoch. Proc. Appl., 17, 337–347. Papadimitriou, E.E., Papazachos, C.B. & Tsapanos, T.M., 2001. Test and application of the time- and magnitude-predictable model to the intermediate and deep focus earthquakes in the subduction zones of the circum-Pacific belt, Tectonophysics, 330, 45–68. Press, W.H., Flannery, B.P., Teukolsky, S.A. & Vetterling, W.T., 1986. Numerical Recipes, p. 818, Cambridge University Press, Cambridge. Pollitz, F.F. & Sacks, I.S., 1995. Consequences of stress changes following the 1891 Nobi earthquake, Japan, Bull. seism. Soc. Am., 85, 796–807. Pollitz, F.F. & Sacks, I.S., 1997. The 1995 Kobe, Japan, earthquake: a longdelayed aftershock of the offshore 1944 Tonankai and 1946 Nankaido earthquakes, Bull. seism. Soc. Am., 87, 1–10. Rathbun, S.L., 1996. Asymptotic properties of the maximum likelihood estimator for spatio-temporal point processes, J. Statist. Plan. Infer., 51, 55–74. Rathbun, S.L., 2002. A marked spatio-temporal point process model for California earthquakes, in Seminar Notes at the IMA Workshop on Point Process Modeling and Seismological Applications of Statistics, Minneapolis, Minnesota, 10–14 June 2002 (http://www.ima.umn.edul//talks/workshops/6-1014.2002/rathbun/Rathbun.pdf). Reid, H.F., 1910. The mechanism of the earthquake, in The California Earthquake of April 18, 1906, Report of the State Earthquake Investi C 2003 RAS, GJI, 154, 925–946 941 gation Commission, Vol. 2, pp. 16–28, Carnegie Institute of Washington, Washington, DC. Rényi, A., 1959. On the dimension and entropy of probability distributions., Acta Mathematica, 10, 193–215. Rundle, J.B., 1988a. A physical model for earthquakes 1. Fluctuations and interactions, J. geophys. Res., 93, 6237–6254. Rundle, J.B., 1988b. A physical model for earthquakes 2. Application to southern California, J. geophys. Res., 93, 6255–6274. Schoenberg, F. & Bolt, B., 2000. Short-term exciting, long-term correcting models for earthquake catalogs, Bull. seism. Soc. Am., 90, 849–858. Seno, T., 1979. Pattern of intraplate seismicity in Southwest Japan before and after great interplate earthquakes, Tectonophysics, 57, 267–283. Shi, Y., Liu, J., Vere-Jones, D., Zhuang, J. & Ma, L., 1998. Application of mechanical and statistical models to the study of seismicity of synthetic earthquakes and the prediction of natural ones, Acta Seismologica Sinica, 11, 421–430. Shimazaki, K., 1976. Intra-plate seismicity and inter-plate earthquakes: historical activity in southwest Japan, Tectonophysics, 33, 33–42. Shimazaki, K. & Nakata, T., 1980. Time-predictable model for large earthquakes, Geophys. Res. Lett., 7, 279–282. Sornette, D. & Sornette, A., 1999. General theory of the modified Gutenberg– Richter law for large seismic moments, Bull. seism. Soc. Am., 89, 1121– 1030. Stark, P.B., 1997. Earthquake prediction: the null hypothesis, Geophys. J. Int., 131, 495–499. Stein, R.S., Barka, A.A. & Dieterich, J.H., 1997. Progressive failure on the North Anatolian fault since 1939 by earthquake stress triggering, Geophys. J. Int., 128, 594–604. Thatcher, W., 1984. The earthquake deformation cycle at the Nankai Trough, J. geophys. Res., 89, 3087–3101. Toda, S., Stein, R.S., Reasenberg, P.A., Dieterich, J.H. & Yoshida, A., 1998. Stress transferred by the 1995 M W = 6.9 Kobe, Japan, shock: effect on aftershocks and future earthquake probabilities, J. geophys. Res., 103, 24 543–24 565. Utsu, T., 1999. Representation and analysis of the earthquake size distribution: a historical review and some new approaches, Pure appl. Geophys., 155, 509–535. Vere-Jones, D., 1978. Earthquake prediction—a statistician’s view, J. Phys. Earth, 26, 129–146. Vere-Jones, D., 1988. On the variance properties of stress-release models, Austral. J. Statist., 30A, 123–135. Vere-Jones, D., 1995. Forecasting earthquakes and earthquake risk, Int. J. Forecasting, 11, 503–538. Vere-Jones, D., 1998. Probabilities and information gain for earthquake forecasting, Comput. Seismol., 30, 248–263. Vere-Jones, D. & Deng, Y.L., 1988. A point process analysis of historical earthquakes from North China, Earthquake Res. China, 2, 165–181. Vere-Jones, D. & Ogata, Y., 1984. On the moments of a self-correcting process, J. Appl. Prob., 21, 335–342. Vere-Jones, D., Robinson, R. & Yang, W., 2001. Remarks on the accelerated moment release model: problems of model formulation, simulation and estimation, Geophys. J. Int., 144, 517–531. Walrand, J., 1988. An Introduction to Queueing Networks, p. 384, PrenticeHall, Englewood Cliffs, NJ. Wang, A., Vere-Jones, D. & Zheng, X., 1991. Simulation and estimation procedures for stress release models, in Stochastic Processes and Their Applications, Lecture Notes in Economics and Mathematical Systems, Vol. 370, pp. 11–27, eds Beckmann, M.J., Gopalan, M.N. & Subramanian, R., Springer, Berlin. Ward, S.N., 1991. A synthetic seismicity model for the Middle America Trench, J. geophys. Res., 96, 21 433–21 442. Ward, S.N., 1996. A synthetic seismicity model for southern California: cycles, probabilities, hazard, J. geophys. Res., 101, 22 393–22 418. Working Group on California Earthquake Probabilities, 1995. Seismic hazards in southern California; probable earthquakes, 1994 to 2024, Bull. seism. Soc. Am., 85, 379–439. Zheng, X., 1991. Ergodic theorems for stress release processes, Stoch. Proc. Appl., 38, 239–258. 942 M. Bebbington and D. Harte Zheng, X. & Vere-Jones, D., 1991. Applications of stress release models to earthquakes from North China, Pure appl. Geophys., 135, 559–576. Zheng, X. & Vere-Jones, D., 1994. Further applications of the stochastic stress release model to historical earthquake data, Tectonophysics, 229, 101–121. Zhuang, J. & Ma, L., 1998. The stress release model and results from modelling features of some seismic regions in China, Acta Seismologica Sinica, 11, 59–70. APPENDIX A: TECHNICAL NOTES A1 Numerical likelihood maximization Assume that the process has occurred for an infinite time into the past and that events have occurred at t k , where k = . . . , −2, −1, 0, 1, 2, . . . , n; and t n is the time of the last observed event. It is usually the case though, that the process has been observed over only part of this period, say in the interval [T 0 , T 2 ], where t 0 < T 0 < t 1 < . . . < t n < T 2 . This raises a potential problem with the conditional intensity function which is conditional on the history of the process, i.e. λ(t) = λ(t|Ht ). Since we have no recorded history before T 0 , the calculated amount of stress released prior to T 0 is S(T 0 ) = 0. This is not a real problem in the case of the stress release model because the parameter α in eq. (9) describes the intensity at t = 0. In a more general situation where the model does not make this compensation, one possible strategy is to maximize the log-likelihood function over a smaller interval, say [T 1 , T 2 ], i.e. T2 log L(T1 , T2 ) = log λ(tk |Htk ) − λ(t|Ht ) dt, (A1) T1 k:T1 ≤tk ≤T2 where t 0 < T 0 < t 1 < . . . < t m < T 1 < t m+1 < . . . < t n < T 2 . That is, the first m observed events are used as part of the history of the process (i.e. in calculating λ(t|Ht )) but only the last n − m events enter explicitly into the log-likelihood equation. This then gives the fitted process an initial settling-in period. In all models fitted in this paper, T 0 = T 1 , i.e. m = 0. Note that the values of T 0 , T 1 and T 2 do not coincide with event times t k . The length of time between, for example, T 0 and t 1 and also between t n and T 2 is part of the sample information. Hence perturbing these values will produce different parameter estimates, and is the equivalent to perturbing ones sample in any other statistically sampled situation. Hence the values of T 0 , T 1 and T 2 should be determined a priori and not with reference to the event times. In all models fitted in this paper, T 0 and T 2 are set to the beginning and end of the first and last calendar years for which there are data, including null data, respectively. The parameters {α j } then subsume the unknown time from t 1 to the previous event. Sometimes individual data sets contain little information concerning certain model parameters. For example, parameters that are dependent on the characteristics of the seismic network, rather than the geophysical properties of the region under study, may be better determined using the collective information of many studies or specific knowledge of the capabilities of the seismic network. In these situations, a possibility is to maximize the posterior log-likelihood log Q(T1 , T2 ) = log L(T1 , T2 ) + log f j (θ j ), (A2) j where f j (θ j ) is a prior density for the parameter θ j . Fitting the model by maximizing the log-likelihood presents many computational difficulties. Some algorithms are good at determining the rough location of the maximum, but converge very slowly in close proximity to the maximum. Other algorithms that are based on hill climbing in the steepest direction appear to work when the initial starting location is sufficiently close to the solution, but can become hopelessly lost if the initial values are too far away. The relative scales in the parameter space can also cause considerable problems. For example, one parameter may have a possible domain that spans a number of orders of magnitude more than a second parameter. Thus, a step taken by the optimizer that is scaled in a manner to be appropriate for the first parameter may completely overshoot the maximum because of the finer scale in the second parameter. Most optimizers have an argument so that relative scales in the parameter space can be specified. A good knowledge of the function being optimized, in this respect, and an understanding of the manner in which the optimizer scales the parameter space is critical in order to calculate the maximum-likelihood parameter estimates without relying too much on trial and error. Furthermore, the calculation of numerical derivatives for the determination of the steepest ascent is fraught with many problems, particularly machine precision, and this is further exacerbated when different parameters are on very different scales. In the stress-release model, the parameters have widely differing scales, a problem further magnified by using, for example, the seismic moment (energy) as the variable rather than the stress (cf. eq. 2). Many optimizers use a quadratic approximation to determine the step length and direction, and therefore internally calculate the Hessian (Davidson–Fletcher–Powell) of the log-likelihood. This estimate of the Hessian, if satisfactory, is very useful because its inverse provides an estimate of the covariance matrix of the fitted parameters (see, for example, Press et al. 1986, pp. 510–515), and hence rudimentary confidence intervals. However, there are a number of potential problems. The estimate of the Hessian typically starts with the identity matrix, which is modified at each iteration. Thus, if the process converges quickly, the calculated Hessian may be a poor estimate. Furthermore, the calculation of the Hessian by way of the second derivatives is fraught with all of the same difficulties as when one calculates the first derivatives as described above, but probably more so! The formulation (10) indicates correlation between the parameters is very likely. We will illustrate this, and the shape of the likelihood surface, using the linked stress release model (a, b, C) fitted to data from Taiwan (Bebbington & Harte 2001), where −1.45 0.55 0.68 0.25 a= , b= , C= . (A3) −2.27 0.13 0.21 0.32 We see that this indicates inhibitory behaviour, events in one region lowering the intensity in the other region. Examining Fig. A1 we see that the likelihood surface is well behaved for the transfer terms c12 and c21 , indicating no tendency for compensating effects to allow arbitrary amounts of interaction. Bebbington & Harte (2001) also indicate some correlation between b2 and c21 , the input terms for region 2. Fig. A2 clearly shows this effect. This factor, plus the ‘flatness’ of the likelihood surface evident in both figures are often a principal cause of instability in the fitting procedure. A2 Simulation There are two aspects to simulating the process: event time and event magnitude. Generally, determining the time to the next occurrence will follow the thinning method (see Ogata 1981, or Wang et al. 1991 for an account) where points in a dominating (with a higher intensity) Poisson process are simulated and then successively accepted C 2003 RAS, GJI, 154, 925–946 The linked stress release model Figure A1. Likelihood contours for transfer terms (Taiwan data). Figure A2. Likelihood contours for region 2 inputs (Taiwan data). C 2003 RAS, GJI, 154, 925–946 943 944 M. Bebbington and D. Harte or rejected with suitable likelihood. Apart from the slight complication of having to restart due to an event in another region in the linked case, matters are quite straightforward. Assigning a magnitude to an event is more complicated. The distribution of the stress released by an event is usually assumed to be independent of the stress level itself. While there may be a weak dependence, the resulting improvement in fit does not justify its inclusion in the model (Zheng & Vere-Jones 1991). Although the Gutenberg–Richter power-law decay holds fairly consistently for small to medium magnitudes, it is precisely at the large magnitudes we are interested in where we see deviation. It may be possible to use a tapered Pareto distribution (Vere-Jones et al. 2001; see also Sornette & Sornette 1999; Utsu 1999) or a left-truncated Gamma distribution with negative shape parameter (Kagan 1991), but this requires estimating further parameters. Instead it makes sense to eliminate the magnitude distribution as a source of error, and to use the empirical magnitude distributions from the original catalogue. Since there often seem to be significant differences in the magnitude distribution by region, particularly between intraplate and interplate events (see, for example, Kanaori et al. 1993), this must also be allowed for. Lu & Vere-Jones (2001) demonstrate that the magnitude distribution is probably the most critical element in producing qualitative and quantitative agreement with historical data. Use of the empirical magnitude distribution is particularly apt when the objective is to isolate the factors involved in the fitting of the model, and thereby to evaluate their significance (Bebbington & Harte 2001). A3 Calculation of the information gain Information theory arises in connection with the transmission of information, in particular, the length of a binary representation of that information. Say a set E has N elements. If log2 N ∈ Z+ , each element can be labelled by a binary number having log2 N digits. As such, Hartley defined log2 N as the necessary information to characterize E (see, for example, Kullback 1997). We can use this theory to describe the information generated concerning the probability distribution of an observed point process. To start this discussion, assume that E = E 1 + E 2 + . . . + E b , where E 1 , . . . , E b are pairwise disjoint finite sets. Experiments are performed, that consist of independently and randomly allocating elements to the b subsets (E j , j = 1, . . . , b) according to the probabilities p ij , where i denotes the experiment number. Let Y ij be a random variable that is one if, in the ith experiment, set E j is allocated at least one element, and zero otherwise. We denote the particular experimental outcome as y ij . Then the information generated concerning the probability distribution P i = ( pi1 , . . . , p ib ) by the ith experiment is − n yi j log2 pi j . j=1 The expected amount of information generated by an experiment, concerning the probability distribution P i = ( pi1 , . . . , p ib ), is − b pi j log2 pi j . by a sequence of n experiments is b n Then Shannon’s formula, being the expected value of the above estimate, generalizes to − b n pi j log2 pi j . i=1 j=1 Note that we simply use natural logarithms in our calculations and discussions below. We use this in the point process context as follows. Assume that the observed time period (T 1 , T 2 ) is divided into n bins each of width δ. Each bin is an ‘experiment’ in the above terminology. Furthermore, each experiment has only two possible outcomes (i.e. b = 2), hence we will drop the index j since the two probabilities in the ith experiment can be written as p i and 1 − p i . Denote the bin boundaries as τ 0 , τ 1 , . . . , τ n , where τ 0 = T 1 and τ n = T 2 . Let ∅[τi ,t) denote the outcome that no events occur in the interval [τ i , t). Let Y i be one if one or more events occur in the ith bin, [τ i−1 , τ i ), and zero otherwise, then pi = Pr{Yi = 1} ≈ 1 − exp − τi τi−1 ! λ t|Hτi−1 ∩ ∅[τi−1 ,t) , θ dt , (A4) where θ is a vector of parameters peculiar to the conditional intensity function λ. Hence the information generated concerning the probability distribution of the point process during the observation period (T 1 , T 2 ) is − n [yi log pi + (1 − yi ) log(1 − pi )]. i=1 This is simply the log-likelihood function of the binomial distribution, hence it is referred to as the binomial score and is denoted by B. The relationship between the binomial score and the point process log-likelihood can be derived in much the same way that one derives the Poisson distribution as an infinite number of Bernouilli trials. Assume that the bin width δ is sufficiently small so that the number of events per bin is no greater than one. Denote the bin that contains the kth event as I k . That is, in what follows, the summation over k is only over those bins that contain an event, the time of which is t k ; but the summation over i is over all bins. Then the binomial score can be rewritten as N n pk B= + log log(1 − pi ) 1 − pk k=1 i=1 ! " N ≈ log exp λ(t|Ht )dt − 1 − λ(t|Ht ) dt k=1 ≈ j=1 This is known as Shannon’s formula (see, for example, Kullback 1997). If the joint probabilities of a set of Y ij can be written as a chain of probabilities of each Y ij conditioned by those going before it in time (i.e. smaller i), then the total amount of information generated yi j log2 pi j . i=1 j=1 N ≈ ! λ(t|Ht ) dt − λ(t|Ht ) dt log Ik k=1 N Ik log δλ(tk |Htk ) − λ(t|Ht ) dt k=1 = N log δ + N log λ(tk |Htk ) − λ(t|Ht ) dt. (A5) k=1 C 2003 RAS, GJI, 154, 925–946 The linked stress release model It can be shown that as δ becomes very small (i.e. n very large), N log λ(tk |Htk ) − λ(t|Ht ) dt + (δ), (A6) B = N log δ + k=1 where (δ) → 0 as δ → 0. Rényi (1959) would describe this information increase, as δ → 0, as the point process distribution having a ‘dimension’ of N (see Harte 2001). Now denote the constant intensity of a ‘null’ model as λ̄, and the corresponding binomial score as B̄. Then the information gain is B − B̄. Since the term N log δ cancels out, B − B̄ is simply the log-likelihood difference. The expected information gain for a point process is bounded by λ(t|Ht ) − [λ(t|Ht ) − λ̄], λ̄ where the ‘random’ part within the expectation is the history of the process up to time t, i.e. Ht . The information gain therefore describes the ‘predictability’ of the process relative to the null process. If a process generates a higher information gain, then it should be inherently more predictable than a process that generates little information. Typically, one would express the information gain as information gain per unit time, i.e. IEλ(t|Ht ) log B − B̄ , T2 − T1 or per event, i.e. (B − B̄)/N . Note that all models will be dependent on the particular time units used, and hence for comparative purposes these units should be the same. Since the information gain is simply the difference in the loglikelihood between a null and ‘fitted’ model, then it also has the usual goodness-of-fit interpretations when applied to observed historical data. If the distribution of the information gain is unknown, 945 then many realizations can be simulated of the same length as that of the observed series. For each simulated series, the information gain is calculated using the fitted model after re-estimating the parameter values. Thus we form an empirical probability distribution of the information gains. One then compares the information gain calculated using the observed data with the empirical distribution, and rejects the hypothesis that the data are sampled from such a process if the information gain lies in the tails of the empirical distribution. The null model for the linked stress release process can be defined in several natural ways. Assuming it to be a multidimensional Poisson process, that is, an independent Poisson process for each region, it remains to determine the rates of these Poisson processes. We will follow Vere-Jones (1998) in calculating the rate as the stress input rate divided by the average stress release, λ̄i = υi /IE{Ji }, where J i is a random variable for the stress drop distribution in the ith region. Since stress is transferred between the regions, the stress ‘input’ rates {υ i } are found as the solutions to the simultaneous equations θi j υ j , (A7) υi = ρi − j=i for all i. These are known as flow conservation equations in a Jackson network (Walrand 1988), which function similarly here as a consistency condition. For example, the values for regionalization A in the north China model are υ 1 = 0.426, υ 2 = 0.144, υ 3 = 0.204 and υ 4 = 0.333. Using the empirical jump distribution to estimate E{J i }, these produce λ̄1 = 0.0377, λ̄2 = 0.0189, λ̄3 = 0.0378 and λ̄4 = 0.0282 (per year). The other regionalizations produce similar, but slightly different, values because of the different estimated parameters ρ and . A P P E N D I X B : H I S T O R I C A L D AT A S E T S Table B1. Persian earthquake data 1780–1994. Reprinted from Zheng, X. and Vere-Jones, D., Further applications of the stochastic stress release model to historical earthquake data, Tectonophysics, 229, 101–121, Copyright (1994), with permission from Elsevier. Date 1780.01.08 1780.-.1786.10.1808.06.26 1809.-.1810.-.1824.06.25 1825.-.1830.03.27 1833.-.1834.-.1838.-.1840.07.02 1844.05.12 1844.05.13 1851.04.02 1851.06.1853.05.05 1861.05.24 1862.12.19 1862.12.21 1863.12.30 1864.01.17 1864.12.07 1865.06.1868.03.18 C Lat. Long. Mag. Reg. Date 38.20 46.00 38.30 35.30 36.30 38.00 29.80 36.10 35.70 37.30 39.70 29.60 39.50 33.60 37.40 40.00 36.80 29.60 39.40 39.30 29.50 38.20 30.60 33.30 29.60 39.60 45.60 54.50 52.50 57.20 52.40 52.60 52.40 58.10 43.70 59.90 43.90 51.40 48.00 47.30 58.40 52.50 47.50 47.50 52.50 48.60 57.00 45.90 53.10 47.60 7.70 6.50 6.30 6.60 6.50 6.50 6.40 6.70 7.10 6.20 6.00 7.00 7.40 6.40 6.90 6.20 6.90 6.20 6.00 6.00 6.20 6.10 6.00 6.40 6.00 6.00 A E A N N K Z N N E A E A Z N N E Z A A Z N C Z Z A 1923.09.22 1927.07.22 1929.05.01 1929.07.15 1930.05.06 1930.08.23 1931.04.27 1933.10.05 1933.11.28 1934.02.04 1934.06.13 1935.04.11 1936.06.30 1940.05.04 1941.02.06 1945.11.27 1947.08.05 1947.09.23 1948.07.05 1948.10.05 1949.04.24 1953.02.12 1956.10.31 1957.07.02 1957.12.13 1961.06.11 2003 RAS, GJI, 154, 925–946 Lat. Long. Mag. Reg. 29.51 34.90 37.73 32.08 38.24 27.88 39.48 34.42 32.01 30.54 27.63 36.36 33.68 35.76 33.41 25.02 25.25 33.67 29.88 37.88 27.28 35.39 27.27 36.07 34.58 27.78 56.63 52.90 57.81 49.48 44.60 55.02 46.09 57.07 55.94 51.64 62.64 53.32 60.05 58.53 58.87 63.47 63.20 58.67 57.53 58.55 56.46 54.88 54.55 52.47 47.82 54.51 6.70 6.30 7.30 6.00 7.20 6.10 6.40 6.00 6.20 6.30 6.60 6.30 6.00 6.40 6.10 8.00 7.00 6.80 6.00 7.20 6.30 6.50 6.30 6.80 6.86 6.50 C N K Z A Z A C C Z M N E E E M M E C E Z N Z N Z Z 946 M. Bebbington and D. Harte Table B1. (Continued.) Date 1868.08.01 1871.12.23 1872.06.03 1875.05.1879.03.22 1883.05.03 1890.03.25 1890.07.11 1893.11.17 1895.01.17 1896.01.04 1897.01.10 1902.03.09 1903.03.22 1904.11.09 1905.01.09 1905.06.19 1908.09.28 1909.01.23 1911.04.18 1923.09.17 Lat. Long. Mag. Reg. Date 34.90 37.40 34.70 31.20 37.80 37.90 28.80 36.60 37.00 37.10 37.80 26.90 27.08 33.16 36.94 37.00 29.89 38.00 33.41 31.23 37.63 52.50 58.40 47.70 56.30 47.90 47.20 53.50 54.60 58.40 58.40 48.40 56.00 56.34 59.71 59.77 48.68 59.98 44.00 49.13 57.03 57.21 6.40 7.20 6.10 6.00 6.70 6.20 6.40 7.20 7.10 6.80 6.70 6.40 6.40 6.20 6.40 6.20 6.90 6.00 7.40 6.20 6.30 N K Z C N N Z N K E N Z Z E E N E A Z C K 1962.09.01 1964.12.22 1968.08.31 1969.11.07 1970.07.30 1972.04.10 1975.03.07 1976.11.07 1976.11.24 1977.03.21 1977.04.06 1978.09.16 1978.12.04 1979.12.14 1979.01.10 1979.01.16 1980.05.04 1981.07.28 1989.03.05 1990.06.20 1990.11.06 Lat. Long. Mag. Reg. 35.71 28.12 34.02 27.42 37.67 28.38 27.47 33.82 39.12 27.59 31.90 33.40 37.67 32.14 26.52 33.80 38.05 30.01 29.95 36.96 28.23 49.81 56.80 58.96 60.40 55.89 52.98 56.44 59.19 43.92 56.45 50.76 57.12 48.90 49.65 61.01 59.50 48.99 57.79 51.68 49.41 55.43 7.20 6.10 7.41 6.40 6.60 6.90 6.10 6.40 7.30 6.90 6.10 7.30 6.00 6.10 6.00 7.22 6.20 7.10 6.20 7.70 6.70 N Z E M N Z Z E A Z Z C N Z M E N E Z N Z Table B2. Historical large earthquakes for north China, 1480–1996, reprinted from Zheng, X. and Vere-Jones, D., Further applications of the stochastic stress release model to historical earthquake data, Tectonophysics, 229, 101–121, Copyright (1994), with permission from Elsevier, with additions. The region (Reg.) is for the regionalization (A) of Zheng & Vere-Jones (1994), with alternative regionalizations given in parentheses. Date 1484.01.29 1487.08.10 1501.01.19 1502.10.17 1536.10.22 1548.09.13 1556.01.23 1561.07.25 1568.05.15 1568.04.25 1573.01.10 1587.04.10 1597.10.06 1604.10.25 1614.10.23 1618.05.20 1618.11.16 1622.03.18 1622.10.25 1624.02.10 1624.04.17 1624.07.04 1626.06.28 1627.02.15 1634.01.1642.06.30 1654.07.21 1658.02.03 1665.04.16 1668.07.25 1679.09.02 1683.11.22 1695.05.18 Lat. Long. Mag. Reg. Date 40.40 34.30 34.80 35.70 39.60 38.00 34.50 37.50 39.00 34.40 34.40 35.20 38.50 34.20 37.20 37.00 39.80 35.50 36.50 32.40 39.80 35.40 39.40 37.50 34.10 35.10 34.30 39.40 39.90 35.30 40.00 38.70 36.00 116.10 108.90 110.10 115.30 116.80 121.00 109.70 106.20 119.00 109.00 104.10 113.80 120.00 105.00 112.50 111.90 114.50 116.00 106.30 119.50 118.80 105.90 114.20 105.50 105.30 111.10 105.50 115.70 116.60 118.60 117.00 112.70 111.50 6.70 6.20 7.00 6.50 6.00 7.00 8.00 7.20 6.00 6.70 6.70 6.00 7.00 6.00 6.50 6.50 6.50 6.00 7.00 6.00 6.20 6.00 7.00 6.00 6.00 6.00 8.00 6.00 6.50 8.60 8.00 7.00 8.00 3 2 2 3 3 4 2 1 4 2 1 3 4 1 2 2 3 (B:2) 3 1 4 3 (C:4) 1 3 (B:2) 1 1 2 1 3 3 4 3 2 2 1704.09.28 1709.10.14 1718.06.29 1720.07.12 1730.09.30 1739.01.03 1815.10.23 1820.08.03 1829.11.19 1830.06.12 1831.09.28 1852.05.26 1861.07.19 1879.07.01 1882.12.02 1885.01.14 1888.06.13 1888.11.02 1920.12.06 1922.09.29 1937.08.01 1945.09.23 1966.03.22 1967.03.27 1969.07.18 1975.02.04 1976.04.06 1976.07.28 1976.09.23 1979.08.25 1989.10.18 1996.05.03 Lat. Long. Mag. Reg. 34.90 37.40 35.00 40.40 40.00 38.80 34.80 34.10 36.60 36.40 32.80 37.50 39.70 33.20 38.10 34.50 38.50 37.10 36.70 39.20 35.40 39.50 37.50 38.50 38.20 40.70 40.20 39.40 39.90 41.20 40.00 40.80 106.80 105.30 105.20 115.50 116.20 106.50 111.20 113.90 118.50 114.20 116.80 105.20 121.70 104.70 115.50 105.70 119.00 104.20 104.90 120.50 115.10 119.00 115.10 116.50 119.40 122.80 111.10 118.00 106.40 108.10 113.70 109.60 6.00 7.50 7.50 6.70 6.50 8.00 6.70 6.00 6.00 7.50 6.20 6.00 6.00 8.00 6.00 6.00 7.50 6.20 8.50 6.50 7.00 6.20 7.20 6.30 7.40 7.30 6.20 7.80 6.20 6.00 6.00 6.50 1 1 1 3 3 1 2 3 4 3 4 1 4 1 3 1 4 1 1 4 3 3 (C:4) 3 3 4 4 2 3 (C:4) 1 1 2 (C:3) 1 C 2003 RAS, GJI, 154, 925–946
© Copyright 2026 Paperzz