DRAFT: 20 June 1999 - Platform for the Research into

Workshop: SEISMICITY MODELLING IN SEISMIC HAZARD MAPPING, Poljce, Slovenia, May 2224,2000
(Invited Lecture)
Statistical Estimation of Maximum Regional Earthquake
Magnitude mmax
by Andrzej Kijko
Council for Geoscience, Geological Survey of South Africa
Private Bag X112, Pretoria 0001, South Africa, e-mail: [email protected]
Abstract
This paper provides a generic equation for the evaluation of the maximum regional
earthquake magnitude mmax. The equation is capable of generating solutions in different
forms, depending on the assumptions of the model and/or the available information
about past seismicity. It includes the cases (i) when earthquake magnitudes are
distributed according to the truncated frequency-magnitude Gutenberg-Richter relation,
(ii) when the empirical magnitude distribution deviates moderately from the GutenbergRichter model, and (iii) when no specific model of the magnitude distribution is
assumed. As an example of application, the techniques described are applied in the
assessment of mmax for Southern California. The three estimates of mmax obtained in this
work are: 8.32  0.43, 8.31  0.43 and 8.34  0.45. Since the third procedure applied is
non-parametric and does not require specification of the functional form of the
magnitude distribution, its estimate of the maximum magnitude for Southern California
(viz. 8.34  0.45) is considered more reliable than the other two, which are based on the
Gutenberg-Richter model.
1. INTRODUCTION
The purpose of this paper is to provide several techniques for the assessment of the
maximum regional earthquake magnitude mmax. To avoid confusion about the
terminology, it is to be agreed that mmax, being the magnitude of the largest possible
earthquake, is calculated as the upper limit of magnitude for a given region. Also,
synonymous with the largest possible earthquake magnitude, is the maximum regional
magnitude.
The value of maximum magnitude so defined is the same as that used by
earthquake engineers (EERI Committee, 1984). This terminology assumes a sharp cutoff magnitude at a maximum magnitude mmax, so that, by definition, no earthquakes are
to be expected with magnitude exceeding mmax. Cognizance should be taken of the fact
that an alternative, “soft” cut-off maximum earthquake magnitude is also in use (Kagan,
1991), but in this paper only a model having a sharp cut-off of maximum magnitude is
considered.
At present there is no generally accepted method for estimating the value of mmax.
The current methods for the evaluation fall into two main categories: deterministic and
probabilistic.
1
The deterministic procedure most often applied is based on the empirical
relationships between magnitude and various tectonic and fault parameters. There are
several research efforts devoted to the investigation of such relationships. The
relationships are different for different seismic areas and different types of faults
(Anderson et al., 1996, and the references therein). As an alternative to the above
technique, researchers often try to relate the value of mmax with the strain rate or the rate
of seismic-moment release (Papastamatiou, 1980; Anderson and Luco, 1983). Another
procedure for the estimation of mmax was developed by Jin and Aki (1988), where a
remarkably linear relationship was established between the logarithm of coda Q0 and the
largest observed magnitude for earthquakes in China. The authors postulate that if the
largest earthquake magnitude observed during the last 400 years is the maximum
possible magnitude mmax, the established relation will give a spatial mapping of mmax.
Also, a very interesting alternative procedure (based on computer simulation of the
largest earthquakes) was describes by Ward (1997).
The value of mmax can also be estimated purely on the basis of the seismological
history of the area, viz. by using seismic event catalogs and an appropriate statistical
estimation procedure. The most often used probabilistic procedure for maximum
regional magnitude is based on the extrapolation of the classical, log-linear, frequencymagnitude Gutenberg-Richter relation. Among earthquake engineers, the best known is
the extrapolation procedure as applied recently e.g. by Frohlich, 1998, and the
“probabilistic” extrapolation procedure applied by Nuttli (1981), in which the
frequency-magnitude curve is truncated at the specified value of annual probability of
exceedance (e.g. 0.001). Another technique falling into this category is based on
formalism of the extreme values of random variables. The statistical theory of extreme
values was known and well developed in the forties already, and was applied in
seismology as early as 1945 (e.g. Nordquist, 1945). The appropriate statistical tools
required for the estimation of the end-point of distribution functions were developed
later (e.g. Robson and Whitlock, 1964; Hall, 1982) but used in estimating maximum
regional magnitude from the eighties only (Dargahi-Noubary, 1983; Kijko and
Sellevoll, 1989, 1992; Pisarenko et al., 1996).
The purpose of this paper is to provide a procedure (equation) for the evaluation
of mmax which has the potential to be free from subjective assumptions and which is
only driven by seismic data. The equation is generic and is capable of generating
solutions in different forms, depending on the assumptions about the model and/or on
the information available about past seismicity.
2. MAXIMUM REGIONAL MAGNITUDE mmax: GENERIC EQUATION
Suppose that in the area of concern, within a specified time interval T, all n of the main
earthquakes that occurred with a magnitude greater than or equal to mmin are recorded.
Let us assume that the value of the magnitude mmin is known and is denoted as the
threshold of completeness. We assume further that the magnitudes are independent,
identically distributed, random values with cumulative distribution function (CDF),
FM(m). The parameter mmax is the upper limit of the range of magnitudes and is thus
termed the (unknown) maximum regional magnitude, and is to be estimated. Let us
assume that all n recorded magnitudes are ordered in ascending order, i.e. m1  m2  …
 mn. We observe that mn, which is the largest observed magnitude (denoted also as
obs
mmax
) and falls within interval [mmin, mmax], has a CDF
2
FM n (m)  FM (m)n .
(1)
After integrating by parts, the expected value of Mn, E(Mn), is
E (M n ) 
mmax
 m dF
Mn
(m)  mmax 
mmin
mmax
F
Mn
(2)
(m) dm.
mmin
Hence
mmax  E ( M n ) 
mmax
 F
M
(m) n dm .
(3)
mmin
This expression, after replacement of the expected value of the largest observed
obs
magnitude, E(Mn), by the largest magnitude already observed, mmax
, provides the
equation
mmax  m
obs
max

mmax
 F
M
(m) ndm,
(4)
mmin
in which the desired mmax appears on both sides. However, from this equation an
estimated value of mmax (denoted as m̂max ) can easily be obtained by iteration. The first
approximation of m̂max can be obtained from equation (4) by replacing the unknown
obs
upper limit of integration, mmax, by the maximum observed magnitude, mmax
. The next
approximation is obtained by replacing the upper limit of integration by its previous
solution. Some authors simply call the method the iterative method and it was found
that in most cases the convergence is very fast. An extensive analysis and formal
conditions of convergence of the above iterative procedure are discussed elsewhere.
Cooke (1979) was probably the first to obtain the above estimator of the upper bound of
a random variable. If applied to the assessment of the maximum regional magnitude,
equation (4) says that the maximum regional magnitude mmax is equal to the largest
obs
max
magnitude already observed, m
, increased by an amount  
mmax
 F
M
(m) n dm . It
mmin
should be noted that in his original paper Cooke (1979) suggested an equation in which
obs
the upper limit of integration is equal to mmax
and not mmax. Clearly, for large n, the two
equations are equivalent and provide the same solutions.
Estimator (4) is, by its nature, very general and has several interesting properties.
For example, it is valid for each CDF, and does not require the fulfillment of any
additional conditions. It may also be used when the exact number of earthquakes, n, is
not known. In this case, the number of earthquakes, n, can be replaced by T. Such a
replacement is equivalent to the assumption that the number of earthquakes occurring in
unit time conforms to a Poisson distribution with parameter , with T the span of the
seismic catalog. It is also important to note that, since the value of the integral Δ is
never negative, equation (4) provides a value of m̂max which is never less than the
largest magnitude already observed. Of course, the drawback of the formula is that it
requires integration. For some of the magnitude distribution functions the analytical
3
expression for the integral does not exist or, if it does, requires awkward calculations.
This is, however, not a real hindrance, since numerical integration with today’s highspeed computer platforms is both very fast and accurate. Equation (4) will be called the
generic equation for the estimation of mmax.
In the following section we will demonstrate how equation (4) can be used in the
assessment of the maximum regional magnitude mmax in the different circumstances that
a seismologist might face in real life. The cases considered include the following:
(i)
(ii)
(iii)
the earthquake magnitudes are distributed according to the doubly
truncated Gutenberg-Richter relation,
the empirical magnitude distribution deviates moderately from the
Gutenberg-Richter model,
no specific model of the magnitude distribution is assumed, and only a
few of the largest magnitudes are known.
3. SOME SPECIAL CASES
3.1. CASE I: Application of the Generic Equation to the Gutenberg-Richter
Magnitude Distribution. (Formula for mmax for seismologists accepting the
Gutenberg-Richter frequency-magnitude distribution unconditionally)
For the frequency-magnitude Gutenberg-Richter relation, the respective CDF of
magnitudes, which are bounded from above by mmax, is (Page, 1968)
FM (m)  C 1  exp[  (m  mmin )],
(5)
where C is a normalizing coefficient equal to {1 – exp[- (mmax - mmin)]}-1,  = bln(10),
and b is the b-parameter of the Gutenberg-Richter relation. Following the equation (4),
the estimator of mmax requires the calculation of the integral  =
m max
C
n
 1  exp   m  m  dm,
n
min
an integral which does not have a simple solution. It
m min
can be shown that an approximate, and straightforward estimator of mmax can be
obtained through an application of Cramér’s approximation, which says that for large n,
the value of FM (m) n is approximately equal to exp{n[1  FM (m)]} . Simple
calculations show that after replacement of FM (m) n by its Cramér approximate value,
integral  takes the form [ E1 (n2 )  E1 (n1 )]  exp( n2 ) /   mmin exp( n) , where n1 = n /{1exp[-  (mmax - mmin)], n2  n1 exp[   (mmax  mmin )], and E1 () denotes an exponential
integral function. Hence, following equation (4), for the Gutenberg-Richter frequencymagnitude distribution, the estimator of mmax is obtained as a solution of the equation
obs
mmax  mmax

E1 (n2 )  E1 (n1 )
 mmin exp( n).
 exp( n2 )
(6)
Equation (6), which provides a value for the maximum regional magnitude mmax , was
introduced in Kijko and Sellevoll (1989). The value of mmax obtained from the solution
of equation (6) will be termed the Kijko-Sellevoll estimator of mmax, or, in short, K-S.
4
The calculation of the variance of the estimated maximum magnitude, Var( m̂max ), is the
same as for Cases II and III, and is shown in Section 3.3.
3.2. CASE II: Application of the Generic Equation to the Gutenberg-Richter
Magnitude Distribution in Case of Uncertainty in the b-value. (Formula for mmax
for seismologists having limited faith in the Gutenberg-Richter frequencymagnitude distribution)
A significant shortcoming of the K-S formula for mmax estimation comes from the
implicit assumptions that (i) seismic activity remains constant in time, (ii) the proper
functional form of the magnitude distribution is specified, and (iii) the parameters of the
assumed distribution functions are without error. As many studies of seismic activity
suggest, however, the seismic process can be composed of temporal trends, cycles,
short-term oscillations and pure random fluctuations. A list of some well-documented
cases of temporal variation of seismic activity of areas from all over the world is given
in Kijko and Graham (1998).
When the variation of seismic activity is a random process, the Bayesian
formalism, in which the model parameters are treated as random variables, provides the
most efficient tool to account for the uncertainties considered above. In this section, a
Bayesian-based equation for the assessment of the maximum regional magnitude will be
derived in which the uncertainty of the Gutenberg-Richter parameter b is taken into
account. Consideration of such a uncertainty in the b-value renders the three implicit
assumptions as redundant.
Following the assumption that the variation of the -value in the GutenbergRichter-based CDF (5) may be represented by a Gamma distribution with parameters p
and q, for mmin  m  mmax, the Bayesian CDF of magnitudes takes the form:


FM (m)  C 1   p /  p  m  mmin  ,
q
(7)
where C is a normalizing coefficient, p   /(  ) 2 and q  (  /   ) 2 . The symbol 
denotes the known, mean value of the parameter ,   is the known standard deviation
of  describing its uncertainty, and C is equal to {1  [ p /( p  mmax  mmin )] q }1 . It
should be noted that application of formula (7) requires that the standard deviation of
parameter  includes two components: aleatory and epistemic uncertainties.
Knowledge of the Bayesian, Gutenberg-Richter distribution (7), makes it
possible to construct the Bayesian version of the estimator of mmax. Following the
generic equation (4), the estimation of mmax requires calculation of the integral  equal
 1   p / p  m  m   dm,
mmax
to C
n
q n
min
which, after application of Cramér’s formula,
mmin
can be expressed as

 1 / q  2 exp[ nr q /(1  r q )
 (1 / q,  r q )   (1 / q,  ),

(8)
where r  p /( p  mmax  mmin ) , c1  exp[ n(1  C)] ,   nC , and  (,) is the
Incomplete Gamma Function. Thus, the estimator of mmax, when the uncertainty of the
5
Gutenberg-Richter parameter b is taken into account, is calculated as a solution of
equation
obs
mˆ max  mmax

 1/ q2 exp[ nr q /(1  r q )

 (1 / q, r q )   (1 / q,  ).

(9)
The value of mmax obtained from equation (9) will be denoted as the KijkoSellevoll-Bayes estimator of mmax, or, in short, K-S-B. An extensive comparison of
performances of K-S and K-S-B estimators is given in Kijko and Graham (1998).
3.3. Case III: Estimation of mmax when no Specific Model of the Magnitude
Distribution is Assumed. (Formula for mmax for seismologists who only believe in
what they see)
The procedures derived in the previous sections are parametric and are applicable when
the empirical, log-frequency-magnitude graph for the seismic series exhibits apparent
linearity, starting from a certain mmin value. However, many studies of seismicity show
that, in some cases, (i) the empirical distributions of earthquake magnitudes are of bi- or
multi-modal character, (ii) the log-frequency-magnitude relation has a strong non-linear
component or (iii) the presence of "characteristic" earthquakes is evident.
In order to use the generic equation (4) in such cases, the analytical, parametric
models of the frequency-magnitude distributions should be replaced by a nonparametric counterpart. The non-parametric estimation of a probability density function
(PDF) is an approach that deals with the direct summation of the kernel functions using
sample data. Given the sample data mi, i = 1 ,..., n, and the kernel function K(), the
kernel estimator fˆM m of an actual, and unknown PDF fM(m), is
1 n  m  mi 
fˆM m  
,
 K
nh i 1  h 
(10)
where h is a positive smoothing factor. The kernel function K() is a PDF, symmetric
about zero. The specific choice of it is not so important for the performance of the
method; many unimodal distribution functions ensure similar efficiencies. In this work
the Gaussian kernel function, K(x) = 1/ 2 exp(x2/2), was used. However, the choice
of the smoothing factor h is crucial because it affects the trade-off between random and
systematic errors. Several procedures exist for the estimation of the value of this
parameter, none of them being distinctly better for all varieties of real data. In this work
the least-squares cross-validation (Stone, 1984) was used. The details of the procedure
are given by Kijko et al. (1999). Fortunately, in the application of the non-parametric
estimation procedure, the integration of the CDF is not strongly affected by the accuracy
of h. The tests show that the final estimates of hazard obtained when the optimal value
of h is used do not differ much from those achieved in the case of a reasonable guess of
h.
Following the functional form of a selected kernel K(x) and the fact that the data
comes from a interval mmin , mmax , the estimator of CDF of seismic event magnitude is
n
  m  mi 
 m  mi 
FˆM (m)  C   
    min
,
h 
h


i 1  
6
(11)
1

 n   (m  mi ) 

 m  mi  
    min
where C     max
  and  x  denotes the standard
h
h

 


 i1  

Gaussian cumulative distribution function. Making use of the non-parametric, Gaussian
based estimation of the CDF as given by equation (11), the approximate value of the
n
  m  mi 
 m  mi 
integral for  is C    
    min
 dm. Therefore, the equation for
h 
h


mmin i 1  
mmax based on the non-parametric Gaussian estimation of PDF takes the form
mmax n
n
mmax
mmax  m
obs
max

C 

n
n
  m  mi 
 mmin  mi 
 dm.

  h    
h



i 1  
n
(12)
mmin
The value of mmax obtained from equation (12) will be denoted as the non-parametric,
Gaussian-based estimator or, in short, N-P-G.
The N-P-G estimator of mmax is very useful. Its strongest point is that it does not
require specification of the functional form of the magnitude distribution FM(m). By its
nature, therefore, it is capable of dealing with cases of complex empirical distributions,
e.g. distributions which are in extreme violation of log-linearity, and/or are multimodal,
and/or incorporate "characteristic" earthquakes. The drawback of estimator (12) is that,
formally, it requires knowledge of all events with magnitude above a specified level of
completeness mmin. In practice, though, this can be reduced to a knowledge of a few (say
10) of the largest events. Such a reduction is possible because the contribution of the
weak events to the estimated value of mmax decreases very rapidly as magnitude
decreases and for large n, the few largest observations carry most of the information
about its endpoint.
Elementary computations show (Kijko and Graham, 1998) that the approximate
variance of any of the above estimators [i.e. (6), (9) and (12)] is given by
Var (mˆ max )   M2   2 ,
(13)
where  M denotes the standard error in the determination of the maximum observed
obs
magnitude mmax
, and  represents respective corrections.
4. Example of Application: mmax for Southern California
All information about the seismicity of Southern California during the last 150 years
was taken from Appendix A of a paper by Field et al. (1999). In order to be consistent
with the assumption of the independence of seismic events, all aftershocks were
removed. This reduced catalog also has different levels of completeness for various time
intervals. Application of the maximum likelihood procedure to this catalog, following
procedure used by Kijko and Sellevoll, (1992), yields the values: ˆ ( mmin  5.0) =
2.14  0.17 , bˆ  0.79  0.06 , mˆ K S  8.32  0.43 and mˆ K S  B = 8.31  0.43, where mˆ K  S
max
max
K S B
max
max
and mˆ
denote the K-S estimator (6) and K-S-B estimator (9) respectively.
Application of the remaining procedure to find estimates of mmax gives:
7
N  P G
N  P G
mˆ max
 8.34  0.45 , where mˆ max
denotes the non-parametric, Gaussian-based
estimator (12).
One should not be surprised that the K-S and K-S-B estimators of mmax differ
from its N-P-G counterpart. Obviously, the differences follow from the fact that the
empirical distribution of magnitude does not follow the standard, log-linear GutenbergRichter relation. The first two procedures (K-S and K-S-B) are based on the GutenbergRichter relation, while the N-P-G procedure is model-free and therefore is able to take
characteristic earthquakes into account, which are clearly seen in the analyzed data set
(Figure 1).
LOG10(Cumulative number of seismic events)
NON-PARAMETRIC FIT TO OBSERVATIONS (mmax = 8.34)
10
10
10
10
observed number of events
non-parametric fit to observations
1
0
-1
-2
6
6.5
7
7.5
8
Magnitude
Figure 1. Observed cumulative number of earthquakes and its non-parametric fit for the data from
southern California (after Field et al., 1999).
5. Remarks and Conclusions
This paper is devoted to the problem of determining the maximum regional earthquake
magnitude mmax. A generic equation for mmax evaluation was developed. The equation is
very flexible and is capable of generating solutions in different forms, depending on the
assumptions about the model and/or on the available information about past seismicity.
Three special cases of the generic formula are discussed:
-
when earthquake magnitudes are distributed according to the truncated
Gutenberg-Richter relation,
when the empirical magnitude distribution deviates appreciably from the
Gutenberg-Richter model,
when no specific model of magnitude distribution is assumed.
8
As an example of application of the developed formalism, the three solutions
(abbreviated as K-S, K-S-B, and N-P-G) have been applied to the assessment of the
maximum magnitude for Southern California. The three estimates of mmax are: 8.32 
0.43, 8.31  0.43 and 8.34  0.45. The values of m̂max obtained from the K-S and K-S-B
procedure, are insignificantly lower than the value obtained from the application of the
non-parametric procedure N-P-G. This difference can be attributed to the fact that the
K-S and K-S-B estimators are based on the Gutenberg-Richter model of the frequencymagnitude relation, which is not necessarily correct for Southern California. Since the
N-P-G procedure is non-parametric and does not require specification of the functional
form of the magnitude distribution, its estimate of the maximum magnitude for
Southern California (equal to 8.34  0.45), may be considered more reliable than the
model-based estimators K-S and K-S-B.
References
Anderson, J.G., and J.E. Luco, (1983). Consequences of slip rate constrains on
earthquake occurrence relation, Bull. Seism. Soc. Am. 73, 471-496.
Anderson, J.G., S.G., Wesnousky, and M.W. Stirling, (1996). Earthquake size as a
function of slip rate, Bull. Seism. Soc. Am. 86, 683-690.
Campbell, K.W. (1982). Bayesian analysis of extreme earthquake occurrences. Part I.
Probabilistic hazard model, Bull. Seism. Soc. Am. 72, 1689-1705.
Cooke, P. (1979). Statistical inference for bounds of random variables, Biometrika, 66,
367-374.
Dargahi-Noubary, G.R. (1983). A procedure for estimation of the upper bound for
earthquake magnitudes, Phys. Earth Planet. Interiors, 33, 91-93.
EERI Committee on Seismic Risk, (H.C. Shah, Chairman), (1984). Glossary of terms
for probabilistic seismic risk and hazard analysis, Earthquake Spectra, 1, 33-36.
Field, D.H., D.D. Jackson, and J.F. Dolan (1999). A mutually consistent seismic-hazard
source model for Southern California, Bull. Seism. Soc. Am. 89, 559-578.
Frohlich, C. (1998). Does maximum earthquake size depend on focal depth? Bull.
Seism. Soc. Am. 88, 329-336.
Hall, P. (1982). On estimating the endpoint of a distribution, Ann. Statist. 10, 556-568.
Jin, A., and K. Aki (1988). Spatial and temporal correlation between coda Q and
seismicity in China. Bull. Seism. Soc. Am. 78, 741-769.
Kagan, Y.Y. (1991). Seismic moment distribution, Geophys. J. Int., 106, 123-134.
Kijko A., and G. Graham (1998). "Parametric-Historic" procedure for probabilistic
seismic hazard analysis. Part I: Assessment of maximum regional magnitude mmax,
Pure App. Geophys, 152, 413-442.
Kijko, A., S. Lasocki, and G. Graham (1999). Nonparametric seismic hazard analysis in
mines. Submitted for publication in the Bull. Seism. Soc. Am.
Kijko, A. and M.A. Sellevoll (1989). Estimation of earthquake hazard parameters from
incomplete data files, Part I, Utilization of extreme and complete catalogues with
different threshold magnitudes, Bull. Seism. Soc. Am. 79, 645-654.
Kijko, A. and M.A. Sellevoll (1992). Estimation of earthquake hazard parameters from
incomplete data files, Part II, Incorporation of magnitude heterogeneity, Bull. Seism.
Soc. Am. 82, 120-134.
Nordquist, J.M. (1945), Theory of largest values applied to earthquake magnitudes,
Trans. Am. Geophys. Union 26, 29-31.
9
Nuttli, O.W. (1981). On the problem of maximum magnitude of earthquakes. USGS
Open Report, 13p.
Page, R. (1968). Aftershocks and microaftershocks, Bull. Seism. Soc. Am. 58, 11311168.
Papastamatiou, D. (1980). Incorporation of crustal deformation to seismic hazard
analysis, Bull. Seism. Soc. Am. 70, 1321-1335.
Pisarenko, V.F., A.A. Lyubushin, V.B. Lysenko, and T.V. Golubieva (1996). Statistical
Estimation of seismic hazard parameters: maximum possible magnitude and related
parameters, Bull. Seism. Soc. Am. 86, 691-700.
Robson, D.S. and J.H. Whitlock (1964). Estimation of a truncation point, Biometrika,
51, 33-39.
Ward, S.N. (1997). More on Mmax, Bull. Seism. Soc. Am. 87, 1199-1208.
FILE NAME: Ljubliana2
VERSION: 18 May, 2000
PRINTED: 28 July, 2017
10