Bulletin of the Seismological Society of America, Vol. 83, No. 1, pp. 7-24, February 1993 STATISTICS OF CHARACTERISTIC EARTHQUAKES BY Y. Y. KAGAN ABSTRACT Statistical methods are used to test the characteristic earthquake hypothesis. Several distributions of earthquake size (seismic moment-frequency relations) are described. Based on the results of other researchers as well as my own tests, evidence of the characteristic earthquake hypothesis can be explained either by statistical bias or statistical artifact. Since other distributions of earthquake size provide a simpler explanation for available information, the hypothesis cannot be regarded as proven. INTRODUCTION Size distribution of earthquakes has important implications both for practical applications in engineering seismology (earthquake hazard analysis) and for construction of a consistent physical theory of earthquake occurrence. Since the 1940s, we have discovered (see more in Kagan, 1991) that magnitude distribution when observed over broad areas follows the Gutenberg-Richter (G-R) relation. This relation can be transformed into a power-law distribution for the scalar seismic moment. (In this note, as a measure of earthquake size, I consistently use the scalar seismic moment of an earthquake, which I denote by symbol M. Occasionally, for illustration purposes, I also use an empirical magnitude of earthquakes denoted by m.) Considerations of finiteness of the total seismic moment or deformational energy available for an earthquake generation, and the finite size of tectonic plates, require that the power-law relation be modified at the high end of the moment scale. This upper limit is usually treated by the introduction of an additional parameter, maximum magnitude or maximum moment (McGarr, 1976; Anderson, 1979; Molnar, 1979; Anderson and Luco, 1983; Kagan, 1991). The behavior of the size distribution in the neighborhood of the maximum moment is not well known due to an insufficient number of great earthquakes in available catalogs. Although evidence exists that the moment-frequency relation bends down at the large size end of the distribution, at least, for the world wide data (see discussion in section on Earthquake Size Distribution), the data on the size distribution in particular regions are scarce. However, it is very important, especially for engineering purposes, to evaluate the maximum credible size of an earthquake in a region and the probability of its occurrence. Thus, the problem can be formulated as follows: is it possible to estimate the maximum size earthquake in a region, and is its occurrence rate higher or lower than the extrapolation of the G-R relation for small and moderate shocks? Recently a more complex model of earthquake size distribution has been proposed: the characteristic earthquake distribution (Singh et al., 1983; Wesnousky et al., 1983; Schwarz and Coppersmith, 1984; Davison and Scholz, 1985; see also Scholz, 1990). Generally, the characteristic distribution presupposes the existence of two separate and independent earthquake populations: (1) that of regular earthquakes and (2) that of characteristic earthquakes. Regular earthquakes, which might include foreshocks and aftershocks of characteristic events, follow the G-R relation characterized by three parameters: a, '7 8 Y . Y . KAGAN fl, and Me, where (1) a = the number of earthquakes in the unit of time with seismic moment higher t h a n a cutoff level, (2) fi = the slope of the momentfrequency relation in the loglog plot, and (3) M a = the upper m o m e n t limit for regular earthquakes. The characteristic events correspond to the largest earthquakes in a region. In order to characterize these events occurrence we need at least two additional parameters: the size and the rate of occurrence of the characteristic earthquakes, Mma x and ama x. From the point of view of general statistical and continuum-mechanical considerations, there should be no serious objection to the idea of "characteristic" earthquakes, if the hypothesis signifies t h a t the distribution of earthquakes in a region that has a particular geometry and stress pattern, m a y differ from the standard G-R relation. An earthquake occurrence is dependent on stress distributions and geometry of faults, therefore the power-law moment-frequency law is only a first approximation. However, questions remain as to the degree of difference between the moment distribution for a large region and the distribution for a small area centered on an earthquake fault segment, even to the form of such postulated difference. Proponents of the "characteristic earthquake" hypothesis assert t h a t this difference is substantial (see Wesnousky et al., 1983, their Fig. 1) and has a particular form of sharp probability density peak at the high magnitude end of the size distribution; i.e., the frequency of these earthquakes is significantly higher t h a n an extrapolation based on the G-R law would predict. This model is proposed on the basis of geologic evidence for similarity of large earthquakes on fault segments, as well as on some considerations of the geometrical structure of earthquake faults, their barriers, and the rupture process on these complex fault systems. The determination of characteristic earthquakes for an extended fault system involves subdividing the system into individual faults and their segments. This procedure of segment selection is based on geological, seismic, and other information, and it is of a qualitative nature, since no general algorithm exists for this selection. Each of the segments is assumed to be ruptured completely by nearly identical (characteristic) earthquakes. Recently the characteristic earthquake hypothesis has been used to calculate the seismic risk (see, for example, Youngs and Coppersmith, 1985; Wesnousky, 1986; Nishenko, 1991) and to model earthquake rupture propagation (Brown et al., 1991; Carlson, 1991). Because of the importance of this hypothesis it deserves rigorous testing. Kagan and Jackson (1991b) tested whether the seismic gap hypothesis predicts time of occurrence for large earthquakes in the circum-Pacific seismic zones better t h a n a random guess (the Poisson distribution) and found t h a t it does not. The negative results of the gap hypothesis verification demonstrate t h a t the plausible and even seemingly compelling physical and geological arguments do not guarantee t h a t the hypothesis is correct. Thus, the arguments proposed for the justification of the characteristic hypothesis are not sufficient for its acceptance as a valid scientific model; the hypothesis needs to be tested and validated. This paper addresses only the subject of the size distribution of earthquakes; their temporal relations are discussed only if they are relevant for the testing of the characteristic earthquake hypothesis. Such testing encounters several difficulties: 1. The major justification for the characteristic earthquake hypothesis comes from general geological, physical, and mechanical considerations, which are STATISTICS OF CHARACTERISTIC EARTHQUAKES 9 of a qualitative nature and subject to various, sometimes contradictory interpretations. 2. Since the above considerations are non-unique, the quantitative verification of the hypothesis is based almost exclusively upon the statistical analysis of earthquake size distributions in various seismogenic regions (see references above). 3. Unfortunately, the seismic record in most seismic zones is too short to estimate the distribution of event sizes directly from the data. Many additional assumptions about large earthquakes, the stationarity of earthquake occurrence and relation between various magnitude scales, for example, are needed before testing is possible. These assumptions make verification of results inconclusive. 4. Moreover, the characteristic hypothesis is not specified to the point where it can be subjected to a formal testing; some basic parameters of the model are known only in broad qualitative terms. The above considerations make it almost impossible to verify or refute the characteristic hypothesis on the basis of current evidence. The best outcome we can hope to obtain is to show that the quantitative proofs proposed to validate it are not sufficient; simpler models can explain available data as successfully as the characteristic hypothesis. In this paper, I do not challenge the physical, geological, and geometrical arguments that are used to support the characteristic hypothesis, although in the discussion section I provide some reasons that cast doubts on the possibility of objective segmentation of earthquake faults, and therefore on the unique definition of characteristic earthquakes. However, the main point of this paper is to review the quantitative statistical arguments used to verify the characteristic hypothesis. These arguments constitute a backbone for the validation of the hypothesis. In particular, the results of this paper indicate that earlier published validation attempts contain some serious defects, at least from a statistical point of view. Howell (1985) observes that certain statistical evidence used to prove the characteristic earthquake hypothesis m a y be the result of statistical fluctuations and thus selective reporting of "successful" samples. Another possible biased selection involves, for example, taking a relatively small area around a large known earthquake and comparing the seismicity of w e a k and strong earthquakes in the area. Such a selection uses earthquake catalog information to choose the sample that, in general, follows statistical laws that differ from a sample chosen by an unbiased procedure, j u s t as the maximum member of a population has a different distribution than a randomly selected member. Thus, the catalog is used twice: first to select a sample and then to test it. As an example of such selection, suppose a fault system of length L is enclosed in a rectangle L x W. If we reduce the width W so that the largest earthquake is kept inside the rectangle, the final sample (when W -~ 0) will contain a single earthquake. Anderson and Luco (1983) show that m a n y geologic and paleo-seismic studies compare earthquake size distributions belonging to different distribution classes: those of earthquake ruptures at a specific site and those events occurring in a specific area. It is obvious, for example, that a large earthquake has a much higher chance to rupture the surface, thus its statistics follows a different distribution than the statistics of events that are registered by a seismographic 10 Y.Y. KAGAN network (Anderson and Luco, 1983). Moreover, since these data are usually collected by different methods, their systematic errors may differ. For example, large surface ruptures are easier to discover by paleo-seismological surveys. Comparison of seismicity levels obtained from local instrumental catalogs and from catalogs of strong earthquakes of longer time duration is potentially an error-prone procedure. Local catalog data are usually collected over quiet periods of earthquake activity; even if a strong earthquake occurs in an area covered by a local network, a threshold level of detection and reporting usually rises abruptly immediately following the main event, thus introducing a negative bias in the seismicity levels of small earthquakes. Hence, incomplete reporting of these events causes significant bias in seismicity levels. Moreover, there is substantial evidence that seismicity undergoes both short- and long-term variations (Kagan and Jackson, 1991a, and references therein). Whereas shortterm changes of seismicity, due mostly to aftershock sequences, could reasonably be taken into account, no such models yet exist for the long-term variations. These variations m a y span decades, centuries, or even milleniums (Kagan and Jackson, 1991a). Another source of bias is the saturation of all magnitude scales. This saturation is explained by the finite frequency of a seismographic network (Kagan, 1991). The saturation causes a loss of important information that cannot fully be restored by any transformation rule. For example, an evaluation of the recent seismicity levels by Singh e t a l . (1983) and Davison and Scholz (1985) is based on body-wave magnitude ( m b) data. m b saturates at magnitudes as low as 6.0; moreover, in order to compare m b with strong earthquake data and fault slip data, m b is first converted into m s ; m s is, in turn, converted into the seismic moment. Clearly, significant systematic errors m a y accumulate during such conversions. Since m b data are usually limited to earthquakes of lower magnitude limits, their extrapolation to the domain of m a x i m u m events depends strongly upon accepted b values of the G-R law: even insignificant modification of the b value might bring these data into agreement with earthquake data in the upper magnitude range. In the publications cited above, the standard errors in the b-value estimates have not been evaluated; these errors are significant if the total number of events in a sample ( N ) is small: the coefficient of variation for the b value is 1 / ~ N (Aki, 1965). For example, if we consider that the discrepancies in rates of occurrence of strong and w e a k earthquakes reported in Singh e t a l . (1983) are due to saturation, b-value uncertainties, and possible long-term fluctuations of seismicity, then only one region out of four (Oaxaca) exhibits a significant difference between extrapolated seismicity for m s events and registered "maximum" earthquakes. Such a discrepancy m a y readily be attributed to random fluctuations of the data (Howell, 1985). Singh e t a l . (1983) often compare actually observed earthquake numbers with expected numbers according to extrapolation of the G-R law. If the ratio of these numbers is high, it is considered as evidence in favor of the characteristic earthquake hypothesis. However, if the expected number is small, the relatively high ratio values might be the result of random fluctuations. For example, for the Poisson variable with the expected value of 0.5, in 9% of cases two and more events are observed (Gnedenko, 1962, pp. 431-434). If the expected number is 1.0, there is still 1.9% probability that the actual number of events is equal to or exceeds 4. Similarly, one should not make significant conclusions when STATISTICS OF CHARACTERISTIC EARTHQUAKES 11 observing two or three events in a certain magnitude interval versus zero or one event in another interval (Singh et al., 1983); such variations can be explained again by random fluctuations. A single work among those cited escapes criticism: Davison and Scholz (1985) investigate a large portion of the circum-Pacific belt (more than 3000 km), hence there is little possibility for the selection bias or statistical fluctuations explaining their results. Except for use of the m b data, the study analyzes a relatively homogeneous m s catalog to infer suitability of the characteristic earthquake hypothesis. Davison and Scholz (1985) first normalize m s data to the recurrence times of characteristic earthquakes; then they extrapolate the normalized seismicity levels to magnitudes of the maximum events. For five out of nine zones where great earthquakes occurred during this century, Davison and Scholz (1985) find that extrapolation of frequency-moment data is smaller by a factor of 10 than the moment of earthquakes that actually originated on these segments. However, these tests are critically dependent on the estimate of the return time for the m a x i m u m magnitude events. Davison and Scholz (1985) evaluate the recurrence time by dividing the earthquake moment by the moment accumulation rate. This procedure assumes that most elastic deformation is effected by characteristic events. However, we need to test whether another model, for example, a regular G-R relation, can explain these data with the same efficiency. To accomplish this test, we need first to describe statistical distributions of earthquake size. Although most of the distributions in this paper have been published earlier (see Anderson, 1979; Molnar, 1979; Anderson and Luco, 1983; Kagan, 1991), consistent use of the seismic moment allows us to simplify the formulas significantly. Thus this paper has two objectives: (1) to compile formulas for distributions of the seismic moment and related quantities and (2) to review available evidence and statistical tests for the characteristic earthquake hypothesis. EARTHQUAKE SIZE DISTRIBUTIONS The distribution density ¢ of the seismic moment is usually assumed to follow the power-law or a Pareto distribution, which is an appropriate transformation of the G-R relation (see more in Kagan, 1991): ¢(M)=fiMc~.M -1 ~ forMc <M<~. (1) M c is a lower threshold seismic moment. This lower limit is introduced for two reasons: (1) since any seismographic network has a detection threshold and (2) since otherwise the distribution would diverge at M ~ 0 (there is an infinite number of events with M -~ 0). The distribution (1) requires only one degree of freedom for its characterization. In the traditional form of the G-R relation another p a r a m e t e r a or a (Kagan, 1991) is added; this p a r a m e t e r is the rate of occurrence of earthquakes with M > M c. In later considerations, in addition to the density function, I often use the cumulative distribution function F ( M ) and the complementary function q)(M) = 1 - F ( M ) . Statistical analysis of magnitude and seismic moment distributions yields the value of fi in (1) between ~1 and 1 for small and medium-size earthquakes (Kagan, 1991). Simple considerations of finiteness of seismic moment or deformational energy, available for an earthquake generation, require that the 12 Y.Y. KAGAN power-law relation be modified at the large size end of the moment scale. At the minimum, the distribution tail must have a decay stronger than M -1 -~ with fi > 1. This problem is generally solved by the introduction of an additional parameter, called a "maximum moment" (M~), to the distribution. This new parameterization of the modified G-R relation has several forms (Anderson and Luco, 1983; Kagan, 1991). In their simplest modifications, they require at least three parameters to characterize an earthquake occurrence. I consider four possible distributions of earthquake size that all satisfy the requirement of finiteness of the moment flux: (a) characteristic earthquake distribution; (b) maximum moment distribution; (c) truncated Pareto distribution; and (d) g a m m a distribution. Below I refer to cases (b), (c), and (d) as to a l t e r n a t i v e models. In this paper the term "characteristic earthquake distribution" refers to a distribution for which the major part of total seismic moment flux is carried out by characteristic earthquakes, so that we essentially ignore t h e contribution of regular earthquakes in the total budget of seismic moment rate. Thus, the seismic deformation is characterized by two parameters M~c and ax~ (moment and rate of characteristic earthquakes), which are either both evaluated independently using the available geological/seismological data or, if only one of them can be evaluated, the other can be computed from the relation (2) = a~cMxc, where M is the moment flux. Here I understand the "maximum moment distribution" as a truncated power-law distribution of the complementary cumulative function (Molnar, 1979) q)(M) = for M c < M < M x m , and (I)(M) = 0 for M ~ < M, (3) where Mx~ is the m a x i m u m moment for this model. For the cumulative function, we obtain F( M) M s _ Me t3 M~ - and F(M) for M e < M < M x m , (4) = 1 forMxm < M . For the maximum moment distribution density, ~(M) = fiMc~'M -1-~ + 6(M- ~b(M) = 0 Mxm ) • Mc]8 f o r M c forMxm<M, <=M~Mxm, (5) where 6 is the Dirac delta function. The two first distributions can be regarded as limiting cases of the general characteristic hypothesis. In the former distribution (a), the regular earthquakes do not play a significant role in the total balance of fault slip rate and 13 STATISTICS OF CHARACTERISTIC EARTHQUAKES therefore can be ignored. In the latter distribution (b), characteristic earthquakes are considered as an extrapolation of the regular events distribution, or as a density spike at the high m o m e n t end of the distribution. The size distributions referred to in most of the publications on characteristic earthquakes can easily be obtained as a mixture of (a) and (b). Another model does not truncate the complementary cumulative distribution but the density function: for the Pareto distribution truncated on both ends (c), the distribution density is ¢(M) = fi M x~p• M c~ M-1-8 Mx~p- M f forM~ < M < M x p , = (6) = where Mzp is the m a x i m u m m o m e n t for the Pareto model, which m a y differ fromMxm of (3). For cumulative function, we obtain F(M) = [_M__~_]~ M~ - M f Mx~p - M f for Mc < M < Mxp, = = (7) and the complementary distribution function has a form ¢p( M ) = [ _ ~ ] ~ M~p - M~ Mx~p:Mcc ~ (s) f°rMc<M<Mxp" For the g a m m a distribution (d), ¢(M) =C-1× ~ exp ~/xl f°rMc<M<% (9a) where Mxg is the m a x i m u m m o m e n t p a r a m e t e r for the g a m m a distribution. The p a r a m e t e r Mxg has a different meaning from the p a r a m e t e r Mxp or Mxm: whereas the latter two represent a "hard" limit, the former is a "soft" limit; hence in the g a m m a distribution some earthquakes m a y have the m o m e n t M~g < M. The normalizing coefficient C is (Bateman and Erdelyi, 1953) ( ) C= 1- ~Mc n ~exp([Mxg Mc ( M c / M xf ig+ ~) F(1 - fi) - n~O ( - 1 )ni(1---- I , (9b) where F is the g a m m a function. For M~ << M~g, C = 1. The cumulative function is F(M)=C = l×fi [ ]expl [Mxg]JMo ~ exp [ ~ dM 7( - f i , M / M x e) - 7( - f i , Mc/Mxg)) forM~ < M < % (10) 14 Y.Y. KAGAN where ~ is the incomplete g a m m a function (Bateman and Erdelyi, 1953). The complementary distribution function has a form ¢p(M) = 1 - F ( M ) forM c <M<~. (11) To illustrate the gamma distribution fit to actual data, Figure l a displays a cumulative histogram for the scalar seismic moment of earthquakes in the H a r v a r d catalog (Dziewonski et al., 1992, and references therein), for shallow earthquakes (depth interval 0 to 70 km). The available catalog covers the period from 1 J a n u a r y 1977 to 31 December 1991. To ensure the data uniformity events with M >__10177 Nm (m w >=5.8) are used. The distribution of the worldwide data for the largest earthquakes differs from the G-R law, b u t this difference has an opposite sign than the characteristic hypothesis predicts: the number of great events is smaller than the extrapolation of the G-R curve. Thus, our null hypothesis for the size distribution of the largest events should be the gamma or truncated Pareto distribution (see below). The maximum likelihood procedure (Kagan, 1991) allows us to retrieve the values of distribution parameters. The values of fi and Mxg are listed in Table 1 for the maximum of the likelihood function and for four extreme points of an approximate ellipse corresponding to the 95% confidence area (see Fig. 2 in Kagan, 1991). The five curves in Figure l a are calculated using the above values, they form a 95% envelope of possible approximations for the momentfrequency relation by the gamma distribution. The curves in Figure l a demonstrate that worldwide earthquake size data are reasonably well-approximated by the gamma distribution. For comparison, Figure lb displays the curves for the truncated Pareto distribution (8). (We have not tried distributions (a) and (b) since they are clearly inappropriate for approximation of the worldwide data.) To obtain the values of parameters that are also listed in Table 1, I again apply the likelihood method. From (6) it is clear that in this case the function should reach the m a x i m u m at the M value of the largest earthquake in the catalog ( M m a x = 3.57 • 1021 Nm), thus the m a x i m u m likelihood estimate and the lower bound of the 95% confidence area coincide. Even a visual inspection shows that the gamma distribution gives a better approximation for the experimental curve. The disadvantages of the hard limit used in the truncated Pareto distribution are more obvious if we compare the H a r v a r d catalog with the history of the seismic moment release during the 20th century (McGarr, 1976; Kanamori, 1983). The largest earthquake (Chilean of 1960) had a moment 2 • 1023 Nm, thus we would need to significantly modify the Mxp parameter of the Pareto distribution, whereas the g a m m a distribution approximates these data with only a slight change of parameters: since earthquakes with M > Mxg are allowed in the gamma distribution, the Chilean earthquake can be accommodated by using the "upper" curve of the 95% envelope (see Table 1). Figure 2 displays complementary functions of distributions (3), (8), and (11) for t~ = 2 (Davison and Scholz, 1985). The value of Mxm = 1023 is set, approximately corresponding to the largest observed earthquake in the Aleutian islands and Alaska (Kanamori, 1983). Other Mx's are adjusted so that the distributions would yield the same seismic moment release as (b) (see also below, equation 15). The characteristic model (a) would represent a step function in the plot at the abscissa value 1023 Nm. For the truncated Pareto STATISTICS 10 4 , , ,,,,,,, OF CHARACTERISTIC , , ,,,,,,, , , ,,,,,,, EARTHQUAKES , , ,,,,,,, 15 , , ,,,,,,, , , ,,,,, 10 3 " '~siii".. i;, .. i~i;!ii!!!!!!iiiii. " 10 2 Y. E '-,..~ 101 %, l00 •. 10-1 i i i iiiiit 10 17 i llll,,i i 1018 , i lllliH 10 19 i i ,llll,, 10 20 ~ .i -.. \ ".. , I.IHH 10 21 i ".i i lilt 10 22 1023 L o g Seismic M o m e n t ( N m ) (a) 10 4 , , ,,,,,,, , , ,,,,,,, , • ,,,,,,, , , ,,,,,,, , , ,,,,,,, ~, , ,,,,,~ I ~llll i i 103 E Z 102 7 E 10 I ~9 10 o 10_ 1 1017 , i , J ,,liH 1018 i , ill,zt t i , ,,Llll 1019 1020 _t r i itttt~ 1021 tit 1022 L,, 1023 Log Seismic M o m e n t (Nm) (b) FIG. 1. L o g o f s c a l a r s e i s m i c m o m e n t v e r s u s c u m u l a t i v e f r e q u e n c y f o r t h e 1 9 7 7 to 1 9 9 1 H a r v a r d c a t a l o g . T h e c u r v e s s h o w t h e n u m b e r s o f s h a l l o w e a r t h q u a k e s w i t h t h e m o m e n t l a r g e r o r e q u a l to M . T h e t o t a l n u m b e r o f e v e n t s w i t h M > 10177 is 2 3 5 8 . D a s h e d a n d d o t t e d c u r v e s c o r r e s p o n d to t h e approximation by the gamma (la) and-Pareto (lb) distributions (maximum likelihood shown by a d a s h e d line; 9 5 % e n v e l o p e s h o w n by d o t t e d lines). 16 Y. Y. KAGAN TABLE 1 PARAMETERVALUES(PARETO AND GAMMADISTRIBUTIONS) FOR THE HARVARDCATALOG1977-1991 Likelihood Function _+95% Limits /3-value M x x Gamma distribution max 0.667 left 0.627 down 0.650 right 0.706 upper 0.683 Pareto distribution max, down 0.684 left 0.645 right 0.715 upper 0.687 102] Nm M x 1021 Nm/yr 2.51 1.58 1.00 4.60 22.0 2.49 2.46 1.92 2.57 4.65 3.57 3.57 3.57 11.0 2.91 3.36 2.56 4.08 10o ii 10-1 , 10-2 o 10 -3! ~ 10 4 e~ ~o 1@5 10-6 10_ 7 1018 T i I1.1. 1019 , i IIH,. 1020 , ~ iiitl, 1021 i i iiiiiit 1022 1 0 23 1 0 24 Seismic Moment (Nm) FIG. 2. Complementary distribution functions for several models of earthquake size distribution: (1) Pareto distribution with unlimited maximum seismic moment (dotted line); (2) maximum moment model (solid line); (3) truncated Pareto distribution (dashdot line); (4) gamma distribution (dashed line). We take seismic moment cutoff at 10 18 Nm; M x _ = 10 2 3 ; other maximum moment quantities are adjusted to yield the same moment rate, given num]aer of earthquakes with M > 10 is : Mxp = 3.38 x 10-23 and Mxg = 4.74 x 1023 (see equation 15). distribution, the return times for earthquakes with M ~ Mxp are approaching infinity since the probability of occurrence of such events obtained by integration of (6) is close to zero. The gamma distribution displays similar behavior (Fig. 2). Using this figure, one can calculate the number of expected earthquakes with the moment greater or equal to M, by multiplying the value of alp(M) by the S T A T I S T I C S OF C H A R A C T E R I S T I C EARTHQUAKES 17 total number of events with M >=M c. The difference between the alternative distributions is small for most of the moment range, only at M > 5 • 1021 the difference exceeds 10%. To distinguish between these models we need information on great earthquakes, which are extremely rare according to (c) and (d). Such events might correspond to simultaneous breaking of several segments of the subduction belt. Using models (a) through (d) one can calculate the seismic moment release rate (moment flux), taking the seismic activity level a o for earthquakes with moment M o and greater. To avoid awkward algebra, M x >> M 0. In most cases, M 0 can be chosen to correspond to M e in (1). For the characteristic distribution the moment flux is given by (2). The total seismic moment rate according to the m a x i m u m moment distribution is (Molnar, 1979; Anderson and Luco, 1983) /I//1- ~/I// f i - - " M=ao,,,xm ,,~o 1 - 1 (12) fi' for the Pareto distribution we calculate (McGarr, 1976; Anderson, 1979; Anderson and Luco, 1983) M /3 = ~ nzl-~t ~ . -0-,-x, -,-o 1 - 13' (13) and for the g a m m a distribution /3 : a°Mx~g-¢M°¢F(2 - / 3 ) -1- - - ~ " (14) Table 1 displays the moment rate per year for the H a r v a r d catalog. The estimates are relatively stable, even though the parameter values display significant variations. The rate value obtained by a direct summation of the moment is 2.33 • 1 0 21 N m / y r ; i.e,. this value is close to the values corresponding to the m a x i m u m of likelihood functions. Parenthetically I note t h a t the direct summation of Kanamori (1983) data for 1921 to 1976 yields 8.53 • 1021 N m / y r , whereas McGarr (1976) found the M value of 3.4- 1022 N m / y r on the basis of plate motion calculations. F u r t h e r discussion of this problem is outside the scope of this article. By equating expressions (12) to (14) for total moment rate, we can find the ratios between m a x i m u m moments in all of these distributions: Mxg[r(2 -/3)] 1/(1-~) Mxp = --- Mxm/3- 1/(1-~). (15) For values of/3 equal to ~2 or to ~1 the g a m m a function is F(2 - 2) = 0.893 or F(2 - 1) 2 = 0.886. Using formulas (5), (6), and (9), I calculate the distribution of the total seismic moment t h a t is released by an earthquake with the moment M. In this case, the distributions converge for M ~ 0, thus we do not need the truncation on the left-hand side of the distributions, and we can define the distributions on the interval 0 < M < ~. For the characteristic distribution (a), the distribution is a step function at Mxc; for other models, the total moment 18 Y.Y. KAGAN release distributions can be calculated as (16) F(M) -- fjch(M)MdM. In particular, for the maximum moment distribution (b) the cumulative function is F(M) =fl , for0<=M<__Mxm, and F= 1 forMxm < M ; (17) and F= 1 forMxm < M ; (18) for the Pareto distribution F(3~) = , for0<M<M~m, and for the gamma distribution 1 F(]~/) - - fi, M/Mxg), y(1 F(1 for 0 < M < ~. (19) Figure 3 displays these distributions for fi = ~. 2 The plots can be compared to similar results by Anderson and Luco (1983, their Table 1). According to all of the distributions, the major part of seismic moment release is carried out by great earthquakes (i.e., events with ( M x / 1 0 ) < M < Mx): for case (b) it is 1 J E lll,tll i i iil,l,i i i ii,ll~r i i ii~lll ! 0.9 / [ / 0.8 i/ }; 0.7 0.6 0.5 0.4 0.3 ¢, 0.2 0.1 0 1019 . , ,111,. 1020 ~ J ii1.,,, 1021 , i t,lllll 1022 i , , i,l,i, 1023 i , .,,,,,~ 1024 i i ,qll 1025 Seismic Moment (Nm) FIG. 3. Distribution of total seismic m o m e n t released by e a r t h q u a k e s with seismic m o m e n t M. The line types a n d values of p a r a m e t e r s correspond to t h a t of Figure 2. STATISTICS OF CHARACTERISTIC EARTHQUAKES 19 69.1%, for case (c) 53.6%, and for (d) 39.7%. In the last case (the g a m m a distribution), earthquakes with M > Mx contribute an additional 9.6% to he total. The above values are obtained for fi = ~. 2 The m a x i m u m magnitude 1 distribution (b) implies t h a t y of the seismic moment release is due to m a x i m u m moment earthquakes. Alternative distributions differ little in their total mom e n t release until the earthquake moment approaches m a x i m u m size. This m a y signify t h a t it would be difficult to infer the proper form of a distribution on the basis of the relatively short histories of earthquake occurrences available for most of the Earth's regions. RESULTS Next we will examine whether the discrepancy between the rates of event occurrence in the zones selected by Davison and Scholz (1985) and the rates of great events which originate in the same zones might have an explanation other t h a n characteristic distribution (a). In case of a characteristic distribution, the moment release is concentrated in a sharp, delta-function-like spike, whereas other models envision such a release in a more continuous manner. Davison and Scholz (1985) find t h a t the strongest earthquakes have an occurrence rate with a factor 5 to 30 higher t h a n the expected rate for those events using the extrapolation from weak events by the G-R law. Let us suppose t h a t in reality earthquakes obey any of the alternative distributions (b to d) but we process the largest events through the same procedure as applied by Davison and Scholz (1985): we normalize the occurrence rate of medium-size earthquakes according to the recurrence time of characteristic events calculated using (2) and extrapolate the obtained rate to great events. In particular, let us assume t h a t all earthquakes with moment larger t h a n Mf are t a k e n to be characteristic events. Then the recurrence time for an event with moment M is T= Mf A~ILf (20) where Lf is the rupture length. M and a in this section are normalized per unit of time as well as per unit of fault length. The expected total number of earthquakes with Mf < M during this time I- M r 11-~ Nr= TLfaf= ( 1 - fi)[~xm J , (21) where in the right-hand part of (21) we calculate Nf for model (b). Let us call the inverse of Nf the discrepancy ratio p. This ratio shows how much the procedure employed by Davison and Scholz (1985) under-counts the expected number of earthquakes. For example, even if an earthquake has a moment equal to the m a x i m u m (Mxm) and fi = y, 2 then p = 3; i.e., this procedure reduces the expected number of regular events by a factor of 3, which is the consequence of the distribution of the total moment displayed in Figure 3; only one third of the moment is released by earthquakes with Mxm (cf. Anderson and Luco, 1983). 20 Y.Y. KAGAN One needs to remember t h a t the "true" value of M~ is unknown, so the difference obtained by using (2) is an average over possible values of M and Mx. Since we do not know the value of M~m, for model (b) I assume t h a t any earthquake with M f < M is a characteristic event. These earthquakes occur with probabilities (5), hence we calculate the average value of p as , = fMx~p4~(M) d M = Mxm ] + - JMr 1--fl Mxm ] " (23) For example, assuming again t h a t fi = -~, and MW = 1021, which corresponds to earthquakes with m s = 8.0, we obtain ~ = 9.3. This means t h a t the average discrepancy for model (b) is of one order of magnitude; i.e., it is about the same as was observed by Davison and Scholz (1985). Due to random fluctuations, this ratio m a y be larger or smaller for certain areas. Similar expressions for other models can be obtained, for example, for the Pareto distribution (c) ~ = l f i 2 f i M f -1 Mxp - M f M~p -- M--~" (24) The values of ~ are only slightly different from t h a t of (23): for the same example ~ = 9.5. Even if we t a k e n M f = 1022 (m s -~ 8.7), the ratio is still large --4.6. Combined data from the Aleutian Arc (Fig. 11 in Davison and Scholz, 1985) follow one of the "noncharacteristic" alternative distributions. From a statistical point of view this is to be expected, since the data set is extensive and its selection is largely unbiased, except for the 1964 Alaskan earthquake, which was one of the two largest worldwide earthquakes in this century. Hence we should expect t h a t the point corresponding to this event would be an outlier in the moment-frequency plot. Davison and Scholz (1985) identify four out of nine zones where no strong earthquakes have occurred in this century, as potential seismic gaps. They normalize the seismicity levels measured in these zones to the r e t u r n times of m a x i m u m moment earthquakes. These m a x i m u m events are assumed to rupture the whole extent of the appropriate segment. The arguments above are applicable to those segments of the Aleutian Arc: if the distribution of earthquake size follows model (c) or (d), we will obtain similar differences between the probabilities of the strongest events and extrapolation of seismicity, using the characteristic hypothesis (2). Parenthetically, I mention t h a t during the 6 years t h a t have elapsed since the publication of the Davison and Scholz' (1985) paper, strong earthquakes have not followed the seismic gap hypothesis: according to the Harvard catalog of seismic moment tensor solutions (Dziewonski et al., 1992), two earthquakes out of the six events with M > 1019"5 in the Aleutian Arc occurred in the previously identified zones (M = 1021 in 1957 zone, M = 10137 in 1938 zone), three events near, but outside the Yakataga gap, and only one (M = 1019's) occurred in the Kommandorski gap. If we subscribe to the position t h a t the size distribution of earthquakes in the Aleutian Arc follows the truncated Pareto or g a m m a distribution, then we STATISTICS OF CHARACTERISTIC EARTHQUAKES 21 calculate from Figure 11 in Davison and Scholz (1985) that about eight events with moment larger than 3.5 • 1021 occurred in the 85 years prior to 1985. The probability of eight earthquakes falling into five fault zones out of a total of nine is 0.34; thus the observed earthquake pattern can easily be explained by a simpler assumption of the Pareto or g a m m a distribution. DISCUSSION The existence of several competing earthquake size distributions suggests that we do not yet know the exact seismic moment-frequency relation. In general, by introducing delta-functions or sharp corners in the distributions, we impose rather strong constraints on the size of an earthquake. These constraints are not w a r r a n t e d by the available evidence, since they contradict the known behavior of dissipative physical systems. Therefore, we should introduce some degree of smoothness in the distributions which would require additional parameter(s), thus such distributions as "characteristic earthquake," "maxim u m moment," or the truncated Pareto distributions will lose much appeal due to their simplicity. Only the g a m m a distribution satisfies both conditions: simplicity and satisfactory approximation of available data. Although alternative distributions are similar at most of the moment range (see Fig. 2) where one has empirical evidence and all of them can be used for an approximation of moment-frequency histograms, we should be concerned with the simplicity and logical consistency of the models. For example, the characteristic earthquake model (a) cannot by itself explain the size distribution of all earthquakes making it necessary to invoke a general characteristic model, as discussed in the introduction. The latter model, in turn, requires five to six parameters for characterization, and the analysis above shows that there is no direct quantitative evidence to justify its acceptance. E a r t h q u a k e occurrence models with three degrees of freedom approximate available data as satisfactorily as a general characteristic distribution. We m a y invoke the Ockham razor rule, thus choosing the simplest explanation and the most economical conceptual formulation of a model. I interpret the Ockham rule to select among models consistent with data and other available information, the model that has a minimum n u m b e r of degrees of freedom; we should choose the model with m a x i m u m entropy, i.e., the model that, under known constraints, maximizes the uncertainty of our knowledge. In other words, unless there is compelling evidence to the contrary, we should accept the simplest distribution of earthquake size. In conclusion, let us examine whether the characteristic hypothesis can be effectively tested. Unless formulated as a formal quantitative algorithm to specify segmentation of a fault zone and values of distribution parameters expected from earthquakes occurring on these segments, it would be difficult to validate this hypothesis statistically. At present, fault segment selection is based on qualitative criteria and hence is largely subjective. Thatcher (1990) demonstrates that even very large earthquakes do not break the same segment of subduction belts; thus subdivision of a fault into segments does not assure that earthquakes will obey these boundaries. The Working Group (1988, 1990) provided two segmentation variants of the San Francisco peninsula part of the San Andreas fault, the Hayward, and the Rodgers Creek faults. Although the same scientists served on both panels (8 out of 12), the results of fault segmentation are significantly different (cf. Fig. 5 of the 1988 22 Y.Y. KAGAN report and Figs. 5 and 6 of the 1990 report). As another example, Nishenko (1991) proposes a subdivision of the circum-Pacific earthquake zone into segments that are significantly different from that of the earlier work by McCann et al. (1979). I believe that this disagreement is not accidental: if, as accumulating evidence suggests (Kagan, 1992), the fault pattern is scale-invariant (fractal), there are no separate scales for "individual" faults or their segments. Moreover, the fault geometry analysis (Kagan, 1992, p. 10) indicates that earthquakes do not occur on a single (possibly wrinkled or even fractal) surface, b u t on a fractal structure of many closely correlated faults. This would indicate that we cannot meaningfully define an individual fault surface. Thus, the objective selection of fault segments is as impossible as it is infeasible to devise a computer algorithm that would subdivide a mountain range into individual mountains or a cloudy sky into individual clouds. Suppose we try to subdivide fault patterns as shown in Figures 10 to 12 of Tchalenko (1970) without considering the scale of the plot. In principle each segment can be subdivided further into subsegments and so on; each fault trace (surface) could again be subdivided into m a n y quasi-parallel traces. Therefore any segmentation rule would be arbitrary, and the results obtained through such subdivision would be strongly dependent on the algorithm used. Although it is difficult to formalize the segmentation procedure, the geological insight of such studies m a y still yield important information on the size distribution of very large earthquakes. We cannot test this possibility statistically through a retrospective forecasting, since current segmentation models are descriptive and hence the number of degrees of freedom in the models is comparable to the number of data. However, it is possible to test the actual forecast of characteristic earthquakes (such as made by Nishenko, 1991), where such considerations do not apply. However, in order for the prediction to be testable, it should be forecast for a large number of fault segments, so that after several years there will be a sufficient dataset to make statistically valid conclusions. Moreover, a meaningful forecasting method m u s t be testable against a random occurrence. Suppose an earthquake satisfying characteristic event criteria occurs in a certain zone: it does not automatically validate the hypothesis; such an earthquake could occur purely by chance. Thus we need to compare the forecast data set with that expected from the standard worldwide relation such as displayed in Figure 1. In other words, the characteristic earthquake hypothesis, as any meaningful scientific model, must be falsifiable: accumulated evidence should either validate or reject the hypothesis. Thus, to be verifiable, not only characteristic events have to be described completely in the forecast, b u t the distribution of regular earthquakes also needs to be specified; i.e., all five or six parameters of the general characteristic distribution m u s t be specified. At a preliminary stage, the general characteristic distribution might be defined as a worldwide generic model; thus, only one or two parameters need be adjusted for a particular fault segment. Only when it can be shown that the forecast distribution is a significantly superior approximation of the experimental data than relations (b) to (d) can we conclude that the hypothesis has been verified. Before such validation is accomplished, in practical evaluations of seismic risk a range of results should be presented to display possible alternatives. These calculations should be performed using STATISTICS OF CHARACTERISTIC EARTHQUAKES 23 v a r i o u s m o d e l s of size d i s t r i b u t i o n w i t h i n d i c a t i o n of p o s s i b l e v a r i a t i o n s o f parameter estimates. ACKNOWLEDGMENTS I appreciate support from the National Science Foundation through Cooperative Agreement EAR-8920136 and USGS Cooperative Agreement 14-08-0001-A0899 to the Southern California Earthquake Center (SCEC). I am grateful for useful discussions held with D. Vere-Jones of Wellington University, D. D. Jackson of UCLA, and P. Bak of Brookhaven National Laboratory. I thank Associate Editor S. G. Wesnousky, reviewer P. A. Reasenberg (USGS), and an anonymous referee for their valuable remarks, which significantly improved this manuscript. Publication 5, SCEC. Publication 3808, Institute of Geophysics and Planetary Physics, University of California, Los Angeles. REFERENCES Aki, K. (1965). Maximum likelihood estimate of b in the formula log N = a - bM and its confidence limits, Bull. Earthquake Res. Inst. Tokyo Univ. 43, 237-239. Anderson, J. G. (1979). Estimating the seismicity from geological structure for seismic risk studies, Bull. Seism. Soc. Am. 69, 135-158. Anderson, J. G. and J. E. Luco (1983). Consequences of slip rate constraints on earthquake occurrence relation, Bull. Seism. Soc. Am. 73, 471-496. Bateman, H. and A. Erdelyi (1953). Higher Transcendental Functions, McGraw-Hill, New York. Brown, S. R., C. H. Scholz, and J. B. Rundle (1991). A simplified spring-block model of earthquakes, Geophys. Res. Lett. 18, 214-218. Carlson, J. M. (1991). Time intervals between characteristic earthquakes and correlations with smaller events: an analysis based on a mechanical model of a fault, J. Geophys. Res. 96, 4255-4267. Davison, F. C. and C. H. Scholz (1985). Frequency-moment distribution of earthquakes in the Aleutian Arc: a test of the characteristic earthquake model, Bull. Seism. Soc. Am. 75, 1349-1361. Dziewonski, A. M., G. Ekstrom, M. P. Salganik, and G. Zwart (1992). Centroid-moment tensor solutions for January-March 1991, Phys. Earth Planet. Interiors 70, 7-15. Gnedenko, B. V. (1962). The Theory of Probability, Chelsea, New York, 459 pp. Howell, B. F. (1985). On the effect of too small a data base on earthquake frequency diagrams, Bull. Seism. Soc. Am. 75, 1205-1207. Kagan, Y. Y. (1991). Seismic moment distribution, Geophys. J. Int. 106, 123-134. Kagan, Y. Y. (1992). Seismicity: turbulence of solids, Nonlinear Sci. Today 2, 1-13. Kagan, Y. Y. and D. D. Jackson (1991a). Long-term earthquake clustering, Geophys. J. Int. 104, 117-133. Kagan, Y. Y. and D. D. Jackson (1991b). Seismic gap hypothesis: ten years after, J. Geoph. Res. 96, 21,419-21,431. Kanamori, H. (1983). Global seismicity, in Earthquakes: Observation, Theory and Interpretation, Proc. Int. School Phys. "Enrico Fermi," Course LXXXV, H. Kanamori and E. Boschi (Editors), North Holland, Amsterdam, 596-608. McCann, W. R., S. P. Nishenko, L. R. Sykes, and J. Krause (1979). Seismic gaps and plate tectonics: seismic potential for major boundaries, Pageoph 117, 1082-1147. McGarr, A. (1976). Upper limit to earthquake size, Nature 262, 378-379. Molnar, P. (1979). Earthquake recurrence intervals and plate tectonics, Bull. Seism. Soc. Am. 69, 115-133. Nishenko, S. P. (1991). Circum-Pacific seismic potential: 1989-1999, Pageoph 135, 169-269. Scholz, C. H. (1990). The Mechanics of Earthquakes and Faulting, Cambridge University Press, Cambridge. Schwarz, D. P. and K. J. Coppersmith (1984). Fault behaviour and characteristic earthquakes: examples from Wasatch and San Andreas fault zones, J. Geophys. Res. 89, 5681-5698. Singh, S. K., M. Rodrigez, and L. Esteva (1983). Statistics of small earthquakes and frequency of occurrence of large earthquakes along the Mexican subductien zone, Bull. Seism. Soc. Am. 73, 1779-1796. Tchalenko, J. S. (1970). Similarities between shear zones of different magnitudes. Geol. Soc. Am. Bull. 81, 1625-1639. 24 Y, Y. KAGAN Thatcher, W. (1990). Order and diversity in the modes of circum-Pacific earthquake recurrence, J. Geophys. Res. 95, 2609-2623. Wesnousky, S. G. (1986). Earthquakes, Quaternary faults, and seismic hazard in California, J. Geophys. Res. 91, 12,587-12,631. Wesnousky, S. G., C. H. Scholz, K. Shimazaki, and T. Matsuda (1983). Earthquake frequency distribution and the mechanics of faulting, J. Geophys. Res. 88, 9331-9340. Working Group on California Earthquake Probabilities (1988). Probabilities of large earthquakes occurring in California on the San Andreas fault, U.S. Geol. Surv., Open-File rept. 88-398, 62 Pp. Working Group on California Earthquake Probabilities (1990). Probabilities of large earthquakes in the San Francisco Bay Region, California, USGS Circular 1053, 51 pp. Youngs, R. R. and K. J. Coppersmith (1985). Implications of fault slip rates and earthquake recurrence models to probabilistic seismic hazard estimates, Bull. Seism. Soc. Am. 75, 939-964. Note Added in Proof--More complete discussion of the truncated Pareto distribution and evaluation of the maximum earthquake magnitude can be found in the paper by V. F. Pisarenko (1991) "Statistical evaluation of maximum possible earthquakes," Phys. Solid Earth 27, 757-763, (English translation). INSTITUTEOF GEOPHYSICSANDPLANETARYPHYSICS UNIVERSITYOF CALIFORNIA LOS ANGELES,CALIFORNIA90024-1567 Manuscript received 10 December 1991
© Copyright 2026 Paperzz