Geophys. J . Int. (1991) 104, 289-306 Single-link cluster analysis, synthetic earthquake catalogues, and aftershock identification Scott D. Davis1?*and Cliff Frohlich2 ’ Department of GeoIogicaI Sciences, University of Texm at Aurtin, Austin, TX 78713, USA Institute for Geophysics, University of Texas at Austin, 8701 MoPac Blvd, Austin, TX 78759-8345, USA Accepted 1990 August 7. Received 1990 August 6; in original form 1989 July 12 SUMMARY This paper investigates several aspects of synthetic catalogue generation and aftershock identification schemes. First, we introduce a method for generating synthetic catalogues of earthquakes. This method produces a catalogue which has the geographic appearance of an actual catalogue when the hypocentres are plotted in map view, but allows us to vary the spatial and temporal relationships between pairs of close events. Second, we discuss six statistics to measure certain characteristics of synthetic and actual catalogues. These include four new statistics S,, B,,, S, and B , which evaluate the distributions of link lengths between events in space and space-time as computed by single-link cluster analysis (SLC). Third, we develop a new scheme for identifying aftershocks in which a group of events forms an aftershock sequence if each event is within a space-time distance D of at least one other event in the group. We define the space-time separation of events by dsT= d(d2+ C 2 t 2 ) , where d is the spatial separation of events, t is the time separation, and C = 1km day-’. Our experience with several synthetic catalogues - 25.2 km.Here, S, suggests that an appropriate trial value for D is 9.4 kmln (G) is the median link length using SLC with the metric dST. Fourth, we generate synthetic catalogues resembling both teleseismic and local network catalogues to evaluate the validity and reliability of this aftershock identification scheme, as well as other schemes proposed by Gardner & Knopoff (1974), Shlien & Toksoz (1974), Knopoff, Kagan & Knopoff (1982), and Reasenberg (1985). Using a simple scoring method, we find that the SLC method compares favourably with other aftershock identification algorithms described in the literature. Key words: aftershocks, aftershock identification, catalogue simulation, earthquake catalogues, earthquake statistics. INTRODUCTION Although seismic activity is considered to be stochastic in nature (Vere-Jones 1970), there are significant deviations from a Poisson process. In particular, a major deviation from a Poisson process is the existence of earthquake aftershocks. Algorithms for the identification of clustered earthquakes such as aftershocks include various space-time windows (Knopoff & Gardner 1972; Gardner & Knopoff 1974; Kagan & Knopoff 1976; Knopoff et al. 1982; Tinti & Mulgaria 1985), likelihood functions (Prozorov & Dziewonski 1982; Bottari & Neri 1983), ratios tests (Frohlich & Davis 1985), Shlien & Toksoz’s (1974) s-statistic, other cluster-link schemes (Reasenberg 1985), and single-link cluster analysis (this paper; Davis & Frohlich 1990; Frohlich & Davis 1990). This paper is not a comprehensive review of different aftershock identification schemes. Instead, we aim to show how one can use synthetic earthquake sequences with known clustering properties to test the validity of such schemes. In the first section, we discuss the various assumptions needed to generate a synthetic earthquake catalogue with realistic properties. We recommend several parameters for generating synthetic catalogues which have not been proposed previously, but which allow us to obtain catalogues which are considerably more ‘realistic’ than those generated previously by simpler schemes and which produce well-defined sequences of ‘afterevents’ (simulated aftershocks). As our goal is to create synthetic catalogues which will serve as models for actual earthquake catalogues, it is important to have statistics for quantifymg the spatial and 289 290 S. D. Davis and C. Frohlich temporal clustering properties of earthquake catalogues. In Section 2 we use six different statistics to compare several synthetic and actual earthquake catalogues. In the third section we provide an example of the use of these synthetic catalogues. In particular, we look at the reliability and validity (Reasenberg & Matthews 1988) of aftershock identification schemes proposed by Gardner & Knopoff (1974), Shlien & Tokoz (1974), Knopoff et al. (1982), and Reasenberg (1985), as well as a new method proposed in a previous study (Frohlich & Davis 1990). By creating synthetic catalogues with properties similiar to those of four actual catalogues (two teleseismic and two local network catalogues), we evaluate the success rate of the aftershock identification schemes on these catalogues. We are currently unaware of any previous comparisons of reliability and validity for different aftershock identification schemes. One finding of this study is that single-link cluster analysis (SLC) has many useful properties for evaluating earthquake catalogues. Several of the most useful ‘new’ statistics we used for describing earthquake catalogues fall naturally out of the SLC method. SLC also offers a way of identifying earthquake aftershock sequences and earthquake swarms which compares favourably to other methods. A surprising result of this analysis is that even under optimum conditions, aftershock identification algorithms often failed to identify a significant fraction of synthetically generated ‘afterevents’. We suggest that part of the difficulty is that the seismologist’s concept of aftershock differs in an important way from the ‘afterevents’ generated by a stochastic process. Specifically, the seismologist considers events to be related if they are close enough together in space and time so that the physical processes which generate the earthquakes cannot be independent. In contrast, for a stochastic process several ‘close’ events may be independent, while a significant fraction of ‘afterevents’ may be separated from their parent events by large distances and time intervals. PARAMETERS FOR GENERATING SYNTHETIC CATALOGUES In this section we develop a parametric model of earthquake behaviour for generating synthetic earthquake catalogues (Figs 1-4). There are two distinct classes of events in the model: primary events (simulated main shocks) and afterevents (simulated aftershocks). Some primary events are isolated; others are followed by one or more afterevents. One must consider several aspects of earthquake behaviour to generate a synthetic catalogue. These include: (1) catalogue completeness and magnitude ranges, (2) a recurrence relation for primary event magnitudes, (3) an afterevent function that specifies numbers and magnitudes of afterevents, (4) primary event origin times, (5) afterevent origin times, (6) a primary event hypocentre distribution, (7) an afterevent hypocentre distribution, and (8) location errors. We discuss each of these aspects of catalogue generation in some detail below. For consistency, when we deal specifically with afterevent magnitudes we use lower-case letters ( m ) , and upper-case letters ( M )for primary event magnitudes. We shall also use upper case letters when the event in question may either be a primary event or afterevent. It is important to note that this procedure is not based on any physical model of earthquake occurrence, but rather is merely a set of simple algorithms for creating a catalogue which has many of the statistical properties of actual earthquake catalogues. For each of the topics listed below, other choices could be made. For example, one could choose to incorporate forevents (simulated foreshocks) or seismic quiescence into the model; however, we have not done so for simplicity. Other researchers have taken different approaches. Burridge & Knopoff (1967) use a simple deterministic model based on the physics of a 1-D array of blocks and springs to generate simple catalogues with several of the properties of real catalogues. Mikumo & Miyatake (1978; 1979; 1983) and Cao & Aki (1985) employ more detailed models to study the effects of fault heterogeneity and friction laws on earthquake behaviour. In a series of papers (Kagan & Knopoff 1976, 1977, 1978, 1980) Kagan & Knopoff utilize synthetic catalogues generated by a stochastic model which employs a branching process. In a related paper, Kagan & Knopoff (1981) assume a self-similar seismic moment release process, and define the times and magnitudes of earthquakes by the periods in which moment release exceeds a cut-off threshold. These models have many appealing features, and in several ways are more ‘realistic’ than the model presented here. However, in some ways their very realism makes them impractical for our purposes. For example, in models which employ a branching process, aftershocks may generage second-order aftershocks, which in turn may generate third-order aftershocks, and so on. Although such a scheme is intuitively appealing, it precludes the use of a simple scoring method which tests to see if an afterevent is linked to its parent event. 1 Catalogue completeness and magnitude range Wd” and WMX) It is easier to compare observed earthquake distributions to theoretical distributions when the observed catalogue is complete; that is, no earthquakes are missed. Habermann (1982) notes that detection rates are approximately constant for earthquakes in the ISC catalogue with magnitudes greater than about 4.7. Generally, global catalogues are complete for earthquakes with magnitudes of about 5.5 and greater. Catalogues of earthquakes determined from local networks may be complete, or at least uniform, to lower magnitudes. For simplicity, we assumed that our model catalogues are approximately complete above some lower cut-off magnitude Mmin. We also assumed an upper cut-off magnitude M,,,, as this facilitated calculations for several reasons. First, without a cut-off magnitude most reasonable models predict more earthquakes in the largest magnitude ranges that are observed (e.g. BQth 1978a). Second, if b, 2 b, where b, is the b-value of events in afterevent sequences and b is the value for the primary events (see Sections 2 and 3 below), then the number of afterevents in the catalogue becomes infinite without an upper cut-off magnitude. Third, because major rupture zones are limited to sizes of about lo00 km or less, there are physical constraints that prevent earthquakes SLC, synthetic catalogues, and aftershock identification of magnitudes larger than about 9.5. Finally, even if very large events are possible, a catalogue with a finite number of events may not contain the largest possible earthquakes, and the largest clusters will be missing from the population of observed events. Both teleseismic and local catalogues are known to contain systematic errors in the detection of earthquakes as well as in the determination of magnitudes (Habermann 1982, 1983, 1987; Habermann & Craig 1988). Such errors can influence the generation of synthetic catalogues. For example, errors in magnitude determination will affect estimates of the aftershock function, while changes in detection rate will alter the temporal characteristics of catalogues such as the Poisson index of dispersion. In addition, we have not addressed the possibility of errors in magnitude determination (Habermann 1987) in these catalogues. A detailed treatment of how incompleteness and non-uniformity affect the statistics we use, as well as how incompleteness and non-uniformity can be detected or avoided, is beyond the scope of this paper. 2 Recurrence relation for primary event magnitudes ( b ) To construct a synthetic catalogue, we need to specify the distribution of earthquake magnitudes. Let N o ( M ) be the primary event magnitude distribution such that the number of primary events in the magnitude range ( M , M + d M ) is N o ( M ) d M . The total number v o ( M ) of primary events of magnitude M and greater is given by JM Likewise, N,(m) and N c ( M ) denote the levels of occurrence for afterevents and for events in the catalogue as a whole, respectively, while I]A(m)and q c ( M ) denote the number of afterevents or catalogue events of magnitude M (or m ) and greater. To generate synthetic earthquakes, we assumed a Gutenberg-Richter relation (Gutenberg & Richter 1954): exponentially such that n A ( M , m ) = q 10ba"-"" = 0, m<M, mrM. (4) Here, b, is a fixed exponential decay rate which is equal to the b-value of individual afterevent sequences in the catalogue (as opposed to the b-value of events in the catalogue as a whole), while q is a 'swarming parameter' which is equal to the probability that a primary event will produce an afterevent of approximately its own magnitude. This swarming parameter is different from the 'positive influence parameter' of Prozorov & Dziewonski (1982) as it develops from a theoretical earthquake population rather than from observed events, and it is normalized in a different manner. Nevertheless, the concept is similar in that higher values of q produce larger and more frequent earthquake swarms and aftershock sequences, and may vary for different tectonic regimes (Mogi 1967; Chen & Knopoff 1987). We have generated synthetic catalogues with many of the statistical properties of actual earthquake catalogues; typical swarming parameters for these catalogues range from 0.2 (magnitude mb = 4.8 and greater earthquakes in Mexico and Central America) to 0.5 (magnitude m,=3.0 and greater earthquakes in the central Vanuatu island arc). We note that the afterevent function n A ( M , m ) chosen for this study (equation 4) is completely arbitrary. We used this form because it is mathematically simple, and because it produced catalogues which appear to be very similar to actual earthquake catalogues. However, we are not aware of any physical o r observational basis for this function. Prozorov 19 Dziewonski (1982) define an 'intensity function' I , ( M ) which represents the mean number of afterevents for a primary event of magnitude M , where the number of afterevents follows a Poisson distribution. If the catalogue is complete, we can derive the intensity function from I M I,(M) = n A ( M , x ) A. M,i" For the aftershock function given by equation (4), No(M) = A1O-bM. Typical b-values reported in the literature range from 0.8 to 1.2, with a value of approximately 1 being typical (BHth 1987b, 1981). For simplicity, we assumed a b-value of 1.0 for primary events in the catalogue. Analytical solutions for N A ( m ) , Nc(M), t)A(m), and q c ( M ) are given in Appendix A. 3 Afterevent function (4,6,) We defined an 'afterevent function' n A ( M , m ) as the mean number of afterevents of magnitude m produced by a primary event of magnitude M . One appealing set of afterevent functions is that in which earthquakes are self-similar with respect to magnitude; that is, earthquakes behave in a similar fashion regardless of scale (e.g. Kagan & Knopoff 1981; Mandelbrot 1982). These models take the form nA(M, m ) = f ( M - m). 291 (3) One possibility is to assume the function behaves Thus, the mean number of afterevents per primary event is directly proportional to the swarming parameter q. In our applications, we used a value of b, = 1.0 for several reasons. First, the statistical characteristics of the synthetic catalogue were simpler when b,= b, where b is the Gutenberg-Richter b-value for primary events (see Section 2 and Appendix A). Second, we were able to match actual catalogues adequately with a value of b,= 1. Third, although there may be some variation, a value of b, = 1 is representative of many aftershock sequences (Utsu 1961; Reasenberg & Jones 1989). 4 Primary event origin times (I.,,, T ) The simplest assumption to make for the generation of primary event origin times is that they occur as a Poisson process, i.e., they are independent and have interevent times which are exponentially distributed (Cox & Lewis S. D . Davis and C. Frohlich 292 1966). Such a process has a mean rate Ao, and the expected number of primary events in a catalogue which covers a time span T is no = AoT. In practice, we set the primary event rate A, so that the expected total number of events produced (primary events and afterevents) was equal to the observed number of events ntota,in the actual catalogue we were trying to simulate. Using the afterevent function described above, with b,=b and Mmax-Mmi,~2b,one can show (Davis 1989). that the appropriate value of A. is very closely approximated by ntotal Mmu-Mmi,,-- (7) b In 10 Although the assumption of a Poisson process with a steady rate is somewhat simplistic, it is adequate for most purposes. If necessary, one could devise more complicated models to account for changes in detection rate (Habermann 1982), clustering of large main shocks (Kagan & Jackson 1990) or gradual accumulation of strain energy in the time intervals between large earthquakes (e.g. Utsu 1972). 5 Afterevent origin times (c, 1,) For the models we considered, the probability that a given number of afterevents will occur follows a Poisson distribution since afterevents of a given primary event are generated independently. The expected fraction of primary events of magnitude M with L afterevents is P[A,(M), L], where P[u, L ] is the Poisson probability of mean u : u L e-” P[u, L]= -. L! (8) To generate afterevent times we used a trigger model with an inverse time (Omori’s Law) decay. Trigger models (Vere-Jones 1970) form a class of processes in which the probability of an afterevent occurring in a time interval (t, t + dt), given a trigger (primary event) at time to is A,(M)@(r - to) dt. (9) Here, A,(M) is the intensity function, and @ ( t ) is a decay function which is proportional to the number of afterevents per unit time produced by the trigger, normalized over time such that I, m @ ( t ) dt = 1. Trigger models employed in the study of earthquake aftershocks have generally taken the form of an exponential decay (Vere-Jones & Davies 1966; Burridge & Knopoff 1967; Hawkes & Adamopoulous 1973) or inverse time decay (Omori’s Law). Most authors have found the latter to provde a better fit. Omori’s law takes the form 1 @ ( t )a (t c ) . + However, Omori’s law is difficult to analyse statistically, as the number of expected afterevents from a single primary event increases without bound. Utsu (1961) presented a Modified Omori’s law: 1 (t+C)P’ @ ( t )0: Previous investigators have reported values of p ranging from 0.85 (Mayer-Rosa et al. 1976) to 1.3 (Utsu 1961). While the Modified Omori’s law has been investigated extensively (e.g. Ogata 1983; Reasenberg & Jones 1989), one disadvantage of the law is that many of the afterevents occur long after the primary event. For example, if one chooses the values of c=O.25 days and p = 1.25 (Vere-Jones & Davies 1966), then more than 12 per cent of the afterevents will occur loo0 days or more after the primary event. Such events are likely to be ‘lost’ in the background seismicity after these time intervals, and thus many seismologists would not consider them to be aftershocks. Consequently, we instead used a form of Omori’s law with a time cut-off t, such that 1 +c) ’ @ ( I ) 0: (t , =0, t 5 t,, I t>tc and r<0. (13) For the purposes of this study, we set c =0.25 days (Vere-Jones & Davies 1966) and I, = 100 days. One can also employ trigger models to create foreshocks by allowing the decay function to be non-zero in the time range t < 0 (prior to the primary event). However, for the sake of simplicity we ignored the possibility of foreshocks in this study. An alternative to the trigger model approach is one which incorporates self-exciting or mutually exciting processes, where afterevents are not independent but may give rise to afterevents of their own (Hawkes & Adamopoulous 1973; Adamopoulous 1975; Oakes 1975). However, we chose not to employ these models because the studies of Vere-Jones (1970) and Hawkes & Adamopoulous (1973) found that simple, independent afterevent models described observations as well as, or better than, other branching process models. 6 primary event hypocentre distribution The simplest model for the generation of primary event hypocentres is to assume the hypocentres occur randomly and uniformly within some simple geometric zone, e.g., a rectangular, circular, or spherical region. However, such models do not accurately describe such features as isolated events off the subduction zone or clusters of events; nor do they account for gross changes in spatial clustering along the subduction zone (e.g., Fig. 1, top). As our goal was to generate synthetic catalogues which simulate the properties of real catalogues, we used these real catalogues to determine the 3-D probability density function for primary event hypocentres. To do this, we divided the region of interest into cubical sub-regions with sides of size X , . We then computed the fraction of actual earthquakes occumng within each sub-region, and used this value as the probability that a primary event would occur within that sub-region. Although this method required slightly more computer time than randomly assigning primary events in some pre-defined SLC, synthetic catalogues, and aftershock identijication v) In . (a) MEXCAM48 I -115 C -110 -105 -100 -95 -90 -85 -80 O .75 293 obtained L by assuming a ratio of maximum fault slip to fault length of lop4, and using a relation between magnitude, fault length, and maximum slip (King & Knopoff 1968). Although we constrained aftershocks to occur within the rupture zone of the main shock, there is evidence that some off-fault aftershocks are triggered by the strain release of large main shocks (Das & Scholz 1981; Stein & Lisowski 1983; Strehlau 1986). It would be possible to create more detailed synthetics which account for such aftershocks, but for’simplicity we have not done so. 8 Location errors (Xe,.,) I L C . -115 -110 -105 -100 -95 -90 -85 -80 O Finally, to simulate actual earthquake catalogues, we accounted for relative location errors. In spite of the fact that real catalogues are plagued by such errors as outliers, for simplicity we assigned random Gaussian errors with a standard deviation of X,,, km to our simulated hypocentres. We used a relative location error of X,,, = 20 km for events in teleseismic catalogues, and a value X,,, = 5 km for events in local network catalogues. -75 longitude Figure 1. (a) Epicentres of earthquakes in catalogue MEXCAM48. This catalogue contains 1552 earthquakes of magnitude 4.8 and greater in the Mexico-Central America region from January 1964 to February 1986. (b) Distribution of 1534 epicentres for a synthetic catalogue based on MEXCAM48 (see Table 1 for input parameters). volume, it avoids the subjective choice of the size of such a volume, and it produced realistic looking hypocentre distributions which preserved the characteristics of spatial clustering and isolated events (Fig. 1, bottom). Essentially, this method fixed the relationship between more distant pairs of synthetic events to be identical to that of the actual catalogue, while allowing us to experiment with different afterevent functions which affected the space-time relationship of close pairs of events. An alternative approach would have been to reorder catalogue events, but this would not have allowed us to generate afterevents and thereby determine which events are primary events and which are afterevents. For simplicity, we set the size of the sub-regions X b o x equal to X,,,, our estimate for the relative location errors in that catalogue (see Section 8 below). This value made a logical choice for the size of the sub-region, as it is unlikely we could detect any spatial clustering occurring on a smaller scale. Some examples We generated synthetic catalogues with many of the statistical features of four actual earthquake catalogues. Two of these, MEXCAM48 and ALASKA49, consist of earthquakes located by the International Seismological Centre (ISC) from January 1964 to February 1986. MEXCAM48 consisted of 1552 mb 2 4.8 earthquakes in Mexico and Central America (Fig. l), and ALASKA49 consisted of 1839 mb 2 4.9 earthquakes along the AlaskaAleutian trench (Fig. 2). These catalogues are almost certainly incomplete, but should be approximately uniform. In (a) ALASKA49 ..!n (D ..o (D (D 0.. v) In 0.. --P . +s !n P 160 170 180 190 200 210 220 7 Afterevent hypocentre distribution To generate afterevent locations, we assumed that afterevents occurred with equal probability density over the rupture area of the primary event. For the majority of primary events (those with M 5 8.0) we assumed a circular rupture determined by the relation between rupture area and magnitude given by Kanamori (1977). For the few larger earthquakes (M> 8.0) we assumed a rectangular rupture zone of length L, with a total area constrained by Kanamori’s (1977) area-magnitude relation. Here, we ! P n . 160 170 180 190 200 210 In P 220 longitude Figure 2. (a) Epicentres of earthquakes in catalogue ALASKA49. This catalogue contains 1839 earthquakes of magnitude 4.9 and greater in Alaska and the Aleutians from January 1964 to February 1986. (b) Distribution of 1786 epicentres for a synthetic catalogue based on ALASKA49 (see Table 1 for input parameters). 294 S. D. Davis and C. Frohlich SYNTHETICS VAN30 ?A *. . ** *4 0 6 !l' 167 1& 189 l longitude 0 0 b '46 0 187 188 169 110' longitude Figure 3. Epicentres of earthquakes in catalogue VAN30. This catalogue contains 2395 earthquakes of magnitude 3.0 and greater in the central Vanuatu region from January 1979 to December 1980. (b) Distribution of 2388 epicentres for a synthetic catalogue based on VAN30 (see Table 1 for input parameters). We also studied two catalogues from local networks to see how these techniques worked when applied to earthquakes in different magnitude ranges. VAN30 consisted of 2395 earthquakes along the Vanuatu (New Hebrides) trench of mb 2 3.0 located by a locd seismograph array (Chatelain, Cardwell & Isacks 1983; Marthelot et al. 1985; Chatelain et al. 1986) from January 1979 to December 1980 (Fig. 3). ADAKS20 .. . I ...-.. . * *. . 1 . 181 182 183 184 185 SYNTHETICS I I . 181 182 183 184 185 longitude Figme 4. (a) Epicentres of earthquakes in catalogue ADAKS20. This catalogue contains 2043 shallow (depth 570 km) earthquakes of magnitude 2.0 greater in the Adak Island region from January 1978 to December 1986. (b) Distribution of 2116 epicentres for a synthetic catalogue based on ADAKS20 (see Table 1 for input parameters). ADAKS20 consisted of 2043 shallow (depth (70km) earthquakes of m b2 2.0 located by the Adak network (Engdahl 1977; Frohlich et al. 1982) between 50.75"N and 52.2"N latitude and between 174.5"W and 179.0"W longitude from January 1978 to December 1986 (Fig. 4). Based on the magnitude distribution of events in these catalogues, we chose cut-off magnitudes above which the catalogues should be approximately complete. However, the preponderance of larger earthquakes near the edges of the Vanuatu seismic zone (Fig. 3) indicates that VAN30 may not be complete in these areas. To simulate the statistical characteristics of MEXCAM48, we set the four input parameters b, b,, c, and T, to 1.0, 1.0,0.25 and 100 days, respectively, as explained previously (Table 1). We set T = 22.2 yr, the time span of the available ISC catalogue (January 1964-February 1986). We arbitrarily estimated relative location errors in the catalogue to be on the order of 20 km. We used a lower magnitude cut-off of Mmi,=4.8; for lower values the catalogue was significantly incomplete (Habermann 1982), while for much higher values of Mminthere were not enough data to make the analysis meaningful. As the largest magnitude earthquake reported by the ISC in this region during the time of interest was Ms= 8.1, we used this value for Mmm. We tried several values for the swarming parameter q ; in each case we chose an appropriate value of 1, as described previously (equation 7). To generate synthetics for ALASKA49, VAN30, and ADAKS20, we determined appropriate input parameters in a similar manner (Table 1). STATISTICS F O R EVALUATING SYNTHETIC A N D ACTUAL CATALOGUES Assuming Our synthetic provides a model of actual earthquake behaviour, how can we SLC, synthetic catalogues, and aftershock identification 295 Table 1. Input parameters for synthetic catalogues. The last four columns list the parameters used to model the four actual catalogues chosen for this study. M48 is MEXCAM48, A49 is ALASKA49, V30 is VAN30, and A20 is ADAKS20. The values of M,,, were chosen on the basis of the largest magnitude earthquake in the catalogue of interest. Pammeter Namc Jy.Q&s M48 @ UQ ezll b-value for main shocks bvalue for aftershock seqiiences Oinori Law constant Oinori cutoff time size of subregions for determining pdf of synthetic main shock locations set to b=1.0 (see text) set to ba=I.O (see text) set to c = 0.25 days (see text) set to tc = 100 days (see text) 1 1 1 .25 I I .25 1 I .25 100 I .2s 100 100 100 set to Xbox = X , 20 20 5 5 4 swarming parameter 0.2a 0.3” 0.5’ 0.25a T N time span of catalog expected number of events catalog location errors lower magnitude cutoff upper magnitude cutoff defines extent to which events “cluster” in space-time varies according to catalog of interest vanes according to catalog of interest varies according to catalog of interest vanes according to catalog of interest varies according to catalog of interest 22.2 1552 20 22.2 1839 20 4.9 9.0 2.0 2395 5 3.0 6.0 8.0 2043 5 2.0 b ba C tC Xbx Xem Mmin ,,M (see text) 4.8 8.1 5.5 FOOTNOTE: a Several values were tried; number shown is best-fitting value determine the best value for q? One way is to find which values of q produce synthetic catalogues with statistical properties similar to those of an actual catalogue we wish to simulate. Various statistical tests have been used to characterize deviations from random behaviour, including the Poisson index of dispersion (Vere-Jones & Davies 1966; Shlien & Toksoz 1970; McNally 1977), the Kolmogorov-Smirnov test (Shlien & Toksoz 1970; Reasenberg & Matthews 1988), the Anderson-Darling test (Anderson & Darling 1952; Frohlich 1987; Reasenberg & Matthews 1988), the E parameter of Shlien & Toksoz (Shlien & Toksoz 1970; Bottari & Neri 1983; De Natale et 01. 1985), second- and higher order moments (Kagan & Knopoff 1976, 1978; Reasenberg 1985), power spectra (Utsu 1972), variance-time curves (VereJones 1970; Rice 1975), event pair-analysis (Eneva & Pavlis 1988; Eneva & Hamburger 1989), and hazard and intensity functions (Vere-Jones 1970; Rice 1975). For this study we chose six statistics as described below. Two statistics (&, B,) describe only the spatial distribution of earthquake hypocentres; two (S,, B,) relate to the separations between events in both space and time, and two [PID(lO), PID(100)I relate only to the temporal properties of the earthquake catalogue. We chose these statistics because they were reasonably robust; that is, they were not highly sensitive to small changes within the catalogue. In addition, all of these statistics were relatively simple to calculate. Four of the statistics (&, B,,, S,, B,) are introduced in this paper: they have not been previously published in the literature. These statistics derive from single-link cluster analysis (SLC) of earthquakes (Frohlich & Davis 1990) and provide information on the distribution of interevent distances in space and space-time. Previously we have used SLC to study earthquake nests (Frohlich & Davis 1990), isolated events (Frohlich & Davis 1990), seismic quiescence (Wardlaw, Frohlich & Davis 1990), and earthquake aftershocks (Davis & Frohlich 1990). Previously debed statistics [PZD(lo), PZD (loo)] A Poisson process has the property that the mean or expected value and the variance of the number of events per unit time interval are equal (Cox & Lewis 1966). The Poisson index of dispersion (PID) is defined as the ratio of the variance to the mean: PID(At) = variance (At) expected value (At) ’ where At is the time length of the intervals. The PID has an expected value of 1 for a sequence generated by a Poisson process, and is significantly greater than 1 for highly clustered sequences. We used PID’s with time intervals of 10 and 100 days (Davis 1989). New statistics (&, &, B,, 43,) Single-link cluster analysis (SLC) is a scheme for joining N events with N - 1 links using a distance metric. SLC provides an objective way of dividing a set of events into natural clusters. By removing the K longest links, one is left with K + 1 ‘clusters’ of events. We refer the reader to Frohlich & Davis (1990) for a more detailed explanation of SLC. We observe that the logarithmic distribution of link lengths is nearly straight for the majority of links (Frohlich & Davis 1990; and Fig. 5). Thus, two statistics suffice to describe the link-length distribution fairly accurately for all but the very smallest and very largest links. These statistics are S, the median link length, and B, the ‘slope’ of the link-length distribution. Because the log of the link lengths is approximately straight when plotted against link number, this slope can be defined by the value D(0.25) B = link length ‘slope’ = D(0.75) ’ where D(f) is the length exceeded by a fraction f of the links. Note that B is not truly the slope, but is proportional to the exponential of the slope on a semi-log plot (Fig. 5). Because the link distribution depends on the chosen distance metric, we examined two sets of statistics relating to the median link length and the link length slope. First, we considered the distribution when the metric is simple 296 S. D . Davk and C. Frohlich I I I I I 1 I I I 1 I I Lv) N 1 MEXCAMM (C = 0 km/day) -115 -110 -105 -100 -95 -85 -90 -80 -75 Fraction f 5. (a) Linkage of earthquakes in Fig. 1 (MEXCAM48) as determined by single-link cluster analysis (SLC). The numbers above some links in the map represent the lengths of the 11 longest links (those greater than 150 km). (b) Distribution of link lengths for the earthquakes shown in Fig. 1. The logarithmic link distribution is nearly linear for all but the smallest and largest links. Thus, the central portion of the distribution curve can be characterized by two statistics, the median link length S and the distribution slope B. Let D ( f ) be the length which a fraction f of the links exceed. We define So as the median link length, i.e., D(0.5). Here, S,= 18.6 km. We define the distribution ‘slope’ as B, = D ( 0 . 2 5 ) / 0 ( 0 . 7 5 ) Here, . B, = (29.85 km)/(11.68 km) = 2.56. The three dots on the curve correspond to the lengths D(O.25), D(0.5) and D(0.75), respectively. Figure Euclidean distance, and we denote these values as So and B,. To some extent So serves as a scaling factor, as it gives an indication of typical distances between hypocentres in the region of interest (Fig. 6). B , is a measure of the ratio of long link lengths to short. B, is always greater than or equal to 1.0, and increases as events become more and more clustered (Fig. 6). Because these two statistics do not provide information on the very longest links (Fig. S), the distribution of ‘isolated’ events and clusters will not be well constrained. However, as our goal was to test aftershock identification schemes, the distribution of the longest links should not have greatly affected our results. The slope of the lipk length distribution is also affected by the ‘dimensionality’ of the space in which events occur (Frohlich & Davis 1990). Thus, with the statistics So and B, alone we are unable to distinguish between differences in dimensionality and the degree of clustering (Fig. 6). However, as the mapped positions of simulated hypocentres strongly resemble those in actual catalogues (Figs 1-4), many of the overall dimensional characteristics of the catalogues will be preserved. By changing the distance metric used to define linkage for SLC, one can study different aspects of clustering (Frohlich & Davis 1990). In the present paper and in a related study (Davis & Frohlich 1990) we used a metric which includes both space and time separation between events. In this way we could identify events that are close in space and time, and hence are likely to share a genetic relationship. This metric was + d,, = space-time ‘distance‘ = v ( d z C2t2), (16) where d is the geographic separation, in 3-D Euclidean space, between events (km), t is the time difference (days), and C is a parameter which relates time to distance. Although the units of d , are km, this is not simple geographic distance (except for the special case when C = 0), but instead includes both space and time separation. To avoid confusion we denoted values of d,, by the units ‘ST-km’ (space-time km). In addition to So and B,, which relate to the spatial properties of events, we considered the space-time distribution of events by examining the statistics S, and B,, i.e., the median link length and link length ‘slope’ when we used a space-time metric with a conversion factor of C km day-’. Larger values of C place more emphasis on the temporal properties of the catalogue. In a later section of this study we show that C = 1km day-’ is a reasonable value for studying the space-time properties of aftershocks; thus, we used the statistics S, and B,. These statistics are similar to So and B , except that they provide information on the distribution of events in space-time. Thus, S, estimates the order of magnitude of typical space-time distances, while B, provides information on the clustering of events in space and time. These statistics are thus appropriate for catalogues in which aftershock sequences or swarms may be important features. Comparison of actual and synthetic catalogues: examples We used these six statistics to help us find a reasonable value for the swarming parameter q for our synthetic catalogue. For example, consider a synthetic catalogue SLC, synthetic catalogues, and aftershock identification - a 25 km b U 25 km 297 somewhere between 0.1 and 1.0 as appropriate (Table 2). We chose a value of q = 0.25 for this study. ..... .. AFTERSHOCK IDENTIFICATION SCHEMES Single-link cluster identification of earthquake aftershocks SO 5.0 km BO 2.5 So 2.5 km Bo 2.5 H . 1: . .. . , 1‘ +., . , So 5.0 km . <., , I ...~ . B, 4.0 Fiyre 6. Schematic of how changes in the pattern of seismicity affect the statistics S, and B,. (a) 100 events randomly placed in a circular area of radius 43 km. (b) The same events as in (a) have been reduced to an area of radius 21.5 km. This has no effect on B,, but decreases the median link lenght S, by a factor of 2. The dashed line indicates a circle of radius 43 km. (c) 33 random events in an area of 10 km radius (small circle) have been superimposed on 67 events in an area of 70km radius (large circle). Note that the presence of clustering increases the link length slope B,. The dashed line indicates a circle of radius 43 km. (d) The dimensionality of earthquakes also affects the link length slope. Here, the same events as in (a) have been ‘stretched’ to fill an ellipse with a semi-major axis of 292 km and a semi-minor axis of 3.6 km, making the distribution of events approximately 1-D. Note that with the statistics S, and B, alone we cannot discriminate between the effects of clustering (c) and dimensionality (d). A dashed circle of radius 43 km is shown for reference. devised to simulate MEXCAM48. If q is too high, we would expect large numbers of closely knit afterevent sequences. In this case, the abundance of short links should produce lower median link lenghts (So, S,) and higher link length slopes (B,, B,) than those found in catalogue MEXCAM48, while the clustering of events in time should also produce higher values of PID(10) and PID(100) that in the observed catalogue. Alternatively, if q is too low, the index of dispersion [PID( lo), PID(100)] should approach unity, the median link lengths (So, S,) should be higher, and the link length slopes ( E o , B,) lower than those found in MEXCAM48. The closest match between MEXCAM48 and the synthetics was with a swarming parameter of about q = 0.2 (Table 2). At much lower values (e.g. q = 0.04) there were not enough afterevents. This lack of clustering was evident somewhat in the spatial statistics So and B,, and was prominently seen in the statistics which include a temporal component [S,, B , , PID(lO), PID(lOO)]. At higher swarming parameters (e.g. q = 1.0) the synthetic catalogues clearly exhibited too much clustering. Similarly, we found that with swarming parameters of 0.3 and 0.5 we could approximate the characteristics of ALASKA49 and VAN30, respectively (Table 2). For ADAKS20, there was no clear best-fitting value of q, although the statistics suggested a value To study aftershock sequences, we used the space-time metric d, described previously. Although one could choose a different metric for identifymg aftershocks, we chose d,, as it is easy to use, has the appealing attribute of treating time as a fourth spatial dimension, and closely approximates the way the human eye picks out aftershock sequences on earthquake space-time plots. With dST as the metric, we define a ‘space-time cluster’ of size D, or ‘D-cluster’, as a group of events joined by links of size D ST-km and less. In practice, this method requires the determination of two parameters, C and D. The first parameter, C, is a conversion factor which relates temporal separation to spatial separation. Ideally, one would like to choose a value C = A x / A t such that two simultaneous events separated by a distance Ax are as closely ‘related’ to each other as two events with identical hypocentres occurring a time At apart. The second parameter, D is the ‘distance’ at which we consider events to be related. If two events can be joined by a series of links of size D or less, then these events belong to the same D-cluster. To identify aftershock sequences, one should choose D large enough to link most aftershocks to their parent events, but small enough so unrelated events are rarely linked. We found that in the ISC catalogue, SLC joins events that the eye picks out as obvious clusters if one uses values of about C = 1 km day-’ and D = 60 ST-km (Fig. 7 ) . When we used SLC to identify aftershocks, we divided the events into distinct space-time clusters. We assumed that each cluster of N events contained one primary event (main shock) and N - 1 secondary events (aftershocks and/or foreshocks). We could then either assume that (1) the first event to occur in a given cluster was the primary event, or (2) the largest magnitude event in a given cluster was the primary event. In the first case, all of the secondary events would be aftershocks: in the second case, the secondary events would be either foreshocks or aftershocks. For this study, we took the largest magnitude event as the main shock of a sequence. Scoring system for aftershock identification To determine the accuracy of an aftershock identification scheme on any particular synthetic catalogue, we scored its success in terms of its ‘validity’ V and ‘reliability’ R (Reasenberg & Matthews 1988): ‘score’ = V + R - 1. (17) Here, V is the fraction of afterevents which are linked to their parent primary event, and R is the fraction of primary events which are correctly identified as such (i.e., not misidentified as afterevents). Note that in order to compute the score, one must keep track of both the actual afterevent sequences (those created by the stochastic model) and the determined afterevent sequences (those picked out by the identification scheme). It can be shown that when applied to 298 S. D. Davis and C. Frohlich Table 2. Comparison of earthquake statistics of actual catalogues with synthetic catalogues. Space-only statistics are the median link length S, and the link length distribution ‘slope’ B,. Space-time statistics are the median link length and link-length slope computed with a space-time metric using a conversion factor C = 1 km day-’. Time-only statistics are the Poisson Index of Dispersion PZD at time intervals of 10 days and 100 days. File MEXCAM48 contains earthquakes in the Mexico-Central America region (FlinnEngdahl regions 5 and 6) of magnitude 4.8 and greater (Fig. la). File ALASKA49 contains earthquakes in Alaska and the Aleutians (Flinn-Engdahl region 1) of magnitude 4.9 and greater (Fig. 2a). VAN30 contains earthquakes in the central Vanuatu region of magnitude 3.0 and greater (Fig. 3a). File ADAKS20 are earthquakes in the Adak region of magnitude 2.0 and greater (Fig. 4a). We generated synthetic catalogues using the parameters given in the text, varying only the swarming parameter q. SPACE-ONLY SPACE-TIME S1 80 SO TIME -ONLY P I D . (10) P I D . (100) 81 F i l e HEXCAM48 18.61 2.56 102.84 3.03 2.12 5.00 HLXCAHQB S y n t h e t i c s (mean and standard deviation of 20 catalogs each) q q 0.04 0.1 q = 0.2 q = 0.4 q = 1.0 = = 23.93 f .53 23.14 f .45 21.9 f 1.2 19.1 f 1.4 13.5 f 2.2 2.01 2.20 2.36 2.16 3.38 f .13 f .12 f .29 f .31 f .54 122.8 f 116.5 f 101.8 f 62.9 f 23.9 f 2.4 3.4 1.2 12.1 5.1 2.02 f .53 2.51 f . 8 0 4.4 f 1.8 1.8 f 3.1 1.9 f 2.1 1.11 f . 1 6 1.65 f .35 3.1 f 1.2 6.5 f 3.7 20.9 f 19.0 1.41 f .21 2.51 f . 8 0 6.0 f 2.1 14.2 f 8.3 41.2 f 36.1 6.05 26.9 41.3 F i l e ALASXA49 13.29 2.64 66.13 ALAsXAO9 S y n t h e t i c s (mean and standard deviation q = 0.04 q = 0.1 q = 0.2 q = 0.4 q = 1.0 18.65 f .43 18.21 f .56 11.2 f l . 2 15.0 f 1.8 10.1 1 2 . 1 2.16 2.21 2.32 2.58 2.92 f .ll f .16 f .29 f .43 f .59 102.6 f 95.6 f 18.8 f 45.0 f 19.1 f 3.8 8.1 11.5 16.1 5.5 of 20 catalogs each) 2.12 f .85 3.0 f 2.0 4.9 f 4.0 1.4 f 3.8 5.9 f 2.5 1.8 k‘1.5 5.0 f 8.3 13.0 f 24 31. f 61 51. f 64 3.60 34 .O 3.0 f 3.6 10. f 18 30. f 30 81. f 163 130. f 178 F i l e VAN30 4.50 3.03 14.17 96.3 W 3 0 S y n t h e t i c s (mean and standard deviation of 20 catalogs each) --- q = 0.04 0.1 0.2 0.4 1.0 q q q q 5.80 5.69 5.37 4.90 3.88 f .19 f .19 f .19 f .31 f .33 2.63 2.65 2.75 2.96 3.46 f .09 f .09 f .08 f .I7 f.23 19.11 f .44 18.50 f .38 17.16 f .59 15.0 f 1.0 9.9 f 1.5 rile 2.67 3.44 2.09 2.17 2.43 3.12 4.10 f f f f .ll .10 .16 .40 f .60 1.09 f .17 1.44 f .24 2.48 f .58 5.2 f 1.9 14.4 f 1.7 0.94 f .54 2.0 f 1.2 4.2 f 2.0 10.0 f 5.8 29. f 22 ADAKSZO 18.07 2.42 3.83 20.8 ADAXSZO S y n t h e t i c s (mean and standard deviation of 20 catalogs each) q - 0.04 0.1 q = 0.2 q q =. - 0.4 q = 1.0 4.19 4.08 3.79 3.38 2.69 f .10 f .16 f .28 f .50 f .74 2.24 2.32 2.50 2.91 3.48 f .08 f .08 f .15 f .42 f .56 22.36 f .47 20.1 f 2.9 19.1 f 1.1 14.8 f 3.5 8.0 f 3.4 synthetic catalogues in which the primary events are both the first and the largest event in their actual afterevent sequences, both V and R are independent of whether an aftershock identification scheme picks the first event or largest event in a determined afterevent sequence as the 1.82 f .12 2.24 f 1.10 2.90 f .12 4.1 f 1.6 6.1 f 1.4 1.31 f .22 2.5 f 1.2 6.0 f 4.0 13.5 f 12.5 41.5 f 49.9 1.64 f .69 4.4 f 2.8 12.9 f 9.7 32.0 f 30.6 98.0 f 116.1 linked (Fig. 8). If too many primary events are incorrectly linked, the score again decreases. Although in theory the score can range from -1 ot +1, we rarely observed scores much below zero. SLC, synthetic catalogues, and aftershock ident$cation 299 MEXCAM48 Synthetics (q = 0.2) Rat Islands C = 1 km/day D = 60 S T - k m i W165 + + 8 + + + + % + + + 170 + -5 0 + + 1 Q ++ + ++ 175 z ++ -b + % + + + + $ + + + + 0 a 180 ,% + + 8 + + + + + 7.0 ++ ++% El75 1664 1466 15365 1667 1968 1969 0 1 YEAR Figure 7. Single-Link Cluster (SLC) identification of aftershocks, from Frohlich & Davis (1990). Shown is a space-time plot of 432 earthquakes in ALASKA49 between 165"W and 175"E from 1964 to 1970. Here we linked earthquakes using the space-time metric described in the text, relating distance to time by C = 1.0 km day-'. Lines denote links of length 60ST-km or less. Filled circles represent isolated events as well as the largest magnitude event in each sequence. Open circles are linked earthquakes, and + 's are unlinked earthquakes. The Rat Islands earthquake of 1965 February 4 (Ms=8.2) is linked to two foreshocks and 245 aftershocks which occur over 153 days. Similarly, the Andreanof Islands earthquake of 1%9 May 14 (Ms= 7.0) is linked to 21 events which occur over 176 days. SCORING SYSTEM 'storm- = V + R v -1 = frnctlon of aftorshocks Ilnkad t o parrnt mvmnt R I. Small 0 0 0 0 scorn 0.33 = fractlon of maln shocks corrmctly Idmntlflmd flaln shock 0Aftershock TlHE Lmrgm 0 Egwe 8. Scoring system to evaluate success rate of an aftershock identification scheme on a synthetic catalogue with known main shocks and aftershocks. Here, we illustrate the scoring system with a schematic diagram showing a main shock with three aftershocks, and two isolated main shocks. The 'score' = V + R - 1, where V is validity (fraction of aftershocks correctly identified), and R is the reliability (fraction of main shocks correctly identified). For a single-link cluster analysis scheme, a too small value of D will not identify a sufficient fraction of the aftershocks, and the resulting score will be low. Conversely, if D is too high, then too many main shocks will be linked together and misidentified as aftershocks. ' 20 ' 40 ' 60 i o ' 100 D (ST-km) ' . . . . . 120 140 Figure 9. Plots of score versus D for synthetic catalogues with statistical characteristics similar to MEXCAM48 for several values of C. Each curve represents the average result from four synthetic catalogues, and represents a different conversion parameter C, as follows: C = 0 (lower solid line), C = 0.05 km day-' (upper solid line), C=0.25km day-' (long dashed line), C = l km day-' (dotted line), C = 3 km day-' (short dashed line), and C = 10 km day-' (dashed-dotted line). Davis (1989) presents similar figures for the other catalogues. 1 km day-' and D about 80 ST-km (Fig. 9). With these values, roughtly 10 per cent of the afterevents were not identified by the SLC identification scheme, while about the same percentage of primary events were erroneously identified as afterevents. We used the scoring system to address a number of questions. In particular, we wished to know (1) what values of C and D are most useful in finding afterevents for a particular catalogue, and (2) how does the identification of afterevents depend on the background seismicity rate? Results from synthetics of the four catalogues we studied suggest that a value of C = 1km day-' is adequate for finding afterevents in most practical applications (See Davis 1989 for details). In MEXCAM48, values of C = 1 km day-' and D = 80 ST-km yielded an average score of 0.77 (Fig. 9). Similarly, in ALASKA49, we obtained an average score of 0.82 with C = 1km day-' and D = 60 ST-km. In ADAKS20, we obtained an average score of 0.44 with C = 1 km day-' and D = 62 ST-km. In VAN30, we obtained an unusually low best score of about 0.34 with C = 1km day-' and D = 12 ST-km. This area includes a well documented nest of activity known as the Efate Salient (hacks er al. 1981), and it was difficult to pick out afterevents because of the unusually high background rate. Nevertheless, we obtained the highest possible score with a value of C = 1 km day-'. To investigate the effect of different background rates, we generated several synthetic catalogues with varying primary event occurence rates I,. All of the other input parameters were the same as those used to simulate MEXCAM48. An analysis of these catalogues (Fig. 10) suggested that the scheme worked well with a value of C = 1km day-' over a wide range of background rates. Naturally, when the 300 S. D. Davis and C. Frohlich t 1 . . . . . . . . . . . . . . 0 20 0 40 60 80 120 100 140 D (ST-km) 0 o T , . 0 20 . . . . . . . . . . . . 40 60 80 100 120 l n 140 D (ST-km) D for synthetic catalogues with several different main shock Occurrence rates A,, expressed in terms of A, the best fit Occurrence rate for actual MEXCAM48 earthquakes. Other input parameters were those used to simulate MEXCAM48 (Table 1); we set the swarming parameter q to 0.2 (Table 2). Each line represents a different conversion parameter C, as follows: C = 0 (lower solid line), C = 0.05 km day-' (upper solid line), C = 0.25 km day-' (long dashed line), C = 1 km day-' (dotted line), C = 3 km day-' (short dashed line), and C = 10 km day-' (dashed-dotted line). Each such curve is the average from four synthetic catalogues. In each case, a value of C = 1 km day-' (dotted line) produces approximately the highest scores with an appropriate value for D (solid squares). Figure 10. Plots of score versus 0 0 2- O.. 2 ~2 A MEXCAM48 Synthmtlcr 0 ALASKA49 Synthetlca + oI ::: A ..$ ADAKS2O Synthotlcr 0 VAN30Synthetlu ..O 0 0.. r 7 0.. 0 .'a OA,..' m 0 9."' 0 (0" s-. N ..o ..do Other aftershock identification schemes ..o N .+'@ &...a4 6 : t: 0 5 10 20 50 100 - 25.2 ST-km, Dk,(Sl) = 9.4 S T - k m ' " 6 8 ST-km < S, < 300 ST-km. (18) (0 ,$ ..."+...Q 0.. background rate is high, one must use a lower value of D to avoid linking numerous unrelated events, and the identification scheme is less efficient. As a general empirical rule, we determined that the value Dbestwhich produces the best score (with C = 1 km day-') is a function of the median link length in space-time S, (Fig. 11) and takes the form f- 0 200 Sl Figure 11. Plot of Dkst versus S,, where S, is the median link length with C = 1 km day-' and DkSt is the value of D which produced the highest score with C = 1km day-' (see Fig. 12). Each symbol represents the average results from four synthetic catalogues with input parameters similar to those found for catalogue MEXCAM48, ALASKA49, VAN30, and ADAKS49. Only the main shock occurrence rate was vaned. For each set of symbols, the background rates used were, from left to right, 30A (where A is the background rate of the true catalogue), 10A, 31, A, A/3 and 1/10. The dashed line represents the function D b c s t = a a +b, where a = 9.4 ST-kml'* and b = -25.2 ST-km (equation 9). We also applied this scoring system to four other schemes for identifymg dependent events, namely, the aftershock identification schemes of Gardner & Knopoff (1974), Shlien & Toksoz (1974), Knopoff et ul. (1982), and Reasenberg (1985). In this section we investigate how well these schemes performed when used on the synthetic catalogues developed in this study. Although all of the aftershock identification schemes we considered were based on the premise that two earthquakes or events are related if they fall within some space-time criteria of each other, there were several conceptual differences which require some discussion. Standard space-time windows versu cluster-link schemes In a standard space-time window, an earthquake belongs to a cluster if it falls within a given time interval and distance interval of the cluster's primary event (either the first or SLC, synthetic catalogues, and aftershock identijication largest earthquake, depending on the scheme), where the size of the intervals depend on the magnitude of the largest earthquake in the cluster. In such a scheme, no earthquake outside of these intervals will belong to the cluster (Knopoff & Gardner 1972). Alternatively, in a cluster-link scheme, an earthquake belongs to a cluster if it falls within some space-time criteria of any earthquake in the cluster (as opposed to only the primary event), making it possible for very distant earthquakes to belong to the same cluster if there is a chain of intervening related earthquakes. Among the schemes studied, those of Gardner & Knopoff (1974) and Knopoff el al. (1982) use standard space-time windows, while the SLC scheme and those of Schlien & Toksoz (1974) and Reasenberg (1985) are cluster-link schemes. However, we found that we obtained higher scores for the space-time window schemes when we treated them as cluster-link schemes, and these are the scores presented in this paper. Main shock threshold magnitude In some aftershock identification schemes, events are only considered as possible main shocks if their magnitude exceeds some threshold value; in other schemes, all events are considered as possible main shocks. Of the schemes we studied, only that of Knopoff et al. (1982) employed magnitude thresholds. As the choice for such a threshold is somewhat arbitrary, we did not use any main shock magnitude thresholds in this study. Magnitude dependence For several of the schemes, the criteria for determining dependence varied with main shock magnitude. This is based on the reasonable premise that larger earthquakes effect physical changes at larger distances and over longer time periods. Although this is a realistic assumption, such schemes often have the potentially undesirable property of being sensitive to the magnitudes of a few large events. Consequently, the results depend on the accuracy of magnitudes which often contain large systematic errors (Habermann 1987; Habermann & Craig 1988). Direction of search Magnitude-dependent aftershock identification schemes are of two general classes: forward-only searching schemes, and forward-backward searching schemes. In forward-only searching schemes, the order of the events may affect their dependence. For example, a small shock followed by a large shock may be considered unrelated because the large shock does not fall within the short time window of the small event. However, if the order of events were reversed, the small event would fall within the time window of the large event, and one would consider the events to be related. In forward-backward schemes, the order of the events is not important since the search windows extend both forward and backward equally in time. Note that when the criteria are not magnitude dependent, it does not matter whether the scheme is forward-only searching or forward-backward searching. The SLC scheme described in this paper is an example of a scheme which does not depend on earthquake magnitude. In Table 3 we denote schemes with an ‘F(forward-only searching), and ‘FB’ (forward-backward searching), or a ‘- -’ (schemes in which the distinction is meaningless). Table 3. Comparison of different aftershock schemes. See text for explanation of terms. The first column indicates whether an event must exceed some threshold magnitude before it is considered a possible main shock. The second column indicates whether the scheme is magnitude dependent. The third column lists the direction of search for the scheme; here, ‘F denotes schemes which are forward-only searching ‘F/B’ denotes schemes which are forward-backward searching, and ‘- -’ denotes schemes in which the distinction is meaningless. The fourth column indicates whether the earthquake clusters identified by the scheme obey the transitive property. The final column gives the basis upon which the scheme determines whether or not events are related, with ‘(M)’ denoting that the basis is dependent on main shock magnitude. Asterisks indicates that we modified the algorithm as described in the text. Main Shock Relcrence Marnitude Magnitude DeDen(kmd this piper no no Threshold 301 Direclion $LSw& Basis of Tronsilive? Scheme yes cluster-link scheme with space-time metric dST Gnrdiier and Knopoff (1974) no Yes F yes space-time window’ with spatial cutoff (M) and temporal cutoff (M) Knopoff et al. (1982) yes* yes FIB no* space-time window’ with spatial cutoff (M) and temporal cutoff (M) Shlien 2nd Tijksoz (1974) no no _. cluster-link scheme with space-time statistic s Reasenberg (1985) no Yes F cluster-liiik sc11e1ncwith spatial cutoff (M) and Oinori probability relation for time (M) 302 S. D. Davis and C . Frohlich HIGH SCORES Transitivity of relatedness For many schemes, the ‘relatedness’ of events forms an equivalence relation; that is, when aftershock sequences are identified, the following properties hold for all earthquakes in the catalogue: (1) each event is the same sequence as itself (‘reflexivity’); (2) if event a is in a sequence which contains b, then event b is in a sequence which contains a (‘symmetry’); and (3) if event a is in a sequence which contains b, and b is in a sequence which contains c, then a is in a sequence which contains c (‘transitivity’). Most of the schemes described in this paper behave as equivalence relations. However, Knopoff et af. (1982) treat main shocks and aftershocks in such a way that event b can be an aftershock of both event a and event c, where a and c do not belong in the same aftershock sequence. For the purposes of this study we modified the method of Knopoff et af. (1982) by treating it as if the transitive property held. With respect to these properties, Gardner &- Knopoff (1974) employ forward-only searching magnitude-dependent space-time windows; that is, events are related if their time separation At and space separation Ar are such that At < f ( M ) and Ar < g ( M ) , where M is the magnitude of the first shock. Similarly, Knopoff et af. (1982) employ a suite of magnitude-dependent space-time windows, although their scheme is forward-backward searching. In addition, they only apply their space-time windows to events above a certain threshold magnitude. For both of these schemes, we generalized their concepts by employing a range of space-time windows which were proportional by a ‘scaling factor’ to those presented in the original papers. Reasenberg (1985) uses a forward-only searching cluster-link scheme. Here, the spatial cut-off is given by where rcrack is the radius of a circular crack with a stress drop of 30 bar which corresponds with the seismic moment (Kanamori & Anderson 1975), and Q is a constant. For his investigation, Reasenberg uses a value of Q = 10. His time criteria is based on the probability of observing future events in an Omori’s Law decay, and consequently depends on the main shock-aftershock separations already determined by the algorithm. In addition, Reasenberg places an upper bound for the time cut-off of , T = 10 days on events = 1 day for events not yet in aftershock sequences, and T, associated with an aftershock sequence. As with the previous two schemes, we employed a ‘scaling factor’ to vary his values for R and T ,, . Shlien & Toksoz (1974) identify dependent events on the basis of their s-statistic: s = z ( A d ) 2A t k ( X , Z ) (20) where Ad is the spatial separation between events, At is the time separation, and k ( X , Z) is the mean rate of activity per unit area at spatial location X over a sub-region with area Z. They consider events to be related when s < (Y, A d 5 D,,, It. Scores produced by single-link cluster analysis (SLC) using Dkst (equation 9) compared with the highest scores obtained by other detection schemes (Davis 1989). Fire and 0 5 At IT,,,, with where a,A, D,,,, and Z are chosen constants. Note that this scheme requires the determination of the mean rate of activity over several spatial sub-regions. In their investigation, they chose values of (Y = 0.02, A = 100, and D,,, = 1.41 degrees, and chose Z = 1 degree for the northern Japan catalogue and Z = 0.5 degrees for the southern California catalogue. For this study, we used values ranging from 2 = 0.25 to 1 degree. Results Figure 12 summarizes the highest scores produced by each method. For details on how the scores varied with different scaling factors or sub-region sizes, we refer the reader to Davis (1989). Among the teleseismic catalogues, SLC and Shlien & Toksoz’s (1974) method produced the highest scores, ranging from 0.76 to 0.85. The space-time windows of Gardner & Knopoff (1974), Knopoff et al. (1982), and Reasenberg (1985) produced somewhat lower scores ranging from 0.56 to 0.74. With the local network data, all four schemes produced roughly similar scores. However, all scores were noticeably lower than with the teleseismic catalogues, averaging 0.22 for VAN30 synthetics and 0.43 for ADAK20 synthetics. In general, the Gardner & Knopoff scheme produced slightly higher scores for VAN30 synthetics and ADAK20 synthetics (0.26 and 0.47, respectively). This may result from the fact that their scheme was specifically devised to detect aftershocks in local network catalogues. Among all four catalogues, SLC had the highest average score (0.568), followed by the methods of Shilen & Toksoz (0.548), Reasenberg (0.520), Gardner & Knopoff ( O S O S ) , and Knopoff et af. (0.480). SLC, synthetic catalogues, and aftershock identification 303 DISCUSSION Aftershock identification schemes In this paper we have proposed a workable scheme for generating synthetic catalogues which closely approximate real catalogues in many ways. These catalogues allow us to study many of the features of actual earthquake catalogues, and we have used them to estimate the success rate of aftershock identification schemes, allowing us to compare different schemes. Synthetic catalogues can be valuable for estimating how well observational techniques work when applied to actual earthquake sequences. We have used synthetics to investigate the validity and reliability of aftershock identification schemes, and in particular to evaluate a new single-link cluster (SLC) scheme proposed in this paper. We found that aftershock identification schemes work better in some catalogues than in others. In synthetics of the (MEXCAM48 and teleseismic catalogues studies ALASKA49), SLC worked fairly well, missing only 10-14 per cent of the afterevents and misidentifying only 5-10 percent of the primary events. However, when applied to two local catalogues, SLC does not fare as well. With synthetics of ADAKS20, SLC missed 18 per cent of the afterevents and misidentified 37 per cent of the primary events. In the tightly clustered VAN30 catalogue, SLC missed over half of the afterevents while misidentifying 17 per cent of the primary events. The SLC scheme compared favourably with other aftershock identification schemes published in the literature when applied to the four catalogues in this study. First, SLC had the highest average score, although its results were not excessively better than the scores produced by other schemes (Fig. 12). However, we note that the other schemes presented in this paper were designed with specific catalogues in mind, and with some modification of these schemes it is possible that higher scores could be obtained. For example, it is likely that algorithms which employ space-time windows could obtain higher scores by imposing a minimum spatial cut-off which corresponded to the location errors in the synthetic catalogues, or by using separate scaling factors for the spatial windows and time windows. We also note that we modified the Knopoff et al. (1982) scheme by ignoring main shock threshold magnitudes and the non-transitive nature of thier approach. Second, the analysis performed on synthetic catalogues in this paper suggest that SLC is easy to use in that one does not need to make a careful study to find the appropriate input parameters. We recommend assigning a value of C = 1 km day-’ (Figs 9 and 10) and assigning D on the basis of the median link length in space-time (Fig. 11 and equation 18). Third, the scheme is not magnitude dependent, and thus is not strongly affected by errors in magnitude determination. Statistics for evaluating catalogues We have illustrated how we can examine the spatiotemporal properties of earthquake catalogues by using six statistics: [So, B,, S,, B,, PID(lO), PID(lOo)]. Four of the statistics (&, B,, S,, B,) derive from SLC, and have first been proposed in this paper. The spatial statistic So, the median link length in space, provides a rough measure of the geographic distances between epicentres. For the catalogues studied in this paper, values of So ranged from 2.67 km (ADAKS20) to 18.6km (MEXCAM49). The statistic does not appear to be strongly influenced by the aftershock process except when afterevents dominate the catalogue. This is evident from the dependence of So on the swarming parameter q (Table 2). The spatial statistic B,, the link distribution ‘slope’, is a measure of the relative sizes of large and small links, and relates to the degree of spatial clustering. In this study, the local networks ADAKS20 and VAN30 (B, of 3.44 and 3.03, respectively) were more spatially clustered than the teleseismic catalogues MEXCAM48 and ALASKA49 (B, of 2.56 and 2.64, respectively). This statistic, like So, depended weakly on the presence of afterevents (Table 2). Of the statistics described in this paper, the space-time statistics S, and B, were the most useful for determining the degree of space-time clustering in earthquake catalogues. These statistics were relatively robust and not strongly influenced by random changes in the catalogue. On the other hand, they were sensitive to the presence of afterevents, and showed a strong dependence on the swarming parameter q in the synthetic catalogues (Table 2). The space-time statistic S, is a measure of the distance between events in space-time. In the four catalogues studied in this paper we have found values of S, ranging from 14.2 ST-km (VANSO) to 103 ST-km (MEXCAM48). Likewise, the space-time ‘slope’ B , provides information on the degree of clustering in space-time. In this study, values of B , ranged from 2.42 (ADAKS20) to 6.05 (ALASKA49). The index of dispersion [PID(lO), PID(lOo)] was relatively robust when there was little temporal clustering in the catalogue, but became highly variable for strongly clustered sequences (Table 2). Consequently, the statistics PID(10) and PID(100) were useful for estimating the order of magnitude of temporal clustering, but could not be used to ‘finetune’ a synthetic catalogue and determine the appropriate input parameters with any precision. In this study, the catalogue MEXCAM48 was closest to random [PID(lO) = 2.7, PID(100) = 5.01 and the catalogue VAN30 was the most highly clustered [PZD(lO) = 34, PID(100) = 961. Properties of earthquake catalogues Although the present research applied several aftershock identification schemes only to synthetic earthquake catalogues, for several reasons we expect that the results will be similar if the identification schemes were applied to genuine data. First, in other research projects we have used SLC as a tool to remove aftershocks from earthquake catalogues (Frohlich & Davis 1990; Wardlaw et al. 1990) and obtained ‘residual’ catalogues which are statistically and visually similar to catalogues generated by a Poisson process. Second, in the present study the synthetic catalogues which we studied were extremely ‘realistic’ at least in so far as they are indistinguishable from real catalogues when displayed as maps (Figs 1-4), or evaluated using statistics such as the index of dispersion or the space-time statistics presented in 304 S. D . Davis and C . Frohlich this paper. We invite other researchers to suggest additional statistics to make meaningful comparisons between real and synthetic catalogues. One intriguing result of this study is that even in the best cases, aftershock identification schemes fail to identify a surprising number of afterevents. Even in the case with the highest score (SLC on ALASKA49 synthetics), 10 per cent of the afterevents were not identified, and 5 per cent of the primary events were incorrectly identified as afterevents. In the most extreme case, catalogue VAN30, no scheme produced socres higher than 0.25, and all four schemes missed at least half of the afterevents. One explanation for the high failure rate of aftershock schemes is that the phenomenon seismologists refer to as ‘aftershocks’ is fundamentally different from the concept of ‘afterevents’ generated by a stochastic process. By virtue of the long-tailed character of Omori’s Law, a fraction of afterevents may be separated from their parent events by such large gaps in time that they would be ‘lost’ among the background of unrelated events, and most seismologists would no longer consider them to be mechanically related to their parent events. Conversely, even for a Poisson process a small percentage of events will occur so close together in space and time that a seismologist may consider them to be physically related. Thus it seems likely that statistical schemes for identifying aftershocks will seldom be more successful at finding afterevents than the schemes evaluated here. Alternatively, we suggest that with the correct parameters these schemes are able to identify nearly 100 per cent of the earthquakes that seismologists typically call aftershocks. Nevertheless, seismologists can not be completely certain of the efficiency of their aftershock detection schemes until there is an accepted physical model for the occurrence of earthquake aftershocks. CONCLUSIONS (1) We have developed a model for generating synthetic earthquake catalogues which mimic many of the properties of actual earthquake catalogues. These catalogues consist of primary events and afterevents, and may be used to study the reliability and validity of analytical techniques such as aftershock identification schemes. To create these catalogues, one must choose values for 11 variables (Table 1). Several of the variables, e.g., the time span T and number of events N,are easily derived from the actual catalogues of interest. (2) We developed four new statistics (So, B,, S,, B,) based on single-link cluster .analysis which evaluate the spatial and temporal properties of pairs of events in earthquake catalogues. These statistics are relatively robust, and are useful for making comparisons between real and synthetic catalogues. (3) We proposed a new method for identifying related events such as aftershocks using single-link cluster analysis. In this scheme, two earthquakes are related if there is a chain of events linking the two earthquakes, and no link of the chain exceeds a critical space-time distance d , = D (equations 16 and 18). (4) We evaluated five schemes for identifying aftershocks. For synthetics of two teleseismic catalogues, the SLC scheme and Shlien & Toksoz’s (1974) scheme produced the highest scores, while for synthetics of two local network catalogues Gardner & Knopoffs (1974) scheme produced the highest scores. In general, when applied to synthetic catalogues, SLC appears to work as well as several other aftershock identification schemes presented in the literature. ( 5 ) The aftershock identification schemes studied here had a surprisingly high failure rate when applied to synthetic catalogues. However, we submit that this is due to the stochastic nature of the synthetic catalogues, and suggest that these schemes identify a higher percentage of the events seismologists would consider to be physically related. ACKNOWLEDGMENTS Thanks to Cornell University and ORSTOM for access to the Vanuatu catalogue, and Carl Kisslinger at CIRES for access to data from the ADAK network. We also thank Paul Reasenberg and Jeffrey Park for their critical review of the manuscript. Partial funding for this work was provided by the donors of the Petroleum Research Fund, administered by the American Chemical Society, and by National Science Foundation grants EAR-8618406, EAR8843928, and EAR-8916665. REFERENCES Adamopoulous, H., 1975. Some counting and interval properties of the mutually-excitingprocesses, 1. appl. Prob., U,78-86. Anderson, T. W. & Darling, D. A . , 1952. Asymptotic theory of cetain ‘goodness of fit’ criteria based on stochastic processes, Ann. Math Statist., 23, 193-212. BBth, M., 1978a. A note on the recurrence relations for earthquakes, Tectonophysics, 51, T23-T30. Bith, M., 1978b. Some properties of earthquake frequency distributions, Tectonophysics, 51, T63-T69. BBth, M., 1981. Earthquake magnitude-recent research and current trends, Earth Sci. Rev., 17, 315-398. Bottari, A. & Neri, G., 1983. Some statistical properties of a sequence of historical Calabro-Peloritan earthquakes, J . geophys. Res., 88, 1209-1212. Bumdge, R & Knopoff, L., 1967. Model and theoretical seismicity, Bull. sebm. SOC.Am., 57, 341-371. Cao, T. & Aki, K . , 1985. Seismicity simulation with a mass-spring model and a displacement hardening-softening friction law, Pure appl. Geophys., 122, 10-24. Chatelain, J., Cardwell, R. K. & Isacks, B. L., 1983. Expansion of the aftershock zone following the Vanuatu (New Hebrides) earthquake on 15 July 1981, Geophys. Res. Len., 10,385-388. Chatelain, J. L., Isacks, B. L., Cardwell, R. K . , PrCvot, R. & Bevis, M., 1986. Patterns of seismicity associated with asperities in the central New Hebrides island arc, 1. geophys. Res., 91, 12497-12519. Chen, Y. T. & Knopoff, L., 1987. Simulation of earthquake sequences, Geophys. J . R. mtr. SOC.,91,693-709. Cox, D. R. & Lewis, P. A. W., 1966. The Statistical Analysis of Series of Events, Halsted Press. Das, S. & Scholz, C. H., 1981. Off-fault aftershock clusters caused by shear stress?, Bull. seism. SOC.A m . , 71, 1669-1675. Davis, S. D., 1989. Investigations concerning the nature of earthquake aftershocks and earthquakes induced by fluid injection, PhD thesis, University of Texas at Austin. Davis, S. D. & Frohlich, C., 1990. Single-link cluster analysis of earthquake aftershocks: decay laws and regional variations, J. geophys. Res., submitted. De Natale, G . , Gresta, S., PatanC, G & Zollo, A., 1985. Statistical analysis of earthquake activity at Etna volcano (March 1981 eruption), Pure appl. Geophys., 123, 697-705. SLC, synthetic catalogues, and aftershock identification Eneva, M. & Pavlis, G. L., 1988. Application of pair analysis statistics to aftershocks of the 1984 Morgan Hill, California, earthquake, J. geophys. Res., 93, 9113-9125. Eneva, M. & Hamburger, M. W., 1989. Spatial and temporal patterns of earthquake distribution in Soviet Central Asia: Application of pair analysis statistics, Bull. seism. SOC. A m . , 79, 1457-1476. Engdahl, E . R., 1977. Seismicity and plate subduction in the central Aleutians, in Island Arcs, Deep Sea Trenches, and Black Arc Basins, Maurice Ewing Ser. 1, pp. 259-271, eds Talwani, M & Pittman, W. C. 111, AGU, Washington, DC. Frohlich, C., 1987. Aftershocks and temporal clustering of deep earthquakes, J. geophys. Res., 92, 13 944-13 956. Frohlich, C. & Davis, S., 1985. Identification of aftershocks of deep earthquakes by a new ratios method, Geophys. Res. Len., 12, 714-716. Frohlich, C. & Davis, S. D., 1990. Single-link cluster analysis as a method to evaluate spatial and temporal properties of earthquake catalogues, Geophys. J. Int. , 100, 19-32. Frohlich, C., Billington, S., Engdahl, E. R. & Malahoff, A., 1982. Detection and location of earthquakes in the central Aleutian subduction zone using island and ocean bottom seismograph stations, I . geophys. Res., 87, 6853-6864. Gardner, J. K. & Knopoff, L., 1974. Is the sequence of earthquakes in Southern California, with aftershocks removed, Poissonian?, Bull. seism. SOC.A m . , 64, 1363-1367. Gutenberg, B. & Richter, C. F., 1954. Seismicity of the Earth and Associated Phenomena, Princeton University Press, Princeton, NJ. Habermann, R. E., 1982. Consistency of teleseismic reporting since 1963, Bull. seism. SOC.A m . , 72, 93-111. Habermann, R. E . , 1983. Teleseismic detection in the Aleutian island arc, J. geophys. Res., 88, 5056-5064. Habermann, R. E., 1987. Man-made changes of seismicity rates, Bull. seism. SOC.A m . , 77,141-159. Habermann, R. E . & Craig, M. S., 1988. Comparison of Berkeley and CALNET magnitude estimates as a means of evaluating temporal consistency of magnitudes in California, Bull. seism. SOC.Am., 78, 1255-1267. Hawkes, A. G . & Adamopoulous, L., 1973. Cluster models for earthquakes-regional comparisons, Bull. Inf. Star. Inst., 45(3), 454-461. Isacks, B. L., Cardwell, R . K., Chatelain, J., Barazangi, M., Marthelot, J., Chinn, D. & Louat, R., 1981. Seismicity and tectonics of the central New Hebrides Island Arc, in Earfhquake Prediction , Maurice Ewing Ser. 4, pp. 93-116. eds Simpson, D. W. & Richards, P. G., AGU, Washington, DC. Kagan, Y. Y. & Knopoff, L., 1976. Statistical search for non-random features of the seismicity of strong earthquakes, Phys. Earth planet. Inter., U,291-318. Kagan, Y. & Knopoff, L., 1977. Earthquake risk as a stochastic process, Phys. Earth planet. Infer., 14, 97-108. Kagan, Y. & Knopoff, L., 1978. Statistical study of the occurrence of shallow earthquakes, Geophys, 1. R . astr. Soc., 55,67-86. Kagan, Y. Y. & Knopoff, L., 1980. Dependence of seismicity on depth, Bull. seism. SOC. A m . , 70, 1811-1822. Kagan, Y. Y. & Knopoff. L., 1981. Stochastic synthesis of earthquake catalogues, J. geophys. Res., 86, 2853-2862. Kagan, Y. Y. & Jackson, D . D . , 1990. Long-term earthquake clustering, Geophys. 1. lnt., 104, 117-133. Kanamori, H., 1977. The energy release in great earthquakes, 1. geophys. R e x , 82, 2981-2987. Kanamori, H . & Anderson, D. L., 1975. Theoretical basis of some empirical relations in seismology, Bull. seism. SOC. A m . , 65, 1073- 1095. King, C . Y. & Knopoff, L., 1968. Stress drop in earthquakes, Bull. seism. SOC. A m . , 58,249-257. Knopoff, L. & Gardner, J. K., 1972. Higher seismic activity during 305 local night on the raw worldwide earthquake catalogue, Geophys. 1. R. astr. SOC.,28, 311-313. Knopoff, L., Kagan, Y. Y. & Knopoff, R., 1982. b values for foreshocks and aftershocks in real and simulated earthquake sequences, Bull. seism. SOC.A m . , 72, 1663-1676. Mandelbrot, B. B., 1982. The Fractal Geometry of Nature, W. H. Freeman, S Francisco. Marthelot, J. M., Chatelain, J. L., Isacks, B. L., Cardwell, R. K. & Coudert, E., 1985. Seismicity and attneuation in the central Vanuatu (New Hebrides) islands: a new interpretation of the effect of subduction of the D’Entrecasteaux Fracture Zone, J. geophys. Rex, 90, 8641-8650. Mayer-Rosa, D., Pavoni, N., Graf, R. & Bast, B., 1976. Investigations of intensities, aftershock statistics and the focal mechanism of Friuli earthquakes in 1975 and 1976, Pure appl. Geophys., 114, 1095-1 103. McNally, K. C., 1977. Patterns of earthquake clustering preceding moderate earthquakes, central and southern California, EOS, Trans. Am. geophys. Un., 58, 1195. Mikumo, T. & Miyatake, T., 1978. Dynamical rupture process on a three-dimensional fault with non-uniform frictions, and near-field seismic waves, Geophys, J. R. astr. SOC., 54, 417-438. Mikumo, T. & Miyatake, T., 1979. Earthquake sequences on a frictional fault model with non-uniform strength and relaxation times, Geophys, 1. R. astr. SOC.,59,497-522. Mikumo, T. & Miyatake, T., 1983. Numerical modelling of space and time variations of seismic activity before major earthquakes, Geophys, 1. R . astr. SOC.,74, 559-583. Mogi, K., 1%7. Earthquakes and fractures, Tectonophysics, 5 , 35-55. Oakes, D., 1975. The Markovian self-exciting process, 1. appl. Prob., U,69-77. Ogata, Y., 1983. Estimation of the parameters in the Modified Omori formula for aftershock frequencies by the maximum likelihood procedure, J. Phys. Earth, 31, 115-124. Prozorov, A. G. & Dziewonski, A. M., 1982. A method of studying variations in the clustering properties of earthquakes, 1. geophys. Res., 87, 2829-2839. Reasenberg, P., 1985. Second-order moment of central California seismicity, 1969-1982.1. geophys. Res., 90, 5479-5495. Reasenberg, P. A. & Matthews, M. V., 1988. Precursory seismic quiescence: a preliminary assessment of the hypothesis, Pure appl. Geophys., U6,373-406. Reasenberg, P. M. & Jones, L. M., 1989. Earthquake hazard immediately after a mainshock in California, Science, 243, 1173-1176. Rice, J . , 1975. Statistical methods of use in analysisng sequences of earthquakes, Geophys. J. R. asrr. SOC.,42, 671-683. Shlien, S. & Toksoz, M. F., 1970. A clustering model for earthquake occurrences, Bull. sebm. SOC.A m . , 60, 1765-1787. Shlien, S. & Toksoz, M. F., 1974. A statistical method of identifying dependent events and earthquake aftershocks, Earthquake Nores, 45(3), 3-16. Stein, R. S. & Lisowski, M., 1983. The 1979 Homestead Valley earthquake sequence, California: control of aftershocks and postseismic deformation, J. geophys. Res., 88, 6477-6490. Strehlau, J., 1986. A discussion of the depth extent of rupture in large continental earthquakes, in Earthquake Source Mechanics, Maurice Ewing Ser. 6, pp. 131-145, eds Das, S., Boatwright, J. & Scholz, C. H., AGU, Washington, DC. Tinti, S. & Mulgaria, F., 1985. Completeness analysis of a seismic catalogue, Ann. Geophys., 3, 407-414. Utsu, T., 1%1. A statistical study on the Occurrence of aftershocks, Geophys. Mag., 30,521-605. Utsu, T., 1972. Aftershocks and earthquake statistics (1V)analyses of the distribution of earthquakes in magnitude, time, and space with special consideration to clustering characteris- 306 S. D. Davis and C.Frohlich tics of earthquake occurrence (2), J. Fac. Sci., Hokkaido Univ., Ser T (Geophys.), 4, 1-42. Vere-Jones, D., 1970. Stochastic models for earthquake occurrences, J. R. Stat. SOC., B32, 1-62. Vere-Jones, D. & Davies, R. B., 1%. A statistical survey of earthquakes in the main seismic region of New Zealand, part 2-Time series analysis, N. 2. J. Geol. Geophys., 9, 251-284. Wardlaw, R. L., Frohlich, C. & Davis, S. D., 1990. Evaluation of precursory seismic quiescence in sixteen subduction zones using single-link cluster analysis, Pure appl. Geophys., W, 57-78. APPENDIX A Rates of occurrence for plimrvy events, afterevents, and cntalogue events ps generated by the synthetic model The number of primary events in the magnitude range (M,M + d M ) is N,(M) dM, where N,(M) is the primary event Occurrence rate. Likewise, N A ( m ) and Nc(m) denote, respectively, the occurrence rates of afterevents and of all events in the catalogue (primary events and afterevents). If each event in the catalogue is either a primary event or an afterevent, then, NC(m = NO(m + N A ( m ). (All For the synthetic model described in Chapter 2, we assumed a Gutenberg-Richter relation (equation 2): where b is the primary event b-value and A is a constant related to the total number of events. The afterevent occurrence rate is given by + 10baAm10-bMmax ba -7-1. 10-6" b,Zb, (A10) where AM = Mmax- M , Am = M,, (All) - m. (A121 Finally, the cumulative rate of the catalogue qc(m) is simply the sum of the terms in qo(m) and qA(m). APPENDIX B Approximate relation of B , to dimensionality k for randomly generated events In Frohlich & Davis (1990) we derive approximate solutions for the cumulative density function (CDF) &(r) for the link length distribution of events generated by a Poisson process for a manifold of k dimensions: Fk(r)= 1 - exp (-&k rk), (A13) where where nA(M, m ) is the afterevent function (see main text) which describes the mean number of afterevents of magnitude m produced by a primary event of magnitude M. For this study we assume an afterevent function of the form (see equation ( 4 ) of main text). The solution to the integral is NA(m) = qA lo-bm(Mmax- m ) , b, = b, where Ab = b, - b. Utilizing (Al) and (A2) we derive Nc(m) = A(1+ q(M,, = A 10-bm + -m)] 10-b.m Ab In 10 b, = b, (10MmxAb (A61 - 1VAb), b, # b. We obtain the cumulative rates q , ( M ) and q A ( m ) from the integrals 4 =Alt, m2=~A2tr LY - - d 3 t . 3-3 6414) Here, Ak is the Poisson process rate per unit volume of dimension k and unit time, and t is total the length of the time interval. This equation is based on the distribution of nearest-neighbour links, and therefore does not describe an exact solution to the distribution of links determined by single-link cluster analysis (SLC) for the 2-D and higher dimensional cases. However, we note that at least 50 per cent (and in practice about 70 per cent) of the links in SLC are nearest-neighbour links. The spatial statistic Bo is given by 0(0.25)/0(0.75), where D ( f ) is the length exceeded by a fraction f of the links (equation 6). By solving for Fk[D(0.25)J and Fk[D(o.75)], we find that [1- B, = In (0.25) In (0.75) "'= (4.82)"* Thus, the expected value of B , for events generated by a Poisson process is 4.82 for 1-D data, 2.2 for 2-D data, and 1.7 for 3-D data. For example, B , for the randomly generated events in Fig. 6(a) is 2.5 (close to the expected value of 2.2), and the value increases to 4.0 when these events fill a more 1-D space (Fig. 6d). However, for clustered data (Fig. 6c) the observed value of B , is higher than for data generated by a Poisson process.
© Copyright 2026 Paperzz