Single-Link Cluster Analysis, Synthetic Earthquake Catalogues, and

Geophys. J . Int. (1991) 104, 289-306
Single-link cluster analysis, synthetic earthquake catalogues, and
aftershock identification
Scott D. Davis1?*and Cliff Frohlich2
’ Department of GeoIogicaI Sciences, University of Texm at Aurtin, Austin, TX 78713, USA
Institute for Geophysics, University of Texas at Austin, 8701 MoPac Blvd, Austin, TX 78759-8345, USA
Accepted 1990 August 7. Received 1990 August 6; in original form 1989 July 12
SUMMARY
This paper investigates several aspects of synthetic catalogue generation and
aftershock identification schemes. First, we introduce a method for generating
synthetic catalogues of earthquakes. This method produces a catalogue which has
the geographic appearance of an actual catalogue when the hypocentres are plotted
in map view, but allows us to vary the spatial and temporal relationships between
pairs of close events. Second, we discuss six statistics to measure certain characteristics of synthetic and actual catalogues. These include four new statistics S,, B,,, S,
and B , which evaluate the distributions of link lengths between events in space and
space-time as computed by single-link cluster analysis (SLC). Third, we develop a
new scheme for identifying aftershocks in which a group of events forms an
aftershock sequence if each event is within a space-time distance D of at least one
other event in the group. We define the space-time separation of events by
dsT= d(d2+ C 2 t 2 ) , where d is the spatial separation of events, t is the time
separation, and C = 1km day-’. Our experience with several synthetic catalogues
- 25.2 km.Here, S,
suggests that an appropriate trial value for D is 9.4 kmln (G)
is the median link length using SLC with the metric dST. Fourth, we generate
synthetic catalogues resembling both teleseismic and local network catalogues to
evaluate the validity and reliability of this aftershock identification scheme, as well
as other schemes proposed by Gardner & Knopoff (1974), Shlien & Toksoz (1974),
Knopoff, Kagan & Knopoff (1982), and Reasenberg (1985). Using a simple scoring
method, we find that the SLC method compares favourably with other aftershock
identification algorithms described in the literature.
Key words: aftershocks, aftershock identification, catalogue simulation, earthquake
catalogues, earthquake statistics.
INTRODUCTION
Although seismic activity is considered to be stochastic in
nature (Vere-Jones 1970), there are significant deviations
from a Poisson process. In particular, a major deviation
from a Poisson process is the existence of earthquake
aftershocks. Algorithms for the identification of clustered
earthquakes such as aftershocks include various space-time
windows (Knopoff & Gardner 1972; Gardner & Knopoff
1974; Kagan & Knopoff 1976; Knopoff et al. 1982; Tinti &
Mulgaria 1985), likelihood functions (Prozorov & Dziewonski 1982; Bottari & Neri 1983), ratios tests (Frohlich &
Davis 1985), Shlien & Toksoz’s (1974) s-statistic, other
cluster-link schemes (Reasenberg 1985), and single-link
cluster analysis (this paper; Davis & Frohlich 1990; Frohlich
& Davis 1990).
This paper is not a comprehensive review of different
aftershock identification schemes. Instead, we aim to show
how one can use synthetic earthquake sequences with
known clustering properties to test the validity of such
schemes. In the first section, we discuss the various
assumptions needed to generate a synthetic earthquake
catalogue with realistic properties. We recommend several
parameters for generating synthetic catalogues which have
not been proposed previously, but which allow us to obtain
catalogues which are considerably more ‘realistic’ than those
generated previously by simpler schemes and which produce
well-defined sequences of ‘afterevents’ (simulated
aftershocks).
As our goal is to create synthetic catalogues which will
serve as models for actual earthquake catalogues, it is
important to have statistics for quantifymg the spatial and
289
290
S. D. Davis and C. Frohlich
temporal clustering properties of earthquake catalogues. In
Section 2 we use six different statistics to compare several
synthetic and actual earthquake catalogues.
In the third section we provide an example of the use of
these synthetic catalogues. In particular, we look at the
reliability and validity (Reasenberg & Matthews 1988) of
aftershock identification schemes proposed by Gardner &
Knopoff (1974), Shlien & Tokoz (1974), Knopoff et al.
(1982), and Reasenberg (1985), as well as a new method
proposed in a previous study (Frohlich & Davis 1990). By
creating synthetic catalogues with properties similiar to
those of four actual catalogues (two teleseismic and two
local network catalogues), we evaluate the success rate of
the aftershock identification schemes on these catalogues.
We are currently unaware of any previous comparisons of
reliability and validity for different aftershock identification
schemes.
One finding of this study is that single-link cluster analysis
(SLC) has many useful properties for evaluating earthquake
catalogues. Several of the most useful ‘new’ statistics we
used for describing earthquake catalogues fall naturally out
of the SLC method. SLC also offers a way of identifying
earthquake aftershock sequences and earthquake swarms
which compares favourably to other methods.
A surprising result of this analysis is that even under
optimum conditions, aftershock identification algorithms
often failed to identify a significant fraction of synthetically
generated ‘afterevents’. We suggest that part of the difficulty
is that the seismologist’s concept of aftershock differs in an
important way from the ‘afterevents’ generated by a
stochastic process. Specifically, the seismologist considers
events to be related if they are close enough together in
space and time so that the physical processes which generate
the earthquakes cannot be independent. In contrast, for a
stochastic process several ‘close’ events may be independent, while a significant fraction of ‘afterevents’ may be
separated from their parent events by large distances and
time intervals.
PARAMETERS FOR GENERATING
SYNTHETIC CATALOGUES
In this section we develop a parametric model of earthquake
behaviour for generating synthetic earthquake catalogues
(Figs 1-4). There are two distinct classes of events in the
model: primary events (simulated main shocks) and
afterevents (simulated aftershocks). Some primary events
are isolated; others are followed by one or more afterevents.
One must consider several aspects of earthquake
behaviour to generate a synthetic catalogue. These include:
(1) catalogue completeness and magnitude ranges, (2) a
recurrence relation for primary event magnitudes, (3) an
afterevent function that specifies numbers and magnitudes
of afterevents, (4) primary event origin times, (5) afterevent
origin times, (6) a primary event hypocentre distribution,
(7) an afterevent hypocentre distribution, and (8) location
errors. We discuss each of these aspects of catalogue
generation in some detail below. For consistency, when we
deal specifically with afterevent magnitudes we use
lower-case letters ( m ) , and upper-case letters ( M )for
primary event magnitudes. We shall also use upper case
letters when the event in question may either be a primary
event or afterevent.
It is important to note that this procedure is not based on
any physical model of earthquake occurrence, but rather is
merely a set of simple algorithms for creating a catalogue
which has many of the statistical properties of actual
earthquake catalogues. For each of the topics listed below,
other choices could be made. For example, one could
choose to incorporate forevents (simulated foreshocks) or
seismic quiescence into the model; however, we have not
done so for simplicity. Other researchers have taken
different approaches. Burridge & Knopoff (1967) use a
simple deterministic model based on the physics of a 1-D
array of blocks and springs to generate simple catalogues
with several of the properties of real catalogues. Mikumo &
Miyatake (1978; 1979; 1983) and Cao & Aki (1985) employ
more detailed models to study the effects of fault
heterogeneity and friction laws on earthquake behaviour. In
a series of papers (Kagan & Knopoff 1976, 1977, 1978, 1980)
Kagan & Knopoff utilize synthetic catalogues generated by a
stochastic model which employs a branching process. In a
related paper, Kagan & Knopoff (1981) assume a
self-similar seismic moment release process, and define the
times and magnitudes of earthquakes by the periods in
which moment release exceeds a cut-off threshold. These
models have many appealing features, and in several ways
are more ‘realistic’ than the model presented here.
However, in some ways their very realism makes them
impractical for our purposes. For example, in models which
employ a branching process, aftershocks may generage
second-order aftershocks, which in turn may generate
third-order aftershocks, and so on. Although such a scheme
is intuitively appealing, it precludes the use of a simple
scoring method which tests to see if an afterevent is linked
to its parent event.
1 Catalogue completeness and magnitude range
Wd” and WMX)
It is easier to compare observed earthquake distributions to
theoretical distributions when the observed catalogue is
complete; that is, no earthquakes are missed. Habermann
(1982) notes that detection rates are approximately constant
for earthquakes in the ISC catalogue with magnitudes
greater than about 4.7. Generally, global catalogues are
complete for earthquakes with magnitudes of about 5.5 and
greater. Catalogues of earthquakes determined from local
networks may be complete, or at least uniform, to lower
magnitudes. For simplicity, we assumed that our model
catalogues are approximately complete above some lower
cut-off magnitude Mmin.
We also assumed an upper cut-off magnitude M,,,, as
this facilitated calculations for several reasons. First, without
a cut-off magnitude most reasonable models predict more
earthquakes in the largest magnitude ranges that are
observed (e.g. BQth 1978a). Second, if b, 2 b, where b, is
the b-value of events in afterevent sequences and b is the
value for the primary events (see Sections 2 and 3 below),
then the number of afterevents in the catalogue becomes
infinite without an upper cut-off magnitude. Third, because
major rupture zones are limited to sizes of about lo00 km or
less, there are physical constraints that prevent earthquakes
SLC, synthetic catalogues, and aftershock identification
of magnitudes larger than about 9.5. Finally, even if very
large events are possible, a catalogue with a finite number of
events may not contain the largest possible earthquakes, and
the largest clusters will be missing from the population of
observed events.
Both teleseismic and local catalogues are known to
contain systematic errors in the detection of earthquakes as
well as in the determination of magnitudes (Habermann
1982, 1983, 1987; Habermann & Craig 1988). Such errors
can influence the generation of synthetic catalogues. For
example, errors in magnitude determination will affect
estimates of the aftershock function, while changes in
detection rate will alter the temporal characteristics of
catalogues such as the Poisson index of dispersion. In
addition, we have not addressed the possibility of errors in
magnitude determination (Habermann 1987) in these
catalogues. A detailed treatment of how incompleteness and
non-uniformity affect the statistics we use, as well as how
incompleteness and non-uniformity can be detected or
avoided, is beyond the scope of this paper.
2 Recurrence relation for primary event magnitudes ( b )
To construct a synthetic catalogue, we need to specify the
distribution of earthquake magnitudes. Let N o ( M ) be the
primary event magnitude distribution such that the number
of primary events in the magnitude range ( M , M + d M ) is
N o ( M ) d M . The total number v o ( M ) of primary events of
magnitude M and greater is given by
JM
Likewise, N,(m) and N c ( M ) denote the levels of
occurrence for afterevents and for events in the catalogue as
a whole, respectively, while I]A(m)and q c ( M ) denote the
number of afterevents or catalogue events of magnitude M
(or m ) and greater.
To generate synthetic earthquakes, we assumed a
Gutenberg-Richter relation (Gutenberg & Richter 1954):
exponentially such that
n A ( M , m ) = q 10ba"-""
= 0,
m<M,
mrM.
(4)
Here, b, is a fixed exponential decay rate which is equal to
the b-value of individual afterevent sequences in the
catalogue (as opposed to the b-value of events in the
catalogue as a whole), while q is a 'swarming parameter'
which is equal to the probability that a primary event will
produce an afterevent of approximately its own magnitude.
This swarming parameter is different from the 'positive
influence parameter' of Prozorov & Dziewonski (1982) as it
develops from a theoretical earthquake population rather
than from observed events, and it is normalized in a
different manner. Nevertheless, the concept is similar in that
higher values of q produce larger and more frequent
earthquake swarms and aftershock sequences, and may vary
for different tectonic regimes (Mogi 1967; Chen & Knopoff
1987). We have generated synthetic catalogues with many of
the statistical properties of actual earthquake catalogues;
typical swarming parameters for these catalogues range from
0.2 (magnitude mb = 4.8 and greater earthquakes in Mexico
and Central America) to 0.5 (magnitude m,=3.0 and
greater earthquakes in the central Vanuatu island arc).
We note that the afterevent function n A ( M , m ) chosen for
this study (equation 4) is completely arbitrary. We used this
form because it is mathematically simple, and because it
produced catalogues which appear to be very similar to
actual earthquake catalogues. However, we are not aware of
any physical o r observational basis for this function.
Prozorov 19 Dziewonski (1982) define an 'intensity
function' I , ( M ) which represents the mean number of
afterevents for a primary event of magnitude M , where the
number of afterevents follows a Poisson distribution. If the
catalogue is complete, we can derive the intensity function
from
I
M
I,(M) =
n A ( M , x ) A.
M,i"
For the aftershock function given by equation (4),
No(M) = A1O-bM.
Typical b-values reported in the literature range from 0.8 to
1.2, with a value of approximately 1 being typical (BHth
1987b, 1981). For simplicity, we assumed a b-value of 1.0
for primary events in the catalogue. Analytical solutions for
N A ( m ) , Nc(M), t)A(m), and q c ( M ) are given in Appendix
A.
3 Afterevent function (4,6,)
We defined an 'afterevent function' n A ( M , m ) as the mean
number of afterevents of magnitude m produced by a
primary event of magnitude M . One appealing set of
afterevent functions is that in which earthquakes are
self-similar with respect to magnitude; that is, earthquakes
behave in a similar fashion regardless of scale (e.g. Kagan &
Knopoff 1981; Mandelbrot 1982). These models take the
form
nA(M, m ) = f ( M - m).
291
(3)
One possibility is to assume the function behaves
Thus, the mean number of afterevents per primary event is
directly proportional to the swarming parameter q.
In our applications, we used a value of b, = 1.0 for several
reasons. First, the statistical characteristics of the synthetic
catalogue were simpler when b,= b, where b is the
Gutenberg-Richter b-value for primary events (see Section
2 and Appendix A). Second, we were able to match actual
catalogues adequately with a value of b,= 1. Third,
although there may be some variation, a value of b, = 1 is
representative of many aftershock sequences (Utsu 1961;
Reasenberg & Jones 1989).
4 Primary event origin times (I.,,, T )
The simplest assumption to make for the generation of
primary event origin times is that they occur as a Poisson
process, i.e., they are independent and have interevent
times which are exponentially distributed (Cox & Lewis
S. D . Davis and C. Frohlich
292
1966). Such a process has a mean rate Ao, and the expected
number of primary events in a catalogue which covers a time
span T is no = AoT.
In practice, we set the primary event rate A, so that the
expected total number of events produced (primary events
and afterevents) was equal to the observed number of
events ntota,in the actual catalogue we were trying to
simulate. Using the afterevent function described above,
with b,=b and Mmax-Mmi,~2b,one can show (Davis
1989). that the appropriate value of A. is very closely
approximated by
ntotal
Mmu-Mmi,,--
(7)
b In 10
Although the assumption of a Poisson process with a
steady rate is somewhat simplistic, it is adequate for most
purposes. If necessary, one could devise more complicated
models to account for changes in detection rate (Habermann
1982), clustering of large main shocks (Kagan & Jackson
1990) or gradual accumulation of strain energy in the time
intervals between large earthquakes (e.g. Utsu 1972).
5 Afterevent origin times (c, 1,)
For the models we considered, the probability that a given
number of afterevents will occur follows a Poisson
distribution since afterevents of a given primary event are
generated independently. The expected fraction of primary
events of magnitude M with L afterevents is P[A,(M), L],
where P[u, L ] is the Poisson probability of mean u :
u L e-”
P[u, L]= -. L!
(8)
To generate afterevent times we used a trigger model with
an inverse time (Omori’s Law) decay. Trigger models
(Vere-Jones 1970) form a class of processes in which the
probability of an afterevent occurring in a time interval
(t, t + dt), given a trigger (primary event) at time to is
A,(M)@(r - to) dt.
(9)
Here, A,(M) is the intensity function, and @ ( t ) is a decay
function which is proportional to the number of afterevents
per unit time produced by the trigger, normalized over time
such that
I,
m
@ ( t ) dt = 1.
Trigger models employed in the study of earthquake
aftershocks have generally taken the form of an exponential
decay (Vere-Jones & Davies 1966; Burridge & Knopoff
1967; Hawkes & Adamopoulous 1973) or inverse time decay
(Omori’s Law). Most authors have found the latter to
provde a better fit. Omori’s law takes the form
1
@ ( t )a (t c ) .
+
However, Omori’s law is difficult to analyse statistically, as
the number of expected afterevents from a single primary
event increases without bound. Utsu (1961) presented a
Modified Omori’s law:
1
(t+C)P’
@ ( t )0:
Previous investigators have reported values of p ranging
from 0.85 (Mayer-Rosa et al. 1976) to 1.3 (Utsu 1961).
While the Modified Omori’s law has been investigated
extensively (e.g. Ogata 1983; Reasenberg & Jones 1989),
one disadvantage of the law is that many of the afterevents
occur long after the primary event. For example, if one
chooses the values of c=O.25 days and p = 1.25
(Vere-Jones & Davies 1966), then more than 12 per cent of
the afterevents will occur loo0 days or more after the
primary event. Such events are likely to be ‘lost’ in the
background seismicity after these time intervals, and thus
many seismologists would not consider them to be
aftershocks. Consequently, we instead used a form of
Omori’s law with a time cut-off t, such that
1
+c) ’
@ ( I ) 0:
(t
,
=0,
t 5 t,,
I
t>tc
and r<0.
(13)
For the purposes of this study, we set c =0.25 days
(Vere-Jones & Davies 1966) and I, = 100 days.
One can also employ trigger models to create foreshocks
by allowing the decay function to be non-zero in the time
range t < 0 (prior to the primary event). However, for the
sake of simplicity we ignored the possibility of foreshocks in
this study.
An alternative to the trigger model approach is one which
incorporates self-exciting or mutually exciting processes,
where afterevents are not independent but may give rise to
afterevents of their own (Hawkes & Adamopoulous 1973;
Adamopoulous 1975; Oakes 1975). However, we chose not
to employ these models because the studies of Vere-Jones
(1970) and Hawkes & Adamopoulous (1973) found that
simple, independent afterevent models described observations as well as, or better than, other branching process
models.
6 primary event hypocentre distribution
The simplest model for the generation of primary event
hypocentres is to assume the hypocentres occur randomly
and uniformly within some simple geometric zone, e.g., a
rectangular, circular, or spherical region. However, such
models do not accurately describe such features as isolated
events off the subduction zone or clusters of events; nor do
they account for gross changes in spatial clustering along the
subduction zone (e.g., Fig. 1, top). As our goal was to
generate synthetic catalogues which simulate the properties
of real catalogues, we used these real catalogues to
determine the 3-D probability density function for primary
event hypocentres.
To do this, we divided the region of interest into cubical
sub-regions with sides of size X
,
.
We then computed the
fraction of actual earthquakes occumng within each
sub-region, and used this value as the probability that a
primary event would occur within that sub-region. Although
this method required slightly more computer time than
randomly assigning primary events in some pre-defined
SLC, synthetic catalogues, and aftershock identijication
v)
In
.
(a) MEXCAM48
I
-115
C
-110
-105
-100
-95
-90
-85
-80
O
.75
293
obtained L by assuming a ratio of maximum fault slip to
fault length of lop4, and using a relation between
magnitude, fault length, and maximum slip (King &
Knopoff 1968).
Although we constrained aftershocks to occur within the
rupture zone of the main shock, there is evidence that some
off-fault aftershocks are triggered by the strain release of
large main shocks (Das & Scholz 1981; Stein & Lisowski
1983; Strehlau 1986). It would be possible to create more
detailed synthetics which account for such aftershocks, but
for’simplicity we have not done so.
8 Location errors (Xe,.,)
I
L
C .
-115
-110
-105
-100
-95
-90
-85
-80
O
Finally, to simulate actual earthquake catalogues, we
accounted for relative location errors. In spite of the fact
that real catalogues are plagued by such errors as outliers,
for simplicity we assigned random Gaussian errors with a
standard deviation of X,,, km to our simulated hypocentres.
We used a relative location error of X,,, = 20 km for events
in teleseismic catalogues, and a value X,,, = 5 km for events
in local network catalogues.
-75
longitude
Figure 1. (a) Epicentres of earthquakes in catalogue MEXCAM48.
This catalogue contains 1552 earthquakes of magnitude 4.8 and
greater in the Mexico-Central America region from January 1964
to February 1986. (b) Distribution of 1534 epicentres for a synthetic
catalogue based on MEXCAM48 (see Table 1 for input
parameters).
volume, it avoids the subjective choice of the size of such a
volume, and it produced realistic looking hypocentre
distributions which preserved the characteristics of spatial
clustering and isolated events (Fig. 1, bottom). Essentially,
this method fixed the relationship between more distant
pairs of synthetic events to be identical to that of the actual
catalogue, while allowing us to experiment with different
afterevent functions which affected the space-time relationship of close pairs of events. An alternative approach would
have been to reorder catalogue events, but this would not
have allowed us to generate afterevents and thereby
determine which events are primary events and which are
afterevents.
For simplicity, we set the size of the sub-regions X b o x
equal to X,,,, our estimate for the relative location errors in
that catalogue (see Section 8 below). This value made a
logical choice for the size of the sub-region, as it is unlikely
we could detect any spatial clustering occurring on a smaller
scale.
Some examples
We generated synthetic catalogues with many of the
statistical features of four actual earthquake catalogues.
Two of these, MEXCAM48 and ALASKA49, consist of
earthquakes located by the International Seismological
Centre (ISC) from January 1964 to February 1986.
MEXCAM48 consisted of 1552 mb 2 4.8 earthquakes in
Mexico and Central America (Fig. l), and ALASKA49
consisted of 1839 mb 2 4.9 earthquakes along the AlaskaAleutian trench (Fig. 2). These catalogues are almost
certainly incomplete, but should be approximately uniform.
In
(a) ALASKA49
..!n
(D
..o
(D
(D
0..
v)
In
0..
--P
. +s
!n
P
160
170
180
190
200
210
220
7 Afterevent hypocentre distribution
To generate afterevent locations, we assumed that
afterevents occurred with equal probability density over the
rupture area of the primary event. For the majority of
primary events (those with M 5 8.0) we assumed a circular
rupture determined by the relation between rupture area
and magnitude given by Kanamori (1977). For the few
larger earthquakes (M> 8.0) we assumed a rectangular
rupture zone of length L, with a total area constrained by
Kanamori’s (1977) area-magnitude relation. Here, we
!
P
n
.
160
170
180
190
200
210
In
P
220
longitude
Figure 2. (a) Epicentres of earthquakes in catalogue ALASKA49.
This catalogue contains 1839 earthquakes of magnitude 4.9 and
greater in Alaska and the Aleutians from January 1964 to February
1986. (b) Distribution of 1786 epicentres for a synthetic catalogue
based on ALASKA49 (see Table 1 for input parameters).
294
S. D. Davis and C. Frohlich
SYNTHETICS
VAN30
?A
*.
.
**
*4
0
6
!l'
167
1&
189
l
longitude
0
0
b
'46
0
187
188
169
110'
longitude
Figure 3. Epicentres of earthquakes in catalogue VAN30. This catalogue contains 2395 earthquakes of magnitude 3.0 and greater in the
central Vanuatu region from January 1979 to December 1980. (b) Distribution of 2388 epicentres for a synthetic catalogue based on VAN30
(see Table 1 for input parameters).
We also studied two catalogues from local networks to see
how these techniques worked when applied to earthquakes
in different magnitude ranges. VAN30 consisted of 2395
earthquakes along the Vanuatu (New Hebrides) trench of
mb 2 3.0 located by a locd seismograph array (Chatelain,
Cardwell & Isacks 1983; Marthelot et al. 1985; Chatelain et
al. 1986) from January 1979 to December 1980 (Fig. 3).
ADAKS20
..
.
I
...-.. .
*
*.
.
1
.
181
182
183
184
185
SYNTHETICS
I
I
.
181
182
183
184
185
longitude
Figme 4. (a) Epicentres of earthquakes in catalogue ADAKS20.
This catalogue contains 2043 shallow (depth 570 km) earthquakes
of magnitude 2.0 greater in the Adak Island region from January
1978 to December 1986. (b) Distribution of 2116 epicentres for a
synthetic catalogue based on ADAKS20 (see Table 1 for input
parameters).
ADAKS20 consisted of 2043 shallow (depth (70km)
earthquakes of m b2 2.0 located by the Adak network
(Engdahl 1977; Frohlich et al. 1982) between 50.75"N and
52.2"N latitude and between 174.5"W and 179.0"W longitude
from January 1978 to December 1986 (Fig. 4). Based on the
magnitude distribution of events in these catalogues, we
chose cut-off magnitudes above which the catalogues should
be approximately complete. However, the preponderance of
larger earthquakes near the edges of the Vanuatu seismic
zone (Fig. 3) indicates that VAN30 may not be complete in
these areas.
To simulate the statistical characteristics of MEXCAM48,
we set the four input parameters b, b,, c, and T, to
1.0, 1.0,0.25 and 100 days, respectively, as explained
previously (Table 1). We set T = 22.2 yr, the time span of
the available ISC catalogue (January 1964-February 1986).
We arbitrarily estimated relative location errors in the
catalogue to be on the order of 20 km. We used a lower
magnitude cut-off of Mmi,=4.8; for lower values the
catalogue was significantly incomplete (Habermann 1982),
while for much higher values of Mminthere were not enough
data to make the analysis meaningful. As the largest
magnitude earthquake reported by the ISC in this region
during the time of interest was Ms= 8.1, we used this value
for Mmm. We tried several values for the swarming
parameter q ; in each case we chose an appropriate value of
1, as described previously (equation 7). To generate
synthetics for ALASKA49, VAN30, and ADAKS20, we
determined appropriate input parameters in a similar
manner (Table 1).
STATISTICS F O R EVALUATING
SYNTHETIC A N D ACTUAL CATALOGUES
Assuming Our synthetic
provides a
model of actual earthquake behaviour, how can we
SLC, synthetic catalogues, and aftershock identification
295
Table 1. Input parameters for synthetic catalogues. The last four columns list the parameters used to model
the four actual catalogues chosen for this study. M48 is MEXCAM48, A49 is ALASKA49, V30 is VAN30,
and A20 is ADAKS20. The values of M,,, were chosen on the basis of the largest magnitude earthquake in
the catalogue of interest.
Pammeter Namc
Jy.Q&s
M48
@
UQ
ezll
b-value for main shocks
bvalue for aftershock seqiiences
Oinori Law constant
Oinori cutoff time
size of subregions for determining
pdf of synthetic main shock locations
set to b=1.0 (see text)
set to ba=I.O (see text)
set to c = 0.25 days (see text)
set to tc = 100 days (see text)
1
1
1
.25
I
I
.25
1
I
.25
100
I
.2s
100
100
100
set to Xbox = X
,
20
20
5
5
4
swarming parameter
0.2a
0.3”
0.5’
0.25a
T
N
time span of catalog
expected number of events
catalog location errors
lower magnitude cutoff
upper magnitude cutoff
defines extent to which events “cluster”
in space-time
varies according to catalog of interest
vanes according to catalog of interest
varies according to catalog of interest
vanes according to catalog of interest
varies according to catalog of interest
22.2
1552
20
22.2
1839
20
4.9
9.0
2.0
2395
5
3.0
6.0
8.0
2043
5
2.0
b
ba
C
tC
Xbx
Xem
Mmin
,,M
(see text)
4.8
8.1
5.5
FOOTNOTE:
a Several values were tried; number shown is best-fitting value
determine the best value for q? One way is to find which
values of q produce synthetic catalogues with statistical
properties similar to those of an actual catalogue we wish to
simulate.
Various statistical tests have been used to characterize
deviations from random behaviour, including the Poisson
index of dispersion (Vere-Jones & Davies 1966; Shlien &
Toksoz 1970; McNally 1977), the Kolmogorov-Smirnov test
(Shlien & Toksoz 1970; Reasenberg & Matthews 1988), the
Anderson-Darling test (Anderson & Darling 1952; Frohlich
1987; Reasenberg & Matthews 1988), the E parameter of
Shlien & Toksoz (Shlien & Toksoz 1970; Bottari & Neri
1983; De Natale et 01. 1985), second- and higher order
moments (Kagan & Knopoff 1976, 1978; Reasenberg 1985),
power spectra (Utsu 1972), variance-time curves (VereJones 1970; Rice 1975), event pair-analysis (Eneva & Pavlis
1988; Eneva & Hamburger 1989), and hazard and intensity
functions (Vere-Jones 1970; Rice 1975).
For this study we chose six statistics as described below.
Two statistics (&, B,) describe only the spatial distribution
of earthquake hypocentres; two (S,, B,) relate to the
separations between events in both space and time, and two
[PID(lO), PID(100)I relate only to the temporal properties
of the earthquake catalogue. We chose these statistics
because they were reasonably robust; that is, they were not
highly sensitive to small changes within the catalogue. In
addition, all of these statistics were relatively simple to
calculate.
Four of the statistics (&, B,,, S,, B,) are introduced in this
paper: they have not been previously published in the
literature. These statistics derive from single-link cluster
analysis (SLC) of earthquakes (Frohlich & Davis 1990) and
provide information on the distribution of interevent
distances in space and space-time. Previously we have used
SLC to study earthquake nests (Frohlich & Davis 1990),
isolated events (Frohlich & Davis 1990), seismic quiescence
(Wardlaw, Frohlich & Davis 1990), and earthquake
aftershocks (Davis & Frohlich 1990).
Previously debed statistics [PZD(lo), PZD (loo)]
A Poisson process has the property that the mean or
expected value and the variance of the number of events per
unit time interval are equal (Cox & Lewis 1966). The
Poisson index of dispersion (PID) is defined as the ratio of
the variance to the mean:
PID(At) =
variance (At)
expected value (At) ’
where At is the time length of the intervals. The PID has an
expected value of 1 for a sequence generated by a Poisson
process, and is significantly greater than 1 for highly
clustered sequences. We used PID’s with time intervals of
10 and 100 days (Davis 1989).
New statistics (&, &, B,, 43,)
Single-link cluster analysis (SLC) is a scheme for joining N
events with N - 1 links using a distance metric. SLC
provides an objective way of dividing a set of events into
natural clusters. By removing the K longest links, one is left
with K + 1 ‘clusters’ of events. We refer the reader to
Frohlich & Davis (1990) for a more detailed explanation of
SLC.
We observe that the logarithmic distribution of link
lengths is nearly straight for the majority of links (Frohlich
& Davis 1990; and Fig. 5). Thus, two statistics suffice to
describe the link-length distribution fairly accurately for all
but the very smallest and very largest links. These statistics
are S, the median link length, and B, the ‘slope’ of the
link-length distribution. Because the log of the link lengths
is approximately straight when plotted against link number,
this slope can be defined by the value
D(0.25)
B = link length ‘slope’ = D(0.75) ’
where D(f) is the length exceeded by a fraction f of the
links. Note that B is not truly the slope, but is proportional
to the exponential of the slope on a semi-log plot (Fig. 5).
Because the link distribution depends on the chosen
distance metric, we examined two sets of statistics relating
to the median link length and the link length slope. First, we
considered the distribution when the metric is simple
296
S. D . Davk and C. Frohlich
I
I
I
I
I
1
I
I
I
1
I
I
Lv)
N
1
MEXCAMM (C = 0 km/day)
-115
-110
-105
-100
-95
-85
-90
-80
-75
Fraction f
5. (a) Linkage of earthquakes in Fig. 1 (MEXCAM48) as determined by single-link cluster analysis (SLC). The numbers above some
links in the map represent the lengths of the 11 longest links (those greater than 150 km). (b) Distribution of link lengths for the earthquakes
shown in Fig. 1. The logarithmic link distribution is nearly linear for all but the smallest and largest links. Thus, the central portion of the
distribution curve can be characterized by two statistics, the median link length S and the distribution slope B. Let D ( f ) be the length which a
fraction f of the links exceed. We define So as the median link length, i.e., D(0.5). Here, S,= 18.6 km. We define the distribution ‘slope’ as
B, = D ( 0 . 2 5 ) / 0 ( 0 . 7 5 ) Here,
.
B, = (29.85 km)/(11.68 km) = 2.56. The three dots on the curve correspond to the lengths D(O.25), D(0.5) and
D(0.75), respectively.
Figure
Euclidean distance, and we denote these values as So and
B,. To some extent So serves as a scaling factor, as it gives
an indication of typical distances between hypocentres in the
region of interest (Fig. 6). B , is a measure of the ratio of
long link lengths to short. B, is always greater than or equal
to 1.0, and increases as events become more and more
clustered (Fig. 6). Because these two statistics do not
provide information on the very longest links (Fig. S), the
distribution of ‘isolated’ events and clusters will not be well
constrained. However, as our goal was to test aftershock
identification schemes, the distribution of the longest links
should not have greatly affected our results.
The slope of the lipk length distribution is also affected by
the ‘dimensionality’ of the space in which events occur
(Frohlich & Davis 1990). Thus, with the statistics So and B,
alone we are unable to distinguish between differences in
dimensionality and the degree of clustering (Fig. 6).
However, as the mapped positions of simulated hypocentres
strongly resemble those in actual catalogues (Figs 1-4),
many of the overall dimensional characteristics of the
catalogues will be preserved.
By changing the distance metric used to define linkage for
SLC, one can study different aspects of clustering (Frohlich
& Davis 1990). In the present paper and in a related study
(Davis & Frohlich 1990) we used a metric which includes
both space and time separation between events. In this way
we could identify events that are close in space and time,
and hence are likely to share a genetic relationship. This
metric was
+
d,, = space-time ‘distance‘ = v ( d z C2t2),
(16)
where d is the geographic separation, in 3-D Euclidean
space, between events (km), t is the time difference (days),
and C is a parameter which relates time to distance.
Although the units of d , are km, this is not simple
geographic distance (except for the special case when
C = 0), but instead includes both space and time separation.
To avoid confusion we denoted values of d,, by the units
‘ST-km’ (space-time km).
In addition to So and B,, which relate to the spatial
properties of events, we considered the space-time
distribution of events by examining the statistics S, and B,,
i.e., the median link length and link length ‘slope’ when we
used a space-time metric with a conversion factor of C km
day-’. Larger values of C place more emphasis on the
temporal properties of the catalogue. In a later section of
this study we show that C = 1km day-’ is a reasonable
value for studying the space-time properties of aftershocks;
thus, we used the statistics S, and B,. These statistics are
similar to So and B , except that they provide information on
the distribution of events in space-time. Thus, S, estimates
the order of magnitude of typical space-time distances,
while B, provides information on the clustering of events in
space and time. These statistics are thus appropriate for
catalogues in which aftershock sequences or swarms may be
important features.
Comparison of actual and synthetic catalogues: examples
We used these six statistics to help us find a reasonable
value for the swarming parameter q for our synthetic
catalogue. For example, consider a synthetic catalogue
SLC, synthetic catalogues, and aftershock identification
-
a
25 km
b
U
25 km
297
somewhere between 0.1 and 1.0 as appropriate (Table 2).
We chose a value of q = 0.25 for this study.
..... ..
AFTERSHOCK IDENTIFICATION SCHEMES
Single-link cluster identification of earthquake aftershocks
SO 5.0 km
BO 2.5
So 2.5 km
Bo 2.5
H
.
1:
.
..
.
,
1‘
+.,
.
,
So 5.0 km
.
<.,
,
I
...~
.
B, 4.0
Fiyre 6. Schematic of how changes in the pattern of seismicity
affect the statistics S, and B,. (a) 100 events randomly placed in a
circular area of radius 43 km. (b) The same events as in (a) have
been reduced to an area of radius 21.5 km. This has no effect on B,,
but decreases the median link lenght S, by a factor of 2. The dashed
line indicates a circle of radius 43 km. (c) 33 random events in an
area of 10 km radius (small circle) have been superimposed on 67
events in an area of 70km radius (large circle). Note that the
presence of clustering increases the link length slope B,. The
dashed line indicates a circle of radius 43 km. (d) The
dimensionality of earthquakes also affects the link length slope.
Here, the same events as in (a) have been ‘stretched’ to fill an ellipse
with a semi-major axis of 292 km and a semi-minor axis of 3.6 km,
making the distribution of events approximately 1-D. Note that with
the statistics S, and B, alone we cannot discriminate between the
effects of clustering (c) and dimensionality (d). A dashed circle of
radius 43 km is shown for reference.
devised to simulate MEXCAM48. If q is too high, we would
expect large numbers of closely knit afterevent sequences.
In this case, the abundance of short links should produce
lower median link lenghts (So, S,) and higher link length
slopes (B,, B,) than those found in catalogue MEXCAM48,
while the clustering of events in time should also produce
higher values of PID(10) and PID(100) that in the observed
catalogue. Alternatively, if q is too low, the index of
dispersion [PID( lo), PID(100)] should approach unity, the
median link lengths (So, S,) should be higher, and the link
length slopes ( E o , B,) lower than those found in
MEXCAM48.
The closest match between MEXCAM48 and the
synthetics was with a swarming parameter of about q = 0.2
(Table 2). At much lower values (e.g. q = 0.04) there were
not enough afterevents. This lack of clustering was evident
somewhat in the spatial statistics So and B,, and was
prominently seen in the statistics which include a temporal
component [S,, B , , PID(lO), PID(lOO)]. At higher swarming parameters (e.g. q = 1.0) the synthetic catalogues clearly
exhibited too much clustering. Similarly, we found that with
swarming parameters of 0.3 and 0.5 we could approximate
the characteristics of ALASKA49 and VAN30, respectively
(Table 2). For ADAKS20, there was no clear best-fitting
value of q, although the statistics suggested a value
To study aftershock sequences, we used the space-time
metric d, described previously. Although one could choose
a different metric for identifymg aftershocks, we chose d,,
as it is easy to use, has the appealing attribute of treating
time as a fourth spatial dimension, and closely approximates
the way the human eye picks out aftershock sequences on
earthquake space-time plots.
With dST as the metric, we define a ‘space-time cluster’ of
size D, or ‘D-cluster’, as a group of events joined by links of
size D ST-km and less. In practice, this method requires the
determination of two parameters, C and D. The first
parameter, C, is a conversion factor which relates temporal
separation to spatial separation. Ideally, one would like to
choose a value C = A x / A t such that two simultaneous
events separated by a distance Ax are as closely ‘related’ to
each other as two events with identical hypocentres
occurring a time At apart. The second parameter, D is the
‘distance’ at which we consider events to be related. If two
events can be joined by a series of links of size D or less,
then these events belong to the same D-cluster. To identify
aftershock sequences, one should choose D large enough to
link most aftershocks to their parent events, but small
enough so unrelated events are rarely linked. We found that
in the ISC catalogue, SLC joins events that the eye picks out
as obvious clusters if one uses values of about C = 1 km
day-’ and D = 60 ST-km (Fig. 7 ) .
When we used SLC to identify aftershocks, we divided
the events into distinct space-time clusters. We assumed
that each cluster of N events contained one primary event
(main shock) and N - 1 secondary events (aftershocks
and/or foreshocks). We could then either assume that (1)
the first event to occur in a given cluster was the primary
event, or (2) the largest magnitude event in a given cluster
was the primary event. In the first case, all of the secondary
events would be aftershocks: in the second case, the
secondary events would be either foreshocks or aftershocks.
For this study, we took the largest magnitude event as the
main shock of a sequence.
Scoring system for aftershock identification
To determine the accuracy of an aftershock identification
scheme on any particular synthetic catalogue, we scored its
success in terms of its ‘validity’ V and ‘reliability’ R
(Reasenberg & Matthews 1988):
‘score’ = V
+ R - 1.
(17)
Here, V is the fraction of afterevents which are linked to
their parent primary event, and R is the fraction of primary
events which are correctly identified as such (i.e., not
misidentified as afterevents). Note that in order to compute
the score, one must keep track of both the actual afterevent
sequences (those created by the stochastic model) and the
determined afterevent sequences (those picked out by the
identification scheme). It can be shown that when applied to
298
S. D. Davis and C. Frohlich
Table 2. Comparison of earthquake statistics of actual catalogues with synthetic catalogues. Space-only
statistics are the median link length S, and the link length distribution ‘slope’ B,. Space-time statistics are
the median link length and link-length slope computed with a space-time metric using a conversion factor
C = 1 km day-’. Time-only statistics are the Poisson Index of Dispersion PZD at time intervals of 10 days
and 100 days. File MEXCAM48 contains earthquakes in the Mexico-Central America region (FlinnEngdahl regions 5 and 6) of magnitude 4.8 and greater (Fig. la). File ALASKA49 contains earthquakes in
Alaska and the Aleutians (Flinn-Engdahl region 1) of magnitude 4.9 and greater (Fig. 2a). VAN30 contains
earthquakes in the central Vanuatu region of magnitude 3.0 and greater (Fig. 3a). File ADAKS20 are
earthquakes in the Adak region of magnitude 2.0 and greater (Fig. 4a). We generated synthetic catalogues
using the parameters given in the text, varying only the swarming parameter q.
SPACE-ONLY
SPACE-TIME
S1
80
SO
TIME -ONLY
P I D . (10)
P I D . (100)
81
F i l e HEXCAM48
18.61
2.56
102.84
3.03
2.12
5.00
HLXCAHQB S y n t h e t i c s (mean and standard deviation of 20 catalogs each)
q
q
0.04
0.1
q = 0.2
q = 0.4
q = 1.0
=
=
23.93 f .53
23.14 f .45
21.9 f 1.2
19.1 f 1.4
13.5 f 2.2
2.01
2.20
2.36
2.16
3.38
f .13
f .12
f .29
f .31
f .54
122.8 f
116.5 f
101.8 f
62.9 f
23.9 f
2.4
3.4
1.2
12.1
5.1
2.02 f .53
2.51 f . 8 0
4.4 f 1.8
1.8 f 3.1
1.9 f 2.1
1.11 f . 1 6
1.65 f .35
3.1 f 1.2
6.5 f 3.7
20.9 f 19.0
1.41 f .21
2.51 f . 8 0
6.0 f 2.1
14.2 f 8.3
41.2 f 36.1
6.05
26.9
41.3
F i l e ALASXA49
13.29
2.64
66.13
ALAsXAO9 S y n t h e t i c s (mean and standard deviation
q = 0.04
q = 0.1
q = 0.2
q = 0.4
q = 1.0
18.65 f .43
18.21 f .56
11.2 f l . 2
15.0 f 1.8
10.1 1 2 . 1
2.16
2.21
2.32
2.58
2.92
f .ll
f .16
f .29
f .43
f .59
102.6 f
95.6 f
18.8 f
45.0 f
19.1 f
3.8
8.1
11.5
16.1
5.5
of 20 catalogs each)
2.12 f .85
3.0 f 2.0
4.9 f 4.0
1.4 f 3.8
5.9 f 2.5
1.8 k‘1.5
5.0 f 8.3
13.0 f 24
31. f 61
51. f 64
3.60
34 .O
3.0 f 3.6
10. f 18
30. f 30
81. f 163
130. f 178
F i l e VAN30
4.50
3.03
14.17
96.3
W 3 0 S y n t h e t i c s (mean and standard deviation of 20 catalogs each)
---
q = 0.04
0.1
0.2
0.4
1.0
q
q
q
q
5.80
5.69
5.37
4.90
3.88
f .19
f .19
f .19
f .31
f .33
2.63
2.65
2.75
2.96
3.46
f .09
f .09
f .08
f .I7
f.23
19.11 f .44
18.50 f .38
17.16 f .59
15.0 f 1.0
9.9 f 1.5
rile
2.67
3.44
2.09
2.17
2.43
3.12
4.10
f
f
f
f
.ll
.10
.16
.40
f .60
1.09 f .17
1.44 f .24
2.48 f .58
5.2 f 1.9
14.4 f 1.7
0.94 f .54
2.0 f 1.2
4.2 f 2.0
10.0 f 5.8
29. f 22
ADAKSZO
18.07
2.42
3.83
20.8
ADAXSZO S y n t h e t i c s (mean and standard deviation of 20 catalogs each)
q
-
0.04
0.1
q = 0.2
q
q
=.
-
0.4
q = 1.0
4.19
4.08
3.79
3.38
2.69
f .10
f .16
f .28
f .50
f .74
2.24
2.32
2.50
2.91
3.48
f .08
f .08
f .15
f .42
f .56
22.36 f .47
20.1 f 2.9
19.1 f 1.1
14.8 f 3.5
8.0 f 3.4
synthetic catalogues in which the primary events are both
the first and the largest event in their actual afterevent
sequences, both V and R are independent of whether an
aftershock identification scheme picks the first event or
largest event in a determined afterevent sequence as the
1.82 f .12
2.24 f 1.10
2.90 f .12
4.1 f 1.6
6.1 f 1.4
1.31 f .22
2.5 f 1.2
6.0 f 4.0
13.5 f 12.5
41.5 f 49.9
1.64 f .69
4.4 f 2.8
12.9 f 9.7
32.0 f 30.6
98.0 f 116.1
linked (Fig. 8). If too many primary events are incorrectly
linked, the score again decreases. Although in theory the
score can range from -1 ot +1, we rarely observed scores
much below zero.
SLC, synthetic catalogues, and aftershock ident$cation
299
MEXCAM48 Synthetics (q = 0.2)
Rat Islands C = 1 km/day D = 60 S T - k m
i
W165
+ +
8 + + + +
%
+
+ +
170
+
-5
0
+
+
1
Q
++
+
++
175
z
++
-b
+
%
+
+
+
+ $ +
+ + +
0
a
180
,%
+
+
8
+
+
+
+
+
7.0
++
++%
El75
1664
1466
15365
1667
1968
1969
0
1
YEAR
Figure 7. Single-Link Cluster (SLC) identification of aftershocks,
from Frohlich & Davis (1990). Shown is a space-time plot of 432
earthquakes in ALASKA49 between 165"W and 175"E from 1964 to
1970. Here we linked earthquakes using the space-time metric
described in the text, relating distance to time by C = 1.0 km day-'.
Lines denote links of length 60ST-km or less. Filled circles
represent isolated events as well as the largest magnitude event in
each sequence. Open circles are linked earthquakes, and + 's are
unlinked earthquakes. The Rat Islands earthquake of 1965
February 4 (Ms=8.2) is linked to two foreshocks and 245
aftershocks which occur over 153 days. Similarly, the Andreanof
Islands earthquake of 1%9 May 14 (Ms= 7.0) is linked to 21 events
which occur over 176 days.
SCORING SYSTEM
'storm- = V + R
v
-1
= frnctlon of aftorshocks
Ilnkad t o parrnt mvmnt
R
I.
Small
0
0 0
0
scorn 0.33
= fractlon of maln shocks
corrmctly Idmntlflmd
flaln shock
0Aftershock
TlHE
Lmrgm
0
Egwe 8. Scoring system to evaluate success rate of an aftershock
identification scheme on a synthetic catalogue with known main
shocks and aftershocks. Here, we illustrate the scoring system with
a schematic diagram showing a main shock with three aftershocks,
and two isolated main shocks. The 'score' = V + R - 1, where V is
validity (fraction of aftershocks correctly identified), and R is the
reliability (fraction of main shocks correctly identified). For a
single-link cluster analysis scheme, a too small value of D will not
identify a sufficient fraction of the aftershocks, and the resulting
score will be low. Conversely, if D is too high, then too many main
shocks will be linked together and misidentified as aftershocks.
'
20
'
40
'
60
i o ' 100
D (ST-km)
'
. . . . .
120
140
Figure 9. Plots of score versus
D for synthetic catalogues with
statistical characteristics similar to MEXCAM48 for several values
of C. Each curve represents the average result from four synthetic
catalogues, and represents a different conversion parameter C, as
follows: C = 0 (lower solid line), C = 0.05 km day-' (upper solid
line), C=0.25km day-' (long dashed line), C = l km day-'
(dotted line), C = 3 km day-' (short dashed line), and C = 10 km
day-' (dashed-dotted line). Davis (1989) presents similar figures
for the other catalogues.
1 km day-' and D about 80 ST-km (Fig. 9). With these
values, roughtly 10 per cent of the afterevents were not
identified by the SLC identification scheme, while about the
same percentage of primary events were erroneously
identified as afterevents.
We used the scoring system to address a number of
questions. In particular, we wished to know (1) what values
of C and D are most useful in finding afterevents for a
particular catalogue, and (2) how does the identification of
afterevents depend on the background seismicity rate?
Results from synthetics of the four catalogues we studied
suggest that a value of C = 1km day-' is adequate for
finding afterevents in most practical applications (See Davis
1989 for details). In MEXCAM48, values of C = 1 km day-'
and D = 80 ST-km yielded an average score of 0.77 (Fig. 9).
Similarly, in ALASKA49, we obtained an average score of
0.82 with C = 1km day-' and D = 60 ST-km. In ADAKS20,
we obtained an average score of 0.44 with C = 1 km day-'
and D = 62 ST-km.
In VAN30, we obtained an unusually low best score of
about 0.34 with C = 1km day-' and D = 12 ST-km. This
area includes a well documented nest of activity known as
the Efate Salient (hacks er al. 1981), and it was difficult to
pick out afterevents because of the unusually high
background rate. Nevertheless, we obtained the highest
possible score with a value of C = 1 km day-'.
To investigate the effect of different background rates, we
generated several synthetic catalogues with varying primary
event occurence rates I,. All of the other input parameters
were the same as those used to simulate MEXCAM48. An
analysis of these catalogues (Fig. 10) suggested that the
scheme worked well with a value of C = 1km day-' over a
wide range of background rates. Naturally, when the
300
S. D. Davis and C. Frohlich
t
1
. . . . . . . . . . . . . .
0
20
0
40
60
80
120
100
140
D (ST-km)
0
o T , .
0
20
. . . . . . . . . . . .
40
60
80
100
120
l n
140
D (ST-km)
D for synthetic catalogues with several different main shock Occurrence rates A,, expressed in terms of A, the
best fit Occurrence rate for actual MEXCAM48 earthquakes. Other input parameters were those used to simulate MEXCAM48 (Table 1); we
set the swarming parameter q to 0.2 (Table 2). Each line represents a different conversion parameter C, as follows: C = 0 (lower solid line),
C = 0.05 km day-' (upper solid line), C = 0.25 km day-' (long dashed line), C = 1 km day-' (dotted line), C = 3 km day-' (short dashed line),
and C = 10 km day-' (dashed-dotted line). Each such curve is the average from four synthetic catalogues. In each case, a value of C = 1 km
day-' (dotted line) produces approximately the highest scores with an appropriate value for D (solid squares).
Figure 10. Plots of score versus
0
0
2-
O..
2
~2
A MEXCAM48 Synthmtlcr
0 ALASKA49 Synthetlca
+
oI
:::
A ..$
ADAKS2O Synthotlcr
0 VAN30Synthetlu
..O
0
0..
r
7
0..
0
.'a
OA,..'
m
0 9."'
0
(0"
s-.
N
..o
..do
Other aftershock identification schemes
..o
N
.+'@
&...a4 6
:
t:
0
5
10
20
50
100
- 25.2 ST-km,
Dk,(Sl) = 9.4 S T - k m ' " 6
8 ST-km < S, < 300 ST-km.
(18)
(0
,$
..."+...Q
0..
background rate is high, one must use a lower value of D to
avoid linking numerous unrelated events, and the
identification scheme is less efficient. As a general empirical
rule, we determined that the value Dbestwhich produces the
best score (with C = 1 km day-') is a function of the median
link length in space-time S, (Fig. 11) and takes the form
f-
0
200
Sl
Figure 11. Plot of Dkst versus S,, where S, is the median link
length with C = 1 km day-' and DkSt is the value of D which
produced the highest score with C = 1km day-' (see Fig. 12). Each
symbol represents the average results from four synthetic catalogues
with input parameters similar to those found for catalogue
MEXCAM48, ALASKA49, VAN30, and ADAKS49. Only the
main shock occurrence rate was vaned. For each set of symbols, the
background rates used were, from left to right, 30A (where A is the
background rate of the true catalogue), 10A, 31, A, A/3 and 1/10.
The dashed line represents the function D b c s t = a a +b, where
a = 9.4 ST-kml'* and b = -25.2 ST-km (equation 9).
We also applied this scoring system to four other schemes
for identifymg dependent events, namely, the aftershock
identification schemes of Gardner & Knopoff (1974), Shlien
& Toksoz (1974), Knopoff et ul. (1982), and Reasenberg
(1985). In this section we investigate how well these schemes
performed when used on the synthetic catalogues developed
in this study. Although all of the aftershock identification
schemes we considered were based on the premise that two
earthquakes or events are related if they fall within some
space-time criteria of each other, there were several
conceptual differences which require some discussion.
Standard space-time windows versu cluster-link schemes
In a standard space-time window, an earthquake belongs to
a cluster if it falls within a given time interval and distance
interval of the cluster's primary event (either the first or
SLC, synthetic catalogues, and aftershock identijication
largest earthquake, depending on the scheme), where the
size of the intervals depend on the magnitude of the largest
earthquake in the cluster. In such a scheme, no earthquake
outside of these intervals will belong to the cluster (Knopoff
& Gardner 1972). Alternatively, in a cluster-link scheme, an
earthquake belongs to a cluster if it falls within some
space-time criteria of any earthquake in the cluster (as
opposed to only the primary event), making it possible for
very distant earthquakes to belong to the same cluster if
there is a chain of intervening related earthquakes.
Among the schemes studied, those of Gardner & Knopoff
(1974) and Knopoff el al. (1982) use standard space-time
windows, while the SLC scheme and those of Schlien &
Toksoz (1974) and Reasenberg (1985) are cluster-link
schemes. However, we found that we obtained higher scores
for the space-time window schemes when we treated them
as cluster-link schemes, and these are the scores presented
in this paper.
Main shock threshold magnitude
In some aftershock identification schemes, events are only
considered as possible main shocks if their magnitude
exceeds some threshold value; in other schemes, all events
are considered as possible main shocks. Of the schemes we
studied, only that of Knopoff et al. (1982) employed
magnitude thresholds. As the choice for such a threshold is
somewhat arbitrary, we did not use any main shock
magnitude thresholds in this study.
Magnitude dependence
For several of the schemes, the criteria for determining
dependence varied with main shock magnitude. This is
based on the reasonable premise that larger earthquakes
effect physical changes at larger distances and over longer
time periods. Although this is a realistic assumption, such
schemes often have the potentially undesirable property of
being sensitive to the magnitudes of a few large events.
Consequently, the results depend on the accuracy of
magnitudes which often contain large systematic errors
(Habermann 1987; Habermann & Craig 1988).
Direction of search
Magnitude-dependent aftershock identification schemes are
of two general classes: forward-only searching schemes, and
forward-backward searching schemes. In forward-only
searching schemes, the order of the events may affect their
dependence. For example, a small shock followed by a large
shock may be considered unrelated because the large shock
does not fall within the short time window of the small
event. However, if the order of events were reversed, the
small event would fall within the time window of the large
event, and one would consider the events to be related. In
forward-backward schemes, the order of the events is not
important since the search windows extend both forward
and backward equally in time.
Note that when the criteria are not magnitude dependent,
it does not matter whether the scheme is forward-only
searching or forward-backward searching. The SLC scheme
described in this paper is an example of a scheme which
does not depend on earthquake magnitude. In Table 3 we
denote schemes with an ‘F(forward-only searching), and
‘FB’ (forward-backward searching), or a ‘- -’ (schemes in
which the distinction is meaningless).
Table 3. Comparison of different aftershock schemes. See text for explanation of terms. The first
column indicates whether an event must exceed some threshold magnitude before it is considered a
possible main shock. The second column indicates whether the scheme is magnitude dependent. The
third column lists the direction of search for the scheme; here, ‘F denotes schemes which are
forward-only searching ‘F/B’ denotes schemes which are forward-backward searching, and ‘- -’
denotes schemes in which the distinction is meaningless. The fourth column indicates whether the
earthquake clusters identified by the scheme obey the transitive property. The final column gives the
basis upon which the scheme determines whether or not events are related, with ‘(M)’ denoting that the
basis is dependent on main shock magnitude. Asterisks indicates that we modified the algorithm as
described in the text.
Main Shock
Relcrence
Marnitude
Magnitude
DeDen(kmd
this piper
no
no
Threshold
301
Direclion
$LSw&
Basis of
Tronsilive? Scheme
yes
cluster-link scheme with
space-time metric dST
Gnrdiier and Knopoff (1974) no
Yes
F
yes
space-time window’ with
spatial cutoff (M) and
temporal cutoff (M)
Knopoff et al. (1982)
yes*
yes
FIB
no*
space-time window’ with
spatial cutoff (M) and
temporal cutoff (M)
Shlien 2nd Tijksoz (1974)
no
no
_.
cluster-link scheme with
space-time statistic s
Reasenberg (1985)
no
Yes
F
cluster-liiik sc11e1ncwith
spatial cutoff (M) and
Oinori probability
relation for time (M)
302
S. D. Davis and C . Frohlich
HIGH SCORES
Transitivity of relatedness
For many schemes, the ‘relatedness’ of events forms an
equivalence relation; that is, when aftershock sequences are
identified, the following properties hold for all earthquakes
in the catalogue:
(1) each event is the same sequence as itself
(‘reflexivity’);
(2) if event a is in a sequence which contains b, then
event b is in a sequence which contains a
(‘symmetry’); and
(3) if event a is in a sequence which contains b, and b is
in a sequence which contains c, then a is in a sequence
which contains c (‘transitivity’).
Most of the schemes described in this paper behave as
equivalence relations. However, Knopoff et af. (1982) treat
main shocks and aftershocks in such a way that event b can
be an aftershock of both event a and event c, where a and c
do not belong in the same aftershock sequence. For the
purposes of this study we modified the method of Knopoff et
af. (1982) by treating it as if the transitive property held.
With respect to these properties, Gardner &- Knopoff
(1974) employ forward-only searching magnitude-dependent
space-time windows; that is, events are related if their time
separation At and space separation Ar are such that
At < f ( M ) and Ar < g ( M ) , where M is the magnitude of the
first shock. Similarly, Knopoff et af. (1982) employ a suite of
magnitude-dependent space-time windows, although their
scheme is forward-backward searching. In addition, they
only apply their space-time windows to events above a
certain threshold magnitude. For both of these schemes, we
generalized their concepts by employing a range of
space-time windows which were proportional by a ‘scaling
factor’ to those presented in the original papers.
Reasenberg (1985) uses a forward-only searching
cluster-link scheme. Here, the spatial cut-off is given by
where rcrack is the radius of a circular crack with a stress
drop of 30 bar which corresponds with the seismic moment
(Kanamori & Anderson 1975), and Q is a constant. For his
investigation, Reasenberg uses a value of Q = 10. His time
criteria is based on the probability of observing future
events in an Omori’s Law decay, and consequently depends
on the main shock-aftershock separations already determined by the algorithm. In addition, Reasenberg places an
upper bound for the time cut-off of ,
T = 10 days on events
= 1 day for events not yet
in aftershock sequences, and T,
associated with an aftershock sequence. As with the
previous two schemes, we employed a ‘scaling factor’ to
vary his values for R and T
,,
.
Shlien & Toksoz (1974) identify dependent events on the
basis of their s-statistic:
s = z ( A d ) 2A t k ( X , Z )
(20)
where Ad is the spatial separation between events, At is the
time separation, and k ( X , Z) is the mean rate of activity per
unit area at spatial location X over a sub-region with area Z.
They consider events to be related when s < (Y, A d 5 D,,,
It. Scores produced by single-link cluster analysis (SLC)
using Dkst (equation 9) compared with the highest scores obtained
by other detection schemes (Davis 1989).
Fire
and 0 5 At IT,,,, with
where a,A, D,,,, and Z are chosen constants. Note that
this scheme requires the determination of the mean rate of
activity over several spatial sub-regions. In their investigation, they chose values of (Y = 0.02, A = 100, and D,,, =
1.41 degrees, and chose Z = 1 degree for the northern Japan
catalogue and Z = 0.5 degrees for the southern California
catalogue. For this study, we used values ranging from
2 = 0.25 to 1 degree.
Results
Figure 12 summarizes the highest scores produced by each
method. For details on how the scores varied with different
scaling factors or sub-region sizes, we refer the reader to
Davis (1989). Among the teleseismic catalogues, SLC and
Shlien & Toksoz’s (1974) method produced the highest
scores, ranging from 0.76 to 0.85. The space-time windows
of Gardner & Knopoff (1974), Knopoff et al. (1982), and
Reasenberg (1985) produced somewhat lower scores ranging
from 0.56 to 0.74. With the local network data, all four
schemes produced roughly similar scores. However, all
scores were noticeably lower than with the teleseismic
catalogues, averaging 0.22 for VAN30 synthetics and 0.43
for ADAK20 synthetics. In general, the Gardner & Knopoff
scheme produced slightly higher scores for VAN30
synthetics and ADAK20 synthetics (0.26 and 0.47,
respectively). This may result from the fact that their
scheme was specifically devised to detect aftershocks in local
network catalogues. Among all four catalogues, SLC had
the highest average score (0.568), followed by the methods
of Shilen & Toksoz (0.548), Reasenberg (0.520), Gardner &
Knopoff ( O S O S ) , and Knopoff et af. (0.480).
SLC, synthetic catalogues, and aftershock identification
303
DISCUSSION
Aftershock identification schemes
In this paper we have proposed a workable scheme for
generating synthetic catalogues which closely approximate
real catalogues in many ways. These catalogues allow us to
study many of the features of actual earthquake catalogues,
and we have used them to estimate the success rate of
aftershock identification schemes, allowing us to compare
different schemes.
Synthetic catalogues can be valuable for estimating how well
observational techniques work when applied to actual
earthquake sequences. We have used synthetics to
investigate the validity and reliability of aftershock
identification schemes, and in particular to evaluate a new
single-link cluster (SLC) scheme proposed in this paper.
We found that aftershock identification schemes work
better in some catalogues than in others. In synthetics of the
(MEXCAM48 and
teleseismic catalogues studies
ALASKA49), SLC worked fairly well, missing only 10-14
per cent of the afterevents and misidentifying only 5-10
percent of the primary events. However, when applied to
two local catalogues, SLC does not fare as well. With
synthetics of ADAKS20, SLC missed 18 per cent of the
afterevents and misidentified 37 per cent of the primary
events. In the tightly clustered VAN30 catalogue, SLC
missed over half of the afterevents while misidentifying 17
per cent of the primary events.
The SLC scheme compared favourably with other
aftershock identification schemes published in the literature
when applied to the four catalogues in this study. First, SLC
had the highest average score, although its results were not
excessively better than the scores produced by other
schemes (Fig. 12). However, we note that the other schemes
presented in this paper were designed with specific
catalogues in mind, and with some modification of these
schemes it is possible that higher scores could be obtained.
For example, it is likely that algorithms which employ
space-time windows could obtain higher scores by imposing
a minimum spatial cut-off which corresponded to the
location errors in the synthetic catalogues, or by using
separate scaling factors for the spatial windows and time
windows. We also note that we modified the Knopoff et al.
(1982) scheme by ignoring main shock threshold magnitudes
and the non-transitive nature of thier approach. Second, the
analysis performed on synthetic catalogues in this paper
suggest that SLC is easy to use in that one does not need to
make a careful study to find the appropriate input
parameters. We recommend assigning a value of C = 1 km
day-’ (Figs 9 and 10) and assigning D on the basis of the
median link length in space-time (Fig. 11 and equation 18).
Third, the scheme is not magnitude dependent, and thus is
not strongly affected by errors in magnitude determination.
Statistics for evaluating catalogues
We have illustrated how we can examine the spatiotemporal properties of earthquake catalogues by using six
statistics: [So, B,, S,, B,, PID(lO), PID(lOo)]. Four of the
statistics (&, B,, S,, B,) derive from SLC, and have first
been proposed in this paper.
The spatial statistic So, the median link length in space,
provides a rough measure of the geographic distances
between epicentres. For the catalogues studied in this paper,
values of So ranged from 2.67 km (ADAKS20) to 18.6km
(MEXCAM49). The statistic does not appear to be strongly
influenced by the aftershock process except when afterevents dominate the catalogue. This is evident from the
dependence of So on the swarming parameter q (Table 2).
The spatial statistic B,, the link distribution ‘slope’, is a
measure of the relative sizes of large and small links, and
relates to the degree of spatial clustering. In this study, the
local networks ADAKS20 and VAN30 (B, of 3.44 and 3.03,
respectively) were more spatially clustered than the
teleseismic catalogues MEXCAM48 and ALASKA49 (B, of
2.56 and 2.64, respectively). This statistic, like So, depended
weakly on the presence of afterevents (Table 2).
Of the statistics described in this paper, the space-time
statistics S, and B, were the most useful for determining the
degree of space-time clustering in earthquake catalogues.
These statistics were relatively robust and not strongly
influenced by random changes in the catalogue. On the
other hand, they were sensitive to the presence of
afterevents, and showed a strong dependence on the
swarming parameter q in the synthetic catalogues (Table 2).
The space-time statistic S, is a measure of the distance
between events in space-time. In the four catalogues
studied in this paper we have found values of S, ranging
from 14.2 ST-km (VANSO) to 103 ST-km (MEXCAM48).
Likewise, the space-time ‘slope’ B , provides information on
the degree of clustering in space-time. In this study, values
of B , ranged from 2.42 (ADAKS20) to 6.05 (ALASKA49).
The index of dispersion [PID(lO), PID(lOo)] was
relatively robust when there was little temporal clustering in
the catalogue, but became highly variable for strongly
clustered sequences (Table 2). Consequently, the statistics
PID(10) and PID(100) were useful for estimating the order
of magnitude of temporal clustering, but could not be used
to ‘finetune’ a synthetic catalogue and determine the
appropriate input parameters with any precision. In this
study, the catalogue MEXCAM48 was closest to random
[PID(lO) = 2.7, PID(100) = 5.01 and the catalogue VAN30
was the most highly clustered [PZD(lO) = 34, PID(100) =
961.
Properties of earthquake catalogues
Although the present research applied several aftershock
identification schemes only to synthetic earthquake catalogues, for several reasons we expect that the results will be
similar if the identification schemes were applied to genuine
data. First, in other research projects we have used SLC as a
tool to remove aftershocks from earthquake catalogues
(Frohlich & Davis 1990; Wardlaw et al. 1990) and obtained
‘residual’ catalogues which are statistically and visually
similar to catalogues generated by a Poisson process.
Second, in the present study the synthetic catalogues which
we studied were extremely ‘realistic’ at least in so far as they
are indistinguishable from real catalogues when displayed as
maps (Figs 1-4), or evaluated using statistics such as the
index of dispersion or the space-time statistics presented in
304
S. D . Davis and C . Frohlich
this paper. We invite other researchers to suggest additional
statistics to make meaningful comparisons between real and
synthetic catalogues.
One intriguing result of this study is that even in the best
cases, aftershock identification schemes fail to identify a
surprising number of afterevents. Even in the case with the
highest score (SLC on ALASKA49 synthetics), 10 per cent
of the afterevents were not identified, and 5 per cent of the
primary events were incorrectly identified as afterevents. In
the most extreme case, catalogue VAN30, no scheme
produced socres higher than 0.25, and all four schemes
missed at least half of the afterevents.
One explanation for the high failure rate of aftershock
schemes is that the phenomenon seismologists refer to as
‘aftershocks’ is fundamentally different from the concept of
‘afterevents’ generated by a stochastic process. By virtue of
the long-tailed character of Omori’s Law, a fraction of
afterevents may be separated from their parent events by
such large gaps in time that they would be ‘lost’ among the
background of unrelated events, and most seismologists
would no longer consider them to be mechanically related to
their parent events. Conversely, even for a Poisson process
a small percentage of events will occur so close together in
space and time that a seismologist may consider them to be
physically related. Thus it seems likely that statistical
schemes for identifying aftershocks will seldom be more
successful at finding afterevents than the schemes evaluated
here. Alternatively, we suggest that with the correct
parameters these schemes are able to identify nearly 100 per
cent of the earthquakes that seismologists typically call
aftershocks. Nevertheless, seismologists can not be completely certain of the efficiency of their aftershock detection
schemes until there is an accepted physical model for the
occurrence of earthquake aftershocks.
CONCLUSIONS
(1) We have developed a model for generating synthetic
earthquake catalogues which mimic many of the properties
of actual earthquake catalogues. These catalogues consist of
primary events and afterevents, and may be used to study the
reliability and validity of analytical techniques such as aftershock identification schemes. To create these catalogues,
one must choose values for 11 variables (Table 1). Several
of the variables, e.g., the time span T and number of events
N,are easily derived from the actual catalogues of interest.
(2) We developed four new statistics (So, B,, S,, B,)
based on single-link cluster .analysis which evaluate the
spatial and temporal properties of pairs of events in
earthquake catalogues. These statistics are relatively robust,
and are useful for making comparisons between real and
synthetic catalogues.
(3) We proposed a new method for identifying related
events such as aftershocks using single-link cluster analysis.
In this scheme, two earthquakes are related if there is a
chain of events linking the two earthquakes, and no link of
the chain exceeds a critical space-time distance d , = D
(equations 16 and 18).
(4) We evaluated five schemes for identifying aftershocks.
For synthetics of two teleseismic catalogues, the SLC
scheme and Shlien & Toksoz’s (1974) scheme produced the
highest scores, while for synthetics of two local network
catalogues Gardner & Knopoffs (1974) scheme produced the
highest scores. In general, when applied to synthetic
catalogues, SLC appears to work as well as several other
aftershock identification schemes presented in the literature.
( 5 ) The aftershock identification schemes studied here
had a surprisingly high failure rate when applied to synthetic
catalogues. However, we submit that this is due to the
stochastic nature of the synthetic catalogues, and suggest
that these schemes identify a higher percentage of the events
seismologists would consider to be physically related.
ACKNOWLEDGMENTS
Thanks to Cornell University and ORSTOM for access to
the Vanuatu catalogue, and Carl Kisslinger at CIRES for
access to data from the ADAK network. We also thank
Paul Reasenberg and Jeffrey Park for their critical review of
the manuscript. Partial funding for this work was provided
by the donors of the Petroleum Research Fund,
administered by the American Chemical Society, and by
National Science Foundation grants EAR-8618406, EAR8843928, and EAR-8916665.
REFERENCES
Adamopoulous, H., 1975. Some counting and interval properties of
the mutually-excitingprocesses, 1. appl. Prob., U,78-86.
Anderson, T. W. & Darling, D. A . , 1952. Asymptotic theory of
cetain ‘goodness of fit’ criteria based on stochastic processes,
Ann. Math Statist., 23, 193-212.
BBth, M., 1978a. A note on the recurrence relations for
earthquakes, Tectonophysics, 51, T23-T30.
Bith, M., 1978b. Some properties of earthquake frequency
distributions, Tectonophysics, 51, T63-T69.
BBth, M., 1981. Earthquake magnitude-recent research and
current trends, Earth Sci. Rev., 17, 315-398.
Bottari, A. & Neri, G., 1983. Some statistical properties of a
sequence of historical Calabro-Peloritan earthquakes, J .
geophys. Res., 88, 1209-1212.
Bumdge, R & Knopoff, L., 1967. Model and theoretical seismicity,
Bull. sebm. SOC.Am., 57, 341-371.
Cao, T. & Aki, K . , 1985. Seismicity simulation with a mass-spring
model and a displacement hardening-softening friction law,
Pure appl. Geophys., 122, 10-24.
Chatelain, J., Cardwell, R. K. & Isacks, B. L., 1983. Expansion of
the aftershock zone following the Vanuatu (New Hebrides)
earthquake on 15 July 1981, Geophys. Res. Len., 10,385-388.
Chatelain, J. L., Isacks, B. L., Cardwell, R. K . , PrCvot, R. &
Bevis, M., 1986. Patterns of seismicity associated with
asperities in the central New Hebrides island arc, 1. geophys.
Res., 91, 12497-12519.
Chen, Y. T. & Knopoff, L., 1987. Simulation of earthquake
sequences, Geophys. J . R. mtr. SOC.,91,693-709.
Cox, D. R. & Lewis, P. A. W., 1966. The Statistical Analysis of
Series of Events, Halsted Press.
Das, S. & Scholz, C. H., 1981. Off-fault aftershock clusters caused
by shear stress?, Bull. seism. SOC.A m . , 71, 1669-1675.
Davis, S. D., 1989. Investigations concerning the nature of
earthquake aftershocks and earthquakes induced by fluid
injection, PhD thesis, University of Texas at Austin.
Davis, S. D. & Frohlich, C., 1990. Single-link cluster analysis of
earthquake aftershocks: decay laws and regional variations, J.
geophys. Res., submitted.
De Natale, G . , Gresta, S., PatanC, G & Zollo, A., 1985. Statistical
analysis of earthquake activity at Etna volcano (March 1981
eruption), Pure appl. Geophys., 123, 697-705.
SLC, synthetic catalogues, and aftershock identification
Eneva, M. & Pavlis, G. L., 1988. Application of pair analysis
statistics to aftershocks of the 1984 Morgan Hill, California,
earthquake, J. geophys. Res., 93, 9113-9125.
Eneva, M. & Hamburger, M. W., 1989. Spatial and temporal
patterns of earthquake distribution in Soviet Central Asia:
Application of pair analysis statistics, Bull. seism. SOC. A m . ,
79, 1457-1476.
Engdahl, E . R., 1977. Seismicity and plate subduction in the central
Aleutians, in Island Arcs, Deep Sea Trenches, and Black Arc
Basins, Maurice Ewing Ser. 1, pp. 259-271, eds Talwani, M &
Pittman, W. C. 111, AGU, Washington, DC.
Frohlich, C., 1987. Aftershocks and temporal clustering of deep
earthquakes, J. geophys. Res., 92, 13 944-13 956.
Frohlich, C. & Davis, S., 1985. Identification of aftershocks of deep
earthquakes by a new ratios method, Geophys. Res. Len., 12,
714-716.
Frohlich, C. & Davis, S. D., 1990. Single-link cluster analysis as a
method to evaluate spatial and temporal properties of
earthquake catalogues, Geophys. J. Int. , 100, 19-32.
Frohlich, C., Billington, S., Engdahl, E. R. & Malahoff, A., 1982.
Detection and location of earthquakes in the central Aleutian
subduction zone using island and ocean bottom seismograph
stations, I . geophys. Res., 87, 6853-6864.
Gardner, J. K. & Knopoff, L., 1974. Is the sequence of earthquakes
in Southern California, with aftershocks removed, Poissonian?,
Bull. seism. SOC.A m . , 64, 1363-1367.
Gutenberg, B. & Richter, C. F., 1954. Seismicity of the Earth and
Associated Phenomena, Princeton University Press, Princeton,
NJ.
Habermann, R. E., 1982. Consistency of teleseismic reporting since
1963, Bull. seism. SOC.A m . , 72, 93-111.
Habermann, R. E . , 1983. Teleseismic detection in the Aleutian
island arc, J. geophys. Res., 88, 5056-5064.
Habermann, R. E., 1987. Man-made changes of seismicity rates,
Bull. seism. SOC.A m . , 77,141-159.
Habermann, R. E . & Craig, M. S., 1988. Comparison of Berkeley
and CALNET magnitude estimates as a means of evaluating
temporal consistency of magnitudes in California, Bull. seism.
SOC.Am., 78, 1255-1267.
Hawkes, A. G . & Adamopoulous, L., 1973. Cluster models for
earthquakes-regional
comparisons, Bull. Inf. Star. Inst.,
45(3), 454-461.
Isacks, B. L., Cardwell, R . K., Chatelain, J., Barazangi, M.,
Marthelot, J., Chinn, D. & Louat, R., 1981. Seismicity and
tectonics of the central New Hebrides Island Arc, in
Earfhquake Prediction , Maurice Ewing Ser. 4, pp. 93-116. eds
Simpson, D. W. & Richards, P. G., AGU, Washington, DC.
Kagan, Y. Y. & Knopoff, L., 1976. Statistical search for
non-random features of the seismicity of strong earthquakes,
Phys. Earth planet. Inter., U,291-318.
Kagan, Y. & Knopoff, L., 1977. Earthquake risk as a stochastic
process, Phys. Earth planet. Infer., 14, 97-108.
Kagan, Y. & Knopoff, L., 1978. Statistical study of the occurrence
of shallow earthquakes, Geophys, 1. R . astr. Soc., 55,67-86.
Kagan, Y. Y. & Knopoff, L., 1980. Dependence of seismicity on
depth, Bull. seism. SOC. A m . , 70, 1811-1822.
Kagan, Y. Y. & Knopoff. L., 1981. Stochastic synthesis of
earthquake catalogues, J. geophys. Res., 86, 2853-2862.
Kagan, Y. Y. & Jackson, D . D . , 1990. Long-term earthquake
clustering, Geophys. 1. lnt., 104, 117-133.
Kanamori, H., 1977. The energy release in great earthquakes, 1.
geophys. R e x , 82, 2981-2987.
Kanamori, H . & Anderson, D. L., 1975. Theoretical basis of some
empirical relations in seismology, Bull. seism. SOC. A m . , 65,
1073- 1095.
King, C . Y. & Knopoff, L., 1968. Stress drop in earthquakes, Bull.
seism. SOC. A m . , 58,249-257.
Knopoff, L. & Gardner, J. K., 1972. Higher seismic activity during
305
local night on the raw worldwide earthquake catalogue,
Geophys. 1. R. astr. SOC.,28, 311-313.
Knopoff, L., Kagan, Y. Y. & Knopoff, R., 1982. b values for
foreshocks and aftershocks in real and simulated earthquake
sequences, Bull. seism. SOC.A m . , 72, 1663-1676.
Mandelbrot, B. B., 1982. The Fractal Geometry of Nature, W. H.
Freeman, S Francisco.
Marthelot, J. M., Chatelain, J. L., Isacks, B. L., Cardwell, R. K. &
Coudert, E., 1985. Seismicity and attneuation in the central
Vanuatu (New Hebrides) islands: a new interpretation of the
effect of subduction of the D’Entrecasteaux Fracture Zone, J.
geophys. Rex, 90, 8641-8650.
Mayer-Rosa, D., Pavoni, N., Graf, R. & Bast, B., 1976.
Investigations of intensities, aftershock statistics and the focal
mechanism of Friuli earthquakes in 1975 and 1976, Pure appl.
Geophys., 114, 1095-1 103.
McNally, K. C., 1977. Patterns of earthquake clustering preceding
moderate earthquakes, central and southern California, EOS,
Trans. Am. geophys. Un., 58, 1195.
Mikumo, T. & Miyatake, T., 1978. Dynamical rupture process on a
three-dimensional fault with non-uniform frictions, and
near-field seismic waves, Geophys, J. R. astr. SOC., 54,
417-438.
Mikumo, T. & Miyatake, T., 1979. Earthquake sequences on a
frictional fault model with non-uniform strength and relaxation
times, Geophys, 1. R. astr. SOC.,59,497-522.
Mikumo, T. & Miyatake, T., 1983. Numerical modelling of space
and time variations of seismic activity before major
earthquakes, Geophys, 1. R . astr. SOC.,74, 559-583.
Mogi, K., 1%7. Earthquakes and fractures, Tectonophysics, 5 ,
35-55.
Oakes, D., 1975. The Markovian self-exciting process, 1. appl.
Prob., U,69-77.
Ogata, Y., 1983. Estimation of the parameters in the Modified
Omori formula for aftershock frequencies by the maximum
likelihood procedure, J. Phys. Earth, 31, 115-124.
Prozorov, A. G. & Dziewonski, A. M., 1982. A method of studying
variations in the clustering properties of earthquakes, 1.
geophys. Res., 87, 2829-2839.
Reasenberg, P., 1985. Second-order moment of central California
seismicity, 1969-1982.1. geophys. Res., 90, 5479-5495.
Reasenberg, P. A. & Matthews, M. V., 1988. Precursory seismic
quiescence: a preliminary assessment of the hypothesis, Pure
appl. Geophys., U6,373-406.
Reasenberg, P. M. & Jones, L. M., 1989. Earthquake hazard
immediately after a mainshock in California, Science, 243,
1173-1176.
Rice, J . , 1975. Statistical methods of use in analysisng sequences of
earthquakes, Geophys. J. R. asrr. SOC.,42, 671-683.
Shlien, S. & Toksoz, M. F., 1970. A clustering model for
earthquake occurrences, Bull. sebm. SOC.A m . , 60, 1765-1787.
Shlien, S. & Toksoz, M. F., 1974. A statistical method of
identifying dependent events and earthquake aftershocks,
Earthquake Nores, 45(3), 3-16.
Stein, R. S. & Lisowski, M., 1983. The 1979 Homestead Valley
earthquake sequence, California: control of aftershocks and
postseismic deformation, J. geophys. Res., 88, 6477-6490.
Strehlau, J., 1986. A discussion of the depth extent of rupture in
large continental earthquakes, in Earthquake Source
Mechanics, Maurice Ewing Ser. 6, pp. 131-145, eds Das, S.,
Boatwright, J. & Scholz, C. H., AGU, Washington, DC.
Tinti, S. & Mulgaria, F., 1985. Completeness analysis of a seismic
catalogue, Ann. Geophys., 3, 407-414.
Utsu, T., 1%1. A statistical study on the Occurrence of aftershocks,
Geophys. Mag., 30,521-605.
Utsu, T., 1972. Aftershocks and earthquake statistics (1V)analyses of the distribution of earthquakes in magnitude, time,
and space with special consideration to clustering characteris-
306
S. D. Davis and C.Frohlich
tics of earthquake occurrence (2), J. Fac. Sci., Hokkaido
Univ., Ser T (Geophys.), 4, 1-42.
Vere-Jones, D., 1970. Stochastic models for earthquake occurrences, J. R. Stat. SOC., B32, 1-62.
Vere-Jones, D. & Davies, R. B., 1%. A statistical survey of
earthquakes in the main seismic region of New Zealand, part
2-Time series analysis, N. 2. J. Geol. Geophys., 9, 251-284.
Wardlaw, R. L., Frohlich, C. & Davis, S. D., 1990. Evaluation of
precursory seismic quiescence in sixteen subduction zones using
single-link cluster analysis, Pure appl. Geophys., W, 57-78.
APPENDIX A
Rates of occurrence for plimrvy events, afterevents, and
cntalogue events ps generated by the synthetic model
The number of primary events in the magnitude range
(M,M + d M ) is N,(M) dM, where N,(M) is the primary
event Occurrence rate. Likewise, N A ( m ) and Nc(m) denote,
respectively, the occurrence rates of afterevents and of all
events in the catalogue (primary events and afterevents). If
each event in the catalogue is either a primary event or an
afterevent, then,
NC(m
= NO(m
+ N A ( m ).
(All
For the synthetic model described in Chapter 2, we
assumed a Gutenberg-Richter relation (equation 2):
where b is the primary event b-value and A is a constant
related to the total number of events.
The afterevent occurrence rate is given by
+
10baAm10-bMmax
ba
-7-1.
10-6"
b,Zb,
(A10)
where
AM = Mmax- M ,
Am = M,,
(All)
- m.
(A121
Finally, the cumulative rate of the catalogue qc(m) is simply
the sum of the terms in qo(m) and qA(m).
APPENDIX B
Approximate relation of B , to dimensionality k for
randomly generated events
In Frohlich & Davis (1990) we derive approximate solutions
for the cumulative density function (CDF) &(r) for the link
length distribution of events generated by a Poisson process
for a manifold of k dimensions:
Fk(r)= 1 - exp (-&k
rk),
(A13)
where
where nA(M, m ) is the afterevent function (see main text)
which describes the mean number of afterevents of
magnitude m produced by a primary event of magnitude M.
For this study we assume an afterevent function of the form
(see equation ( 4 ) of main text). The solution to the integral
is
NA(m) = qA lo-bm(Mmax- m ) ,
b, = b,
where Ab = b, - b. Utilizing (Al) and (A2) we derive
Nc(m) = A(1+ q(M,,
= A 10-bm +
-m)]
10-b.m
Ab In 10
b, = b,
(10MmxAb
(A61
- 1VAb),
b, # b.
We obtain the cumulative rates q , ( M ) and q A ( m ) from the
integrals
4
=Alt,
m2=~A2tr
LY - - d 3 t .
3-3
6414)
Here, Ak is the Poisson process rate per unit volume of
dimension k and unit time, and t is total the length of the
time interval. This equation is based on the distribution of
nearest-neighbour links, and therefore does not describe an
exact solution to the distribution of links determined by
single-link cluster analysis (SLC) for the 2-D and higher
dimensional cases. However, we note that at least 50 per
cent (and in practice about 70 per cent) of the links in SLC
are nearest-neighbour links.
The spatial statistic Bo is given by 0(0.25)/0(0.75),
where D ( f ) is the length exceeded by a fraction f of the
links (equation 6). By solving for Fk[D(0.25)J and
Fk[D(o.75)], we find that
[1-
B, = In (0.25)
In (0.75)
"'= (4.82)"*
Thus, the expected value of B , for events generated by a
Poisson process is 4.82 for 1-D data, 2.2 for 2-D data, and
1.7 for 3-D data. For example, B , for the randomly
generated events in Fig. 6(a) is 2.5 (close to the expected
value of 2.2), and the value increases to 4.0 when these
events fill a more 1-D space (Fig. 6d). However, for
clustered data (Fig. 6c) the observed value of B , is higher
than for data generated by a Poisson process.