35 3.3. Waveform Cross-Correlation, Earthquake Locations

3.3. Waveform Cross-Correlation, Earthquake Locations and HYPODD
3.3.1 Method
More accurate relative earthquake locations depend on more precise relative phase
arrival observations so I exploit the similarity of waveforms for pairs of earthquakes in
the frequency domain. Using this technique I will try to produce a better resolved picture
of the spatial distribution of the 2001 Enola sequence. The absolute location of the swarm
may not be better resolved (Figure 15). However, this technique helps to determine the
internal spatial structure of the swarm through event relative locations. I again use
HYPODD to relocate the 2001 earthquakes but this time using relative arrival times of
principal seismic phases recorded at each station.
Figure 15. Station – swarm distances. 2001 Enola earthquakes are spatially confined in a
small area, in particular with respect to the station distribution (SW cluster marked with
the red circle). Seismic waves leaving from the swarm area are likely to have close to
identical paths at a station like GUR, HOL, WOL, etc.
35
Two near-by events, with seismic waves radiating along almost identical paths, are
likely to produce very similar waveforms at the same station. Two earthquakes with
similar waveforms are called a doublet. The waveform similarity is illustrated in Figure
16.
Figure 16. Similarity of the waveforms as recorded on the broadband site (GUR), on the
vertical component within an hour of Julian day 182. No filtering applied. There is also at
least one converted phase (or maybe a basement reflector) between the P and S phase
arrivals. Horizontal scale in seconds, vertical in counts.
36
The site response is assumed to be time invariant for a doublet. The observed spectrum
of
an
earthquake
F(f)
at
a
station
is
approximately
given
by:
F ( f ) = Ss ( f ) ⋅ Sp ( f ) ⋅ St ( f ) , where Ss(f), Sp(f) and St(f) are the source, propagation and
site response spectra, respectively. Waveform similarity of a doublet requires close to
identical source as well as medium properties. Only doublets recorded at the same station
are correlated (see Figure 15). This technique greatly improves earthquake location
alleviating uncertainties in the phase picking procedure made by a network analysist.
The results of the waveform cross correlation are the weighted time differences
between arrival times of each doublet. Time difference of P and S arrivals are used
separately as an input for HYPODD.
Here I briefly focus on the waveform cross correlation technique and a simple testing
procedure. Appendix B contains several others.
There are two Enola 2001 data subsets. One consists of a data set recorded by the
Kinemetrics’ K2 accelerographs recording at 200 samples per second (200 Hz). The other
was recorded using a Guralp CMG-4, a broadband velocity instrument, set up for the
acquisition at 100 samples per second (100 Hz). Even though the sampling rates are
different, a re-sampling procedure is not required; a doublet is analyzed at one station at
the time.
To process the waveforms I use the Fast Fourier Transform (FFT) on a windowed P and
S phase. P and S wave waveform cross-correlations are executed separately. Later both
results are combined to further constrain earthquake locations.
37
For P phase waveform cross correlation I use the vertical component and the N-S
component for the S phase.
The data preparation is as follows:
1.
P and S phase arrivals of the 100 largest events (the algorithm for the selection is
explained in Chapter 2) were manually picked.
2.
Using HYPOELLIPSE the earthquakes were located using a velocity model
(Table 3, page 25, Chapter 3.1).
3.
P and S arrival times were extracted from the HYPOELLIPSE output files (See
Appendix C for details.)
4.
The mean (offset) from each seismogram was removed.
5.
Using the results from Step 3 each seismogram was windowed starting 0.15 sec
before the P and S arrivals and ending 0.50 sec after the arrival. This step
separates P from S arrivals.
6.
The windowed P and S arrivals were band-pass filtered using a 1 - 40 Hz and 1 20 Hz, 4th order Butterworth filter, respectively. A Hanning window tapered off
the trace ends.
7.
Fast Fourier Transforms of the windowed P arrivals for each doublet were
calculated. The same procedure was conducted for the windowed S arrivals.
Once the data preparation steps were complete I used the spectral waveform cross
correlation procedure as proposed by Poupinet (1984) to produce relative P and S phase
arrivals for each doublet.
38
The cross spectrum S (f) is defined by:
S ( f ) = A1 * ( f ) ⋅ Α 2 ( f )
(8)
Where A1 and A2 are the spectra of the windowed phases and the asterisk denotes
complex conjugate. The coherence is calculated using the expression:
S2 ( f )
C(f ) =
A1 ( f ) ⋅ A2 ( f )
(9)
Spectra A1 (f), A2 (f) and S (f) are smoothed using a 4 point running average window.
The coherence is a direct measure of how similar two spectra are so (9) can be used as a
weighting function.
The phase part Φ(f) of the cross spectrum S (f) (8) is used to obtain the time delay ∆t
between the two windowed signals. By fitting a straight line (least squares method) to
Φ(f), starting from the origin, it is possible to evaluate ∆t from:
Φ( f ) = 2π ⋅ ∆t ⋅ f
(10)
So the delay ∆t is proportional to the slope of the line. The least-squares fit is weighted
by a factor at each frequency. Using (9) we can define the weight as:
39
C2( f )
1− C2( f )
(11)
Moreover, each time difference ∆t between two doublet/waveforms is associated with
the arithmetic average of C(f) squared. This number illustrates the fidelity of the obtained
time difference for a doublet. The function C(f) ranges from 0 (no spectra similarity) to 1
(spectra are identical). The weighting function (11) fails when the compared waveforms
are identical.
The time differences (∆t) are further processed with HYPODD where additional
reweighting schemes (based on RMS values) may be used. Based on these time
differences for each doublet and files with HYPOELLIPSE individual locations, the DD
can be applied to relocate the 2001 Enola sequence.
The next sequence of figures illustrates a simple test of the waveform cross correlation
algorithm. In this example, a vertical broadband seismogram was delayed by 0.05 sec.
The test consists of running the seismograms through the algorithm in order to obtain the
delay (∆t) based on the P arrival using the presented set of equations. Further tests are
included in Appendix B.
40
Figure 17. A vertical accelerogram at the BAR site. P phase is isolated using the
information from the manual pick file. The dimensions of the window are: 0.15 sec
before the P arrival and 0.5 sec after the arrival. A 4th order Butterworth band-pass filter,
1 to 40 Hz, was applied before the windowing procedure. I use a simple Hanning window
to taper the ends of the traces before calculating the spectra. The red line is the original
trace and the blue line is the same trace delayed 0.05 sec.
In Figure 17 the red trace is the “original” filtered P arrival. The blue trace is the same
as the red one delayed by 0.05 sec (20 samples). This is an example of a doublet. So how
far off is the calculated delay using the equation from the given one of 0.05 sec?
41
Figure 18. Amplitude spectra of both traces (the color scheme is preserved: red line is the
spectrum of the red trace, and blue line is the spectrum of the blue trace, both from the
Figure 17. The spectra are identical as expected from the same waveforms. Note that
there is a significant amount of energy (>40 Hz) even after filtering with the 4th order
band-pass Butterworth filter with the upper cut-off frequency of 40 Hz.
Waveforms (P and S arrivals) of a doublet are windowed based on an operator pick
file and when plotted on the same axes (as in Figure 17) a certain level of alignment
between waveforms exists. The alignment is directly proportional to how consistent a
network analyst was while picking phases. The first motion of both windowed P arrivals,
for example, will be perfectly aligned (no delay) if the arrivals were picked based on the
42
same criteria. Of course, this zero delay is arbitrary. It is based on the pick file. I call it
the relative (plotting) delay.
Figure 19. Cross spectrum (magnitude) of the two waveforms. The same spectra of the
two functions also coincide at the same frequencies (12 and 22 Hz).
The actual time difference (delay) for a doublet depends on the (assumed) small
hypocentral separation between the earthquakes. The earthquakes might have taken place
days/months apart. One earthquake in a doublet (the red trace) defines the time reference
43
frame. The algorithm calculates the travel time necessary for a P wave originating from
the hypocenter to arrive at the station based on the origin and the observed time,
separately, for both earthquakes in a doublet. The observed time is the operator picked P
arrival. The time difference for two travel times (i.e. two P arrivals for two earthquakes in
a doublet) I call the initial delay.
Figure 20. Coherence function between the two traces as given by (9).
The
weight
(quality) of the calculated delay (Figure 21) is expressed as the arithmetic average of C
(f) up to 50 Hz (25 Hz for the broadband seismograms) function to the second power
(C2(f)). Here the assigned (“quality” of waveform cross-correlation) weight is 0.99985.
That is the maximum weight for the perfectly similar (identical) traces. It differs from the
expected value of 1 due to numerical rounding.
44
Now the calculated delay (based on P or S arrivals) is sum of the initial and the relative
delays.
Figures 17-20 show two input functions (the “original” and the delayed
seismogram), their spectra, the cross spectrum function and the coherence function,
respectively (see captions for details).
Figure 21. The phase of the cross-spectrum function S (f). Red line is the calculated
delay (no weights) between the two traces (Figure 17) as defined by (10). The delay is:
0.049957 sec. Dashed black line (here it coincides with red line) is the weighted best fit
(least squares) for the data given by blue dots (the phase of the cross spectrum). See
Figure 22. The sample rate of the accelerograph is 200 Hz (0.005 sec/sample).
45
According to Figure 21 the phase stays linear through out the frequency range.
Figure 22 shows the weights based on the coherence function. It suggests that the weights
are different for different frequencies even though the waveforms are identical. The
amplitude of the coherence function (being larger when the cross spectrum reaches
maxima) causes this difference.
Figure 22. Weighted phase. This time, the weights, as given by (11) are used when
calculating the least squares fit (note the vertical scale difference in Figures 21 and 22).
The delay is now: 0.049956 sec. So the calculated and weighted delays differ from the
initially given delay of 0.05 sec for barely 0.000043 and 0.000044 sec, respectively! Note
that the sampling rate is 0.005 sec. Thus, the precision of the technique (in this “ideal”
case) is ~ 100 times the rate of data acquisition!
46
Note the order of magnitude of the phase (106). The coherence (based on smoothed
spectra) for the identical waveforms barely departs from 1 (Figure 20). Therefore, this
large order of magnitude is due to large weights. Numerical rounding and the numerical
implementation of (11), when the denominator gets very large, give “humps” in the
weighted phase spectrum. The “humps” coincide with the maxima in the cross spectrum
and coherence function (~ 12 and ~ 22 Hz, Figure 19 and Figure 20).
Figure 23 shows the cross-correlation method for a real earthquake doublet.
47
48
Figure 23. A test based on two observed earthquakes. The inputs are filtered (0.1-30 Hz) and windowed (P phase)
seismograms from BAR1 station (top left). The sections are aligned with respect to the manual P-picks. The
amplitude spectra (top middle) show a discrete distribution of amplitudes, located around the peak at 22-24 (and ~35)
Hz for both arrivals (see the cross spectrum, top right). The color scheme is preserved. The weighted phase (black
dashed line) of the cross spectrum yields a less steep slope (smaller delay) as opposed to the non-weighted calculation
(red line). The coherence (lower left) behaves as suggested by the cross spectrum plot, from 20-25 (and ~35) Hz it
approaches a maximum value of 1. The increase in coherence is reflected by the increase in the Weighted Phase plot
(lower right). Above the threshold of 45 Hz the numerical noise takes over.
Figure 24. The format of the waveform cross correlation output is further used as a
HYPODD input file. Bold letters are doublets ID’s. In the same line, zeroes indicate the
correction for the origin time when the cross correlation data are used together with
catalog data (i.e. an analyst phase picks) and the cross correlation and the catalog data
have different origin times stored in the file headers. The first column lists station codes
where a particular doublet is observed. The second column represents relative time
differences (in seconds) for doublets as obtained using the cross correlation technique.
The third column contains the “quality factor” for the cross correlation (Figure 20). The
fourth column list the phase used for the doublet (P for a P and S for an S phase,
respectively). Weights of 1 are due to numerical rounding (weight ≥ 0.999).
At the very end an output file is automatically generated as a product of the waveform
cross correlation (Figure 24). This is the input file for HYPODD.
49
3.3.2. Results
This time I use relative phase arrival times produced by the waveform crosscorrelation technique as input to HYPODD (example Figure 25). The resulting analysis
yields the final picture of the 2001 Enola earthquake locations. Following the data
subdivision scheme of Chapter 3.2.2. I separated the location procedure into:
Figure 25. 2001 Enola earthquake locations. The blue triangle is ENO site. Black dots
are earthquakes located using HYPOELLIPSE on both P and S arrivals (individual-loc).
Red dots are earthquakes located using only waveform cross correlation data for P
arrivals. Error bars are on the order of meters.
50
a) earthquake locations using P waveform cross-correlation data
b) earthquake locations using S waveform cross-correlation data
c) earthquake locations using P and S waveform cross-correlation data
It appears that the earthquakes form two clusters, based on separate P and S relative
arrival times (Figure 25 and Figure 26). The one to the north, based on this map view,
seems to be separated from the one to the south by a seismically quiescent zone.
Figure 26. 2001 Enola earthquake locations using S waveform cross correlation
technique. The locations are very similar to the ones in Figure 25 where only P phase
data are used. Location errors (on the order of meters) are too small to be noted in the
figure.
51
The DD algorithm considered 90 out of 93 individually located earthquakes. Both
clusters (based on the map view) show a certain degree of lineation (striking NW-SE).
Figure 26, where only S arrivals are used, does not reveal significant changes in the
shape of the clusters as compared to Figure 25, where only P arrivals are used. The SW
cluster seems to be moved closer to the NE cluster (referenced to black dots,
HYPOELLIPSE locations in Figure 15).
More important, the internal (epicentral) structure of the clusters changes to some
extent but it is difficult to draw unequivocal conclusions based solely on the map view.
Figure 27 shows earthquake locations constrained using both P and S arrivals. I believe
this to be the final and “most trusted” picture of 2001 Enola relative earthquake locations.
Both P and S relative arrivals are calculated using the highly reliable waveform cross
correlation technique. The location errors (HYPODD) are on the order of meters.
The single event locations obtained using HYPOELLIPSE (black dots, Figure 27)
provide the spatial reference frame and the waveform cross-correlation locations reveal
the internal structure of the 2001 sequence. The HYPOELLIPSE single event locations
give a general picture of the swarm location. The relative locations, on the other hand,
show a very accurate picture (errors on the order of meters) of the swarm’s internal
spatial structure. The earthquakes do not appear to be aligned on a single fault plane. Nor
do the clusters seem to be clearly outlining separate fault planes. However, each cluster,
based on the map view (Figure 27), appears to have a NW-SE trend.
52
Figure 27. 2001 Enola earthquake locations (total of 90) using both P and S waveform
cross correlation data.
The cross sections (Figure 28) reveal clustered 2001 Enola seismicity. The clusters
seem to be connected by a cloud of seismicity at the depth of ~ 5 km where the Paleozoic
– Precambrian boundary is.
Based on waveform similarity more earthquakes could be selected and sorted in groups
with a common source. I speculate it would be possible to get a more comprehensive
picture of the 2001 Enola earthquake clustering if more earthquakes were located. I do
not think more clusters would be revealed. The largest earthquakes define the main
seismicity zones.
53
Figure 28. NS and EW cross-sections. Earthquake locations are constrained with both P
and S waveform cross correlation data. The major seismicity lies within a 1.5 kilometerthick layer at the depth of the Precambrian basement. There are two apparent clusters of
seismicity (left plot). The one to the south appears to be deeper. Both clusters seem to
share a common cloud of earthquakes at depth of ~ 6 km.
Adding more locatable earthquakes is not an easy problem. The whole data set of the
2001 sequence should be organized in a database that would ease associating an event
with multiple stations. This is evident even for a relatively small data set such as the 2001
Enola earthquake sequence.
54
Figure 29. Two location cubes show two clusters of seismicity, in particular the cube to
the left. The deeper SW cluster, to the left in the left 3-D section, (the SW cluster in
Figure 27) may appear as westward ~ 50º dipping, SW-NE striking (appears as a NW-SE
lineation on the map view in Figure 27). Its 3-dimensional shape seems to be tubular
rather than planar.
Figure 29 does not reveal a simple faulting geometry. The SW cluster exhibits a sort of
tubular shape. It is not clear, at this point, whether that shape developed from depth
towards the surface (temporal migration) or it was completely random in time.
The spatial dimension of the Enola seismicity does not allow me to resolve the “single
fault” question in a less opaque manner. However, I cannot completely rule out a
possibility that these two clusters indeed form a fault plane that ruptured in patches with
an aseismic zone between them. Moreover, there could be two separate fault planes
which dimensions are only indicated by the size of the clusters.
Several factors make an interpretation of the 2001 sequence in a tectonic and structural
context very difficult. These factors include absence of mapped faults in the strict swarm
55
crustal volume, lack of apparent fault planes (based on the earthquake spatial structure)
and clustering of seismicity both in space and distinctively in time.
It is not likely that the Enola zone would produce an earthquake of magnitude 6 or
higher due to the small, possibly highly fractured source volume and the shallow depth of
the earthquakes (Chiu et al., 1984).
Not only does the entire Enola zone occupy a small crustal volume, some 10 km3, but
the seismicity even within this small volume is tightly clustered. The clustered character
of seismicity stresses the importance of the specific, highly concentrated seismogenic
properties. Moreover, the clusters do not show a definite lineation that might be
interpreted uniquely as a fault plane. This might indicate that the zone at the hypocentral
depths is filled by small scale fractures that failed producing small size earthquakes.
The migration of seismicity, discussed further in the next chapter, indicates a possible
role of fluids in the Enola earthquake factory. Špicák and Horálek (2001) discussed
migration of seismicity during an intraplate swarm, possibly controlled by fluids,
pointing out a possibility that the swarm might not have happened otherwise.
The clusters seem to connect at a depth of 5-6 km by diffuse seismicity. Closer to the
surface the connection disappears. Maybe by locating more earthquakes this apparent
aseismic gap would be filled with smaller size earthquakes. The located largest
earthquakes, in my opinion, define well enough the shape of seismicity of the 2001
sequence. It is possible then that fluids have migrated, originating from depths of ~7-8
km, splitting in two channels at ~5 km and diffusing into the cluster defined zones
helping fractures to fail and produce the earthquakes. Examining the steep dipping
geometry of the cluster could support this hypothesis.
56
In my opinion the rest of the 2001 Enola seismicity would also group within the two
clusters, organizing some 2,500 earthquakes, separated already by the order of 10s of
meters in a highly concentrated seismogenic zone.
What could be causing this highly localized seismogenic zone still remains
unanswered.
The 1982 sequence was the first seismic episode to be observed instrumentally in the
Enola region. Over a period of 2 years the sequence produced over 30,000 earthquakes.
Now, 20 years later another 1982-like seismic episode took place within the same 4 x 4
km area this time a bit less productive in terms of earthquake numbers, producing some
2500 earthquakes over a period of 2 months. Even though the instrumented time spans
are different, the daily seismicity rates for both swarms reveal that the 1982 sequence was
more abundant in earthquake production. This tremendous number of earthquakes,
relatively small in magnitudes (M < 3) was located in a ~ 8 km3 crustal cube centered at ~
4 km depth. Paleozoic sandstones and carbonates occur up to a depth of ~ 5 km where the
Precambrian basement starts (Schweig et al., 1992). Despite numerous faults surrounding
the Enola region that could have been as good, if not more favorable, hosts of the
seismicity, a region without mapped faults produced both sequences. An evidence of
clear lineation that could be interpreted as a fault/s has not been found for the 1982
sequence (Chiu et al., 1984).
57
3.4. The Earthquake Chronology
Figure 30. Chronology of 2001 Enola sequence. The left plot shows the activity up to
152 Julian day (second to last burst in seismicity. See Figure 4, Chapter 2). Plot on the
right (blue dots) shows the activation of the shallow cluster (to the NE). “Blue”
earthquakes occurred in just two days, day 181 and 182 (the last burst in seismicity. See
Figure 4, Chapter 2).
The 2001 Enola sequence exhibits a peculiar migration in time. The NE cluster was
barely active until the last few last days of the network deployment (Figure 30). It
remained activate only for two days producing about 50 large size earthquakes (within
the population of 100 biggest). This cluster also contributes one of the largest earthquakes
in the 2001 sequence. The deeper SW cluster did not have a distinct time pattern. The
earthquakes in this cluster occurred at different depths over a period of about two months.
Chiu et al. (1984) investigated the spatial migration in time for the 1982 sequence,
grouping 88 events in 12-day periods. The epicentral region of each major sequence, i.e.,
a mainshock plus intermediate foreshocks and aftershocks was aseismic before the major
sequence commenced. The seismic activity before each major sequence occurred in the
58
region surrounding that activity. As the major sequence developed, the surrounding zone
went seismically quiet. They further concluded that these observations indicated that
patterns of strain accumulation and release in the swarm source zone had developed and
changed in short periods of time – hours to days.
Putting together the above observation with the fact that the 2001 sequence also
shows temporal migration (considering the largest selected earthquakes and their
clustering on day 182) brings the temporal similarity of both sequences to our immediate
attention.
What temporal process could lead to observed clustering for both sequences separated
by 20 years? Would it be possible that the proposed fractured media in the strict swarm
area (Chiu et al. 1984; Schweig et al. 1991 and Booth et al., 1990) is filled with fluids
that migrated during both sequences in a similar manner and controlled the seismicity
rates both spatially and temporally?
There is one more thing that needs to be mentioned: Selecting the 100 largest
earthquakes and plotting them chronologically does not necessarily mean that the
seismicity of the deep SW cluster “shut off” when the shallow NE cluster emerged. It
only brings out the fact that these last ~ 50 earthquakes on 182 Julian day were larger
than possible smaller magnitude events that might have been filling out the volume of the
deep SW cluster.
Again, these 100 largest earthquakes illustrate the main features but leave the picture
of the 2001 sequence seismicity a bit incomplete. Locating more earthquakes would
theoretically complete the spatial and temporal behavior of the 2001 sequence.
59