Geophys. J. Int. (1997) 131,495-499 SPECIAL SECTION-ASSESSMENT O F SCHEMES F O R E A R T H Q U A K E PREDICTION Earthquake prediction: the null hypothesis Philip B. Stark Department of Statistics, University of Calijornia, Berkeley, CA 94720-3860, USA. E-mail: [email protected] Accepted 1997 July 1. Received 1997 June 6; in original form 1997 February 17 SUMMARY The null hypothesis in assessing earthquake predictions is often, loosely speaking, that the successful predictions are chance coincidences. To make this more precise requires specifying a chance model for the predictions and/or the seismicity. The null hypothesis tends to be rejected not only when the predictions have merit, but also when the chance model is inappropriate. In one standard approach, the seismicity is taken to be random and the predictions are held fixed. ‘Conditioning’ on the predictions this way tends to reject the null hypothesis even when it is true, if the predictions depend on the seismicity history. An approach that seems less likely to yield erroneous conclusions is to compare the predictions with the predictions of a ‘sensible’ random prediction algorithm that uses seismicity up to time t to predict what will happen after time t. The null hypothesis is then that the predictions are no better than those of the random algorithm. Significance levels can be assigned to this test in a more satisfactory way, because the distribution of the success rate of the random predictions is under our control. Failure to reject the null hypothesis indicates that there is no evidence that any extra-seismic information the predictor uses (electrical signals for example) helps to predict earthquakes. Key word: earthquake prediction. 1 INTRODUCTION Suppose we are given a seismicity sequence and a set of earthquake predictions. We seek to assess the predictions statistically to determine whether they have merit. We want our test to have a significance level a, that is we want to know that when the (as yet unspecified) null hypothesis is true, we have a chance of at most tl of rejecting it erroneously. To reject the null hypothesis is to conclude that the success of the prediction method should not be ascribed to chance coincidence; colloquially, this would be to conclude that the prediction method has merit. A more precise statement is ‘either the null hypothesis is false, or an event that has probability < a has occurred’. Throughout this note, we shall take a = 0.1. To conclude that the null hypothesis is false is not the same as concluding that the prediction method works, but this distinction is often neglected. We might conclude that the null hypothesis is false neither because the predictions have merit nor because an event with probability < a occurred, but because the null hypothesis is a probabilistically inadequate model, whether or not the predictions have merit. It might help to have a simple example in mind. We are given a black box with a button on top and a one-digit 0 1997 RAS display. When we push the button, a number is displayed. We hypothesize that inside the box a coin is tossed five times whenever we push the button, and the display shows the number of times the coin landed ‘heads’. We seek to test the null hypothesis that the coin is fair against the alternative that it has probability greater than 50 per cent of landing ‘heads’. Under the null hypothesis, the number displayed has a binomial distribution with parameters n = 5, p = 0.5; under the alternative the number is binomial with n = 5, p > 0.5. If we reject the null hypothesis when the display shows 4 or more, we get a test with level a zs 0.19. We push the button, and the display shows 9. We therefore reject the null hypothesis. However, under the null hypothesis (and under the alternative), 9 is an impossible outcome! It is clear that the null hypothesis is false, but not because of the hypothesized value of p; rather, something more fundamental is wrong with our probabilistic model of the black box. (In this case, the alternative hypothesis does not explain the observation either.) Let us look at a slightly different situation. We have a black box as before, but now we record a history of its output. In the first nine trials, the output has been (4, 3, 2, 4, 2, 3, 3, 5, 4). We propose to test the hypothesis that n = 5, p = 0.5 against the alternative n = 5, p>O.5, by looking at the number of 495 496 P. B. Sturk times in 10 trials that the output is 4 or larger. Under the null hypothesis, that number has a binomial distribution with n = 10, p = 0.19, so if we reject when we observe the number 4 or higher four or more times in 10 trials, we have a test with significance level CI z 0.1. We push the button one more time, and 4 shows on the display, so we reject the null hypothesis. This test does not have the signijkance level claimed. The appropriate computation would have been to find the conditional probability of observing the number 4 or higher, four or more times in 10 trials, given that we had observed 4 or higher four times in the j r s t nine trials. That conditional probability is clearly unity, not 0.1. The point is that whether we compute a probability or a conditional probability in a hypothesis test can matter a great deal. One common approach to assessing earthquake predictions (e.g. Mulargia & Gasperini 1992; Riedel 1996; Varotsos et al. 1996b) is to model seismicity as a stochastic process, holding the predictions fixed, and then to compare the observed success rate of the predictions on the real seismicity with the success rate of the predictions on random seismicity generated from the stochastic-process model. If the measured prediction success rate exceeds the 90th percentile of the success rate under the stochastic model, one rejects the null hypothesis (at a significance level CI = 0.1) to conclude that the prediction method works. There are several problems with this approach (1) We might reject the null hypothesis not because the predictions have merit, but because the stochastic model of seismicity is poor. An event that appears to be unlikely according to our model of seismicity might in actuality be quite likely. This idea is illustrated in the first ‘black box’ example above. (2) The significance level of the test is conditional. The nominal significance level is not the probability of a false rejection, but the conditional probability of a false rejection given the predictions (and the stochastic model of seismicity). If the predictions are a function of the seismicity (for example, if the prediction algorithm tries to exploit clustering), this conditional probability can be very misleading, as we shall see. This approach pretends that no matter what the seismicity had been, the predictions would have been the same, while no reasonable geophysicist would ignore recent seismicity in predicting future seismicity. This general concern is illustrated in the second ‘black box’ example above. (3) The test does not answer a more important practical question: does the prediction method (which typically uses information extrinsic to the seismicity, such as electrical signals) do a better job of predicting earthquakes than a reasonable seismologist could from the seismicity alone, using a simple model? If the answer to this question is ‘no’, one might question the utility of the prediction method. The difficulty of specifying a unique stochastic model for earthquakes (point 1 ) is well known; see Kagan (1991) and Ogata (1988)for a variety of competing models. The sensitivity of the conclusions about the earthquake predictions to details of the stochastic model of after shocks and to the spatial variability of the rate of seismicity is well documented; see e.g. Honkura & Tanaka (1996), Mulargia & Gasperini (1992), Utada (1996) and Wyss & Allmann (1996). This note demonstrates that point (2) can substantially increase the chance of concluding erroneously that the predictions have merit, especially when the seismicity clusters. A similar conclusion was reached by Michael (1997) in evaluating earthquake predictions based on VLF anomalies. Point (3) has also been raised by a number of authors, including Kagan & Jackson (1996) and Mulargia & Gasperini (1996b). The explicit recognition that this approach to testing predictions is a conditional test appears to be new. An alternative approach is to reformulate the null hypothesis to say that the predictions are no better than those of another, presumably simpler, method. In this approach, one conditions on the observed seismicity, rather than on the predictions. I assert that this approach overcomes all three of the problems just mentioned. In this approach, one holds seismicity fixed, and generates random predictions similar in character to given predictions (for example, the number of predictions, length of alarms, etc.). The algorithm for generating random predictions should be sensible and causal-it should use only seismic information available before time t in predicting what will happen at time t. If the observed success rate is regularly exceeded by random predictions, one concludes that the prediction method is not useful, in so far as it does no better than a particular crude automated strategy that uses only seismic information. This approach does not rely on a stochastic model for seismicity, it allows for the possibility that the predictions are a function of seismicity up to the time of each prediction, and it explicitly compares the success rate with that of other methods that rely exclusively on seismic data, not extrinsic observations such as electrical signals. By deliberately introducing chance into the ‘straw-man’ prediction algorithm, one can assign a significance level to the test. The approach was suggested by Stark (1996), but variants have been suggested by others. For example, Kagan (1996) suggested an extreme case of this approach: the ‘automatic alarm’ strategy, in which one issues an alarm after every sufficiently large event; he also compared the Varotsos, Alexopoulos & Nomicos (VAN, 1981) Greek earthquake predictions (Varotsos et al. 1996a) with a machine-learning approach. Mulargia, Marzocchi & Gasperini (1996) also explicitly compared the VAN predictions with a patternrecognition algorithm that automatically produces an alarm whenever certain conditions are met. Aceves, Park & Strauss ( 1996) compared predictions derived by randomly sampling the historic catalogue with the predictions to be evaluated (VAN in their case); my principal objection to their approach is the attempt to remove clustering from both the predictions and the historical seismicity, and the fact that the randomization of the catalogue prevents the predictions from exploiting the seismic history. The primary differences between this work and those just mentioned are the recognition of the conditional nature of the other approach to hypothesis testing; the deliberate introduction of chance into the comparison prediction algorithms in order to obtain a more traditional statistical test; and the rephrasing of the null hypothesis to be ‘these predictions are no better than those of a (particular) automated strategy’, rather than ‘the observed successes of these predictions are chance coincidences’. The introduction of chance allows one to adjust the prediction rate to match that of the method being tested in a straightforward way, and allows one to assign a significance level to the test. 2 SIMULATION MODEL The points raised above can be illustrated by simulation. We shall model seismicity as a Gamma renewal process, which is 0 1997 RAS, G J l 131, 495-499 Eurthquuke prediction: the izull lijyothesis one generalization of a Poisson process. I n a Poisson process, the times between events (interevent times) are independent and identically distributed (iid) exponential random variables. In a Gamma renewal process, interevent times are iid Gamma(a, p) random variables. For a = 1, the Gamma distribution coincides with the exponential, yielding a Poisson process as a special case. For other values of a, we can produce synthetic seismicity sequences with more ( a < I ) or less (a > 1) clustering than Poisson processes exhibit. Udias & Rice (1975) found that a Gamma renewal process with a=0.509 (more clustering than Poisson) fits some seismicity sequences better than Poisson processes could. Fig. 1 shows three simulated realizations of each of three Gamma renewal processes, all with a rate of 15 events yr-l. The top three sequences are for a Poisson process, the middle three are for a process with more clustering than a Poisson, and the bottom three are for a process with still more clustering. In spite of their ability to model clustered seismicity, Gamma renewal processes do not have aftershocks per se: a shock does not raise the chance of a new shock by some mechanism; rather, the clock restarts after every event, and the expected rate of seismicity, ap, is constant over time. This is in contrast to some stochastic models, such as the Epidemic Type Aftershock Model (ETAS) (see Ogata 1988, 1993). To give a scale to the simulations, we shall calibrate the model to correspond roughly with Greek seismicity from 1987-1989, as reported by SI-NOA and tabulated by Geller (1996). That tabulation shows an average of 15 events per year with magnitude Ms>5.0. We thus fix the expected number of events per year (a/3) in our simulations to be 15. This will allow us to make a crude test of the VAN (Varotsos et al. 1981) earthquake predictions (e.g. Varotsos et al. 1996a), but this paper does not attempt a formal test of the VAN predictions. Because the observed rate of seismicity is a sufficient statistic 3- + + + ++ tt + + + +++ +++ + + +++I+ +% + * + 2.5 - ++ * ++it +++ * ++ + + t;+ ++ + + +* h 8 i i t*+ fKi+ 8 + + ++++ it*++ Gamma(1,O 067) (Poisson) + 2- + + +tt+ + ++ + t + * ++*+*+ tc+t++H+ *+ +Ht + + *it ++ + + +t + + +i+*+++ 1.5- + * + ++ + + + *t++++ +++ i t * Gamma(0.50,0.133) + 1-++++ ~- it + + + ++ +++ + + + + -HH ++++ ++ + + # + 4 +%* 0.5 t++ **+ + i++ + +t Gamma(0.10,0.66) 0 0.5 1 1.5 Timelyears 2 25 3 Figure 1. Three simulated realizations of each of three Gamma renewal processes with the same rate, 15 events yr-’. The top three sequences are simulations of a Poisson process, the middle three show more clustering, and the bottom three still more. There are no ‘aftershocks’ in these mdoels-the interevent times are independent and identically distributed. The difference is in the weight in the tails of the interevent distributions. 0 1997 RAS, G J I 131, 495-499 497 for the intensity of a homogeneous Poisson process, if we were to assume that the random seismicity sequences we generate were Poisson-distributed, and estimate the intensity by maximum likelihood, the expected value of the estimate would be 15 events yr-’. (Nothing in estimating the intensity would alert us to the fact that the process was not Poisson, and we would estimate the entire family of Gamma renewal processes with fixed ap to be about the same Poisson process.) We shall generate 3-year sequences of seismicity, which correspond roughly with the 1987-1989 VAN predictions, treating all of Greece as a single region (that is ignoring spatial variations in the seismicity rate). 3 SIMULATION ‘EXPERIMENTS’ 3.1 First simulation The first simulation illustrates points (1) and (2) of the Introduction. In this experiment, we simulate 3 years of Gamma(O.5,0.13) seismicity (which tends to cluster more than a Poisson process with the same rate). From this sequence, we generate a random set of earthquake predictions: each time there is an event, we toss a coin. If the coin lands on ‘heads’, we predict that there will be another event within 2.5 weeks (issue an alarm with duration 2.5 weeks). If the coin lands on ‘tails’, we d o not issue a prediction. If a new prediction is issued during an alarm, we extend the alarm time accordingly, without increasing the number of predictions. Applying this rule to the ‘observed’ seismicity sequence resulted in 15 predictions whose total duration was 0.83 years (28 per cent of the 3-year period). Note that 15 is somewhat less than 0.5 x 45, which would be the expected number of alarms if the alarm window were vanishingly short, rather than of 2.5 weeks duration. The smaller number of predictions results from the concatenation of overlapping alarms. For this sequence of 15 predictions, the observed success rate is number of correctly predicted events total number of events = 0.2. Holding those 15 predictions fixed, we now generate 1000 3-year seismicity sequences from two renewal processes that have the same rate, 15 events yr-’: a Gamma(O.5,0.133) process (the same as the one that generated the ‘observed‘ seismicity) and a Gamma( I, 0.067) process (Poisson). From the empirical cumulative distribution functions (ecdfs) of the success rates of the fixed predictions on the two sets of random seismicity, we can estimate the success rate that corresponds to a significance-level 0.1 test of the null hypothesis (the critical value), and compare the observed success rate for the original sequence with the critical value to test the hypothesis. The results of the two sets of simulations are shown in Table 1, in which the symbol Fo,9 denotes the 90th percentile of the success rate in 1000 simulations. Perhaps surprisingly, this test rejects the null hypothesis even when it is true, that is even when the model of the seismicity used in the test is the same as that used to generate the original ‘observed’ sequence, and the correct predictions succeed ‘by chance’. This results from conditioning on the predictions: the predictions depend on the seismicity history. They exploit clustering in the sequence from which they derive by issuing alarms whose duration is somewhat longer than the median interevent time after some events. While each simulated 498 P. B. Stask Table 1. Results of simulations testing the hypothesis that successful predictions are chance coincidences, conditional on the predictions, with two different chance models for seismicity. Column 1: process used to model seismicity; column 2: 90th percentile of the fraction of events correctly predicted by the fixed predictions; equivalently, the critical value for a significance-level 0.1 conditional test of the null hypothesis. Process Fo 9 reject null hypothesis'? median false alarm rate True r(0.5,0.133) Poisson r(1,0.067) 0.18 0.17 YES YES 0.53 0.47 seismicity sequence has about the same amount of clustering, the clusters are in different places, so the predictions are, on average, less successful. This is one of the reasons that conditioning on the predictions tends to yield erroneous conclusions. The median false-alarm rate is higher for the true process than for a Poisson process with the same seismicity rate. This also results from clustering: in the extreme limit of clustering, all the simulated events would occur in one cluster, so if we had n alarms, at least n - 1 of them would have to be false alarms. On the other hand, conditional on the number of events, the times of the Poisson-distributed events are uniformly distributed over the 3-year interval, so we have a reasonable chance of no false alarms once the number of events exceeds the number of alarms. 3.2 Second simulation The second simulation is designed to evaluate the proposed approach to testing the revised null hypothesis; namely, conditioning on the observed seismicity (holding it constant) and comparing the success rate of the observed predictions with that of randomly generated predictions. Using the original seismicity sequence of the first simulation, we generate 1000 sets of random predictions, corresponding to different realizations of the coin-tossing stage of that simulation. We then calculate the empirical cdf of the success rate of those predictions, and use its 90th percentile as the critical value of the test of the null hypothesis. The results of these simulations are shown in Table 2. In this test, the null hypothesis is appropriately not rejected. (Note, however, that the original realization of the coin tosses was particularly lucky!) This test behaves as we would like it to, and addresses point (3) in the Introduction: without using extrinsic information, a simple rule for predicting seismicity does as well as or better than the predictions being evaluated more than 10 per cent of the time. Thus the value of any extrinsic information the predictions use (in this case, there is none) has not been established. The null hypothesis that this method is no better than an automated strategy that uses only seismic information is not rejected. Table 2. Results of 1000 simulations to test the revised null hypothesis that the predictions are no better than those of an automatic method that uses no extra-seismic information, conditionally on the observed seismicity, rather than testing the null hypothesis that the successful predictions succeeded by chance, conditionally on the predictions. Column 1: 90th percentile of success rate of random predictions in 1000 trials. 3.3 Third simulation The third simulation repeats the first two, but with the 'observed' seismicity generated from a Poisson process rather than a more clustered Gamma renewal process. In this case, the particular realization of the process and the coin tosses led to 13 predictions with a total alarm time of 0.69 years (23 per cent). 11 per cent of the events were successfully predicted, and the false-alarm rate was 38 per cent. Note that even for Poisson seismicity, the 90th percentile of the success rate of the random predictions is higher when we condition on the seismicity than when we condition on the predictions. 3.4 Fourth simulation The fourth simulation apes a test of the VAN predictions (Varotsos et al. 1996a). To make a more accurate test would require accounting for the spatial heterogeneity of the seismicity rate in Greece, which is beyond the scope of the present work. According to the preliminary determination of epicenters (PDE), the number of events in Greece betweeen 1987 and 1989 with m,>4.7 is 39 (Geller 1996). During the same interval, the VAN group issued 23 predictions (Varotsos et al. 1996a); the claimed success rate is 38 per cent. The nominal duration of an alarm is about 23 days. Table 4 shows the result of testing the null hypothesis that the successful predictions are chance coincidences by conditioning on the predictions and modelling the seismicity either as a Poisson process of a Gamma renewal process with the same rate. In both cases, the null hypothesis would have been rejected. Table 5 shows the result of testing the null hypothesis that the VAN predictions are no better than an automated strategy that uses no extra-seismic information, conditional on the observed seismicity. The automatic strategy was to generate Table 3. Comparison of testing conditionally on the predictions, and conditionally on the seismicity, when the seismicity is simulated from a Poisson process. Conditional on FO.9 reject null hypothesis? median false alarm rate predictions seismicity 0.15 0.17 NO NO 0.46 0.45 Table4. Test of an idealization of the null hypothesis that the successful VAN predictions, which are based on electrical signals, are chance coincidences, holding the predictions fixed and comparing with random seismicity. Column 1: 90th percentile of snccess rate of the predictions on random seismicity. Seismicity model FO.9 reject null hypothesis? median false alarm rate Poisson Gamma(0.5, 0.15) 0.17 0.19 YES YES 0.30 0.40 Table 5. Test of an idealization of the null hypothesis that the VAN predictions, which use electrical signals, are no better than an automated prediction method that uses no extra-seismic information, holding the seismicity fixed and comparing the VAN success rate with that of random predictions. Column 1: 90th percentile of success rate of random 23-day predictions on the true seismicity. F0.9 reject null hypothesis? median false alarm rate F0.9 reject null hypothesis? median false alarm rate 0.21 NO 0.33 0.49 NO 0.30 0 1997 RAS, GJI 131, 495-499 Eurtlzquuke prediction: the null hypothesis random 23-day ‘alarms’ from the observed seismicity, repeating the steps of the earlier simulations, but using a biased coin with probability p = 23/39 of landing on ‘heads’ (so that the expected number of predictions, assuming no overlapping alarms, agrees with the actual number of VAN predictions). This resulted in an average of 10 predictions with an average total alarm time of 0.95 years (32 per cent of the interval). The 90th percentile of the success rate of the random predictions was 49 per cent so the null hypothesis would not be rejected. 4 DISCUSSION The basic phenomena exhibited here, that is using a conditional hypothesis test or an inappropriate null hypothesis can be misleading, arise in other geophysical problems as well. For example, Hide & Malin (1970) found apparently statistically significant correlations between the geoid and the geomagnetic field. Eckhardt (1984) showed that those correlations are not meaningful, and that the impression of statistical significance came from an inappropriate null hypothesis, which included the assumption that the spectrum of the fields was white (rather than red, as most geophysical fields are), and from the fact that the significance level did not account for the fact that certain parameters were fitted to maximize the apparent correlation (in Hide & Malin’s case, a rotation between the fields). Not accounting for the fact that certain parameters have been adjusted on the basis of the data amounts to a conditional test of the null hypothesis, which we have seen can be quite misleading. Similarly, Morelli & Dziewonski ( 1987) found that the correlation between core-mantle topography models inferred from PcP and P K P traveltimes were apparently significant. They used the same (inappropriate) null hypothesis as Hide & Malin, namely that the spectrum of CMB heterogeneity is white, and in assigning a significance level to the observed correlation they did not account for the calibration of various parameters, such as the number of singular functions retained in their damped least-squares estimates of topography (Stark 1995). These same effects are present in some assessments of earthquake predictions, for example the VAN predictions: Varotsos et a/. ( 1996b) explicitly advocate the assumption that the times, locations and magnitudes of events are jointly independent, which is almost certainly false; the null hypothesis has in many cases included the assumption that seismicity (possibly after some data processing to ‘decluster’ the sequences) has a Poisson distribution, which seems implausible to me; and the geographical, magnitude and temporal windows of the predictions were adjusted several times by the VAN group to improve the apparent success rate of the predictions, which has not generally been accounted for in determining the significance level (Kagan & Jackson 1996; Mulargia & Gasperini 1996a). Mulargia (1997) shows that if one accounts for the optimization of those parameters in evaluating the VAN predictions, the apparent significance disappears. ACKNOWLEDGMENTS I am grateful to S. N. Evans and J. R. Rice for helpful discussions. I received support from the NSF and NASA. 0 1997 RAS, G J I 131,495-499 499 REFERENCES Aceves, R.L., Park, S.K. & Strauss, D.J., 1996. Statistical evaluation of the VAN method using the historic earthquake catalog in Greece, Geopkys. Res. Lett., 23, 1425-1428. Eckhardt, D.H., 1984. Correlations between global features of terrestrial fields, Muth. Geol., 16, 155-171. Geller, R.J., 1996. Debate on evaluation of the VAN method: Editor’s introduction, Geophys. Res. Lett., 23, 1291- 1293. Hide, R. & Malin, S.R.C., 1970. Novel correlations between global features of the Earths gravitational and magnetic fields, Nature, 225, 605-609. Honkura, Y. & Tanaka, N., 1996. Probability of earthquake occurrences in Greece with special reference to the VAN predictions, Geophys. Res. Lett., 23, 1417-1420. Kagan, Y.Y., 1991. Likelihood analysis of earthquake catalogues, Geophys. J. Int., 106, 135-148. Kagan, Y.Y., 1996. VAN earthquake predictions-an attempt at statistical evaluation, Geophys. Res. Lett., 23, 1315-1318. Kagan, Y.Y. &Jackson, D.D., 1996. Statistical tests of VAN earthquake predictions: comments and reflections, Geophys. Res. Lett., 23, 1433-1436. Michael, A.J., 1997. The evaluation of VLF guided waves as possible earthquake precursors, Geophys. Res. Lett.. in press. Morelli, A. & Dziewonski, A.M., 1987. Topography of the core-mantle boundary and lateral homogeneity of the liquid core, Nuture, 325, 678-683. Mulargia, F., 1997. Retrospective validation of the time association of precursors, Geophys. J. Int., 131, 500-504 (this issue). Mulargia, F. & Gasperini, P., 1992. Evaluating the statistical validity of ‘VAN earthquake precursors, Geophys. J. Int., 111, 32-44, 1992. Mulargia, F. & Gasperini, P., 1996a. Precursor candidacy and validation: the VAN case so far, Geophys. Res. Lett., 23, 1323-1326. Mulargia, F. & Gasperini, P., 1996b. VAN: Candidacy and validation with the latest laws of the game, Geophys. Res. Lett., 23, 1327-1 330. Mulargia, F., Marzocchi, W. & Gasperini, P., 1996. Rebuttal to replies I and I1 by Varotsos et al., Geophys. Res. Lett., 23, 1339-1340. Ogata, Y., 1988. Statistical models for earthquake occurrences and residual analysis for point processes, J. Am. Stat. Assn, 83, 9-27. Ogata, Y . , 1993. Fast likelihood computation of epidemic type aftershock-sequence model, Geophys. Res. Lett., 20, 2143-2146. Riedel, K.S., 1996. Statistical tests for evaluating earthquake prediction methods, Geophys. Res. Lett., 23, 1407-1409. Stark, P.B., 1995. Reply to Comment by Morelli and Dziewonski, J. geophys. R e x , 100, 15 399-15 402. Stark, P.B., 1996. A few considerations for ascribing statistical significance to earthquake predictions, Geophys. Res. Lett., 23, 1399-1402. Udias, A. & Rice, J.R., 1975. Statistical analysis of microearthquake activity near San Andreas Geophysical Observatory, Bull. seism. Soc. Am., 65, 809-828. Utada, H., 1996. Difficulty of statistical evaluation of an earthquake prediction method, Geophys. Res. Lett., 23, 1391-1394. Varotsos, P., Alexoponlos, K. & Nomicos, K., 1981. Seismic electric currents, Prakt. Akad. Athenon, 56, 277-286. Varotsos, P., Eftaxias, K., Lazaridou, M., Dologlou, E. & Hadjicontis, V., 1996a. Reply to ‘probability of chance correlations of earthquakes with predictions in areas of heterogeneous seismicity rate: the VAN case’, by M. Wyss and A. Allmann, Geophys. Res. Lett., 23, 1311-1314. Varotsos, P., Eftaxias, K., Vallianatos, F. & Lazaridou, M., 1996b. Basic principles for evaluating an earthquake prediction method, Geophys. Res. Lett., 23, 1295-1298. Wyss, W. & Allmann, A., 1996. Probability of chance correlations of earthquakes with predictions in areas of heterogeneous seismicity rate: the VAN case, Geophys. Res. Lett., 23, 1307-1310.
© Copyright 2026 Paperzz