Eruptions of the Old Faithful Geyser A geyser is a hot spring that occasionally becomes unstable and erupts hot water and steam into the air. The Old Faithful Geyser at Yellowstone National Park in Wyoming is probably the most famous geyser in the world. Visitors to the park try to arrive at the geyser site to see it erupt without waiting too long; the name of the geyser comes from the fact that eruptions follow a relatively stable pattern. The National Park Service erects a sign at the geyser predicting when the next eruption will occur, and also post these predictions at the Old Faithful Geyser WebCam page (http://www.nps.gov/archive/yell/ OldFaithfulcam.htm), which also includes a photo of the geyser that is updated roughly every 30 seconds (see the last page of this handout for an example). Thus, it is of interest to understand and predict the interval time until the next eruption. The following analysis is based on a sample of 222 intereruption times taken during August 1978 and August 1979 (data source: Applied Linear Regression, 2nd. ed., by S. Weisberg). The histogram below shows that, in fact, Old Faithful isn’t as “faithful” as you might think: times between eruptions range between 400 and 100 minutes, with two apparent subgroups in the data (in fact, the geyser has become so popular not because it is the largest or most regular geyser in Yellowstone Park, but rather because it erupts more frequently than any of the other large geysers in the park). c 2015, Jeffrey S. Simonoff 1 Frequency 30 20 10 0 40 50 60 70 80 90 100 Time interval until next eruption Times between eruptions apparently center around 55 minutes roughly one–third of the time, and around 80 minutes roughly two–thirds of the time. The existence of two subgroups in this type of data is rare, but not unheard of; J.S. Rinehart, in a 1969 paper in the Journal of Geophysical Research, provides a mechanism for this pattern based on the temperature level of the water at the bottom of a geyser tube at the time the water at the top reaches boiling temperature. A readily available characteristic of the geyser that might be used to forecast the time until the next eruption is the duration of the previous eruption. A scatter plot of time interval until the next eruption on duration of previous eruption looks quite linear, suggesting the use of a linear model relating the two variables: c 2015, Jeffrey S. Simonoff 2 Time interval until next eruption 100 90 80 70 60 50 40 2 3 4 5 Duration of previous eruption That a shorter eruption would be followed by a shorter time interval until the next eruption (and a longer eruption would be followed by a longer time interval) is also consistent with Rinehart’s geyser model, since a short eruption is characterized by having more water at the bottom of the geyser being heated short of boiling temperature, and left in the tube. This water has been heated somewhat, however, so it takes less time for the next eruption to occur. A long eruption results in the tube being emptied, so the water must be heated form a colder temperature, which takes longer. A. Azzalini and A.W. Bowman provide further discussion of statistical analysis based on this model in a 1990 paper in Applied Statistics. Here are the results of a regression of time interval until the next eruption on previous eruption duration: c 2015, Jeffrey S. Simonoff 3 Regression Analysis: Interval versus Duration Analysis of Variance Source Regression Duration Error Lack-of-Fit Pure Error Total DF 1 1 220 32 188 221 Adj SS 27860 27860 8344 1658 6686 36204 Adj MS 27859.9 27859.9 37.9 51.8 35.6 F-Value 734.56 734.56 P-Value 0.000 0.000 1.46 0.065 Model Summary S 6.15853 R-sq 76.95% R-sq(adj) 76.85% R-sq(pred) 76.57% Coefficients Term Constant Duration Coef 33.97 10.358 SE Coef 1.43 0.382 T-Value 23.79 27.10 P-Value 0.000 0.000 VIF 1.00 Regression Equation Interval = 33.97 + 10.358 Duration Durbin-Watson statistic = 2.50204 The regression is very significant, with previous duration accounting for 77% of the variability in time until the next eruption. However, these data form a time series, so we need to check whether there is evidence of autocorrelation in the errors. Here is a time series plot of the standardized residuals: c 2015, Jeffrey S. Simonoff 4 Residuals Versus the Order of the Data (response is Interval) Standardized Residual 3 2 1 0 -1 -2 50 100 150 200 Observation Order There isn’t any obvious cyclical pattern here, so that seems good. We need to check the various tests for autocorrelation, however. The first is given with the regression output — a Durbin–Watson value of 2.50. The sample size here is pretty large, so we can construct an approximate z–statistic for this value, √ √ z = (DW/2 − 1) n = (1.25 − 1) 222 = 3.72. This is highly significant and positive, indicating negative autocorrelation in the errors. Note that this reinforces the difficulty in identifying negative autocorrelation from a time series plot, as it doesn’t show up as a cyclical effect. Let’s look at an ACF plot to see what the autocorrelation structure looks like. c 2015, Jeffrey S. Simonoff 5 Autocorrelation Autocorrelation Function for SRES1 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 5 Lag Corr 1 2 3 4 5 6 7 -0.26 0.13 -0.02 -0.03 0.06 0.04 0.05 T LBQ -3.81 1.79 -0.26 -0.40 0.87 0.61 0.72 14.69 18.37 18.45 18.65 19.55 20.01 20.64 10 Lag Corr 8 9 10 11 12 13 14 0.07 0.01 0.06 0.01 0.02 -0.03 0.06 15 T LBQ 0.98 0.14 0.82 0.08 0.25 -0.47 0.83 21.83 21.85 22.69 22.70 22.78 23.06 23.96 Lag Corr 15 16 17 18 19 20 -0.07 0.02 0.14 0.02 0.06 -0.01 20 T LBQ -0.90 0.23 1.83 0.22 0.77 -0.14 25.02 25.09 29.57 29.64 30.46 30.49 The first–order autocorrelation is −.256, which is not overwhelmingly large, but is significantly negative. The evidence either for or against an AR(1) process is marginal — the second– and third–order autocorrelations are positive and negative, respectively, as desired, so the AR(1) model is probably not too bad an assumption. It might be difficult to see the structure in the graphical ACF plot, so here is a nongraphical version. Autocorrelation Function ACF of SRES1 1 2 3 4 5 6 7 8 9 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 +----+----+----+----+----+----+----+----+----+----+ -0.256 XXXXXXX 0.128 XXXX -0.019 X -0.029 XX 0.063 XXX 0.045 XX 0.052 XX 0.072 XXX 0.010 X c 2015, Jeffrey S. Simonoff 6 10 11 12 13 14 15 16 17 18 19 20 0.060 0.006 0.019 -0.034 0.061 -0.066 0.017 0.136 0.016 0.058 -0.011 XX X X XX XXX XXX X XXXX X XX X Finally, here is a runs test of the residuals, which also agrees that there is negative autocorrelation (too many runs): Runs test for SRES1 Runs above and below K = 0 The observed number of runs = 128 The expected number of runs = 111.423 103 observations above K, 119 below P-value = 0.025 Since AR(1) looked reasonable here, let’s try the Cochrane–Orcutt procedure. After forming the “*” variables (remember that we need to add .256 times the lagged variable here), here is the Cochrane–Orcutt regression: c 2015, Jeffrey S. Simonoff 7 Analysis of Variance Source Regression durstar Error Lack-of-Fit Pure Error Total DF 1 1 219 163 56 220 Adj SS 18975 18975 7708 5935 1773 26683 Adj MS 18975.1 18975.1 35.2 36.4 31.7 F-Value 539.10 539.10 P-Value 0.000 0.000 1.15 0.276 Model Summary S 5.93277 R-sq 71.11% R-sq(adj) 70.98% R-sq(pred) 70.64% Coefficients Term Constant durstar Coef 45.58 9.709 SE Coef 1.92 0.418 T-Value 23.75 23.22 P-Value 0.000 0.000 VIF 1.00 Regression Equation intstar = 45.58 + 9.709 durstar Durbin-Watson statistic = 2.04644 The regression is slightly less significant, but still strong. The fitted regression is Interval = 36.287 + 9.7086 × Duration (after correcting the constant term), which represents a small increase in the constant and small decrease in the slope coefficient, and implies that each additional minute’s duration of the previous eruption is associated with an estimated expected 9.7 additional minutes until the next eruption. Note that to apply this model and make predictions you must use this formula in the calculator directly, as predictions that come from within the Cochrane–Orcutt regression will not be correct. c 2015, Jeffrey S. Simonoff 8 Has the autocorrelation been removed? The time series plot and ACF plot of the residuals look good: Residuals Versus the Order of the Data (response is intstar) Standardized Residual 2 1 0 -1 -2 50 100 150 200 Observation Order Autocorrelation Autocorrelation Function for SRES2 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 5 Lag Corr 1 2 3 4 5 6 7 -0.03 0.09 -0.01 -0.02 0.05 0.07 0.07 T LBQ -0.39 1.30 -0.22 -0.33 0.80 1.01 1.06 0.15 1.87 1.92 2.04 2.71 3.79 4.99 c 2015, Jeffrey S. Simonoff 10 Lag Corr 8 9 10 11 12 13 14 0.08 0.04 0.07 0.01 0.02 -0.03 0.04 15 T LBQ 1.20 0.57 0.98 0.21 0.26 -0.47 0.65 6.57 6.93 8.01 8.06 8.13 8.39 8.87 Lag Corr 15 16 17 18 19 20 -0.07 0.05 0.13 0.07 0.06 -0.02 20 T LBQ -1.01 0.67 1.88 1.03 0.89 -0.30 10.05 10.59 14.80 16.10 17.09 17.20 9 Autocorrelation Function ACF of SRES2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 +----+----+----+----+----+----+----+----+----+----+ -0.026 XX 0.087 XXX -0.015 X -0.022 XX 0.054 XX 0.069 XXX 0.072 XXX 0.083 XXX 0.040 XX 0.068 XXX 0.015 X 0.018 X -0.033 XX 0.045 XX -0.070 XXX 0.047 XX 0.132 XXXX 0.073 XXX 0.064 XXX -0.021 XX √ The Durbin–Watson test is not significant (z = (2.05/2 − 1) 221 = .37), and neither is the runs test: Runs test for SRES2 Runs above and below K = 0 The observed number of runs = 102 The expected number of runs = 111.317 106 observations above K, 115 below P-value = 0.208 c 2015, Jeffrey S. Simonoff 10 Finally, a residual versus fitted plot indicates some slight nonconstant variance, but otherwise there don’t appear to be any problems: Residuals Versus the Fitted Values (response is intstar) Standardized Residual 2 1 0 -1 -2 70 80 90 100 Fitted Value Normal Probability Plot of the Residuals (response is intstar) Standardized Residual 2 1 0 -1 -2 -3 -2 -1 0 1 2 3 Normal Score c 2015, Jeffrey S. Simonoff 11 Row SRES2 HI2 COOK2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 * -0.06888 -1.21031 -0.15962 1.68232 1.75803 -1.11728 1.50206 0.85784 -1.23435 0.54326 -0.94893 0.66322 0.57954 0.63161 1.12708 -0.28909 -1.72482 2.10828 -1.35300 2.00986 0.17931 1.85113 0.21255 0.10315 -0.75234 -0.87564 1.15510 1.34962 -0.69286 1.92303 -0.07669 0.52384 -1.31246 0.64940 -0.20470 -0.27955 -1.01452 1.73533 0.48717 * 0.0059557 0.0058103 0.0059429 0.0045307 0.0057982 0.0109818 0.0076968 0.0170279 0.0080758 0.0162340 0.0060030 0.0045633 0.0068265 0.0186940 0.0046435 0.0047412 0.0054976 0.0049836 0.0183074 0.0045500 0.0178870 0.0079380 0.0200963 0.0062935 0.0130769 0.0058799 0.0060954 0.0077734 0.0104149 0.0045755 0.0175117 0.0062935 0.0158828 0.0067620 0.0155014 0.0061450 0.0144302 0.0057832 0.0053449 * 0.0000142 0.0042805 0.0000762 0.0064406 0.0090123 0.0069305 0.0087500 0.0063739 0.0062023 0.0024351 0.0027191 0.0010082 0.0011543 0.0037999 0.0029631 0.0001991 0.0082229 0.0111311 0.0170693 0.0092319 0.0002928 0.0137092 0.0004632 0.0000337 0.0037499 0.0022675 0.0040913 0.0071350 0.0025262 0.0084990 0.0000524 0.0008690 0.0139003 0.0014356 0.0003299 0.0002416 0.0075348 0.0087584 0.0006377 c 2015, Jeffrey S. Simonoff 12 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 0.12848 -0.84575 -1.92309 2.06020 0.43925 0.25661 0.38638 1.00064 0.91621 -0.85636 0.88401 -1.93311 2.25204 0.02621 1.24228 0.70011 0.07085 0.83656 1.51139 0.21446 -0.88106 -1.64480 -0.01817 0.47416 0.05791 -0.04551 0.80049 0.41952 -0.95180 0.20459 -0.74726 1.60182 -0.00490 2.25516 0.30717 1.08464 0.34003 0.68550 -0.46033 -0.32160 2.25055 2.39998 0.00744 0.0047976 0.0046481 0.0073751 0.0047462 0.0047920 0.0049230 0.0049230 0.0096617 0.0066255 0.0074114 0.0048604 0.0048542 0.0045928 0.0052448 0.0112750 0.0057622 0.0072207 0.0081978 0.0045642 0.0179654 0.0050408 0.0057123 0.0081366 0.0144302 0.0053329 0.0046113 0.0046481 0.0196076 0.0061450 0.0047002 0.0053449 0.0047976 0.0213885 0.0060030 0.0174347 0.0045396 0.0183074 0.0052089 0.0151615 0.0062935 0.0053685 0.0048287 0.0171417 c 2015, Jeffrey S. Simonoff 0.0000398 0.0016701 0.0137390 0.0101206 0.0004645 0.0001629 0.0003693 0.0048843 0.0027994 0.0027378 0.0019084 0.0091141 0.0117003 0.0000018 0.0087993 0.0014204 0.0000183 0.0028923 0.0052369 0.0004207 0.0019664 0.0077713 0.0000014 0.0016459 0.0000090 0.0000048 0.0014962 0.0017600 0.0028006 0.0000988 0.0015003 0.0061846 0.0000003 0.0153570 0.0008371 0.0026825 0.0010781 0.0012303 0.0016311 0.0003275 0.0136691 0.0139739 0.0000005 13 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 1.73613 -1.18241 -0.09986 1.31411 -1.19477 -0.73568 -0.28371 -0.39396 0.12193 1.07190 -1.04561 -0.34384 -1.01959 -1.29061 1.75015 0.49492 1.16295 0.04542 -0.56352 0.93597 0.44439 -0.75373 -0.69250 0.36702 1.12507 0.75279 -0.54619 0.04232 0.06887 -1.47294 -1.37786 -0.18155 -0.49365 -1.31791 -1.15149 -1.74640 -0.89170 -0.92055 1.17035 -1.35648 -1.13356 -0.19884 0.33259 0.0050408 0.0170658 0.0045500 0.0178870 0.0067620 0.0086700 0.0056957 0.0077734 0.0170658 0.0056273 0.0134139 0.0047703 0.0093492 0.0072207 0.0066879 0.0060820 0.0065234 0.0072459 0.0077159 0.0144302 0.0057503 0.0134139 0.0078836 0.0080223 0.0066879 0.0073932 0.0103779 0.0147926 0.0069359 0.0127465 0.0071164 0.0075977 0.0055321 0.0096031 0.0098635 0.0088832 0.0118107 0.0118231 0.0080960 0.0080223 0.0060820 0.0059429 0.0088609 c 2015, Jeffrey S. Simonoff 0.0076354 0.0121369 0.0000228 0.0157256 0.0048591 0.0023667 0.0002305 0.0006080 0.0001291 0.0032511 0.0074325 0.0002833 0.0049054 0.0060574 0.0103116 0.0007494 0.0044403 0.0000075 0.0012346 0.0064132 0.0005711 0.0038621 0.0019054 0.0005447 0.0042613 0.0021104 0.0015642 0.0000134 0.0000166 0.0140055 0.0068036 0.0001262 0.0006778 0.0084206 0.0066043 0.0136680 0.0047517 0.0050695 0.0055899 0.0074403 0.0039315 0.0001182 0.0004945 14 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 0.24002 -0.64337 -1.04058 1.27417 -2.04268 -0.27925 0.82585 -1.32010 0.13080 0.19253 -0.13495 -0.21362 0.26297 0.39638 0.23489 -0.09010 -0.70973 0.31348 -1.00221 -1.10221 -0.67211 -1.61388 -1.40123 0.42158 -0.02521 1.80438 -0.20277 -0.28322 -0.34754 -1.05507 0.61148 -1.60721 1.41389 -1.33303 -0.12588 -0.17728 0.60297 0.34057 -0.08123 -1.20173 -1.15250 -0.32793 -0.35233 0.0072207 0.0081978 0.0063799 0.0054618 0.0049464 0.0079427 0.0109546 0.0074968 0.0130949 0.0106685 0.0083149 0.0098882 0.0053643 0.0056841 0.0065234 0.0178870 0.0052089 0.0065385 0.0124828 0.0113654 0.0127158 0.0094594 0.0115626 0.0109431 0.0055865 0.0075710 0.0096031 0.0112472 0.0083149 0.0137249 0.0069359 0.0155014 0.0074784 0.0124225 0.0054107 0.0082271 0.0055321 0.0096031 0.0151615 0.0069359 0.0096272 0.0115464 0.0074968 c 2015, Jeffrey S. Simonoff 0.0002095 0.0017107 0.0034762 0.0044580 0.0103709 0.0003122 0.0037770 0.0065816 0.0001135 0.0001999 0.0000764 0.0002279 0.0001865 0.0004491 0.0001811 0.0000739 0.0013188 0.0003234 0.0063483 0.0069831 0.0029090 0.0124368 0.0114841 0.0009832 0.0000018 0.0124188 0.0001993 0.0004562 0.0005064 0.0077454 0.0013058 0.0203362 0.0075313 0.0111760 0.0000431 0.0001303 0.0010112 0.0005623 0.0000508 0.0050433 0.0064558 0.0006281 0.0004688 15 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 -1.37297 -0.99730 -0.56347 -0.11126 -0.75358 0.66287 -0.84999 -0.62252 0.83642 -1.78426 1.47592 1.23528 1.55052 0.69415 -1.43357 -0.88078 -0.92324 -0.69250 1.10789 0.66696 -0.70752 -1.01764 0.63335 0.40465 1.16116 -0.16808 -1.12141 -0.44710 -1.17590 0.59300 -0.52561 0.14019 -0.36837 -0.44018 -1.77468 0.79171 1.82168 0.38157 -0.30627 0.59768 -0.00677 -1.17299 1.07970 0.0093592 0.0069526 0.0045256 0.0060555 0.0058399 0.0081366 0.0045256 0.0060555 0.0078204 0.0075153 0.0052723 0.0196076 0.0067620 0.0118396 0.0112354 0.0151466 0.0124225 0.0078836 0.0137249 0.0069359 0.0127465 0.0107815 0.0107103 0.0144158 0.0048630 0.0175889 0.0066100 0.0068754 0.0124828 0.0113654 0.0049259 0.0060422 0.0127772 0.0073033 0.0118396 0.0178479 0.0046300 0.0084198 0.0186940 0.0051178 0.0080023 0.0118520 0.0074968 c 2015, Jeffrey S. Simonoff 0.0089046 0.0034817 0.0007217 0.0000377 0.0016679 0.0018023 0.0016423 0.0011805 0.0027571 0.0120533 0.0057728 0.0152589 0.0081836 0.0028866 0.0116763 0.0059655 0.0053609 0.0019054 0.0085404 0.0015535 0.0032315 0.0056435 0.0021714 0.0011975 0.0032944 0.0002529 0.0041839 0.0006919 0.0087393 0.0020213 0.0006838 0.0000597 0.0008781 0.0007127 0.0188677 0.0056953 0.0077182 0.0006181 0.0008935 0.0009188 0.0000002 0.0082515 0.0044027 16 213 214 215 216 217 218 219 220 221 222 -0.34709 -0.98962 0.82259 -0.03482 -0.24113 0.69115 0.85232 -1.30085 -0.84667 1.11647 0.0070536 0.0183074 0.0067620 0.0155014 0.0056273 0.0121346 0.0048301 0.0131083 0.0122602 0.0111919 0.0004279 0.0091319 0.0023033 0.0000095 0.0001645 0.0029339 0.0017629 0.0112382 0.0044489 0.0070543 Thus, the given line seems reasonable as a way to predict the time until the next eruption of the geyser, using the easily available duration of the previous eruption. A rough 95% p d prediction interval for the time until the next eruption would be Interval±2σ̂/ 1 − ρ̂2 = p d d d Interval ± (2)(5.933)/ 1 − (−.256)2 = Interval ± (2)(6.14) = Interval ± 12.28. A d d rough 90% prediction interval would be Interval ± (1.65)(6.14) = Interval ± 10.13. You might have noted that the interval given on the Old Faithful WebCam site was of the form d time ± 10, so apparently it corresponds to a 90% prediction interval. Predicted We can verify the usefulness of this model by validating it on a separate set of observations. The 1990 Azzalini and Bowman article includes data for 296 eruptions in August 1985. If the OLS and GLS (Cochrane–Orcutt) equations are applied to these new data, the errors have the following statistics: Descriptive Statistics: GLS error, OLS error Variable GLS erro OLS erro Variable GLS erro OLS erro N 230 230 N* 68 68 Mean 2.012 2.066 Median 1.223 1.369 TrMean 1.745 1.833 StDev 6.823 6.735 SE Mean 0.450 0.444 Minimum -10.762 -11.865 Maximum 32.879 32.600 Q1 -3.711 -2.871 Q3 6.032 6.397 c 2015, Jeffrey S. Simonoff 17 In this case there is little to choose between the two models. Both underestimate the time intervals by roughly two minutes on average, with standard deviations slightly higher than the estimated standard deviation of errors from the original data. As is typical, the Prais-Winston procedure gives a similar result: Analysis of Variance Source Regression durstar2 Error Lack-of-Fit Pure Error Total DF 1 1 220 164 56 221 Adj SS 19033 19033 7839 6066 1773 26872 Adj MS 19032.8 19032.8 35.6 37.0 31.7 F-Value 534.14 534.14 P-Value 0.000 0.000 1.17 0.253 Model Summary S 5.96933 R-sq 70.83% R-sq(adj) 70.69% R-sq(pred) 70.36% Coefficients Term Constant durstar2 Coef 45.46 9.722 SE Coef 1.93 0.421 T-Value 23.55 23.11 P-Value 0.000 0.000 VIF 1.00 Regression Equation intstar2 = 45.46 + 9.722 durstar2 You might have wondered about the possibility of just using the lagged version of Interval as a predictor in a regression, as we’ve done earlier. In fact, the regression on just lagged interval is clearly inferior to using the duration of the previous eruption, although the autocorrelation is addressed: c 2015, Jeffrey S. Simonoff 18 Analysis of Variance Source Regression Lagged interval Error Lack-of-Fit Pure Error Total DF 1 1 219 47 172 220 Adj SS 14797 14797 21358 3291 18067 36155 Adj MS 14797.0 14797.0 97.5 70.0 105.0 F-Value 151.72 151.72 P-Value 0.000 0.000 0.67 0.948 Model Summary S 9.87547 R-sq 40.93% R-sq(adj) 40.66% R-sq(pred) 39.96% Coefficients Term Constant Lagged interval Coef 116.44 -0.6399 SE Coef 3.75 0.0519 T-Value 31.05 -12.32 P-Value 0.000 0.000 VIF 1.00 Regression Equation Interval = 116.44 - 0.6399 Lagged interval Adding the duration variable (resulting in two predictors based on the previous eruption), yields a model that is comparable to the earlier models, but it is based on two predictors, rather than one, and would thus be harder to implement “on the fly”: Analysis of Variance Source Regression Duration Lagged interval Error DF 2 1 1 218 c 2015, Jeffrey S. Simonoff Adj SS 28448.8 13651.9 635.6 7706.1 Adj MS 14224.4 13651.9 635.6 35.3 F-Value 402.40 386.20 17.98 P-Value 0.000 0.000 0.000 19 Lack-of-Fit Pure Error Total 189 29 220 7196.2 509.8 36154.9 38.1 17.6 2.17 0.008 P-Value 0.000 0.000 0.000 VIF Model Summary S 5.94550 R-sq 78.69% R-sq(adj) 78.49% R-sq(pred) 78.14% Coefficients Term Constant Duration Lagged interval Coef 50.14 9.159 -0.1673 SE Coef 4.06 0.466 0.0395 T-Value 12.35 19.65 -4.24 1.59 1.59 Regression Equation Interval = 50.14 + 9.159 Duration - 0.1673 Lagged interval We should note a mistake that we’ve made here: since these data have gaps corresponding to new days, cases 14, 27, 40, 54, 68, 82, 95, 108, 122, 136, 150, 164, 180, 194, and 208 should be considered missing in the Cochrane–Orcutt fit, since the lagged duration and interval are not known for those cases (case 1 is of course taken as missing). If this is done the results don’t change appreciably. c 2015, Jeffrey S. Simonoff 20
© Copyright 2026 Paperzz