Auxiliary Material - Springer Static Content Server

Auxiliary Material
To deepen the analysis of forecasting results obtained from the best models of each river,
scatterplots and relationships between residuals and model outputs are shown in Fig. 1. For
one-day ahead forecasting of Helge River streamflow, peak forecasts by the SETAR model
are closer to the 45-degree fit line than for the other models. Additionally, the residual plots
for the models show that the points are well distributed on both sides of the red horizontal line
of zero ordinate indicating mean of residuals. This implies complete independence and
random distribution that is a necessary requirement for a good model (Singh et al. 2009).
Scatter plots of residuals for Helge River show that all models fit observations well for 1-day
ahead forecasting. The 7-day ahead forecasting shows residuals clustered above the zero line
indicating correlated and non-normal error distribution. This is further supported by the
corresponding coefficient of determination R 2  0.001, R2  0.005 , and R2  0.002 for 1-day
ahead forecasting of SETAR, k-nnT, and ANN-PSR, respectively, and for 7-day ahead
forecasting the corresponding coefficient of determination is R2  0.007 , R2  0.030 , and
R2  0.071 , respectively, for the same models.
For Ljusnan River, while all models gave reasonable and similar results for 1-day ahead
forecasting, ANN-P was the best predictor model. For 7-day ahead forecasting, the SETAR
model forecasts are closer to the 45-degree fit line. However, there is a decreasing trend in the
residual plots resulting from under- and over-estimation. This phenomenon is clearer in 7-d
ahead forecasts. For instance, the clustered SETAR model residuals indicate that forecasts
below the threshold variable, r=243 m3s-1 are over-estimated and the forecasts above the
threshold are under-estimated. It is obvious that complex dynamics of this streamflow
augment such behavior of the models. The peak flows of this river are over-estimated leading
to a less good relationship for flows above 400 m3s-1. Thus, the residual plots as a function of
forecasted values help us to better examine the behavior of the models for specific values.
(a) SETAR
Helge River 1-day ahead forecasts
(b) k-nnT
1
(c) ANN-PSR
(d) SETAR
Helge 7-day ahead forecasts
(e) k-nnT
(f) ANN-PSR
Fig. 1 1-day and 7-day ahead forecasting performance for the best models and scatter plots of
residuals with forecasts for Helge, Ljusnan, and Kalix Rivers.
2
(g) SETAR
Ljusnan 1-day ahead forecasts
(h) k-nnT
(i) ANN-P
(j) SETAR
Ljusnan 7-day ahead forecasts
(k) k-nnT
(l) ANN-P
Fig. 1 (continued).
3
(m) SETAR
Kalix 1-day ahead forecasts
(n) k-nnA
(o) ANN-P
(p) SETAR
Kalix 7-day ahead forecasts
(q) k-nnT
(r) ANN-PSR
Fig. 1 (continued).
4
Finally, for Kalix River all models show superior performance including peak flows, while
the residuals are evenly distributed around the zero line. This indicates that the models are
adequate and reliable. For this river, and 7-day ahead forecasting the best model is SETAR
since its forecasts are closer to the fit line. In addition to that, its residuals are evenly
distributed. Above about 700 m3s-1, all model forecasts are poor except SETAR since this
model includes a separate model above the threshold value, i.e., 690 m3s-1. And the values
beyond the threshold value are not well captured with other models since their residuals
display more scatter around the zero line which indicates under/over-estimation of the
forecasts.
CE (-)
MAPE (%)
Helge
Ljusnan
Kalix
Fig. 2 CE and MAPE performance indices obtained for the testing period of 1 January 201031 December 2012 as a function of forecast horizons for Helge, Ljusnan, and Kalix Rivers.
5
In Fig. 2, higher and more stable CE indices were obtained for the Kalix River, and the lowest
and least persistent CE indices were obtained for Ljusnan River, where the lowest and highest
complexities were found, respectively (Fig. 2). The variation of CE with respect to the lead
time is more drastic for Ljusnan River with highest complexity. MAPE indices of Ljusnan and
Kalix rivers behave similarly while the latter are smaller; the highest MAPE indices were
obtained for Helge River. For 7-day ahead forecasting, best performance of the models was
obtained for Kalix River with the lowest complexity and worst performances for models were
obtained for Ljusnan River.
Generally speaking, CE values remain high for SETAR models as a function of lead-time as
well for AR models resulting from long-term temporal persistence for observed discharge.
The gradient of CE and MAPE values obtained for the ANN and k-nn models is steeper than
the SETAR due to error accumulation for the ANN and error propagation resulting from
nearest neighbor estimates for the k-nn.
The error distribution of forecasts at various lead-time is an important tool in assessment of
the performance of the models. In assessing the performance of a forecasting model at larger
lead-time, besides examining distribution of errors, it is important to evaluate the average
prediction error (Nayak et al. 2005). Average absolute relative error (AARE) is a useful tool
for testing the effectiveness of a model since the performance indices based on correlation
between the observed and the forecasted values might be informative in estimating the
continuous behavior of models through lead-time. For this purpose, average absolute relative
error (AARE) and threshold statistic (TS) were calculated (Aqil et al. 2007; Nayak et al.
2005). This was done not only to gain insight of the models’ performance but also to gain
insight of the behavior of error accumulation through lead-time. The criterion was calculated
as (Nayak et al. 2005):
AARE 
1 N
 REt
N i 1
(1)
where N is the total number of testing pattern. Relative error ( REt ) at time t , is calculated as:
REt  %  
Qto  Qt f
 100
Qto
6
(2)
where Qto is observed and Qt f is forecasted streamflow at time t . The threshold statistic at x%
level ( TS x ) can then be defined as:
TS x 
Qx
 100
N
(3)
where Qx is the number of forecasted streamflow out of N totally computed for which the
absolute relative error is less than x% from the model. To obtain clear results, cumulative
frequencies are estimated as an increment of 10 TS statistics.
As can be seen from Fig. 3, at 10% relative error the ANN model forecasted highest ratio of
the total number of flow values at one-day ahead forecasting for all rivers. The poorest
performances were obtained with the k-nnA model for Helge and Ljusnan Rivers, with the knnT for Kalix River in daily forecasting at 10% TS level.
The best AARE score was obtained for Kalix River for all models at 1-day lead time.
Almost all models forecasted 85% of the total of flows with less than 5% relative error
(results not shown) at 1-day ahead forecasting and about 97% of the total flows with less than
10% relative error (Fig. 3).
For 7-day ahead forecasting, the highest AARE score was obtained with SETAR model for
all rivers. The lowest AARE score was obtained with the k-nnA for Helge and Ljusnan
Rivers, and with the ANN-PSR model for Kalix River. As stated by Aqil et al. (2007) and StHilaire et al. (2012) the poor generalization of models as the lead time increases might be a
result from error accumulation at previous steps. The largest error accumulation as lead-time
increase was for the k-nnA model since its AARE score is the lowest while TS variable
approaches 100%. The poor performance of the k-nn models for 7-day ahead forecasting of
streamflow might be due to the phase-space reconstruction. This is leading of a singlevariable to a multi-dimensional phase-space to represent the underlying dynamics which
might cause rapid loss of information.
7
(a) Helge
(b) Ljusnan
(c) Kalix
Fig. 3 Average absolute relative error and threshold statistics for 1- and 7-day ahead forecasts
for Helge, Ljusnan, and Kalix Rivers.
8
REFERENCES
Aqil M, Kita I, Yano A, Nishiyama S (2007) Neural networks for real time catchment flow
modeling and prediction. Water Resources Management: 21(10), 1781-1796. doi:
10.1007/s11269-006-9127-y
Nayak P, Sudheer K, Rangan D, Ramasastri K (2005) Short‐term flood forecasting with a
neurofuzzy model. Water Resources Research: 41(4),W04004. DOI:
10.1029/2004WR003562
Singh KP, Basant A, Malik A, Jain G (2009) Artificial neural network modeling of the river
water quality—a case study. Ecological Modelling: 220(6), 888-895.
St-Hilaire A, Ouarda TBMJ, Bargaoui Z, Daigle A, Bilodeau L (2012) Daily river water
temperature forecast model with a k-nearest neighbour approach. Hydrological
Processes: 26(9), 1302-1310.
9