A comparative study of artificial neural network, adaptive neuro fuzzy

Journal of Hydrology 509 (2014) 379–386
Contents lists available at ScienceDirect
Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol
A comparative study of artificial neural network, adaptive neuro fuzzy
inference system and support vector machine for forecasting river flow
in the semiarid mountain region
Zhibin He a,b,⇑, Xiaohu Wen b, Hu Liu a,b, Jun Du a,b
a
b
Linze Inland River Basin Research Station, Chinese Ecosystem Research Network, China
Key Laboratory of Ecohydrology of Inland River Basin, Cold and Arid Regions Environmental and Engineering Research Institute, Chinese Academy of Sciences, Lanzhou 730000, China
a r t i c l e
i n f o
Article history:
Received 18 June 2012
Received in revised form 10 April 2013
Accepted 27 November 2013
Available online 2 December 2013
This manuscript was handled by Andras
Bardossy, Editor-in-Chief, with the
assistance of K.P. Sudheer, Associate Editor
Keywords:
River flow forecasting
Artificial neural network
Adaptive neuro fuzzy inference system
Support vector machine
s u m m a r y
Data driven models are very useful for river flow forecasting when the underlying physical relationships
are not fully understand, but it is not clear whether these data driven models still have a good performance in the small river basin of semiarid mountain regions where have complicated topography. In this
study, the potential of three different data driven methods, artificial neural network (ANN), adaptive
neuro fuzzy inference system (ANFIS) and support vector machine (SVM) were used for forecasting river
flow in the semiarid mountain region, northwestern China. The models analyzed different combinations
of antecedent river flow values and the appropriate input vector has been selected based on the analysis
of residuals. The performance of the ANN, ANFIS and SVM models in training and validation sets are compared with the observed data. The model which consists of three antecedent values of flow has been
selected as the best fit model for river flow forecasting. To get more accurate evaluation of the results
of ANN, ANFIS and SVM models, the four quantitative standard statistical performance evaluation
measures, the coefficient of correlation (R), root mean squared error (RMSE), Nash–Sutcliffe efficiency
coefficient (NS) and mean absolute relative error (MARE), were employed to evaluate the performances
of various models developed. The results indicate that the performance obtained by ANN, ANFIS and SVM
in terms of different evaluation criteria during the training and validation period does not vary substantially; the performance of the ANN, ANFIS and SVM models in river flow forecasting was satisfactory. A
detailed comparison of the overall performance indicated that the SVM model performed better than
ANN and ANFIS in river flow forecasting for the validation data sets. The results also suggest that ANN,
ANFIS and SVM method can be successfully applied to establish river flow with complicated topography
forecasting models in the semiarid mountain regions.
Ó 2013 Elsevier B.V. All rights reserved.
1. Introduction
River flow forecasting is very important for water resources
system planning and management, especially in arid area where
water resources is scarce, river flow forecasting is useful to water
resources temporal and spatial planning and distributions. River
flow forecasting has been studied by various scientists during the
past few decades. Generally, river flow models can be classed into
the two main groups, physical based models and data driven models. Typically, physically based models are complex and require
sophisticated mathematical tools, a significant amount of calibration data, and some degree of expertise and experience with the
models (Aqil et al., 2007). While data driven models do not provide
⇑ Corresponding author at: Linze Inland River Basin Research Station, Chinese
Ecosystem Research Network, China. Tel.: +86 931 4967165.
E-mail address: [email protected] (Z. He).
0022-1694/$ - see front matter Ó 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.jhydrol.2013.11.054
any information on the physics of the hydrologic processes, they
are very useful for river flow forecasting where the main concern
is accurate predictions of runoff (Nayak et al., 2005; Chau et al.,
2005; Wu et al., 2009). Recently, three data driven methods that
have been gained popularity as an emerging and challenging computational technology such as artificial neutral networks (ANNs),
adaptive neuro fuzzy inference system (ANFIS) and support vector
machine (SVM). These methods offer advantages over conventional
modeling, including the ability to handle large amounts of noisy
data from dynamic and nonlinear systems, especially when the
underlying physical relationships are not fully understood.
In the past few decades, ANNs and ANFIS methods have been
extensively used in a wide range of engineering applications
including hydrology, such as for rainfall–runoff simulation
(Nourani et al., 2009; Talei et al., 2010; Wu and Chau, 2011),
groundwater modeling (Kuo et al., 2004; Daliakopoulos et al.,
2005; Sahoo et al., 2005; Ghose et al., 2010; Taormina et al.,
380
Z. He et al. / Journal of Hydrology 509 (2014) 379–386
and the gradient can be computed as
2012), river flow forecasting(El-Shafie et al., 2006; Shu and Ouarda,
2008) and water quality modeling (Singh et al., 2009; Yan et al.,
2010). Recently, SVMs are gaining recognition in hydrology
(Moradkhani et al., 2004; Yu et al., 2006; Lin et al., 2006; Wu
et al., 2008; Lin et al., 2009; Chen et al., 2010; Yoon et al., 2011).
But for some catchments where have a very few meteorological
observatories and have complicated topography, it is not clear
whether these data driven models still have a good performance.
In this study, the ANN, ANFIS and SVM were used to forecast
river flow in a smaller catchment in the Qilian Mountains of northwestern China and the results obtained are compared to each
other. The purpose of this study is to investigate the accuracy of
three different data driven models ANN, ANFIS and SVM in
modeling daily river flow, and evaluate the performance of three
data driven models in the small river basin of semiarid mountain
regions where have complicated topography.
where l is the learning rate and I is the identity matrix (Dedecker
et al., 2004). During training the learning rate l is incremented or
decremented by a scale at weight updates. When l is zero, this is
just Newton’s method, using the approximate Hessain matrix.
When l is large, this becomes gradient descent with a small step
size.
2. Methodology
2.3. Adaptive neuro fuzzy inference system (ANFIS)
2.1. Artificial neural network (ANN)
ANFIS, first introduced by Jang (1993), is a universal approximator and as such is capable of approximating any real continuous
function on a compact set to any degree of accuracy (Jang et al.,
1997). ANFIS is functionally equivalent to fuzzy inference systems.
Specifically the ANFIS system of interest here is functionally equivalent to the Sugeno first-order fuzzy model (Drake, 2000). Below,
the hybrid learning algorithm, which combines gradient descent
and the least-squares method, is introduced.
As a simple example we assume a fuzzy inference system with
two inputs x and y and one output z. The first-order Sugeno fuzzy
model, a typical rule set with two fuzzy If-Then rules can be expressed as
ANN is a massively parallel distributed information processing
system that has certain performance characteristics resembling
biological neural networks of the human brain (Haykin, 1999). A
neural network is characterized by its architecture that represents
the pattern of connection between nodes, its method of determining the connection weights and the activation function. The most
commonly used neural network structure is the feed forward hierarchical architecture. A typical three-layered feed-forward neural
network is comprised of a multiple elements also called nodes,
and connection pathways that link them. The nodes are processing
elements of the network and are normally known as neurons,
reflecting the fact the neural network method model is based on
the biological neural network of the human brain. A neuron receives an input signal, processes it, and transmits an output signal
to other interconnected neurons.
In the hidden and output layers, the net input to unit i is of the
form
Z¼
k
X
wji yj þ hi
ð1Þ
j¼1
where wji is the weight vector of unit i and k is the number of neurons in the layer above the layer that includes unit i. yj is the output
from unit j, and yi is the bias of unit i. This weighted sum Z; which is
called the incoming signal of unit i, is then passed through a transfer
^i for unit i. The sigmoid function is
function f to yield the estimates y
continuous, differentiable everywhere, and monotonically increasing. The sigmoid transfer function, fi, of unit i, is of the form
^i ¼
y
1
:
1 þ eZ
ð2Þ
A training algorithm is needed to solve a neural network problem. Since there are so many types of algorithms available for
training a network, selection of an algorithm that provides the best
fit to the data is required. Levenberg–Marquardt learning algorithm was used increasingly due to the better performance and
learning speed with a simple structure.
2.2. Levenberg–Marquardt algorithm
The Levenberg–Marquardt algorithm (LMA), is similar to the
quasi-Newton method in which a simplified form of the Hessian
matrix (second derivative) is used. The Hessian matrix can be
approximated as:
T
H¼J J
ð3Þ
g ¼ JT e
ð4Þ
in which J is the Jacobian matrix which contains first derivatives of
the network errors with respect to the weights and biases, and e is a
vector of network errors. An iteration of this algorithm can be written as
h
i1
xkþ1 ¼ xk J T J þ lI J T e
ð5Þ
Rule 1 : If x is A1 and y is B1 ; then f 1 ¼ p1 x þ q1 y þ r 1
ð6Þ
Rule 2 : If x is A2 and y is B2 ; then f 2 ¼ p2 x þ q2 y þ r 2
ð7Þ
where p1, q1, r1 and p2, q2, r2 are the parameters in the then-part
(consequent part) of the first-order Sugeno fuzzy model. The architecture of ANFIS consists of five layers (Fig. 1), and a brief introduction of the model is as follows.
Layer 1: Each node of this layer generates membership grades
to which they belong to each of the appropriate fuzzy sets using
membership functions.
O1;i ¼ lAi ðxÞ for i ¼ 1; 2
ð8Þ
O1;i ¼ lBi2 ðyÞ for i ¼ 3; 4
ð9Þ
where x, y are the crisp input to the node i; Ai and Bi are the fuzzy
set associated with this node, characterized by the shape of the
membership functions (MFs) in this node and can be any appropriate functions that are continuous and piecewise differentiable such
as Gaussian, generalized bell shaped, trapezoidal shaped and triangular shaped functions. The membership functions for A and B are
generally described by generalized bell functions, e.g.
lAi ðxÞ ¼
1
1 þ ½ðx ci Þ=ai 2bi
ð10Þ
where {ai, bi, ci} is the parameter set that changes the shapes of the
MFs with maximum equal to 1 and minimum equal to 0.
Q
Layer 2: This layer consists of the nodes labeled
which multiply incoming signals and sending the product out. For instance,
O2;i ¼ wi ¼ lAi ðxÞlBi2 ðyÞ i ¼ 1; 2
ð11Þ
Layer 3: Every node in this layer is a fixed node labeled N. The
ith node calculates the ratio between the ith rule’s firing strength
to the sum of all rules’ firing strengths:
381
Z. He et al. / Journal of Hydrology 509 (2014) 379–386
Fig. 1. Architecture of the ANFIS.
O3;i ¼ wi ¼
wi
w1 þ w2
i ¼ 1; 2
ð12Þ
Layer 4: Node i in this layer computes the contribution of the ith
rule towards the model output, with the following node function:
O4;i ¼ wi fi ðpi x þ qi y þ r i Þ
ð13Þ
i is the output of layer 3 and {pi, qi, ri} are the parameters
where w
set
Layer 5: The single node in this layer computes the overall output of the ANFIS as:
O5;i ¼
P
i wi f
X
w
f
¼
i i i
wi
X
ð14Þ
i
The distinguishing characteristic of the approach is that ANFIS
applies a hybrid-learning algorithm, the gradient descent method
and the least-squares method, to update parameters. The gradient
descent method is employed to tune premise non-linear parameters {ai, bi, ci}, while the least-squares method is used to identify
consequent linear parameters {pi, qi, ri}. As shown in Fig. 1, the circular nodes are fixed (i.e., not adaptive) nodes without parameter
variables, and the square nodes have parameter variables (the
parameters are changed during training). The task of the learning
procedure has two steps: In the first step, the least square method
to identify the consequent parameters, while the antecedent
parameters (membership functions) are assumed to be fixed for
the current cycle through the training set. Then, the error signals
propagate backward. Gradient descent method is used to update
the premise parameters, through minimizing the overall quadratic
cost function, while the consequent parameters remain fixed. The
detailed algorithm and mathematical background of the hybridlearning algorithm were detail introduced by Jang (1993).
In this study, various membership functions for the ANFIS structure were used to demonstrate the effect of choice of membership
functions on the model performance. Among the different possible
types of membership functions, the triangular membership function as a simple straight line function was found to give better results and was thus adopted. In addition, the type of membership
function used was not found to be a critical consideration in ANFIS
model performance (Nayak et al., 2004). Choosing the number of
membership functions for each input reflects the complexity of
ANFIS model for selecting parameters. In each application, different number of membership functions was tried and the best one
that gives the minimum squared error (MSE) was selected. Two
or three triangular membership functions to the ANFIS models
were found enough for modeling suspended sediment
concentration.
2.4. Support vector machine (SVM)
SVM is the state-of-the-art neural network technology based on
statistical learning (Vapnik, 1995). The structure of an SVM is not
determined a priori. Input vectors supporting the model structure
are selected through a model training process described below.
N
Given a set of training data fðxi þ di Þgi (xi is the input vector, di
is the desired value and N is the total number of data patterns),
the regression function of SVM is formulated as follows:
f ðxÞ ¼ xi /i ðxÞ þ b
ð15Þ
where wi is a weight vector, and b is a bias. /i denotes a nonlinear
transfer function that maps the input vectors into a high-dimensional feature space in which theoretically a simple linear regression can cope with the complex nonlinear regression of the input
space. Vapnik (1995) introduced the following convex optimization
problem with an e-insensitivity loss function to obtain the solution
to Eq. (16):
Minimize :
!
N
X
1
kxk2 þ C
ni þ ni
2
i
8
>
< xi /ðxi Þ þ bi di 6 e þ ni ;
Subject to di xi /ðxi Þ bi 6 e þ ni ;
>
:
ni ; ni ; i ¼ 1; 2; 3; . . . ; N
ð16Þ
i ¼ 1; 2; . . . ; N
i ¼ 1; 2; . . . ; N
ð17Þ
where ni and ni are slack variables that penalize training errors by
the loss function over the error tolerance n and C is a positive tradeoff parameter that determines the degree of the empirical error in
the optimization problem. Eq. (17) is usually solved in a dual form
using Lagrangian multipliers and imposing the Karush–Kuhn–
Tucker (KKT) optimality condition. The input vectors that have
nonzero Lagrangian multipliers under the KKT condition support
the structure of the estimator and are called support vectors.
A number of algorithms have been suggested for solving the
dual optimization problem of the SVM. An overview of these algorithms is found in Shevade et al. (2000). Conventional quadratic
programming algorithms require extremely large memory for the
kernel matrix computation and have difficulties in their implementation. Therefore, they are not suitable for large problems. To overcome this problem, subset selection methods have been developed.
The optimization problem is solved in a selected subset to give a
set of support vectors, and then a new subset is selected using
these support vectors. This process continues until all the input
vectors satisfy the KKT conditions. The sequential minimal optimization (SMO) algorithm, introduced by Platt (1999), puts the subset
selection algorithm to the extreme by selecting a subset of size two
and optimizing the estimation function with respect to them. The
382
Z. He et al. / Journal of Hydrology 509 (2014) 379–386
main advantage of the SMO is that an analytical solution of a subset can be obtained directly without invoking a quadratic optimizer. In this study, the SMO algorithm was employed to train
the SVM model for river flow predictions. The detailed procedures
of the SMO algorithm are found in Platt (1999). A more detailed
discussion of the theory of SVM technique can be found in the literature (Cristianini and Shawe-Taylor, 2000; Yu et al., 2006; Chen
and Yu, 2007; Wang et al., 2009).
In this paper, radius basis function (RBF) was selected as the
kernel function, penalty parameter C and kernel function’s parameter c for the SVM were determined through grid-search algorithm
with cross-validation as described by Hsu et al. (2010). Firstly, a
coarse grid search was used to determine the best region of these
three-dimension grids. Then, a finer grid search was conducted to
find the optimal parameters. A fivefold cross-validation was employed in this study. The calibration and the following forecasting
work in this study were performed using the programming codes
of a Library for Support Vector Machines (LIBSVM) software (Chang
and Lin, 2011) to perform this study.
2.5. Study area
Pailugou catchment (100°170 E, 38°240 N) is located in the Qilian
Mountains, in northwestern China’s Gansu province. The catchment’s total area is 2.91 km2, and elevations range from 2650 to
3800 m. Mean annual rainfall is 375.5 mm at an elevation of
2700 m, and precipitation generally increases with increasing elevation, by about 4.3% per 100 m. About 65% of the precipitation
falls during the summer (July to September). Mean annual temperatures decrease with increasing elevation, from 2 °C at the base of
the catchment (2650 m) to 6.3 °C near the summit (3800 m). Permanently and seasonally frozen soils are widespread at middle and
higher elevations. The main parent material is calcareous rock,
with a relatively thin soil layer above the rock, a coarse soil texture,
an intermediate organic matter content, and a pH ranging from 7
to 8. Owing to the steep temperature and precipitation gradients,
vegetation is present as a mosaic of patches of grassland, scrubland, and forest. Forests are mostly found on shaded slopes and
semi-shaded at intermediate elevations (i.e., 2650–3450 m),
whereas sunny slopes are mostly occupied by grasslands. In the
catchment of Pailugou, Picea crassifolia is the only tree species in
the study area, and is found primarily between elevations of
2650 and 3450 m; alpine shrub communities are found at elevations between 3200 and 3650 m. The dominant shrub species,
which grow primarily where trees are not found, are Salix oritrepha,
Rhododendron przewalskii, Caragana jubata. The dominant understory vegetation species, which grow under both trees and shrubs,
are Potentilla fruticosa, Potentilla glabra, Lonicera microphylla, Kobresis bellardii, Polygonum viviparum. The dominant species on sunny
slopes are Stipa purpurea, Kobresis humilis, Agropyron cristatum.
3. Model development
3.1. Description of data
In this study, the performance of ANNs, ANFIS and SVM were
examined on daily flow. To achieve this, a 6-year flow data was
available from 2001 to 2003 and from 2009 to 2011. In total, the
number of flow data were available were 2190 days. The data were
divided into two sets: a training data set consisting of years 2001–
2003 and a validation data set of years 2009–2011. A full year data
set used in the identification period enabled inclusion of various
hydrological conditions that are observable during different seasons of the year. In this way the model become robust for the different hydrological conditions that prevail in the total time series
(Kisßi, 2006). The daily statistical parameters of the flow data are given in Table 1. It can be seen that the flow data shows a significantly high skewed distribution (2.81 and 2.78 for training and
test data set, respectively). First, several input combinations were
tried using ANN, ANFIS and SVM to estimate daily flow. The numbers of lags were selected according to the partial auto-correlation
function (PCF) of daily flow data. The PCF of the daily data is shown
in Fig. 2. It is clear from this figure that first three lags have significant effect on Qt+1. Thus, three previous lags were considered as
inputs to the models in this study. The inputs present the previous
flow (t, t1 and t2) and the output layer node corresponds to the
flow at time t+1. Thus, the following combinations of input data of
flow were evaluated:
(1) Qt,
(2) Qt and Qt1,
(3) Qt, Qt1 and Qt2.
3.2. Data preprocessing
In order to achieve effective network training, the data are
needed to be normally distributed using an appropriate transformation method. Shanker et al. (1996); Luk et al. (2000) reported
that networks trained on transformed data achieve better performance and faster convergence in general. Besides, Aqil et al.
(2007) showed that the data preprocessing with log sigmoidal activation function before processing the ANN and ANFIS models. In
this study, transformation is performed on all time-series data
independently using the following equation.
z ¼ a log10 ðG þ bÞ
ð18Þ
where z is the transformed value of river flow, a is an arbitrary constant, and b was set to 1 to avoid the entry of zero river flow in the
log function. The final forecast results were then back transformed
using the following equation:
G ¼ 10z=a b
ð19Þ
3.3. Models performance criteria
The performances of the models developed in this study were
assessed using various standard statistical performance evaluation
criteria. The statistical measures considered were coefficient of
correlation (R), root mean squared error (RMSE), Nash–Sutcliffe
efficiency coefficient (NS) and mean absolute relative error
(MARE). The R measures the degree to which two variables are linearly related. RMSE and MARE provide different types of information about the predictive capabilities of the model. The RMSE
measures the goodness-of-fit relevant to high flow values whereas
the MARE not only gives the performance index in terms of predicting flow but also the distribution of the prediction errors.
Coefficient of correlation (R) is defined as the degree of correlation between the observed and predicted values:
Pn o
o
Q pi Q p
i¼1 Q i Q
R ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2 P 2ffi
Pn o
n
o
p
p
Q
Q
Q
Q
i¼1
i¼1
i
i
ð20Þ
Table 1
The flow statistical parameters of each data set.
Training
Test
Qmin
Qmax
Qmean
Qstdev
Qske
4.00
7.28
9604.22
8841.12
617.65
559.74
1046.72
1082.47
2.81
2.78
Qmin, Qmax, Qmean, Qstdev, Qske denote the minimum, maximum, mean, standard
deviation and skewness coefficient of the flow data for training and test data.
Z. He et al. / Journal of Hydrology 509 (2014) 379–386
Fig. 2. Partial auto-correlation function of daily flow data.
The root mean square error (RMSE) can be calculated as follows:
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2
1 Xn o
RMSE ¼
Q i Q pi
i¼1
n
ð21Þ
The Nash–Sutcliffe efficiency coefficient (NS) can be calculated
as follows:
Pn 2
Q oi Q pi
NS ¼ 1 2
Pn o
o
i¼1 Q i Q
i¼1
ð22Þ
The mean absolute relative error (MARE) can be calculated as
follows:
MARE ¼
p
n o
1X
Q i o Q i 100
n i¼1 Q i ð23Þ
where n is the number of input samples; Q oi and Q pi are the observed
and predicted flow at time t, Q o and Q p are the mean of the observed and predicted river flow, The best fit between observed
and calculated values would have R = 1, RMSE = 0, NS = 1 and
MARE = 0, respectively.
3.4. Network architecture for ANN model
In this study, the focus is given to the performance of the ANN
model in 1-day-ahead prediction of river flow. Three input combinations based on daily flow of previous periods are evaluated.
Assume that Qt denotes the river flow at time t, the input combinations evaluated in the study are (1) Qt, (2) Qt and Qt1 and (3) Qt,
Qt1 and Qt2. In all cases, the output layer has only one neuron,
the river flow Qt+1.
After the input and output variables were selected, the ANN
architecture was investigated. The next step in the development
of the ANN model was the determination of the optimum number
of neurons (N) in the hidden layer. In this study, the number of
neuron in the hidden layer was determined by trial and error approach. The optimal number of neurons in the hidden layer was
identified using a trial and error procedure by varying the number
of hidden neurons from 2 to 15. The cross-validation method is
used and the number of hidden neurons was selected based on
the RMSE. When the numbers of hidden neurons are 7, 3 and 5,
the training error is the closest to the testing error. Therefore, the
numbers of hidden neurons are selected to be 7, 3 and 5 for combinations input (1), (2) and (3), respectively.
4. Results and discussion
The ANN, ANFIS and SVM models with different input were
compared based on their performance in training sets and
validation sets. The results were summarized in Tables 2–4. It
383
was apparent that all of the performances of these models are similar during training as well as validation. It also showed that the
model, which consisted of three antecedent flows data in input,
had the smallest value of the RMSE as well as higher value of R
and NS in the training as well as validation period, so, it was selected as the best-fit model for predicting the river flow in this
study. In order to get an effective evaluation of the ANN, ANFIS
and SVM models performance, the best model structures, has been
used to compare the ANN, ANFIS and SVM models. From the bestfit model, it was found that the difference between the values of
the statistical indices of the training and validation set does not
vary substantially. It was observed that all three models generally
gave low values of the RMSE and MARE as well as high R and NS,
the performances of the ANN, ANFIS and SVM models performance
in the river flow forecasting were satisfactory.
A model can be claimed to produce a perfect estimation if the NS
criterion is equal to 1. Normally, a model can be considered as accurate if the NS criterion is greater than 0.8 (Shu and Ouarda, 2008).
The NS values for the ANN, ANFIS and SVM models in this study
are all over 0.8 which indicates that both types of models achieved
acceptable results. The NS values for the SVM model predict of the
flow value were higher than those for the ANFIS and ANN models,
which indicates that the overall quality of estimation of the SVM
model is better than the ANFIS and ANN models according to NS.
Compared with the ANN, ANFIS and SVM models perform from
the RMSE and R viewpoints, the SVM model performed a bit better
than both the ANN and the ANFIS model. Concretely, SVM model
produced a lower RMSE as well as higher R, is the former being
the best according to the criteria.
It appears that while assessing the performance of any model
for its applicability in predicting river flow, it is not only important
to evaluate the average prediction error but also the distribution of
prediction errors. The statistical performance evaluation criteria
employed so far in this study are global statistics and do not provide any information on the distribution of errors. Therefore, in order to test the robustness of the model developed, it is important to
test the model using some other performance evaluation criteria
such as mean absolute relative error (MARE). The MARE index provides an indication about whether a model tends to overestimate
or underestimate. The analysis based on the MARE index suggests
that the SVM model performed better than the ANN and ANFIS
model. This indicates that the errors obtained when using the
SVM model are more symmetric around zero but show more dispersion than those obtained when using the ANFIS and ANN model.
The SVM have 59.94% estimates lower than the 5% relative error
during validation period while the ANN have 27.90% and ANFIS
have 26.80% estimates lower than the 5% error, respectively. Furthermore, The SVM have 72.65% estimates lower than the 10% relative error during validation period while the ANN have 64.92%
and ANFIS have 66.57% estimates lower than the 10% error, respectively. The SVM model seems to perform better than the other
models from the relative error viewpoint. Therefore, SVM model
was the most effective model in terms of forecasting flow accurately during validation set.
Figs. 3–5 showed the hydrograph and scatter plots of both the
observed data and the predicted obtained by using the ANN, SVM
and ANFIS model of the validation period. It was obviously seen
from the hydrographs and scatter plots that the SVM estimates
were closer to the corresponding observed flow values than those
of the other models. As seen from the fit line equations (assume
that the equation is y = ax + b) in the scatter plots that a and b coefficients for the SVM model are, respectively, closer to the 1 and 0
with a higher R value of 0.935 than ANN and ANFIS models.
The models of ANN, ANFIS and SVM showed good prediction
accuracy for low values of flow but were unable to maintain their
accuracy for high values of flow (Figs. 3–5). However, a significant
384
Z. He et al. / Journal of Hydrology 509 (2014) 379–386
Table 2
The structure and the performance statistics of the ANN models during the training and validation periods.
Input
Structure
Qt
Qt, Qt1
Qt, Qt1, Qt2
1–7–1
2–3–1
3–5–1
Training
Validation
R
RMSE
MARE
NS
R
RMSE
MARE
NS
0.919
0.911
0.906
415.434
434.613
444.670
16.015
15.748
16.118
0.843
0.828
0.820
0.940
0.931
0.938
391.486
433.635
388.255
14.214
13.186
12.802
0.869
0.840
0.871
Table 3
The structure and the performance statistics of the ANFIS models during the training and validation periods.
Input
Qt
Qt, Qt1
Qt, Qt1, Qt2
Number of MF
3
3, 3
2, 2, 3
Training
Validation
R
RMSE
MARE
NS
R
RMSE
MARE
NS
0.907
0.920
0.930
442.048
413.239
387.444
15.835
15.580
15.868
0.822
0.844
0.863
0.947
0.946
0.936
368.255
381.155
392.530
12.031
14.296
13.637
0.884
0.876
0.869
Table 4
The structure and the performance statistics of the SVM models during the training and validation periods.
Input
Parameter (C, c)
Training
R
RMSE
MARE
NS
R
RMSE
MARE
NS
Qt
Qt, Qt1
Qt, Qt1, Qt2
2, 0.5
2, 0.125
16, 0.25
0.911
0.909
0.925
432.134
437.332
399.641
14.627
14.973
14.443
0.830
0.826
0.854
0.937
0.947
0.947
404.258
371.484
364.555
11.431
11.371
11.713
0.861
0.882
0.887
Validation
Fig. 3. The observed and forecasted flow values by ANN in the validation period.
Z. He et al. / Journal of Hydrology 509 (2014) 379–386
Fig. 4. The observed and forecasted flow values by SVM in the validation period.
Fig. 5. The observed and forecasted flow values by ANFIS in the validation period.
385
386
Z. He et al. / Journal of Hydrology 509 (2014) 379–386
improvement is observed for the ANFIS in the peak flow prediction
compared to ANN and SVM. Although these models underestimated the peak flow, the ANFIS underestimated it by 12.18% as
opposed to 36.24% and 41.76% for the ANN and SVM.
Overall, the ANN, ANFIS and SVM models can give good
prediction performance and could be successfully applied to establish the forecasting models that could provide accurate and reliable
daily flow prediction. The results suggest that the SVM model was
superior to the ANN and ANFIS in the river flow forecasting.
5. Conclusions
In this study, ANN, ANFIS and SVM models were developed for
forecasting the short term of river flow based on antecedent values
of river flow data. For achieving this objective, the Pailugou station
located in the Heihe River has been selected as case study. The results of ANN, ANFIS and SVM models and observed values were
compared and evaluated based on their training and validation
performance. The results demonstrated that ANN, ANFIS and
SVM can be applied successfully to establish accurate and reliable
river flow forecasting. According to the results, the model which
consists of three antecedent values of river flow has been selected
as the best fit forecasting model. Comparing the results of ANN,
ANFIS and SVM models, it was seen that the values of R and NS
of SVM models were higher than those of ANN and ANFIS models.
Moreover, the RMSE values of SVM models were lower than those
of ANN and ANFIS models. Therefore, SVM model could improve
the accuracy over the ANN and ANFIS models. The results also
demonstrated ANN, ANFIS and SVM showed good prediction
accuracy for low values of flow but were unable to maintain their
accuracy for high values of flow. However, a significant improvement is observed for the ANFIS in the peak flow prediction compared to ANN and SVM. Overall, the analysis presented in this
study provides that the SVM method was superior to the ANN
and ANFIS in the river flow forecasting.
Although the results presented here are promising and these
data driven models can be successfully applied to establish river
flow with complicated topography forecasting models, these
models underestimate significantly flow in the flood conditions.
In future work, further research is necessary to improve the prediction accuracy, especially for the high values of flow, by combining
or improving model parameters.
Acknowledgments
The authors thank the editors and anonymous reviewers for
their critical review and comments on this manuscript. This work
was supported by the Major Research Plan of the National Natural
Science Foundation of China (91025014), the Hundred Talents
Program of the Chinese Academy of Sciences (29Y127D11), the
National Natural Science Foundation of China (41271524) and
the Open Foundation of Key Laboratory of Ecohydrology of Inland
River Basin (90Y229F51).
References
Aqil, M., Kita, I., Yano, A., et al., 2007. A comparative study of artificial neural
networks and neuro-fuzzy in continuous modeling of the daily and hourly
behaviour of runoff. J. Hydrol. 337, 22–34.
Chang, C.C., Lin, C.J., 2011. LIBSVM: a Library for Support Vector Machines. ACM
Transactions on Intelligent Systems and Technology 2(3), 1–27. <http://
www.csie.ntu.edu.tw/cjlin/libsvm>.
Chau, K.W., Wu, C.L., Li, Y.S., 2005. Comparison of several flood forecasting models
in Yangtze River. J. Hydrol. Eng. 10, 485–491.
Chen, S.T., Yu, P.S., 2007. Pruning of support vector networks on flood forecasting.
J. Hydrol. 347, 67–78.
Chen, S.T., Yu, P.S., Tang, Y.H., 2010. Statistical downscaling of daily precipitation
using support vector machines and multivariate analysis. J. Hydrol. 385, 13–22.
Cristianini, N., Shawe-Taylor, J., 2000. An Introduction to Support Vector Machines and
other Kernel-based Learning Methods. Cambridge University Press, Cambridge.
Daliakopoulos, I.N., Coulibaly, P., Tsanis, I.K., 2005. Groundwater level forecasting
using artificial neural networks. J. Hydrol. 309, 229–240.
Dedecker, A.P., Goethals, P.L.M., Gabriels, W., et al., 2004. Optimization of artificial
neural network (ANN) model design for prediction of macro invertebrates in the
Awalm river basin (Flanders Belgium). Ecol. Model. 174, 161–173.
Drake, J.T., 2000. Communications phase synchronization using the adaptive
network fuzzy inference system. Ph.D. Thesis. New Mexico State University,
Las Cruces, New Mexico, USA.
El-Shafie, A., Taha, M.R., Noureldin, A., 2006. A neuro-fuzzy model for inflow
forecasting of the Nile river at Aswan high dam. Water Resour. Manage. 21,
533–556.
Ghose, D.K., Panda, S.S., Swain, P.C., 2010. Prediction of water table depth in western
region, Orissa using BPNN and RBFN neural networks. J. Hydrol. 394, 296–304.
Haykin, S., 1999. Neural Network-a Comprehensive Foundation. Prentice-Hall,
Englewood Cliffs, NJ.
Hsu, C.W., Chang, C.C., Lin, C.J., 2010. A practical guide to support vector
classification.
URL
<http://www.csie.ntu.edu.tw/~cjlin/papers/guide/
guide.pdf>.
Jang, J.S.R., 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE
Trans Syst, Man, Cybernet 23, 665–685.
Jang, J.S.R., Sun, C.T., Mizutani, E., 1997. Neuro-fuzzy and Soft Computing:
A Computational Approach to Learning and Machine Intelligence. PrenticeHall, New Jersey.
Kisßi, Ö., 2006. Daily pan evaporation modelling using a neuro-fuzzy computing
technique. J. Hydrol. 329, 636–646.
Kuo, Y.M., Liu, C.W., Lin, K.H., 2004. Evaluation of the ability of an artificial neural
network model to assess the variation of groundwater quality in an area of
blackfoot disease in Taiwan. Water Res. 38, 148–158.
Lin, J.Y., Cheng, C.T., Chau, K.W., 2006. Using support vector machines for long-term
discharge prediction. Hydrolog. Sci. J. 51, 599–612.
Lin, G.F., Chen, G.R., Huang, P.Y., et al., 2009. Support vector machine-based models
for hourly reservoir inflow forecasting during typhoon-warning periods.
J. Hydrol. 372, 17–29.
Luk, K.C., Ball, J.E., Sharma, A., 2000. A study of optimal model lag and spatial inputs
to artificial neural network for rainfall forecasting. J. Hydrol. 227, 56–65.
Moradkhani, H., Hsu, K.-L., Gupta, H.V., et al., 2004. Improved streamflow
forecasting using self-organizing radial basis function artificial neural
networks. J. Hydrol. 295, 246–262.
Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2004. A neuro-fuzzy
computing technique for modeling hydrological time series. J. Hydrol. 291,
52–66.
Nayak, P.C., Sudheer, K.P., Rangan, D.M., et al., 2005. Short-term flood forecasting
with a neurofuzzy model. Water Resour. Res. 41, 2517–2530.
Nourani, V., Komasi, M., Mano, A., 2009. A multivariate ANN-wavelet approach for
rainfall–runoff modeling. Water Resour. Manage. 23, 2877–2894.
Platt, J.C., 1999. Fast training of support vector machines using sequential
minimaloptimization. In: Schokopf, B., Burges, C.J.C., Smolar, A.J. (Eds.),
Advances in Kernel Methods—Support Vector Learning. MIT Press, Cambridge,
Massachusetts, USA.
Sahoo, G.B., Ray, C., Wade, H.F., 2005. Pesticide prediction in ground water in North
Carolina domestic wells using artificial neural networks. Ecol. Model. 183, 29–
46.
Shanker, M., Hu, M.Y., Hung, M.S., 1996. Effect of data standardization on neural
network training. Int. J. Mange. Sci. 24, 385–397.
Shevade, S.K., Keerthi, S.S., Bhattacharyya, C., et al., 2000. Improvements to the SMO
algorithm for SVM regression. IEEE Trans. Neural Network 11, 1188–1193.
Shu, C., Ouarda, T.B.M.J., 2008. Regional flood frequency analysis at ungauged sites
using the adaptive neuro-fuzzy inference system. J. Hydrol. 349, 31–43.
Singh, K.P., Basant, A., Malik, A., et al., 2009. Artificial neural network modeling of
the river water quality—a case study. Ecol. Model. 220, 888–895.
Talei, A., Chua, L.H.C., Wong, T.S.W., 2010. Evaluation of rainfall and discharge
inputs used by Adaptive Network-based Fuzzy Inference Systems (ANFIS) in
rainfall–runoff modeling. J. Hydrol. 391, 248–262.
Taormina, R., Chau, K.W., Sethi, R., 2012. Artificial Neural Network simulation of
hourly groundwater levels in a coastal aquifer system of the Venice lagoon. Eng.
Appl. Artif. Intel. 25, 1670–1676.
Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer, New York.
Wang, W.C., Chau, K.W., Cheng, C.T., et al., 2009. A comparison of performance of
several artificial intelligence methods for forecasting monthly discharge time
series. J. Hydrol. 374, 294–306.
Wu, C.L., Chau, K.W., 2011. Rainfall–runoff modeling using artificial neural network
coupled with singular spectrum analysis. J. Hydrol. 399, 394–409.
Wu, C.L., Chau, K.W., Li, Y.S., 2008. River stage prediction based on a distributed
support vector regression. J. Hydrol. 358, 96–111.
Wu, C.L., Chau, K.W., Li, Y.S., 2009. Predicting monthly streamflow using data-driven
models coupled with data-preprocessing techniques. Water Resour. Res. 45,
W08432.
Yan, H., Zou, Z., Wang, H., 2010. Adaptive neuro fuzzy inference system for
classification of water quality status. J. Environ. Sci. 22, 1891–1896.
Yoon, H., Jun, S.C., Hyun, Y., et al., 2011. A comparative study of artificial neural
networks and support vector machines for predicting groundwater levels in a
coastal aquifer. J. Hydrol. 396, 128–138.
Yu, P.S., Chen, S.T., Chang, I.F., 2006. Support vector regression for real-time flood
stage forecasting. J. Hydrol. 328, 704–716.