Journal of Hydrology 509 (2014) 379–386 Contents lists available at ScienceDirect Journal of Hydrology journal homepage: www.elsevier.com/locate/jhydrol A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region Zhibin He a,b,⇑, Xiaohu Wen b, Hu Liu a,b, Jun Du a,b a b Linze Inland River Basin Research Station, Chinese Ecosystem Research Network, China Key Laboratory of Ecohydrology of Inland River Basin, Cold and Arid Regions Environmental and Engineering Research Institute, Chinese Academy of Sciences, Lanzhou 730000, China a r t i c l e i n f o Article history: Received 18 June 2012 Received in revised form 10 April 2013 Accepted 27 November 2013 Available online 2 December 2013 This manuscript was handled by Andras Bardossy, Editor-in-Chief, with the assistance of K.P. Sudheer, Associate Editor Keywords: River flow forecasting Artificial neural network Adaptive neuro fuzzy inference system Support vector machine s u m m a r y Data driven models are very useful for river flow forecasting when the underlying physical relationships are not fully understand, but it is not clear whether these data driven models still have a good performance in the small river basin of semiarid mountain regions where have complicated topography. In this study, the potential of three different data driven methods, artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS) and support vector machine (SVM) were used for forecasting river flow in the semiarid mountain region, northwestern China. The models analyzed different combinations of antecedent river flow values and the appropriate input vector has been selected based on the analysis of residuals. The performance of the ANN, ANFIS and SVM models in training and validation sets are compared with the observed data. The model which consists of three antecedent values of flow has been selected as the best fit model for river flow forecasting. To get more accurate evaluation of the results of ANN, ANFIS and SVM models, the four quantitative standard statistical performance evaluation measures, the coefficient of correlation (R), root mean squared error (RMSE), Nash–Sutcliffe efficiency coefficient (NS) and mean absolute relative error (MARE), were employed to evaluate the performances of various models developed. The results indicate that the performance obtained by ANN, ANFIS and SVM in terms of different evaluation criteria during the training and validation period does not vary substantially; the performance of the ANN, ANFIS and SVM models in river flow forecasting was satisfactory. A detailed comparison of the overall performance indicated that the SVM model performed better than ANN and ANFIS in river flow forecasting for the validation data sets. The results also suggest that ANN, ANFIS and SVM method can be successfully applied to establish river flow with complicated topography forecasting models in the semiarid mountain regions. Ó 2013 Elsevier B.V. All rights reserved. 1. Introduction River flow forecasting is very important for water resources system planning and management, especially in arid area where water resources is scarce, river flow forecasting is useful to water resources temporal and spatial planning and distributions. River flow forecasting has been studied by various scientists during the past few decades. Generally, river flow models can be classed into the two main groups, physical based models and data driven models. Typically, physically based models are complex and require sophisticated mathematical tools, a significant amount of calibration data, and some degree of expertise and experience with the models (Aqil et al., 2007). While data driven models do not provide ⇑ Corresponding author at: Linze Inland River Basin Research Station, Chinese Ecosystem Research Network, China. Tel.: +86 931 4967165. E-mail address: [email protected] (Z. He). 0022-1694/$ - see front matter Ó 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jhydrol.2013.11.054 any information on the physics of the hydrologic processes, they are very useful for river flow forecasting where the main concern is accurate predictions of runoff (Nayak et al., 2005; Chau et al., 2005; Wu et al., 2009). Recently, three data driven methods that have been gained popularity as an emerging and challenging computational technology such as artificial neutral networks (ANNs), adaptive neuro fuzzy inference system (ANFIS) and support vector machine (SVM). These methods offer advantages over conventional modeling, including the ability to handle large amounts of noisy data from dynamic and nonlinear systems, especially when the underlying physical relationships are not fully understood. In the past few decades, ANNs and ANFIS methods have been extensively used in a wide range of engineering applications including hydrology, such as for rainfall–runoff simulation (Nourani et al., 2009; Talei et al., 2010; Wu and Chau, 2011), groundwater modeling (Kuo et al., 2004; Daliakopoulos et al., 2005; Sahoo et al., 2005; Ghose et al., 2010; Taormina et al., 380 Z. He et al. / Journal of Hydrology 509 (2014) 379–386 and the gradient can be computed as 2012), river flow forecasting(El-Shafie et al., 2006; Shu and Ouarda, 2008) and water quality modeling (Singh et al., 2009; Yan et al., 2010). Recently, SVMs are gaining recognition in hydrology (Moradkhani et al., 2004; Yu et al., 2006; Lin et al., 2006; Wu et al., 2008; Lin et al., 2009; Chen et al., 2010; Yoon et al., 2011). But for some catchments where have a very few meteorological observatories and have complicated topography, it is not clear whether these data driven models still have a good performance. In this study, the ANN, ANFIS and SVM were used to forecast river flow in a smaller catchment in the Qilian Mountains of northwestern China and the results obtained are compared to each other. The purpose of this study is to investigate the accuracy of three different data driven models ANN, ANFIS and SVM in modeling daily river flow, and evaluate the performance of three data driven models in the small river basin of semiarid mountain regions where have complicated topography. where l is the learning rate and I is the identity matrix (Dedecker et al., 2004). During training the learning rate l is incremented or decremented by a scale at weight updates. When l is zero, this is just Newton’s method, using the approximate Hessain matrix. When l is large, this becomes gradient descent with a small step size. 2. Methodology 2.3. Adaptive neuro fuzzy inference system (ANFIS) 2.1. Artificial neural network (ANN) ANFIS, first introduced by Jang (1993), is a universal approximator and as such is capable of approximating any real continuous function on a compact set to any degree of accuracy (Jang et al., 1997). ANFIS is functionally equivalent to fuzzy inference systems. Specifically the ANFIS system of interest here is functionally equivalent to the Sugeno first-order fuzzy model (Drake, 2000). Below, the hybrid learning algorithm, which combines gradient descent and the least-squares method, is introduced. As a simple example we assume a fuzzy inference system with two inputs x and y and one output z. The first-order Sugeno fuzzy model, a typical rule set with two fuzzy If-Then rules can be expressed as ANN is a massively parallel distributed information processing system that has certain performance characteristics resembling biological neural networks of the human brain (Haykin, 1999). A neural network is characterized by its architecture that represents the pattern of connection between nodes, its method of determining the connection weights and the activation function. The most commonly used neural network structure is the feed forward hierarchical architecture. A typical three-layered feed-forward neural network is comprised of a multiple elements also called nodes, and connection pathways that link them. The nodes are processing elements of the network and are normally known as neurons, reflecting the fact the neural network method model is based on the biological neural network of the human brain. A neuron receives an input signal, processes it, and transmits an output signal to other interconnected neurons. In the hidden and output layers, the net input to unit i is of the form Z¼ k X wji yj þ hi ð1Þ j¼1 where wji is the weight vector of unit i and k is the number of neurons in the layer above the layer that includes unit i. yj is the output from unit j, and yi is the bias of unit i. This weighted sum Z; which is called the incoming signal of unit i, is then passed through a transfer ^i for unit i. The sigmoid function is function f to yield the estimates y continuous, differentiable everywhere, and monotonically increasing. The sigmoid transfer function, fi, of unit i, is of the form ^i ¼ y 1 : 1 þ eZ ð2Þ A training algorithm is needed to solve a neural network problem. Since there are so many types of algorithms available for training a network, selection of an algorithm that provides the best fit to the data is required. Levenberg–Marquardt learning algorithm was used increasingly due to the better performance and learning speed with a simple structure. 2.2. Levenberg–Marquardt algorithm The Levenberg–Marquardt algorithm (LMA), is similar to the quasi-Newton method in which a simplified form of the Hessian matrix (second derivative) is used. The Hessian matrix can be approximated as: T H¼J J ð3Þ g ¼ JT e ð4Þ in which J is the Jacobian matrix which contains first derivatives of the network errors with respect to the weights and biases, and e is a vector of network errors. An iteration of this algorithm can be written as h i1 xkþ1 ¼ xk J T J þ lI J T e ð5Þ Rule 1 : If x is A1 and y is B1 ; then f 1 ¼ p1 x þ q1 y þ r 1 ð6Þ Rule 2 : If x is A2 and y is B2 ; then f 2 ¼ p2 x þ q2 y þ r 2 ð7Þ where p1, q1, r1 and p2, q2, r2 are the parameters in the then-part (consequent part) of the first-order Sugeno fuzzy model. The architecture of ANFIS consists of five layers (Fig. 1), and a brief introduction of the model is as follows. Layer 1: Each node of this layer generates membership grades to which they belong to each of the appropriate fuzzy sets using membership functions. O1;i ¼ lAi ðxÞ for i ¼ 1; 2 ð8Þ O1;i ¼ lBi2 ðyÞ for i ¼ 3; 4 ð9Þ where x, y are the crisp input to the node i; Ai and Bi are the fuzzy set associated with this node, characterized by the shape of the membership functions (MFs) in this node and can be any appropriate functions that are continuous and piecewise differentiable such as Gaussian, generalized bell shaped, trapezoidal shaped and triangular shaped functions. The membership functions for A and B are generally described by generalized bell functions, e.g. lAi ðxÞ ¼ 1 1 þ ½ðx ci Þ=ai 2bi ð10Þ where {ai, bi, ci} is the parameter set that changes the shapes of the MFs with maximum equal to 1 and minimum equal to 0. Q Layer 2: This layer consists of the nodes labeled which multiply incoming signals and sending the product out. For instance, O2;i ¼ wi ¼ lAi ðxÞlBi2 ðyÞ i ¼ 1; 2 ð11Þ Layer 3: Every node in this layer is a fixed node labeled N. The ith node calculates the ratio between the ith rule’s firing strength to the sum of all rules’ firing strengths: 381 Z. He et al. / Journal of Hydrology 509 (2014) 379–386 Fig. 1. Architecture of the ANFIS. O3;i ¼ wi ¼ wi w1 þ w2 i ¼ 1; 2 ð12Þ Layer 4: Node i in this layer computes the contribution of the ith rule towards the model output, with the following node function: O4;i ¼ wi fi ðpi x þ qi y þ r i Þ ð13Þ i is the output of layer 3 and {pi, qi, ri} are the parameters where w set Layer 5: The single node in this layer computes the overall output of the ANFIS as: O5;i ¼ P i wi f X w f ¼ i i i wi X ð14Þ i The distinguishing characteristic of the approach is that ANFIS applies a hybrid-learning algorithm, the gradient descent method and the least-squares method, to update parameters. The gradient descent method is employed to tune premise non-linear parameters {ai, bi, ci}, while the least-squares method is used to identify consequent linear parameters {pi, qi, ri}. As shown in Fig. 1, the circular nodes are fixed (i.e., not adaptive) nodes without parameter variables, and the square nodes have parameter variables (the parameters are changed during training). The task of the learning procedure has two steps: In the first step, the least square method to identify the consequent parameters, while the antecedent parameters (membership functions) are assumed to be fixed for the current cycle through the training set. Then, the error signals propagate backward. Gradient descent method is used to update the premise parameters, through minimizing the overall quadratic cost function, while the consequent parameters remain fixed. The detailed algorithm and mathematical background of the hybridlearning algorithm were detail introduced by Jang (1993). In this study, various membership functions for the ANFIS structure were used to demonstrate the effect of choice of membership functions on the model performance. Among the different possible types of membership functions, the triangular membership function as a simple straight line function was found to give better results and was thus adopted. In addition, the type of membership function used was not found to be a critical consideration in ANFIS model performance (Nayak et al., 2004). Choosing the number of membership functions for each input reflects the complexity of ANFIS model for selecting parameters. In each application, different number of membership functions was tried and the best one that gives the minimum squared error (MSE) was selected. Two or three triangular membership functions to the ANFIS models were found enough for modeling suspended sediment concentration. 2.4. Support vector machine (SVM) SVM is the state-of-the-art neural network technology based on statistical learning (Vapnik, 1995). The structure of an SVM is not determined a priori. Input vectors supporting the model structure are selected through a model training process described below. N Given a set of training data fðxi þ di Þgi (xi is the input vector, di is the desired value and N is the total number of data patterns), the regression function of SVM is formulated as follows: f ðxÞ ¼ xi /i ðxÞ þ b ð15Þ where wi is a weight vector, and b is a bias. /i denotes a nonlinear transfer function that maps the input vectors into a high-dimensional feature space in which theoretically a simple linear regression can cope with the complex nonlinear regression of the input space. Vapnik (1995) introduced the following convex optimization problem with an e-insensitivity loss function to obtain the solution to Eq. (16): Minimize : ! N X 1 kxk2 þ C ni þ ni 2 i 8 > < xi /ðxi Þ þ bi di 6 e þ ni ; Subject to di xi /ðxi Þ bi 6 e þ ni ; > : ni ; ni ; i ¼ 1; 2; 3; . . . ; N ð16Þ i ¼ 1; 2; . . . ; N i ¼ 1; 2; . . . ; N ð17Þ where ni and ni are slack variables that penalize training errors by the loss function over the error tolerance n and C is a positive tradeoff parameter that determines the degree of the empirical error in the optimization problem. Eq. (17) is usually solved in a dual form using Lagrangian multipliers and imposing the Karush–Kuhn– Tucker (KKT) optimality condition. The input vectors that have nonzero Lagrangian multipliers under the KKT condition support the structure of the estimator and are called support vectors. A number of algorithms have been suggested for solving the dual optimization problem of the SVM. An overview of these algorithms is found in Shevade et al. (2000). Conventional quadratic programming algorithms require extremely large memory for the kernel matrix computation and have difficulties in their implementation. Therefore, they are not suitable for large problems. To overcome this problem, subset selection methods have been developed. The optimization problem is solved in a selected subset to give a set of support vectors, and then a new subset is selected using these support vectors. This process continues until all the input vectors satisfy the KKT conditions. The sequential minimal optimization (SMO) algorithm, introduced by Platt (1999), puts the subset selection algorithm to the extreme by selecting a subset of size two and optimizing the estimation function with respect to them. The 382 Z. He et al. / Journal of Hydrology 509 (2014) 379–386 main advantage of the SMO is that an analytical solution of a subset can be obtained directly without invoking a quadratic optimizer. In this study, the SMO algorithm was employed to train the SVM model for river flow predictions. The detailed procedures of the SMO algorithm are found in Platt (1999). A more detailed discussion of the theory of SVM technique can be found in the literature (Cristianini and Shawe-Taylor, 2000; Yu et al., 2006; Chen and Yu, 2007; Wang et al., 2009). In this paper, radius basis function (RBF) was selected as the kernel function, penalty parameter C and kernel function’s parameter c for the SVM were determined through grid-search algorithm with cross-validation as described by Hsu et al. (2010). Firstly, a coarse grid search was used to determine the best region of these three-dimension grids. Then, a finer grid search was conducted to find the optimal parameters. A fivefold cross-validation was employed in this study. The calibration and the following forecasting work in this study were performed using the programming codes of a Library for Support Vector Machines (LIBSVM) software (Chang and Lin, 2011) to perform this study. 2.5. Study area Pailugou catchment (100°170 E, 38°240 N) is located in the Qilian Mountains, in northwestern China’s Gansu province. The catchment’s total area is 2.91 km2, and elevations range from 2650 to 3800 m. Mean annual rainfall is 375.5 mm at an elevation of 2700 m, and precipitation generally increases with increasing elevation, by about 4.3% per 100 m. About 65% of the precipitation falls during the summer (July to September). Mean annual temperatures decrease with increasing elevation, from 2 °C at the base of the catchment (2650 m) to 6.3 °C near the summit (3800 m). Permanently and seasonally frozen soils are widespread at middle and higher elevations. The main parent material is calcareous rock, with a relatively thin soil layer above the rock, a coarse soil texture, an intermediate organic matter content, and a pH ranging from 7 to 8. Owing to the steep temperature and precipitation gradients, vegetation is present as a mosaic of patches of grassland, scrubland, and forest. Forests are mostly found on shaded slopes and semi-shaded at intermediate elevations (i.e., 2650–3450 m), whereas sunny slopes are mostly occupied by grasslands. In the catchment of Pailugou, Picea crassifolia is the only tree species in the study area, and is found primarily between elevations of 2650 and 3450 m; alpine shrub communities are found at elevations between 3200 and 3650 m. The dominant shrub species, which grow primarily where trees are not found, are Salix oritrepha, Rhododendron przewalskii, Caragana jubata. The dominant understory vegetation species, which grow under both trees and shrubs, are Potentilla fruticosa, Potentilla glabra, Lonicera microphylla, Kobresis bellardii, Polygonum viviparum. The dominant species on sunny slopes are Stipa purpurea, Kobresis humilis, Agropyron cristatum. 3. Model development 3.1. Description of data In this study, the performance of ANNs, ANFIS and SVM were examined on daily flow. To achieve this, a 6-year flow data was available from 2001 to 2003 and from 2009 to 2011. In total, the number of flow data were available were 2190 days. The data were divided into two sets: a training data set consisting of years 2001– 2003 and a validation data set of years 2009–2011. A full year data set used in the identification period enabled inclusion of various hydrological conditions that are observable during different seasons of the year. In this way the model become robust for the different hydrological conditions that prevail in the total time series (Kisßi, 2006). The daily statistical parameters of the flow data are given in Table 1. It can be seen that the flow data shows a significantly high skewed distribution (2.81 and 2.78 for training and test data set, respectively). First, several input combinations were tried using ANN, ANFIS and SVM to estimate daily flow. The numbers of lags were selected according to the partial auto-correlation function (PCF) of daily flow data. The PCF of the daily data is shown in Fig. 2. It is clear from this figure that first three lags have significant effect on Qt+1. Thus, three previous lags were considered as inputs to the models in this study. The inputs present the previous flow (t, t1 and t2) and the output layer node corresponds to the flow at time t+1. Thus, the following combinations of input data of flow were evaluated: (1) Qt, (2) Qt and Qt1, (3) Qt, Qt1 and Qt2. 3.2. Data preprocessing In order to achieve effective network training, the data are needed to be normally distributed using an appropriate transformation method. Shanker et al. (1996); Luk et al. (2000) reported that networks trained on transformed data achieve better performance and faster convergence in general. Besides, Aqil et al. (2007) showed that the data preprocessing with log sigmoidal activation function before processing the ANN and ANFIS models. In this study, transformation is performed on all time-series data independently using the following equation. z ¼ a log10 ðG þ bÞ ð18Þ where z is the transformed value of river flow, a is an arbitrary constant, and b was set to 1 to avoid the entry of zero river flow in the log function. The final forecast results were then back transformed using the following equation: G ¼ 10z=a b ð19Þ 3.3. Models performance criteria The performances of the models developed in this study were assessed using various standard statistical performance evaluation criteria. The statistical measures considered were coefficient of correlation (R), root mean squared error (RMSE), Nash–Sutcliffe efficiency coefficient (NS) and mean absolute relative error (MARE). The R measures the degree to which two variables are linearly related. RMSE and MARE provide different types of information about the predictive capabilities of the model. The RMSE measures the goodness-of-fit relevant to high flow values whereas the MARE not only gives the performance index in terms of predicting flow but also the distribution of the prediction errors. Coefficient of correlation (R) is defined as the degree of correlation between the observed and predicted values: Pn o o Q pi Q p i¼1 Q i Q R ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 P 2ffi Pn o n o p p Q Q Q Q i¼1 i¼1 i i ð20Þ Table 1 The flow statistical parameters of each data set. Training Test Qmin Qmax Qmean Qstdev Qske 4.00 7.28 9604.22 8841.12 617.65 559.74 1046.72 1082.47 2.81 2.78 Qmin, Qmax, Qmean, Qstdev, Qske denote the minimum, maximum, mean, standard deviation and skewness coefficient of the flow data for training and test data. Z. He et al. / Journal of Hydrology 509 (2014) 379–386 Fig. 2. Partial auto-correlation function of daily flow data. The root mean square error (RMSE) can be calculated as follows: rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 1 Xn o RMSE ¼ Q i Q pi i¼1 n ð21Þ The Nash–Sutcliffe efficiency coefficient (NS) can be calculated as follows: Pn 2 Q oi Q pi NS ¼ 1 2 Pn o o i¼1 Q i Q i¼1 ð22Þ The mean absolute relative error (MARE) can be calculated as follows: MARE ¼ p n o 1X Q i o Q i 100 n i¼1 Q i ð23Þ where n is the number of input samples; Q oi and Q pi are the observed and predicted flow at time t, Q o and Q p are the mean of the observed and predicted river flow, The best fit between observed and calculated values would have R = 1, RMSE = 0, NS = 1 and MARE = 0, respectively. 3.4. Network architecture for ANN model In this study, the focus is given to the performance of the ANN model in 1-day-ahead prediction of river flow. Three input combinations based on daily flow of previous periods are evaluated. Assume that Qt denotes the river flow at time t, the input combinations evaluated in the study are (1) Qt, (2) Qt and Qt1 and (3) Qt, Qt1 and Qt2. In all cases, the output layer has only one neuron, the river flow Qt+1. After the input and output variables were selected, the ANN architecture was investigated. The next step in the development of the ANN model was the determination of the optimum number of neurons (N) in the hidden layer. In this study, the number of neuron in the hidden layer was determined by trial and error approach. The optimal number of neurons in the hidden layer was identified using a trial and error procedure by varying the number of hidden neurons from 2 to 15. The cross-validation method is used and the number of hidden neurons was selected based on the RMSE. When the numbers of hidden neurons are 7, 3 and 5, the training error is the closest to the testing error. Therefore, the numbers of hidden neurons are selected to be 7, 3 and 5 for combinations input (1), (2) and (3), respectively. 4. Results and discussion The ANN, ANFIS and SVM models with different input were compared based on their performance in training sets and validation sets. The results were summarized in Tables 2–4. It 383 was apparent that all of the performances of these models are similar during training as well as validation. It also showed that the model, which consisted of three antecedent flows data in input, had the smallest value of the RMSE as well as higher value of R and NS in the training as well as validation period, so, it was selected as the best-fit model for predicting the river flow in this study. In order to get an effective evaluation of the ANN, ANFIS and SVM models performance, the best model structures, has been used to compare the ANN, ANFIS and SVM models. From the bestfit model, it was found that the difference between the values of the statistical indices of the training and validation set does not vary substantially. It was observed that all three models generally gave low values of the RMSE and MARE as well as high R and NS, the performances of the ANN, ANFIS and SVM models performance in the river flow forecasting were satisfactory. A model can be claimed to produce a perfect estimation if the NS criterion is equal to 1. Normally, a model can be considered as accurate if the NS criterion is greater than 0.8 (Shu and Ouarda, 2008). The NS values for the ANN, ANFIS and SVM models in this study are all over 0.8 which indicates that both types of models achieved acceptable results. The NS values for the SVM model predict of the flow value were higher than those for the ANFIS and ANN models, which indicates that the overall quality of estimation of the SVM model is better than the ANFIS and ANN models according to NS. Compared with the ANN, ANFIS and SVM models perform from the RMSE and R viewpoints, the SVM model performed a bit better than both the ANN and the ANFIS model. Concretely, SVM model produced a lower RMSE as well as higher R, is the former being the best according to the criteria. It appears that while assessing the performance of any model for its applicability in predicting river flow, it is not only important to evaluate the average prediction error but also the distribution of prediction errors. The statistical performance evaluation criteria employed so far in this study are global statistics and do not provide any information on the distribution of errors. Therefore, in order to test the robustness of the model developed, it is important to test the model using some other performance evaluation criteria such as mean absolute relative error (MARE). The MARE index provides an indication about whether a model tends to overestimate or underestimate. The analysis based on the MARE index suggests that the SVM model performed better than the ANN and ANFIS model. This indicates that the errors obtained when using the SVM model are more symmetric around zero but show more dispersion than those obtained when using the ANFIS and ANN model. The SVM have 59.94% estimates lower than the 5% relative error during validation period while the ANN have 27.90% and ANFIS have 26.80% estimates lower than the 5% error, respectively. Furthermore, The SVM have 72.65% estimates lower than the 10% relative error during validation period while the ANN have 64.92% and ANFIS have 66.57% estimates lower than the 10% error, respectively. The SVM model seems to perform better than the other models from the relative error viewpoint. Therefore, SVM model was the most effective model in terms of forecasting flow accurately during validation set. Figs. 3–5 showed the hydrograph and scatter plots of both the observed data and the predicted obtained by using the ANN, SVM and ANFIS model of the validation period. It was obviously seen from the hydrographs and scatter plots that the SVM estimates were closer to the corresponding observed flow values than those of the other models. As seen from the fit line equations (assume that the equation is y = ax + b) in the scatter plots that a and b coefficients for the SVM model are, respectively, closer to the 1 and 0 with a higher R value of 0.935 than ANN and ANFIS models. The models of ANN, ANFIS and SVM showed good prediction accuracy for low values of flow but were unable to maintain their accuracy for high values of flow (Figs. 3–5). However, a significant 384 Z. He et al. / Journal of Hydrology 509 (2014) 379–386 Table 2 The structure and the performance statistics of the ANN models during the training and validation periods. Input Structure Qt Qt, Qt1 Qt, Qt1, Qt2 1–7–1 2–3–1 3–5–1 Training Validation R RMSE MARE NS R RMSE MARE NS 0.919 0.911 0.906 415.434 434.613 444.670 16.015 15.748 16.118 0.843 0.828 0.820 0.940 0.931 0.938 391.486 433.635 388.255 14.214 13.186 12.802 0.869 0.840 0.871 Table 3 The structure and the performance statistics of the ANFIS models during the training and validation periods. Input Qt Qt, Qt1 Qt, Qt1, Qt2 Number of MF 3 3, 3 2, 2, 3 Training Validation R RMSE MARE NS R RMSE MARE NS 0.907 0.920 0.930 442.048 413.239 387.444 15.835 15.580 15.868 0.822 0.844 0.863 0.947 0.946 0.936 368.255 381.155 392.530 12.031 14.296 13.637 0.884 0.876 0.869 Table 4 The structure and the performance statistics of the SVM models during the training and validation periods. Input Parameter (C, c) Training R RMSE MARE NS R RMSE MARE NS Qt Qt, Qt1 Qt, Qt1, Qt2 2, 0.5 2, 0.125 16, 0.25 0.911 0.909 0.925 432.134 437.332 399.641 14.627 14.973 14.443 0.830 0.826 0.854 0.937 0.947 0.947 404.258 371.484 364.555 11.431 11.371 11.713 0.861 0.882 0.887 Validation Fig. 3. The observed and forecasted flow values by ANN in the validation period. Z. He et al. / Journal of Hydrology 509 (2014) 379–386 Fig. 4. The observed and forecasted flow values by SVM in the validation period. Fig. 5. The observed and forecasted flow values by ANFIS in the validation period. 385 386 Z. He et al. / Journal of Hydrology 509 (2014) 379–386 improvement is observed for the ANFIS in the peak flow prediction compared to ANN and SVM. Although these models underestimated the peak flow, the ANFIS underestimated it by 12.18% as opposed to 36.24% and 41.76% for the ANN and SVM. Overall, the ANN, ANFIS and SVM models can give good prediction performance and could be successfully applied to establish the forecasting models that could provide accurate and reliable daily flow prediction. The results suggest that the SVM model was superior to the ANN and ANFIS in the river flow forecasting. 5. Conclusions In this study, ANN, ANFIS and SVM models were developed for forecasting the short term of river flow based on antecedent values of river flow data. For achieving this objective, the Pailugou station located in the Heihe River has been selected as case study. The results of ANN, ANFIS and SVM models and observed values were compared and evaluated based on their training and validation performance. The results demonstrated that ANN, ANFIS and SVM can be applied successfully to establish accurate and reliable river flow forecasting. According to the results, the model which consists of three antecedent values of river flow has been selected as the best fit forecasting model. Comparing the results of ANN, ANFIS and SVM models, it was seen that the values of R and NS of SVM models were higher than those of ANN and ANFIS models. Moreover, the RMSE values of SVM models were lower than those of ANN and ANFIS models. Therefore, SVM model could improve the accuracy over the ANN and ANFIS models. The results also demonstrated ANN, ANFIS and SVM showed good prediction accuracy for low values of flow but were unable to maintain their accuracy for high values of flow. However, a significant improvement is observed for the ANFIS in the peak flow prediction compared to ANN and SVM. Overall, the analysis presented in this study provides that the SVM method was superior to the ANN and ANFIS in the river flow forecasting. Although the results presented here are promising and these data driven models can be successfully applied to establish river flow with complicated topography forecasting models, these models underestimate significantly flow in the flood conditions. In future work, further research is necessary to improve the prediction accuracy, especially for the high values of flow, by combining or improving model parameters. Acknowledgments The authors thank the editors and anonymous reviewers for their critical review and comments on this manuscript. This work was supported by the Major Research Plan of the National Natural Science Foundation of China (91025014), the Hundred Talents Program of the Chinese Academy of Sciences (29Y127D11), the National Natural Science Foundation of China (41271524) and the Open Foundation of Key Laboratory of Ecohydrology of Inland River Basin (90Y229F51). References Aqil, M., Kita, I., Yano, A., et al., 2007. A comparative study of artificial neural networks and neuro-fuzzy in continuous modeling of the daily and hourly behaviour of runoff. J. Hydrol. 337, 22–34. Chang, C.C., Lin, C.J., 2011. LIBSVM: a Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology 2(3), 1–27. <http:// www.csie.ntu.edu.tw/cjlin/libsvm>. Chau, K.W., Wu, C.L., Li, Y.S., 2005. Comparison of several flood forecasting models in Yangtze River. J. Hydrol. Eng. 10, 485–491. Chen, S.T., Yu, P.S., 2007. Pruning of support vector networks on flood forecasting. J. Hydrol. 347, 67–78. Chen, S.T., Yu, P.S., Tang, Y.H., 2010. Statistical downscaling of daily precipitation using support vector machines and multivariate analysis. J. Hydrol. 385, 13–22. Cristianini, N., Shawe-Taylor, J., 2000. An Introduction to Support Vector Machines and other Kernel-based Learning Methods. Cambridge University Press, Cambridge. Daliakopoulos, I.N., Coulibaly, P., Tsanis, I.K., 2005. Groundwater level forecasting using artificial neural networks. J. Hydrol. 309, 229–240. Dedecker, A.P., Goethals, P.L.M., Gabriels, W., et al., 2004. Optimization of artificial neural network (ANN) model design for prediction of macro invertebrates in the Awalm river basin (Flanders Belgium). Ecol. Model. 174, 161–173. Drake, J.T., 2000. Communications phase synchronization using the adaptive network fuzzy inference system. Ph.D. Thesis. New Mexico State University, Las Cruces, New Mexico, USA. El-Shafie, A., Taha, M.R., Noureldin, A., 2006. A neuro-fuzzy model for inflow forecasting of the Nile river at Aswan high dam. Water Resour. Manage. 21, 533–556. Ghose, D.K., Panda, S.S., Swain, P.C., 2010. Prediction of water table depth in western region, Orissa using BPNN and RBFN neural networks. J. Hydrol. 394, 296–304. Haykin, S., 1999. Neural Network-a Comprehensive Foundation. Prentice-Hall, Englewood Cliffs, NJ. Hsu, C.W., Chang, C.C., Lin, C.J., 2010. A practical guide to support vector classification. URL <http://www.csie.ntu.edu.tw/~cjlin/papers/guide/ guide.pdf>. Jang, J.S.R., 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst, Man, Cybernet 23, 665–685. Jang, J.S.R., Sun, C.T., Mizutani, E., 1997. Neuro-fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. PrenticeHall, New Jersey. Kisßi, Ö., 2006. Daily pan evaporation modelling using a neuro-fuzzy computing technique. J. Hydrol. 329, 636–646. Kuo, Y.M., Liu, C.W., Lin, K.H., 2004. Evaluation of the ability of an artificial neural network model to assess the variation of groundwater quality in an area of blackfoot disease in Taiwan. Water Res. 38, 148–158. Lin, J.Y., Cheng, C.T., Chau, K.W., 2006. Using support vector machines for long-term discharge prediction. Hydrolog. Sci. J. 51, 599–612. Lin, G.F., Chen, G.R., Huang, P.Y., et al., 2009. Support vector machine-based models for hourly reservoir inflow forecasting during typhoon-warning periods. J. Hydrol. 372, 17–29. Luk, K.C., Ball, J.E., Sharma, A., 2000. A study of optimal model lag and spatial inputs to artificial neural network for rainfall forecasting. J. Hydrol. 227, 56–65. Moradkhani, H., Hsu, K.-L., Gupta, H.V., et al., 2004. Improved streamflow forecasting using self-organizing radial basis function artificial neural networks. J. Hydrol. 295, 246–262. Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2004. A neuro-fuzzy computing technique for modeling hydrological time series. J. Hydrol. 291, 52–66. Nayak, P.C., Sudheer, K.P., Rangan, D.M., et al., 2005. Short-term flood forecasting with a neurofuzzy model. Water Resour. Res. 41, 2517–2530. Nourani, V., Komasi, M., Mano, A., 2009. A multivariate ANN-wavelet approach for rainfall–runoff modeling. Water Resour. Manage. 23, 2877–2894. Platt, J.C., 1999. Fast training of support vector machines using sequential minimaloptimization. In: Schokopf, B., Burges, C.J.C., Smolar, A.J. (Eds.), Advances in Kernel Methods—Support Vector Learning. MIT Press, Cambridge, Massachusetts, USA. Sahoo, G.B., Ray, C., Wade, H.F., 2005. Pesticide prediction in ground water in North Carolina domestic wells using artificial neural networks. Ecol. Model. 183, 29– 46. Shanker, M., Hu, M.Y., Hung, M.S., 1996. Effect of data standardization on neural network training. Int. J. Mange. Sci. 24, 385–397. Shevade, S.K., Keerthi, S.S., Bhattacharyya, C., et al., 2000. Improvements to the SMO algorithm for SVM regression. IEEE Trans. Neural Network 11, 1188–1193. Shu, C., Ouarda, T.B.M.J., 2008. Regional flood frequency analysis at ungauged sites using the adaptive neuro-fuzzy inference system. J. Hydrol. 349, 31–43. Singh, K.P., Basant, A., Malik, A., et al., 2009. Artificial neural network modeling of the river water quality—a case study. Ecol. Model. 220, 888–895. Talei, A., Chua, L.H.C., Wong, T.S.W., 2010. Evaluation of rainfall and discharge inputs used by Adaptive Network-based Fuzzy Inference Systems (ANFIS) in rainfall–runoff modeling. J. Hydrol. 391, 248–262. Taormina, R., Chau, K.W., Sethi, R., 2012. Artificial Neural Network simulation of hourly groundwater levels in a coastal aquifer system of the Venice lagoon. Eng. Appl. Artif. Intel. 25, 1670–1676. Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer, New York. Wang, W.C., Chau, K.W., Cheng, C.T., et al., 2009. A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J. Hydrol. 374, 294–306. Wu, C.L., Chau, K.W., 2011. Rainfall–runoff modeling using artificial neural network coupled with singular spectrum analysis. J. Hydrol. 399, 394–409. Wu, C.L., Chau, K.W., Li, Y.S., 2008. River stage prediction based on a distributed support vector regression. J. Hydrol. 358, 96–111. Wu, C.L., Chau, K.W., Li, Y.S., 2009. Predicting monthly streamflow using data-driven models coupled with data-preprocessing techniques. Water Resour. Res. 45, W08432. Yan, H., Zou, Z., Wang, H., 2010. Adaptive neuro fuzzy inference system for classification of water quality status. J. Environ. Sci. 22, 1891–1896. Yoon, H., Jun, S.C., Hyun, Y., et al., 2011. A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer. J. Hydrol. 396, 128–138. Yu, P.S., Chen, S.T., Chang, I.F., 2006. Support vector regression for real-time flood stage forecasting. J. Hydrol. 328, 704–716.
© Copyright 2026 Paperzz