energies Article k-Nearest Neighbor Neural Network Models for Very Short-Term Global Solar Irradiance Forecasting Based on Meteorological Data Chao-Rong Chen and Unit Three Kartini * Department of Electrical Engineering, National Taipei University of Technology, 1, Section 3, Zhong-Xiao (Chung-Hsiao) E. Rd., Da’an Dist., Taipei 106, Taiwan; [email protected] * Correspondence: [email protected] or [email protected]; Tel.: +886-2-27712171 (ext. 2112) Academic Editor: Silvio Simani Received: 24 November 2016; Accepted: 1 February 2017; Published: 8 February 2017 Abstract: This paper proposes a novel methodology for very short term forecasting of hourly global solar irradiance (GSI). The proposed methodology is based on meteorology data, especially for optimizing the operation of power generating electricity from photovoltaic (PV) energy. This methodology is a combination of k-nearest neighbor (k-NN) algorithm modelling and artificial neural network (ANN) model. The k-NN-ANN method is designed to forecast GSI for 60 min ahead based on meteorology data for the target PV station which position is surrounded by eight other adjacent PV stations. The novelty of this method is taking into account the meteorology data. A set of GSI measurement samples was available from the PV station in Taiwan which is used as test data. The first method implements k-NN as a preprocessing technique prior to ANN method. The error statistical indicators of k-NN-ANN model the mean absolute bias error (MABE) is 42 W/m2 and the root-mean-square error (RMSE) is 242 W/m2 . The models forecasts are then compared to measured data and simulation results indicate that the k-NN-ANN-based model presented in this research can calculate hourly GSI with satisfactory accuracy. Keywords: global solar irradiance (GSI); photovoltaic (PV); very short term; forecasting; k-nearest neighbor (k-NN); artificial neural network (ANN) 1. Introduction Nowadays, forecasting global solar irradiance (GSI) is an essential task, particularly related to the increased use of photovoltaic (PV) solar energy as a power source. Forecasting GSI can be executed in different terms: long-term, medium-term, and short-term. Since solar power is categorized as an intermittent energy source, forecasting is paramount to regulate electricity loads in power networks. It also functions to optimize power delivery and unit commitment and by extension, it helps minimize the operating costs of power systems [1]. With a forecast, it is expected that plant operation control systems can be improved, so as to balance power generation and load. Moreover, distribution of load, electric energy storage, and energy supply will be maximized and more reliable. The performance of PV systems is heavily influenced by meteorological conditions such as temperature, global irradiation, humidity, wind speed and wind direction [2]. The relation is clear: electrical energy generated by the PV solar depends on the amount of the GSI received by the PV panels. Solar irradiance absorbed in each PV panel varies depending on geographic location, time, and the absorption capacity of the PV panels. Previous studies have presented a variety of mathematical models for GSI forecasting in relation to meteorological variables. GSI forecasting with k-nearest neighbor (k-NN) statistical methods has been described [3,4]. Hocaoğlu [3] and Pedro and Coimbra [4] presented modeling of solar irradiation with stochastic methods using a k-NN artificial neural network Energies 2017, 10, 186; doi:10.3390/en10020186 www.mdpi.com/journal/energies Energies 2017, 10, 186 2 of 18 (ANN) at a PV station. Solar irradiation prediction is an important problem in geosciences, with direct applications in renewable energy, where the data in the form of time series can also be analyzed using a regression model as described in reference [5]. Salcedo-Sanz et al. [5] worked on the prediction of daily global irradiation using a temporal Gaussian process, in which the study explains the suitability of Gaussian regression (GPR) for the estimation of solar irradiation compared to other machine learning regression algorithms. The GSI forecasting is not only used in stochastic modeling, but in other studies [6,7] attempted forecasting was analyzed using exponential smoothing combined with decomposition methods and least absolute shrinkage and selection operator model. Yang et al. [6,7] studied the forecasting of global horizontal irradiance by exponential smoothing using decompositions, while on another study, they developed the least absolute shrinkage and selection operator model using irradiance very short-term forecasting. Combining a forecasting model with GSI is important to get a better result, and forecasting GSI by the spatiotemporal pattern recognition method, ANN method, parametric models and decomposition models, has been described in [8–10]. Spatiotemporal pattern recognition and nonlinear principal component analysis (PCA) for global horizontal irradiance forecasting has been proposed as well by Licciardi et al. [8]. Amrouche and Le Pivert [9] have presented an ANN based on daily local forecasting for global solar radiation, describing a novel methodology for local forecasting of daily global horizontal irradiance (GHI). The methodology is a combination of spatial modelling and ANNs algorithm. Wong et al. [10], have presented solar radiation models based on parametric models and decomposition models for predicting the average daily and hourly global radiation, beam radiation and diffuse radiation. Obtaining GSI forecasting for a PV station is also the subject of several studies. In the previous references the GSI forecasting was carried out without being influenced by the location of the PV station, but in [11,12] the GSI forecasting was very influenced by the Mediterranean location studied. To achieve multi-horizon irradiation forecasting for Mediterranean locations, time series models have been proposed by Paolia et al. [11]. Lorenz et al. [12] presented irradiance forecasting for the power prediction of grid-connected PV systems. In addition to GSI forecasting using grid-connected systems there are also other studies that similarly use the grid-connected method but different locations, as explained in [13,14]. Wang et al. [13] studied a short term solar irradiance forecasting model based on an ANN using statistical feature parameters. Another study on GSI forecasting using an ANN was performed by Mellit and Pavan [14], which they applied the prediction to a grid-connected PV plant located at Trieste, Italy. The application of a statistical method to detect the motion of cloud structures for surface irradiance is widely used for forecasting GSI as in the previous reference, and therefore in [15] optimization and operational validation of the GSI forecasting, i.e., short term forecasting of solar radiation using statistical methods to determine cloud motion vector fields have been proposed by Hammer et al. Xiao and Chaovalitwongse [16] have presented optimization models for decomposed nearest neighbor feature selection. Validation of short and medium term operational solar radiation forecasts in the US was studied by Perez et al. [17]. Many of the development models done by other researchers for forecasting GSI, tried GSI forecasting with statistical methods of stochastic learning and the development of analytical models with time series as in [18,19], i.e., forecasting of global and direct solar irradiance using stochastic learning methods, ground experiments and the national weather service’s (NWS) database have been proposed by Marquez et al. [18]. Martin et al. [19] have presented GSI forecasting methods based on time series analysis that have been used to predict half daily values of solar irradiance for the next three days. GSI forecasting models developed by performing functional and fuzzy approach spatiotemporal model development have been used too, as described in [20,21]. Boata and Gravila [20] have presented a functional fuzzy approach for forecasting daily global solar irradiation and very short term forecasting of the global horizontal irradiance using a spatiotemporal autoregressive model has been proposed by Dambreville et al. [21]. Of the various modeling approaches for forecasting GSI in the previous references, namely developing and combining forecasting models for short-term forecasts, [22–24] also describe several methods for forecasting GSI, namely using support vector machines and space exponential smoothing models. A review of solar Energies 2017, 10, 186 3 of 18 irradiance forecasting methods and a proposition for small-scale insular grids has been provided by Diagne et al. [22]. Short term solar power prediction using a support vector machine has been proposed Zeng et al. [23]. Dong et al. [24] have presented a short term solar irradiance forecasting method using an exponential smoothing state space model. A comparative empirical study based on a short term wind speed forecating model has been presented by Ren et al. [25]. Mellit et al. [26] and Farhad et al. [27] have presented ANN models for the prediction of solar radiation and a new bloggers Energies 2017, 10, 186 17 classification approach with a hybrid k-NN and ANN model. Probabilistic solar power3 offorecasting approaches k-NN kernel and method selection of input parameters to model direct irradiance by termbased solar on irradiance forecasting using an exponential smoothing state spacesolar model. A using ANN have been proposed by Zhang andterm Wang [28] andforecating López etmodel al. [29]. Thepresented aforementioned comparative empirical study based on a short wind speed has been by Ren et al.[25]. et al. [26] and Farhad al. [27] PV havestation, presentedbut ANN models prediction studies elaborated on Mellit GSI forecasting at oneettarget have yetfor tothe forecast GSI at PV of solar radiation and a new bloggers classification approach with a hybrid k-NN and ANN model. stations surrounded by other PV stations. Probabilistic solar power forecasting approaches based on k-NN kernel and selection of input This paper presents part of a study in process seeking to estimate and predict one hour or 60 min parameters to model direct solar irradiance by using ANN have been proposed by Zhang and Wang ahead global solar irradiation PVaforementioned stations for energy production. This part focuses howPVto predict [28] and López et al. [29].at The studies elaborated on GSI forecasting at oneon target hourly GSI for the station GSI with availability of a local database and based on meteorology station, but target have yetPV to forecast at the PV stations surrounded by other PV stations. This paper presents partmethodology of a study in process seeking to estimate and predict and one hour or modelling 60 data. In this study, a new hybrid that combines k-NN modelling ANN min ahead global solar irradiation at PV stations for energy production. This part focuses on how to algorithm has been developed. A k-NN-ANN method is used to forecast GSI at the target PV station by predict hourly GSI for the target PV station with the availability of a local database and based on means ofmeteorology calculatingdata. k-NNs based on the Euclidean distance and then do the testing and training data. In this study, a new hybrid methodology that combines k-NN modelling and ANN Themodelling remainder of the paper organized as follows:method Section 2 describes theatmodelling algorithm has beenisdeveloped. A k-NN-ANN is used to forecast GSI the target PVand data from a PVstation station, Section describesk-NNs the methodology used, i.e., the k-NN anddo neural network by means of 3calculating based on the Euclidean distance and then the testing and models, training4 data. while Section presents modelling study cases for very short-term forecasting and their measured The remainder of the paper is organized as follows: Section 2 describes the modelling and data errors. Finally, Section 5 presents some concluding remarks. from a PV station, Section 3 describes the methodology used, i.e., the k-NN and neural network models, while Section 4 presents modelling study cases for very short-term forecasting and their 2. Modelling and Data Description measured errors. Finally, Section 5 presents some concluding remarks. The GSI measurements were performed continuously every 5 min for four hours. The dataset 2. Modelling and Data Description thus contains four hours of data (from 5:20 a.m. to 8:00 a.m.) collected on 8 June 2012 at nine PV station The GSI performed 5 min for ◦four hours. TheStation dataset D (140◦ , locations: Station Ameasurements (0◦ , 8.3 km),were Station B (36◦continuously , 10.5 km), every Station C (93 , 10.8 km), thus contains four◦ hours of data (from 5:20 a.m. to 8:00 a.m.) collected on 8 ◦June 2012 at nine PV ◦ 10.5 km), Station E (180 , 10 km), Station F (250 , 4.3 km), Station G (280 , 5.4 km), Station H (310◦ , station locations: Station A (0°, 8.3 km), Station B (36°, 10.5 km), Station C (93°, 10.8 km), Station D ◦ , 0 km), located in Taipei, Taiwan. For this study, the very short-term forecasts 9.1 km) and Station S (0Station (140°, 10.5 km), E (180°, 10 km), Station F (250°, 4.3 km), Station G (280°, 5.4 km), Station H of GSI were only provided by PV Station which located inFor thethis center. nearest neighbouring (310°, 9.1 km) and Station S (0°, 0 km), S located in was Taipei, Taiwan. study,The the very short-term of GSI were onlySprovided by PV Station S which was located in the center. The nearest locationsforecasts surrounding Station were Station A, Station B, Station C, Station D, Station E, Station F, neighbouring locations surrounding Station S were Station A, Station B, Station C, Station D, Station Station G and Station H (Figure 1). E, Station F, Station G and Station H (Figure 1). Figure 1. Modelling of Station S which is surrounded by eight other photovoltaic (PV) stations. ST: Station. Figure 1. Modelling of Station S which is surrounded by eight other photovoltaic (PV) stations. ST: Station. Figure 2 illustrates the GSI values as measured at Station S. These monitored data have been used to evaluate the methodology algorithm using k-NN-ANN as the proposed model for GSI forecasting. Energies 2017, 10, 186 4 of 18 Figure 2 illustrates the GSI values as measured at Station S. These monitored data have been used to evaluate the methodology algorithm using k-NN-ANN as the proposed model for GSI 4forecasting. Energies 2017, 10, 186 of 17 700 GSI actual 600 GSI (W/m2) 500 400 300 200 100 0 00.00 05.35 06.00 06.25 06.50 07.10 07.40 Time Figure 2. Measured global solar irradiance (GSI) at Station S from 5:20 a.m. to 8:00 a.m. on 8 June 2012. Figure 2. Measured global solar irradiance (GSI) at Station S from 5:20 a.m. to 8:00 a.m. on 8 June 2012. 3. k-Nearest Neighbor and Artificial Neural Network Model for Forecasting Global Solar 3. k-Nearest Neighbor and Artificial Neural Irradiance Based on Meteorological Data Network Model for Forecasting Global Solar Irradiance Based on Meteorological Data This section explains the basic idea of the construct methodology for GSI prediction, namely the k-NN-ANN model. Inthe thisbasic study,idea the subject the centralmethodology station, which for it is GSI surrounded by several This section explains of the isconstruct prediction, namely the other PV stations. The first purpose of this study is the improvement of forecasting results using the k-NN-ANN model. In this study, the subject is the central station, which it is surrounded by several k-NN method combined with an ANN model method, and the process is then used to predict GSI output other PV stations. The first purpose of this study is the improvement of forecasting results using the result of a PV station one hour or 60 min ahead based on meteorological data. k-NN method combined ANN model and the is then used to predict GSI Simulation of thewith k-NNan neural networks canmethod, be programmed in aprocess few minutes after the recording of output result of a PV station one hour or 60 min ahead based on meteorological data. the first measurements. For the GSI forecasting, k-NN-ANN method employs past meteorological data (GSI, temperature, humidity, wind speed and direction). The forecastinhorizon four hours 5 min Simulation of the k-NN neural networks can be programmed a few is minutes afterinthe recording increments. The first step developing a k-NN-ANN method is method to developemploys the database features of the first measurements. Forinthe GSI forecasting, k-NN-ANN pastofmeteorological that will be used for comparison with the current conditions and the forecast GSI a few ahead. data (GSI, temperature, humidity, wind speed and direction). The forecast horizon is four hours in 5 min increments. The first step in developing a k-NN-ANN method is to develop the database of 3.1. k-Nearest-Neighbors features that will be used for comparison with the current conditions and the forecast GSI a few ahead. The k-NN method is one of the simplest machine learning algorithm methods. The k-NN algorithm is a non-parametric method used for classification and regression. The output depends on 3.1. k-Nearest-Neighbors whether k-NN is used for classification or regression: in used k-NN model classification is the value a class membership. object validation is classified by a majority vote ofThe its neighbors, Theoutput k-NNismethod is one of theAn simplest machine learning algorithm methods. k-NN algorithm with the object being assigned to the class most common among its k-NNs. If k = 1, then the object is a non-parametric method used for classification and regression. The output depends on is whether simply assigned to the class of that single nearest neighbor. In a k-NN regression model, the value k-NN is used for classification or regression: in used k-NN model classification is the value output output is the property value for the object. This value result output training is the average of the is a classvalues membership. An object is classified byclassification a majorityofvote of its neighbors, with the of its k-NNs. The k-NN validation model is applied to perform objects based on learning object being assigned to the class most common among its k-NNs. If k = 1, then the object is data that were located closest to the object, and the method is considered the simplest among other simply [2]. The mainsingle idea isnearest that the neighbor. k-NN algorithm uses a regression training set for data modelling. Then, is the assignedmethods to the class of that In a k-NN model, the value output prediction new points be the average of the values of Theof variables employed propertythe value for theofobject. Thiscan value result output training is its thek-NNs. average the values of its k-NNs. for modelling very short term forecasting have been described in the previous section. The k-NN The k-NN model is applied to perform classification of objects based on learning data that were located predict is computed using the features assembled in the matrices in a two-step process. In the first closest to the object, and the method is considered the simplest among other methods [2]. The main step, we have been calculating the pre-defined distance between the variables in the new dataset (the idea is that the k-NNoralgorithm a training modelling. thedataset. prediction optimalization the testing uses sets and training set sets)for anddata the features in theThen, previous For a of new 1 n points can be set theofaverage of the its k-NNs. Thewith variables for distances modelling given featuresof S =the {p values , ..., p } in new dataset lengthsemployed N1, ..., Nn, the of very the short previous data arebeen calculated. In theinsecond step, choosing k-NNs havepredict k smallest from term forecasting have described the previous section. Theand k-NN is distances computed using the test in[26]. distance is sorted in ascending order, and first k elements featurestraining assembled the The matrices in a D two-step process. In the first step, wethe have been calculating the ( ) their associated time stamps [2] D D ≤ D between ≤ ... ≤ D the and {τ1 ,..., τ k } are extracted pre-defined variables in the newk dataset (the optimalization or the testing sets S distance S ,1 S ,2 S ,k 1 n } in and training sets) and the features in the previous dataset. For a given set of features S = {p , ..., p Equation (1). To find the k-NN based on the Euclidean distance, this mathematical equation is used: the new dataset with lengths N1 , ..., Nn , the distances of the previous data are calculated. In the second N 2 d ( x, y ) = distances w2j ( x j − yfrom 1, 2, 3,.... test [26]. The distance D (1)is sorted j ) , j =training step, choosing k-NNs and have k smallest j =1 in ascending order, and the first k elements DS ( DS,1 ≤ DS,2 ≤ ... ≤ DS,k ) and their associated k time Energies 2017, 10, 186 5 of 18 stamps {τ1 , ..., τk } are extracted [2] Equation (1). To find the k-NN based on the Euclidean distance, this mathematical equation is used: v uN u 2 d( x, y) = t ∑ w2j x j − y j ,j = 1, 2, 3, . . . . (1) j =1 where d is the number of forecast instances in the optimization set. We can calculate the distance between two scenarios using some distance function d(x, y), where x, y are the matrix scenarios composed of N features x = { x1 , ..., x N }, y = {y1 , ..., y N }, N is the length of data, and the distance between the current performance and previous condition, wj is the weight value of the dependent variable members of k-NN (kernel function) and j is the order of the k-NN based on their distance from the current performance condition and which the nearest with used the lowest order (j = 1, ..., K). k-Nearest Neighbor Modelling In this research, to obtain the k-NN model forecast using the algorithm model proposed described above, several parameters need to be specified and the k-NN modelling process can be divided into five calculation stages. The procedure of k-NN for regression is as follows: (1) (2) The matrix scenarios composed modelling stage which includes form d-dimensional feature vectors C or nxy from the historical data x: c = c1 , c2 , ..., c p and nxy = nxy1 , nxy2 , ..., nxy p or x: C = nxy = [ xt , xt−1 , ..., xt−d+1 ]; Their corresponding successors are denoted as xh . They are given two pieces point c and nxy in a space vector of n-dimensional c (c1 , c2 , ..., cn ) and nxy (nxy1 , nxy2 , ..., nxyn ). The distance calculate vector stage which includes form n-dimensional distance vector dsij (c, nxy) = Di for each testing vector C or nxy by calculating the Euclidean distance between dsij (c, nxy) = Di and the remaining: C : dsij (c, nxy) = k Di − D j k , j 6= i (2) where Di is the value of the dependent variable historical data set and Dj is the value of the dependent the nearest neighbor based on distance: v u k 2 u dsij (c, nxy) = t ∑ cip − nxyipj , p = 1, 2, 3, ..., n (3) q =1 where dsij (c, nxy) is value of the dependent variable GSI, nxyipj is the position coordinat PV (3) (4) station (magnitude of nearest neighbors) cip is the d-dimensional feature vectors, The index j is the current condition, i is the historical dataset, k = K the number of elements in the nearest neighbors, q is the historical dataset and where x, y are scenarios composed of k features and p is the number of the k-NN based on their distance from the current condition (j) in which the nearest have the lowest order (p = 1, ..., k). Select the value distance of the k-NN stage, which includes the sort dsij (c, nxy) in ascending order and select the first K entries as the nearest neighbors Dk , k ∈ {1, ..., K}. Select the best value of k used in modelling the k-NN stage, because a high k value will reduce the effect of noise on the classification, but it will make the boundaries between each classification becomes increasingly blurred. Form a kernel function: K ( j) = 1 j K ∑ j =1 or k i = 1 j 1 i = 1, 2, 3, 4, 5 dis(i ) (4) Energies 2017, 10, 186 (5) 6 of 18 where K(j) = ki is the value k-NN (kernel function), i is the historical data set and the index j is the number of the k-NN based on their distance from the current condition (i) in which the nearest have the lowest order (j = 1, ..., K), and K is the length data sets, and the distance between the current and previous condition [25]. Calculate the final estimation stage, using Equation (5): k sumd = ∑ K j xhj j =1 Energies 2017, 10, 186 (5) 6 of 17 h where x hj is the magnitude of nearestsumd neighbor the order of the nearest neighbors = j =1j,K jjxis (5) based j on their distance from the current condition (h) in which the nearest have the lowest order h nearest neighbor j, j is the of theand nearest based where (j = 1, ..., k),xsumd is magnitude the karnel of function, k is the length of order data sets, Kj isneighbors the kernel function. j is the k on their distance from the current condition (h) in which the nearest have the lowest order (j = 1, 3.2. Artificial ..., k),Neural sumd Network is the karnel function, k is the length of data sets, and Kj is the kernel function. ANN is a mathematical method inspired by the structure and information processing of biological 3.2. Artificial Neural Network neural networks. ANNs are intelligent systems that have the capacity to learn, memorize and create ANN is a mathematical method inspired by the structure and information processing of relationships among data [30]. ANN model is a combination of pattern recognition, deductive biological neural networks. ANNs are intelligent systems that have the capacity to learn, memorize reasoning and numerical computations to simulate learning in the human brain. ANN consists and create relationships among data [30]. ANN model is a combination of pattern recognition, of andeductive interconnected many groups namely neurons, and its main task is processes information using reasoning and numerical computations to simulate learning in the human brain. ANN a connection approach to computation. ANN consists of an interconnected namely consists of an interconnected many groups namely neurons, and its main many task isgroups processes neurons, and its main task is processes information using a connection approach to computation. information using a connection approach to computation. ANN consists of an interconnected manyANN models havenamely been used to predict data [24]. The methodology is a promising alternative groups neurons, and itssolar mainradiation task is processes information using a connection approach to computation. ANN models have been used predict solar dataradiation [24]. The methodology is are to traditional approaches for forecasting GSI,toespecially in radiation cases where measurements a promising alternative traditional approaches comprise for forecasting GSI,connected especially neurons in cases where not readily available. ANN to models fundamentally multiple and nodes. radiationnetworks measurements are not readily available. ANN fundamentally techniques comprise multiple The neural are considered as the member of models the non-parametric which are connected neurons and nodes. The neural networks are considered as the member of the nonusually used for estimation and classification [25]. The neurons have five basic components, i.e., input, parametric techniques which are usually used for estimation and classification [25]. The neurons have weight-bias, threshold, summing junction and output, as illustrated in Figure 3. Neurons are arranged five basic components, i.e., input, weight-bias, threshold, summing junction and output, as illustrated in three layers3.which consist of input, andwhich output. in Figure Neurons are arranged in hidden three layers consist of input, hidden and output. Figure 3. Basic structure architecture of a simple artificial neuron model [27]. Figure 3. Basic structure architecture of a simple artificial neuron model [27]. Artificial Neural Network Modelling Artificial Neural Network Modelling The ANN algorithm model can be divided into four step: (1) the design model and input pattern step which algorithm includes the choicecan of be thedivided ANN model, the number its design layers ANN, number of The ANN model into four step: (1)ofthe modelthe and input pattern in the eachchoice layer ANN, inputsmodel, and outputs, the choice and the validation step neuron which groups includes of theitsANN the number of of itstraining layers set ANN, number of set groups samples in which k-NN design; the training set andthe testing set ANN forecasting based on neuron eachuse layer ANN, its (2) inputs and outputs, choice of training set and validation meteorology data are presented to the ANN and the weights layer are adjusted accordingly till a on set samples which use k-NN design; (2) the training set and testing set ANN forecasting based predetermined condition is more better; (3) the simulation test passed step, in which the result ANN forecasting model is tested using measurement data at nine station PV which are not treated during the training step; and finally (4) the performance evaluation stage. Layered perceptron ANN model with different topology designs were considered in order to obtain the best mapping between the ANN algorithm inputs and outputs. The proposed model in this research is used to predict the GSI values for the next hours or one hour ahead, forecasting based Energies 2017, 10, 186 7 of 18 meteorology data are presented to the ANN and the weights layer are adjusted accordingly till a predetermined condition is more better; (3) the simulation test passed step, in which the result ANN forecasting model is tested using measurement data at nine station PV which are not treated during the training step; and finally (4) the performance evaluation stage. Layered perceptron ANN model with different topology designs were considered in order to obtain the best mapping between the ANN algorithm inputs and outputs. The proposed model in this research is used to predict the GSI values for the next hours or one hour ahead, forecasting based on meteorology data and the GSI data from other PV stations. The GSI forecasting data are normalized 10, 186 7 ofnumber 17 to theEnergies limit2017, of [0, 1] to avoid neuron groups saturation during the learning progress. The of neuron groups in the first hidden layer is 10 and we have to consider an ANN with 1475 inputs normalized to the limit of [0, 1] to avoid neuron groups saturation during the learning progress. The for GSI forecasting. neuron active function of we hidden and output number of neuronThe groups in thegroups first hidden layer is 10 and have to consider anlayer ANNare withtansig 1475 and purelin, respectively. Performance test data set is used to evaluate GSI forecast accuracy of the ANN inputs for GSI forecasting. The neuron groups active function of hidden and output layer are tansig algorithm. The ANN models inputs acts directly on the training period duration for GSI forecasting and purelin, respectively. Performance test data set is used to evaluate GSI forecast accuracy of the [9]. ANN the algorithm. The ANN models acts directly on the training for GSI To choose best design modeling forinputs forecasting, we investigated the period role of duration meteorology data in forecasting To choose bestwe design modeling forecasting, we investigated role of good the ANN model[9]. effectivity andthe then researched the for impact of the data mode on thethe training meteorology in the ANN model effectivity and then we researched the impact of the data mode performance anddata on the forecasting accuracy. on the training good performance and on will the forecasting accuracy. In the first reason, the ANN models learn about the global solar irradiance profile of the In the first reason, the ANN models will learn about the global solar irradiance profile of the target location. At the end of the learning process, the k-NN-ANN model will be able to give the GSI target location. At the end of the learning process, the k-NN-ANN model will be able to give the GSI forecast results at the target PV station based on meteorology data. When compared to actual data, forecast results at the target PV station based on meteorology data. When compared to actual data, ANNANN also also presents thethe whole GSI four-hourwindow, window, giving ANN possibility to presents whole GSIvalues valuesfor for the the four-hour giving ANN the the possibility to learnlearn the existing relationship in between them and GSI evolution evolutionduring during the the existing relationship in between them andtotodevelop developan anidea idea about about GSI time the window. In the second reason,reason, the entire GSI data are used for for very short term forecasting time window. In the second the entire GSI data are used very short term forecastingusing using the k-NN-ANN model based on meteorological at PV the station, target PV station, which is by the k-NN-ANN model based on meteorological data at thedata target which is surrounded surrounded by eight other PV stations. The proposed k-NN-ANN model has to forecast GSI eight other PV stations. The proposed k-NN-ANN model has to forecast GSI values for thevalues next hour theahead next hour or 60 min by taking into account the based forecasting data based ondata. meteorology or 60for min by taking intoahead account the forecasting data on meteorology data. The specific feature of this k-NN-ANN model is that measured data and hourly weather The specific feature of this k-NN-ANN model is that measured data and hourly weather meteorology data (temperature, humidity, global irradiance wind speed and direction) forecasts meteorology data (temperature, humidity, global irradiance wind speed and direction) forecasts for for the nearest eight surrounding stations are used as input data information, as explain by Figure 4. the nearest eight surrounding stations are used as input data information, as explain by Figure 4. To To model betweenk-NN k-NNmodel model prediction actual values ofatGSI modelthe theavailable availablerelationship relationship between prediction andand the the actual values of GSI theat the pivotpivot location, an ANN based model is is used. ANNmodels modelscan canbebeprogrammed programmed in location, an ANN based model used.The Thetraining training of of the the ANN one hours after the of first measurements. in one ahead hours ahead afterrecording the recording of first measurements. Figure 4. The proposedforecasting forecasting model model using neighbor and artificial neural neural networknetwork (kFigure 4. The proposed usinga k-nearest a k-nearest neighbor and artificial NN-ANN) model. (k-NN-ANN) model. 3.3. k-Nearest Neighbor and Artificial Neural Network Modelling The forecasting problem in this research was forecasting based on meteorological data study in progress to forecast GSI at a PV station for energy production by one hour or 60 min ahead by using the hybrid k-NN-ANN model. The procedure of k-NN-ANN model for GSI forecasting based on meteorological data has the following steps: Energies 2017, 10, 186 8 of 18 3.3. k-Nearest Neighbor and Artificial Neural Network Modelling The forecasting problem in this research was forecasting based on meteorological data study in progress to forecast GSI at a PV station for energy production by one hour or 60 min ahead by using the hybrid k-NN-ANN model. The procedure of k-NN-ANN model for GSI forecasting based on meteorological data has the following steps: Step 1: Calculate the distance of each data parameter: r dsGI j ( c, nxy ) = k ∑ q =1 k Ho c Ho p − nxy pj ∑ q =1 k Ws cWs p − nxy pj k r ds Ta j ( c, nxy ) = r ds jHo (c, nxy) = r dsWs j ( c, nxy ) = r dsWd j ( c, nxy ) = 2 k ∑ q =1 ∑ q =1 ∑ q =1 GI cGI p − nxy pj Ta c Ta p − nxy pj 2 , p = 1, 2, ..., n j = 1, 2, ..., n (6) , p = 1, 2, ..., n j = 1, 2, ..., n (7) 2 , p = 1, 2, ..., n j = 1, 2, ..., n 2 Wd cWd p − nxy pj (8) , p = 1, 2, ..., n j = 1, 2, ..., n 2 (9) , p = 1, 2, ..., n j = 1, 2, ..., n (10) and the weight distance is: 1 d21 GIx = sumd Ta x = Ho x = Ws x = Wd x = 1 d21 sumd 1 d21 sumd 1 d21 sumd 1 d21 sumd 1 d22 · GIi + sumd · Tai + · Hoi + · Wsi + · Wdi + · GIi + 1 d22 sumd 1 d22 sumd 1 d22 sumd 1 d22 sumd · Tai + · Hoi + · Wsi + · Wdi + 1 d23 sumd 1 d23 sumd 1 d23 sumd 1 d23 sumd 1 d23 sumd where dsGI j ( c, nxy ) is the Euclidan distance GSI ds Ta j ( c, nxy ) is the Euclidan distance temperature ds jHo (c, nxy) is the Euclidan distance humidity dsWs j ( c, nxy ) is the Euclidan distance wind speed dsWd j ( c, nxy ) is the Euclidan distance wind direct GIx is the weight distance at GSI Ta x is the weight distance at temperature Ho x is the weight distance at humidity Ws x is the weight distance at wind speed Wd x is the weight distance at wind direct GI is the global irradiance Ta is the temperature Ho is the humidity 1 d24 · GIi + · Tai + · GIi + sumd 1 d24 sumd · Hoi + · Wsi + · Wdi + · Tai + 1 d24 sumd 1 d24 sumd 1 d24 sumd 1 d25 sumd 1 d25 sumd · Hoi + · GIi (11) · Tai (12) 1 d25 sumd · Wsi + · Wdi + · Hoi (13) · Wsi (14) · Wdi (15) 1 d25 sumd 1 d25 sumd Energies 2017, 10, 186 9 of 18 Ws is the wind speed Wd is the wind direct C pGI is the d dimensional feature vectors GSI C pTa is the d dimensional feature vectors temperature C pHo is the d dimensional feature vectors humidity CWs p is the d dimensional feature vectors wind speed CWd p is the d dimensional feature vectors wind direct nxyipj is the position coordinat PV station (magnitude of nearest neighbors) cip is the d-dimensional feature vectors sumd is the kernel function: i is the historical data set j is the current condition, i is the historical dataset, k = K the number of elements in the nearest neighbors q is the historical dataset and where x, y are scenarios composed of k features and p is the number of the k-NN based on their distance from the current condition (j) in which the nearest have the lowest order (p = 1, ..., k). Step 2: Calculate number of nearest neighbors: k i +n = 1 i = 1, 2, 3, ..., n = 130 dis(i + n) (16) where k is the distance nearest neighbor, i is the historical dataset and n is the total number of features (i = 1, 2, ..., n) Step 3: Calculate the final estimation as: k sumd = ∑ K( j)xhj (17) j =1 where x hj is the value magnitude of the k-NN j, j is the order of the nearest neighbors based on distance (h) in which the nearest have the lowest order (j = 1, ..., k), sumd is the karnel function, k is the length of data sets, and the distance between the current and previous condition and Kj is the kernel function. Step 4: Training data and testing data using ANN method are obtained from the following steps: a. b. c. d. e. f. g. The final estimation of GSI from k-NN method is into training sets, validation sets, and test sets for ANN model. Architecture model and training sets parameters (create and configure the neural network, initialize the weights layer and biases layer) are selected Model GSI forecast using the training set are run Model forecasting is validated using the validation set Step b–e are repeated using different architectures model forecasting and training set parameters The optimum model is chosen and inserted into training process using data from the training set and validation set The ultimate forecasting model is assessed using the test set Step 5: Perform predictions for future GSI data at the Station S with k-NN-ANN Energies 2017, 10, 186 10 of 18 The forecast model designed for k-NN is described in Figure 5a. If the validation test for forecasting the GSI is successful and get more better the result, the forecast model could perform its designed function, otherwise one or more changes should be made during the previous process. The procedures are shown in Figure 5b. GSI forecasting process using hybrid k-NN-ANN algorithm can also be divided Energies 2017, 10, 186 10 of 17 into two process: (1) k-NN modelling to determine the d-dimensional feature and n-dimensional distance forANN input process; data at the(2) ANN (2) the ANNfor modelling for GSI forecasting. distance dimensional for input data at the theprocess; ANN modelling GSI forecasting. The procedures The procedures in Figure 5c. In this research, and ANN structure construction are displayed in Figureare 5c.displayed In this research, k-NN and ANNk-NN structure construction was programmed was programmed using MATLAB (R2013a) programming. MATLAB is been provided with some using MATLAB (R2013a) programming. MATLAB is been provided with some ANN tools. ANN tools. Figure 5. Flowchart k-NN-ANN method very short term forecasting: (a) k-NN process; (b) ANN process; Figure 5.and Flowchart k-NN-ANN method very short term forecasting: (a) k-NN process; (b) ANN (c) hybrid k-NN-ANN model. process; and (c) hybrid k-NN-ANN model. 3.4. Normalization 3.4. Normalization The normalization of data input is very important to obtain good results in the ANN method [9]. In this research for analysis, it needs normalization data for training process the GSI forecasting The60 normalization of data[31], input is very toEquation obtain good min ahead, as defined which can beimportant calculated by (18): results in the ANN method [9]. In this research for analysis, it needs normalization data for training process the GSI forecasting 60 min GSI dn − GSI min n ahead, as defined [31], which can be calculated by Equation (18): GSI dnorm = (18) GSI max − GSI min GSIdn − GSImin n GSIdnorm = (18) GSImax − GSImin Energies 2017, 10, 186 11 of 18 n Let us denote GSIdnorm and GSIdn be normalized target feature at frame index n and the d-GSI output, respectively. Let us also denote GSImax and GSImin be the maximum value and minimum values of the GSI, respectively. 4. Results and Discussion This section discusses the result of k-NN-ANN model used to forecast future GSI by using a one hour ahead forecasting procedure. The procedure is described in Figure 5. We tested our model using previously described databases based on meteorology data, i.e., wind direction, wind speed, GI, temperature, and humidity, during the 1 h or 60 min ahead process. Data Using the k-NN-ANN model, it is expected that a valid GSI forecast result will be produced. The design forecast is divided into two stages: (a) (b) The first stage is calculating d-dimensional feature and n-dimensional distance based on the Euclidean (k-NN model) every hour for all of the PV stations. Data shown in Table 1 are the order parameters of each PV station, i.e., angle, distance and position coordinates. The table summarizes all possible combinations of variables to be considered as an input for k-NN method. The second stage uses the ANN based on the assumption that the existing data input is a combination of the results obtained from the k-NN model. The research model proposed in this study seeks to estimate and predict a PV station production 60 min ahead, which position is located at the center and surrounded by eight other PV stations. Table 1. Data position and coordinates of the PV stations. No. Station Angle (◦ ) Distance (d) Coordinate (xi , yi ) 1 2 3 4 5 6 7 8 9 A B C D E F G H S 0 36 93 140 180 250 280 310 0 8.3 10.5 10.8 10.5 10 4.3 5.4 9.1 0 (8.3, 0) (8.5, 6.2) (−0.6, 10) (−8, 6.7) (−10, 0.1) (−5.5, −5) (1, −5.3) (5.8, −7) (0.1, 0.1) The d-dimensional feature and n-dimensional distances based on the Euclidean (k-NN model) every hour for all of PV station are shown in Figure 6. From the simulation results using the k-NN method based on meteorological data consisting of global irradiance, temperature, humidity, wind speed and wind direction values, respectively. All of them are used as pre-processing data in the ANN method, which can be calculated by the polynomial Equation (19): f ( x, y) = p00 + p10 x + p01 y + p20 x2 + p11 xy + p02 y2 (19) where f (x, y) is the GSI value of the k-NN method and variable x, y is the coordinate value’s position of the PV station. Example Figure 6a can be produced by polynomial Equation (20): f ( x, y) = 10.74 + 0.05105x + 0.01365y + (−0.0014) x2 + 0.0015xy + 0.002923y2 (20) Example Figure 6b can be produced by polynomial Equation (21): f ( x, y) = 520.4 − 1.678x − 0.7583y + (−0.3799) x2 + 0.2108xy + 0.07159y2 (21) Energies 2017, 10, 186 12 of 18 Energies 2017, 186 17 In which the10, polynomial equation above is the result of the simulation on 5:20 a.m. 12 forof Figure 6a and 8:00 a.m. hours for Figure 6b using the k-NN method for Station S. Moreover that polynomial In which the polynomial equation above is the result of the simulation on 5:20 a.m. for Figure 6a and equation also canhours be implemented thethe other PV stations. 8:00 a.m. for Figure 6b for using k-NN method for Station S. Moreover that polynomial To equation validatealso thecan proposed method, data PV of 60 min ahead of a PV station has been calculated be implemented forGSI the other stations. To validate the proposed method, GSI results data of 60 ahead of a PV station hasactual been calculated using the process described in Section 3. The aremin then compared with the data of the GSI using the process described in Section 3. The results are then compared with the actual data the the GSI at target PV station as described in Section 4. Table 2 shows the optimal k-NN parametersoffor GSI at target PV station as described in Section 4. Table 2 shows the optimal k-NN parameters for the forecast with meteorology data every hours from 5:20 a.m. to 8:00 a.m. on 8 June 2012. GSI forecast with meteorology data every hours from 5:20 a.m. to 8:00 a.m. on 8 June 2012. 08.00 05.20 800 G S I (W /m 2) G S I (W /m 2 ) 14 12 10 8 10 0 -10 -10 d-distance y (a) -5 0 5 d-distance x 10 600 400 200 10 0 d-distance y -10 -10 -5 5 0 10 d-distance x (b) Figure 6. (a) d-Distance PV station based on the Euclidean (k-NN model) in time 5:20 a.m. on 8 June Figure 6. (a) d-Distance PV station based on the Euclidean (k-NN model) in time 5:20 a.m. on 8 June 2012; 2012; and (b) d-Distance PV station based on the Euclidean (k-NN model) in time 8:00 a.m. on 8 June 2012. and (b) d-Distance PV station based on the Euclidean (k-NN model) in time 8:00 a.m. on 8 June 2012. Table 2. Optimal k-NN parameters for the GSI forecast with meteorology data. Table 2. Optimal k-NN parameters for the GSI(x,forecast with meteorology data. Station y)* Time Time5:20 5:25 5:30 5:205:35 5:255:40 5:305:45 5:355:50 5:405:55 5:456:00 5:506:05 5:556:10 6:15 6:00 6:20 6:05 6:25 6:10 6:30 6:15 6:35 6:20 6:40 6:256:45 6:306:50 6:356:55 6:407:00 6:457:05 6:507:10 6:557:15 7:007:20 7:057:25 7:107:35 7:157:40 7:207:45 7:257:50 7:357:55 7:408:00 7:45 7:50 7:55 8:00 ST-S (0.1, 0.1) 11 ST-S15 (0.1, 0.1) 22 11 27 15 37 22 42 27 33 37 32 42 40 33 59 32 77 101 40 139 59 169 77 179 101 197 139 216 169230 179240 197252 216269 230292 240308 252324 269345 292357 308370 324362 345360 357344 370427 362475 360 344 427 475 ST-A (8.3, 0) 11 ST-A 15 (8.3, 22 0) 29 11 39 15 44 22 35 29 32 39 41 44 61 35 82 32 106 41 132 61 152 82 158 106 186 132 200 152 217 158 229 186 238 200 256 217 275 229 290 238 302 256 322 275 334 290 351 302 331 322 323 334 316 414 351 457 331 ST-B ST-C ST-D ST-E ST-F ST-G (1, ST-H (8.5, 6.2) (−0.6, 10) Station (−8, 6.7) (−5.5, −5) −5.3) (5.8, −7) (x, y)(−10, * 0.1) 11 11 10 10 10 11 11 ST-B ST-C ST-D ST-G ST-H 15 15 15 15ST-E 15ST-F 15 15 (8.5, −10, 0.1) (22 −5.5, −5) (1,22−5.3) (5.8, 22 6.2) (− 220.6, 10) (− 228, 6.7) (22 22 −7) 2711 2511 24 10 25 10 27 10 2811 3011 3715 3315 33 15 35 15 38 15 3915 4115 4422 4222 41 22 40 22 41 22 4222 4322 3827 3825 35 24 31 25 30 27 3128 3130 3137 3133 31 33 32 35 32 38 3239 3241 3644 3342 35 41 40 40 44 41 4442 4643 53 45 48 57 65 67 70 38 38 35 31 30 31 31 76 67 66 72 79 83 88 31 31 31 32 32 32 32 100 91 90 95 103 107 111 36 33 35 40 44 44 46 113 110 127 148 160 155 155 53 45 48 57 65 67 70 133 140 165 189 195 183 178 76 67 66 72 79 83 88 153 172 194 204 197 181 170 100 91 90 95 103 107 111 184 196 207 211 206 197 191 113 110 127 148 160 155 155 202 221 234 235 225 213 203 133 140 165 189 195 183 178 219 236 246 246 237 226 218 153 172 181 170 233 248 256194 254204 245197 235 228 184 196 197 191 243 263 272207 269211 256206 245 235 202 221 213 203 260 278 287234 285235 274225 263 254 219 236 226 218 275 294 310246 314246 304237 290 280 233 248 235 228 290 310 326256 330254 320245 305 295 243 263 245 235 302 327 347272 351269 338256 321 307 260 278 263 254 320 345 366287 373285 361274 343 330 275 294 290 280 334 360 381310 385314 372304 354 340 290 310 305 295 354 379 394326 395330 380320 365 351 302 327 321 307 334 371 397347 400351 380338 355 336 320 345 343 330 327 372 402366 405373 380361 351 327 334 360 354 340 328 367 385381 378385 352372 329 310 414 429 440394 443395 436380 426 418 354 379 365 351 450 467 486397 497400 492380 478 469 334 371 355 336 323* (x, y) are327 372 position 402values of405 380 Note: the coordinate the PV stations. 316 328 367 385 378 352 414 414 429 440 443 436 457 450 467 486 497 492 Note: * (x, y) are the coordinate position values of the PV stations. 351 329 426 478 327 310 418 469 Energies 2017, 10, 186 13 of 18 Energies 2017, 10, 186 13 of 17 The k-NN-ANN modelling approach is unique since the neural network algorithm is firstly The k-NN-ANN modelling approach is unique since the neural network algorithm is firstly developed using the k-NN model based on meteorology data. Results from the k-NN model are then developed using the k-NN model based on meteorology data. Results from the k-NN model are then divided it into two sets of data: the training and the validation set. After the simulation test, the ANN divided it into two sets of data: the training and the validation set. After the simulation test, the ANN model based on the k-NN results will be ready to use in 60 min ahead GSI forecasting. As explained model based on the k-NN results will be ready to use in 60 min ahead GSI forecasting. As explained previously, the optimization parameters for GSI forecasting were obtained from a dataset based on previously, the optimization parameters for GSI forecasting were obtained from a dataset based on meteorology data. The calculated values of GSI were then compared with measured values ( GSI ) of meteorology data. The calculated values of GSI were then compared with measured values (GSI ) each station: Station S, Station A, Station B, Station C, Station D, Station E, Station F, Station G, Station of Station S, Station A,and Station B, Station C, Station D, Station Station F, Station G, H. each Meanstation: absolute bias error (MABE) root-mean-square error (RMSE) wereE, used as error statistical Station H. Mean absolute bias (MABE) andmodel root-mean-square error (RMSE) were used as error indicators. The estimation fromerror the k-NN-ANN was then compared with the k-NN model and statistical indicators. The estimation from the k-NN-ANN model was then compared with the k-NN mean average of meteorological data prediction. model and average k-NN-ANN of meteorological prediction. For themean comparison, modeldata performed only 500 iterations for each learning period. For the comparison, k-NN-ANN model performed 500 dataset; iterations each learning period. Figure 7 shows the all the data for the three subsets: (a) only training (b)for validation dataset; and Figure 7 shows the all the data for the three subsets: (a) training dataset; (b) validation dataset; and (c) test dataset. In this figure the plots show for GSI versus the time. For the ANN algorithm is initially (c) test dataset. figurebased the plots show for model GSI versus theitsANN algorithm is initially constructed for In thethis training on the k-NN data,the andtime. afterFor this, training is periodically as constructed for the training based on the k-NN model data, and after this, its training is periodically the database expands over time. as the database expands over time. 0.8 0.8 0.6 GSI G S I in p u t 0.6 0.4 1 0.8 0.6 G S I te s t 1 0.4 0.4 0.2 0.2 0 0 500 1000 0 0 1500 0.2 100 200 0 0 300 100 200 Periode Periode Periode (a) (b) (c) 300 Figure Figure 7. 7. (a) (a) GSI GSI for for the the training training set; set; (b) (b) GSI GSI for for the the validation validation set; set; and and (c) (c) GSI GSI for for the the testing testing set. set. The k-NN model database provides pretraining data, and then the database divided it into two The k-NN model database provides pretraining data, and then the database divided it into two sets of data; the training data set and the validations data set, after the validation test, the k-NN-ANN sets of data; the training data set and the validations data set, after the validation test, the k-NN-ANN based model will be ready to use for forecasting the GSI, and the the accuracy of the model can be based model will be ready to use for forecasting the GSI, and the the accuracy of the model can be determined using the testing set data that has been gained from the training data process. For the determined using the testing set datathe that has model been gained the training data process. Fordata the present forecast application program, ANN has to from start learning based on the training present forecast application program, the ANN model has to start learning based on the training set, and subsequently construct and sharpen the knowledge while ensuring the continuity of its task. datathe set,process and subsequently construct sharpen the knowledge ensuring thefrom continuity of its For of testing the accuracyand of the forecasting model thatwhile has been gained the training task. Forusing the process of testing the accuracy the forecasting thatwas has 90% beenof gained from the process backpropagation method. The of amount of testingmodel data used the total. Note training process using backpropagation method. The amount of testing data used was 90% of the total. that for this validation, the k-NN-ANNs were trained only while data processing is done on the Note thatdata for this validation, k-NN-ANNs werelearning trained rate onlyof while dataa processing is done on the training patterns with a the target error of 0.001, 0.1 and maximum of 50 epochs. training data patterns with a target error of 0.001, learning rate of 0.1 and a maximum of 50 epochs. Figure 8 shows a comparison of the two methods for the GSI data between actual data and kFigure 8 shows a comparison two methods for and the GSI data between dataterm and NN-ANN method, where: (a) there isofa the better between actual forecasting data foractual very short k-NN-ANN method, where: (a)S;there is ashows betterthe between actualGSI and forecasting data for very GSI forecasting at Station target and (b) normalized curve using the k-NN-ANN short term GSI forecasting at Station target S; and (b) shows the normalized GSI curve using the method. k-NN-ANN method. 700 0.8 Ac k-NN-ANN 0.7 500 Normalized value G lobal S olar Irradianc e (W /m 2) 600 0.9 Actual k-NN-ANN 400 300 200 0.6 0.5 0.4 0.3 0.2 100 0 00.00 0.1 05.35 06.00 06.25 06.50 Time (a) 07.10 07.40 0 00.00 05.35 06.00 06.25 06.50 Time (b) 07.10 07.40 that for this validation, the k-NN-ANNs were trained only while data processing is done on the training data patterns with a target error of 0.001, learning rate of 0.1 and a maximum of 50 epochs. Figure 8 shows a comparison of the two methods for the GSI data between actual data and kNN-ANN method, where: (a) there is a better between actual and forecasting data for very short term Energies 2017, 10, 186at Station target S; and (b) shows the normalized GSI curve using the k-NN-ANN 14 of 18 GSI forecasting method. 700 0.9 Actual k-NN-ANN 0.8 Ac k-NN-ANN 0.7 500 Normalized value G lobal S olar Irradianc e (W /m 2) 600 400 300 200 0.6 0.5 0.4 0.3 0.2 100 0.1 0 00.00 05.35 06.00 06.25 06.50 07.10 0 00.00 07.40 Time Energies 2017, 10, 186 05.35 06.00 (a) 06.25 06.50 Time 07.10 (b) 07.40 14 of 17 Figure 8. Comparison of the two methods for the GSI data: (a) very short term GSI forecasting using Figure 8. Comparison of the two methods for the GSI data: (a) very short term GSI forecasting using the k-NN-ANN model versus actual data for station S; and (b) the normalized GSI curve using the the k-NN-ANN model versus actual data for station S; and (b) the normalized GSI curve using the kk-NN-ANN method. NN-ANN method. Figure Figure 99 illustrates illustrates the the comparison comparison of of GSI GSI forecasting forecasting in in a four hour window (5:20 a.m.–8:00 a.m.) on basedon onthe thek-NN-ANN k-NN-ANNmodel, model,actual actual data, and k-NN method. result shows on 88 June June 2012, based data, and thethe k-NN method. TheThe result shows that that very short term forecasting simulation using k-NN-ANN method during a four hour window very short term forecasting simulation using k-NN-ANN method during a four hour window gives better gives results to the k-NN method. resultsbetter compared tocompared the k-NN method. 700 Global Solar Irradiance (W/m2) 600 Actual k-NN k-NN-ANN 500 400 300 200 100 0 00.00 05.35 06.00 06.25 06.50 07.10 07.40 Time Figure 9. 9. Very Veryshort shortterm termGSI GSIforecasting forecasting in in5-h 5-hwindow windowbased based k-NN-ANN k-NN-ANN model, model, actual actual data, data, and and Figure k-NN method. k-NN method. Global Solar Irradiance (W/m2) Figure 10 illustrates a very short-term (60 min ahead) GSI forecast using k-NN-ANN and its Figure 10 illustrates a very short-term (60 min ahead) GSI forecast using k-NN-ANN and its comparison with actual data. It is evident that the k-NN-ANN model is in a good agreement with the comparison with actual data. It is evident that the k-NN-ANN model is in a good agreement with the measured data at the object station. measured data at the object station. 700 the performance of the models, a statistical error measurement was used in the To evaluate experiment, namely theActual MABE, and RMSE. To evaluate the accuracyTheoff orecasting eachGSImethod to forecast the GSI f or 600 k-NN-ANN 1 hourly or 60 minute ahead values, MABE, and RMSE coefficient between results of k-NN-ANN and actual ground measurements 500 were calculated. These statistical error indicators validation forecasting are calculated according to Equations (22)400and (23) and the results statistical error are shown in Table 3. 300 200 100 0 00.00 05.35 06.00 06.25 06.50 07.10 07.40 09.00 Time Figure 10. Very short term (60 min ahead) GSI forecast using k-NN-ANN versus actual data. To evaluate the performance of the models, a statistical error measurement was used in the experiment, namely the MABE, and RMSE. To evaluate the accuracy of each method to forecast the GSI Figure 9. Very short term GSI forecasting in 5-h window based k-NN-ANN model, actual data, and k-NN method. Figure 10 illustrates a very short-term (60 min ahead) GSI forecast using k-NN-ANN and its comparison with Energies 2017, 10, 186 actual data. It is evident that the k-NN-ANN model is in a good agreement with 15 ofthe 18 measured data at the object station. 700 Global Solar Irradiance (W/m2) 600 Actual k-NN-ANN The f orecasting GSI f or 1 hourly or 60 minute ahead 500 400 300 200 100 0 00.00 05.35 06.00 06.25 06.50 07.10 07.40 09.00 Time Figure 10. Very short term term (60 (60 min min ahead) ahead) GSI GSI forecast forecast using using k-NN-ANN k-NN-ANN versus versus actual actual data. data. Figure 10. Very short To evaluate the performance of the models, a statistical error measurement was used in the experiment, the MABE, and RMSE. themodels. accuracy of each method to forecast the GSI Error statistical indicators of the To GSIevaluate forecasting MABE: mean absolute bias error; Table 3. namely values, MABE, and RMSE coefficient between results of k-NN-ANN and actual ground measurements RMSE: root-mean-square error. were calculated. These statistical error indicators validation forecasting are calculated according to Model Equations (22) and (23) and theError results statistical error are shownk-NN-ANN in Table 3. k-NN Indicators 44 42 mean absolute bias error; Table 3. Error statistical indicators models. MABE: MABE (W/m2 ) of the GSI forecasting 251 242 RMSEerror. (W/m2 ) RMSE: root-mean-square Model Error Indicators 2) MABE is calculated according to Equation (21): MABE (W/m RMSE (W/m2) MABE = k-NN 44 251 k-NN-ANN 42 242 √ −b ± b2 − 4ac 1 N G f ,i − Gm,i ∑ N i =1 2a (22) RMSE is calculated according to Equation (22): " RMSE(k ) = 1 N 2 e (t + k|t) N i∑ =1 #1/2 (23) where e(t) is the forecasting data and k(t) is the measured (observed) data: p(t + k|t) = P(t + k) − P(t + k|t) where G f ,i is forecasted value GSI and Gm,i is measured value GSI, (i = 1, 2, ..., N), N is the number of the GSI data, i is the number index variations. Figure 11 illustrates the MABE and RMSE coefficients for the actual data and the very short term (60 min ahead) forecasting performance using k-NN and the k-NN-ANN model. The error statistical indicators of the k-NN model are MABE 44 W/m2 and RMSE 251 W/m2 . On the other hand, the error statistical indicators for the proposed model (k-NN-ANN model) are MABE 42 W/m2 and RMSE 242 W/m2 . Note that the highest RMSE was 191 (W/m2 ) during the thirty-two period, while the lowest a value was found to be 9 (W/m2 ) for the same location. It is evident that k-NN-ANN model displays better predictions than the k-NN model. (60 min ahead) forecasting performance using k-NN and the k-NN-ANN model. The error statistical indicators of the k-NN model are MABE 44 W/m2 and RMSE 251 W/m2. On the other hand, the error statistical indicators for the proposed model (k-NN-ANN model) are MABE 42 W/m2 and RMSE 242 W/m2. Note that the highest RMSE was 191 (W/m2) during the thirty-two period, while the lowest a 2 value was Energies 2017,found 10, 186 to be 9 (W/m ) for the same location. It is evident that k-NN-ANN model displays 16 of 18 better predictions than the k-NN model. 45 250 k-NN model k-NN-ANN model k-NN-ANN model k-NN model 40 200 35 R M S E (W / m 2 ) M A B E (W / m 2 ) 30 150 100 25 20 15 10 50 5 0 0 5 10 15 20 Periode (a) 25 30 35 0 0 5 10 15 20 25 30 35 Periode (b) Figure 11. (a) (a) MABE MABE coefficients coefficientsbetween betweenactual actualdata dataand andGSI GSI forecasts using k-NN-ANN model Figure 11. forecasts using thethe k-NN-ANN model on on the test data set; and (b) RMSE coefficients between actual data and GSI forecasts using the k-NNthe test data set; and (b) RMSE coefficients between actual data and GSI forecasts using the k-NN-ANN ANN testset. data set. modelmodel on theon testthe data 5. Conclusions 5. Conclusions A new methodology for very short term (60 min ahead) GSI forecasting of a target PVstation has A new methodology for very short term (60 min ahead) GSI forecasting of a target PVstation been introduced. In this work we propose a novel methodology for GSI forecasting using a has been introduced. In this work we propose a novel methodology for GSI forecasting using a combination of k-NN modelling and an ANN. The model estimates and predicts the GSI profiles of combination of k-NN modelling and an ANN. The model estimates and predicts the GSI profiles of PV PV stations in very short term (60 min ahead) based on hourly meteorology data from eight stations in very short term (60 min ahead) based on hourly meteorology data from eight surrounding surrounding PV stations. The following conclusions can be drawn from this research: PV stations. The following conclusions can be drawn from this research: A different formulation for very short term GSI forecasting using k-NN-ANN modelling based • A formulation for very The short term GSI forecasting using k-NN-ANN modelling on different meteorology data is proposed. proposed model attempting to shape the patterns of a based on meteorology data is proposed. The proposed model attempting to shape the polynomial equation shows that the proposed model forecasting is more better. The variable patterns of a data polynomial shows the proposed model is more better. meteorology weatherequation is of great very that importance and affects theforecasting resulting GSI forecasting The variable meteorology data weather is of great very importance and affects the resulting GSI output. forecasting output. The new model proposed in this study is a combination of k-NN modelling and an ANN model. • The new model proposed this study a combination of term k-NNperiod modelling andahead) an ANN model. The model is employed toin forecast GSI is data for very short (60 min based on The model is employed to forecast GSI data for very short term period (60 min ahead) based on meteorology data. This research concerns how to predict GSI data at a target PV station, which is surrounded by eight other PV stations. The study also considers the availability of a local measured database. It clearly shows that the GSI forecasting using a different k-NN-ANN model for every hour based on meteorological data giving a better result output, which means the GSI forecasting largely depends on variable meteorological data, where the meteorology data variables consist of GI, wind speed, wind direct, humidity and temperatures. • This paper utilises k-NN-ANN modelling to determine the d-dimensional features and n-dimensional distances. The results demonstrate that the results of the k-NN-ANN model are closely matched with the actual data, and are better than data obtained from k-NN models. The novelty of this article is to predict GSI at a PV station which position is at the center surrounded by eight other adjacent PV stations. The proposed model is able to learn the characteristics of meteorology weather data for the past four hours and use the data as model input. In this paper, the proposed k-NN-ANN model has better approximation compared to k-NN model. The very short term forecast evaluations of the GSI using the k-NN-ANN model are performed for only four hours and the results show that the k-NN-ANN method is better than the k-NN method. The error statistical indicators of the k-NN model are 44 W/m2 for the MABE and 251 W/m2 for the RMSE. On the other hand, the error statistical indicators for the proposed model (k-NN-ANN model) are 42 W/m2 (MABE) and 242 W/m2 (RMSE). We noted that the highest RMSE was 191 (W/m2 ) during the thirty-two hour Energies 2017, 10, 186 17 of 18 period, while the lowest value was found to be 9 (W/m2 ) for the same location. The performance of the proposed k-NN-ANN method is more better compared with the k-NN model. The proposed k-NN-ANN model can therefore be used effectively to forecast very short term GSI data while giving closer result outputs and better matches with actual measured data. Author Contributions: Chao-Rong Chen and Unit Three Kartini conceived and designed the experiments; Chao-Rong Chen performed the experiments; Chao-Rong Chen and Unit Three Kartini analyzed the data; Chao-Rong Chen and Unit Three Kartini contributed reagents/materials/analysis tools; Unit Three Kartini wrote the paper. Conflicts of Interest: The authors declare no conflict of interest. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. Yang, H.T.; Yang, P.C.; Huang, C.L. Optimization of Unit Commitment Using Parallel Structures of Genetic Algorithm. In Proceedings of the 1995 International Conference on Energy Management and Power Delivery, Singapore, 21–23 November 1995. Chiang, C.-T.; Lee, Y.-S.; Li, X.R.; Liao, C.-C. A RSCMAC Based Forecasting for Solar Irradiance from Local Weather Information. In Proceedings of the WCCI 2012 IEEE World Congress on Computational Intelligence, Brisbane, Australia, 10–15 June 2012. Hocaoğlu, F.O. Stochastic approach for daily solar radiation modeling. Sol. Energy 2011, 85, 278–287. [CrossRef] Pedro, H.T.C.; Coimbra, C.F.M. Nearest-neighbor methodology for prediction of intra-hour global horizontal and direct normal irradiances. Renew. Energy 2015, 80, 770–782. [CrossRef] Salcedo-Sanz, S.; Casanova-Mateo, C.; Munoz-Marí, J.; Camps-Valls, G. Prediction of Daily Global Solar Irradiation Using Temporal Gaussian Processes. IEEE Geosci. Remote Sens. Lett. 2014, 11, 1936–1940. [CrossRef] Yang, D.; Sharmaa, V.; Ye, Z.; Lim, L.I.; Zhao, L.; Aryaputera, A.W. Forecasting of global horizontal irradiance by exponential smoothing using decompositions. Energy 2015, 81, 111–119. [CrossRef] Yang, D.; Ye, Z.; Lim, L.H.I.; Dong, Z. Very short term irradiance forecasting using the lasso. Sol. Energy 2015, 114, 314–326. [CrossRef] Licciardi, G.A.; Dambreville, R.; Chanussot, J.; Dubost, S. Spatiotemporal Pattern Recognition and Nonlinear PCA for Global Horizontal Irradiance Forecasting. IEEE Geosci. Remote Sens. Lett. 2015, 12, 284–288. [CrossRef] Amrouche, B.; Le Pivert, X. Artificial neural network based daily local forecasting for global solar radiation. Appl. Energy 2014, 130, 333–341. [CrossRef] Wong, L.T.; Chow, W.K. Solar radiation model. Appl. Energy 2001, 69, 191–224. [CrossRef] Voyant, C.; Paolia, C.; Musellia, M.; Niveta, M.-L. Multi-horizon Irradiation Forecasting for Mediterranean Locations Using Time Series Models. Energy Procedia 2014, 57, 1354–1363. Lorenz, E.; Hurka, J.; Heinemann, D.; Beyer, H.G. Irradiance Forecasting for the Power Prediction of Grid-Connected Photovoltaic Systems. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2009, 2, 2–10. [CrossRef] Wang, F.; Mi, Z.; Su, S.; Zhao, H. Short-Term Solar Irradiance Forecasting Model Based on Artificial Neural Network Using Statistical Feature Parameters. Energies 2012, 5, 1355–1370. [CrossRef] Mellit, A.; Pavan, A.M. A 24-h forecast of solar irradiance using artificial neural network: Application for performance prediction of a grid-connected PV plant at Trieste Italy. Sol. Energy 2010, 84, 807–821. [CrossRef] Hammer, A.; Heinemann, D.; Lorenz, E.; Luckehe, B. Short-Term Forecasting of Solar Radiation: A Statistical Approach Using Satellite Data. Sol. Energy 1999, 67, 139–150. [CrossRef] Xiao, C.; Chaovalitwongse, W.A. Optimization Models for Feature Selection of Decomposed Nearest Neighbor. IEEE Trans. Syst. Man Cybern. Syst. 2016, 46, 2168–2216. [CrossRef] Perez, R.; Kivalov, S.; Schlemmer, J.; Hemker, K., Jr.; Renné, D.; Hoff, T.E. Validation of short and medium term operational solar radiation forecasts in the US. Sol. Energy 2010, 84, 2161–2172. [CrossRef] Marquez, R.; Coimbra, C.F.M. Forecasting of global and direct solar irradiance using stochastic learning methods, ground experiments and the NWS database. Sol. Energy 2011, 85, 746–756. [CrossRef] Energies 2017, 10, 186 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 18 of 18 Martin, L.; Zarzalejo, L.F.; Polo, J.; Navarro, A.; Marchante, R.; Cony, M. Prediction of global solar irradiance based on time series analysis: Application to solar thermal power plants energy production planning. Sol. Energy 2010, 84, 1772–1781. [CrossRef] Boata, R.S.; Gravila, P. Functional fuzzy approach for forecasting daily global solar irradiation. Atmos. Res. 2012, 112, 79–88. [CrossRef] Dambreville, R.; Blanc, P.; Chanussot, J.; Boldo, D. Very short term forecasting of the Global Horizontal Irradiance using a spatio-temporal autoregressive model. Renew. Energy 2014, 72, 291–300. [CrossRef] Diagne, M.; David, M.; Lauret, P.; Boland, J.; Schmutz, N. Review of solar irradiance forecasting methods and a proposition for small-scale insular grids. Renew. Sustain. Energy Rev. 2013, 27, 65–76. [CrossRef] Zeng, J.; Qiao, W. Short-term solar power prediction using a support vector machine. Renew. Energy 2013, 52, 118–127. [CrossRef] Dong, Z.; Yang, D.; Reindl, T.; Walsh, W.M. Short-term solar irradiance forecasting using exponential smoothing state space model. Energy 2013, 55, 1104–1113. [CrossRef] Ren, Y.; Suganthan, P.N.; Srikanth, N. A Comparative Study of Empirical Mode Decomposition-Based Short-Term Wind Speed Forecasting Methods. IEEE Trans. Sustain. Energy 2015, 6, 236–244. [CrossRef] Mellit, A.; Benghanem, M.; Bendekhis, M. Artificial Neural Network Model for Prediction Solar Radiation Data: Application for Sizing Stand-Alone Photovoltaic Power System. In Proceedings of the 2005 IEEE Power Engineering Society General Meeting, San Francisco, CA, USA, 12–16 June 2005. Farhad, S.G.; Khaze, S.R.; Malekl, I. A new Approach in Bloggers Classification with Hybrid of K-Nearest Neighbor and Artificial Neural Network Algorithms. Indian J. Sci. Technol. 2015, 8, 237–246. Zhang, Y.; Wang, J. GEFCom2014 Probabilistic Solar Power Forecasting based on k-Nearest Neighbor and Kernel Density Estimator. In Proceedings of the 2015 IEEE Power & Energy Society General Meeting, Denver, CO, USA, 26–30 July 2015. López, G.; Batlles, F.J.; Tovar-Pescador, J. Selection ofinput parameters to model direct solar irradiance by using artificial neural networks. Energy 2005, 30, 1675–1684. [CrossRef] Anil, K.; Jain, J.; Mao, K. Mohiuddin, Artificial Neural Networks: A Tutorial. Computer 1996, 29, 31–44. Wu, B.; Li, K.; Yang, M.; Lee, C.-H. A Reverberation-Time-Aware Approach to Speech Dereverberation Based on Deep Neural Networks. IEEE/ACM Trans. Audio Speech Lang. Process. 2017, 25, 102–111. [CrossRef] © 2017 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
© Copyright 2026 Paperzz