Solar Radiation Data Modeling with a Novel
Surface Fitting Approach
F. Onur Hocaog̃lu, Ömer Nezih Gerek, Mehmet Kurban
Anadolu University, Dept. of Electrical and Electronics Eng., Eskisehir, Turkey
{fohocaoglu,ongerek,mkurban} @ anadolu.edu.tr
Abstract. In this work one year hourly solar radiation data are analyzed and modeled. Using a 2-D surface fitting approach, a novel model
is developed for the general behavior of the solar radiation. The mathematical formulation of the 2-D surface model is obtained. The accuracy
of the analytical surface model is tested and compared with another
surface model obtained from a feed-forward Neural Network(NN). Analytical surface model and NN surface model are compared in the sense
of Root Mean Square Error (RMSE). It is obtained that the NN surface
model gives more accurate results with smaller RMSE results. However,
unlike the specifity of the NN surface model, the analytical surface model
provides an intuitive and more generalized form that can be suitable for
several other locations on earth.
1
Introduction
Solar radiation is the principal energy source for physical, biological and chemical
processes. An accurate knowledge and an insightful model of the solar radiation
data at a particular geographical location is of vital importance. Such knowledge
is a pre-requisite for the simulation and design of solar energy systems. Architects, agriculturalists, air conditioning engineers and energy conscious designers
of buildings also require such information. In many cases, the solar energy applications involve tilted surfaces. To compensate for the effect of radiation on
tilted surfaces, knowledge of both diffusing and direct components of global radiation falling on a horizontal surface is required [1]. Menges at al [2] reviewed
and compared the available solar-radiation models for a region in detail. The
majority of the models developed for the prediction of solar radiation are based
on existing climatic-parameters, such as sunshine duration, cloud cover, relative
humidity, and minimum and maximum temperatures [3–5]. Unfortunately, for
many developing countries, solar-radiation measurements are not easily available
because of the expensive measuring equipment and techniques required. In this
study, using a 2-D approach as mentioned in Section 2, a novel solar radiation
model for one year solar radiation data that is acquired and collected between
August 1, 2005 and July 30, 2006 in Iki Eylul campus of Anadolu University,is
developed. The model is based on a surface fitting approach using the data rendered in 2-D. It is observed that hourly alteration of solar radiation data within
a day has a Gaussian shaped function, hence the 2-D data along the hour axes
are fitted to Gaussian functions. Trust-region algorithm is used as mentioned in
Section 3 during calculating the parameters of Gaussian functions. Also a NN
model is developed for 2-D data as mentioned in Section 4. Finally the models
are compared in the sense of RMSE and the results are presented in Section 5.
The NNs provide a more “specific” model for the data, hence they yield better
prediction models. However, the 2-D surface model is more generic and insightful. Therefore it can also be used as a global model for places with similar yearly
solar radiation conditions without utilizing data collection and training.
2
Determination and Estimation of Surface Model
Structure and Parameters
600
600
500
500
2
Solar radiation (W/m )
2
Solar radiation (W/m )
The first stage in data fitting is to determine a plausible model among known
mathematical models that characterizes the data accurately. After setting the
mathematical model, coefficients of the model must be estimated. Recently, a
novel 2-D interpretation approach that was developed by Hocaoglu at al [6]
indicated that “rendering” or “interpretation” of the data (i.e. transformation)
also proves to be critical even before proceeding to the modeling. In this approach
the solar radiation data in time series is rendered and presented in 2-D and it
is shown that the representation format has significant advantages over 1-D
time series approaches. In this work, starting from the mentioned 2-D rendered
representation, a novel surface model is proposed.
To determine the structure of the model for fitting to the data, transverse
sections are taken from the 2-D along the “hour” and the “day” axes as given
in Fig.1.
400
300
200
100
0
0
50
0
5
100
10
15
150
200
Day
20
25
400
300
200
100
0
0
800
600
5
400
10
200
15
Hour
Hour
0
Day
Fig. 1. Plots of cross sections along “hour” and “days” axes, respectively, for a two
year data.
Examining Fig.1 it can be deduced that the cross section along the “hour”
axis is similar to a Gaussian function for all days. Conversely, the cross section
along the “days” axis exhibits an oscillatory behavior (seasons) that can be modeled with a sinusoidal function. The hourly variation function was chosen to be
Gaussian due to its shape-wise resemblence and simple calculation, and the daily
variation was chosen as a sinusoid due to its capability of physically explaining
the seasonal variation phenomenon. Once the model of the data is determined,
the fitting process must be applied. The result of the fitting process is an estimate of the ”true” but unknown coefficients of the mathematical model. Method
of least squares is the basic method that can be used for linear estimation. In
this method, the sum of squared residuals is minimized. The residual for the ith
data point is obtained as the difference between the actual value and the fitted
value as given in equation 1.
ei = yi − ŷi
(1)
The summed square error (SSE), therefore, is given by equation 2
SSE =
n
X
i=1
e2i =
n
X
(yi − ŷi )2 ,
(2)
i=1
where n is the number of data points included in the fit and SSE is the sum
of squares error estimate. The supported types of least squares fitting include;
Linear least squares, Weighted linear least squares, Robust least squares and
Nonlinear least squares. Although linear least squares method can be used to fit
a linear (polynomial) model to data, nonlinear functions such as Gaussians and
sinusoids may not be suitable. In general, any surface model may be a nonlinear
model which is defined in matrix form as in equation 3
y = f (X, β) + ε,
(3)
where y is an n-by-1 vector of responses, f is a function of β and X, β is
m-by-1 vector of coefficients. X is the n-by-m design matrix for the model. ε
is an n-by-1 vector of errors. Obviously, nonlinear models are more difficult to
fit than linear models because the coefficients cannot be estimated using simple
matrix optimization techniques. Instead, an iterative approach is required that
follows the following steps:
1. Start with an initial estimate for each coefficient. For some nonlinear models,
a heuristic approach is provided that produces reasonable starting values. For
other models, random values on the interval [0,1] are provided.
2. Produce the fitted curve for the current set of coefficients. The fitted response
value y is given by equation 4
3. Adjust the coefficients and determine whether the fit improves.
4. Iterate the process by returning to step 2 until the fit reaches the specified
convergence criteria.
ŷ = f (X, b)
(4)
The above iteration involves the calculation of the Jacobian of f (X, b), which
is defined as a matrix of partial derivatives taken with respect to the coefficients.
The direction and magnitude of the adjustment in step-3 depend on the fitting
algorithm.
There are several algorithms to find estimations of nonlinear model parameters. Around those, best knowns are trust-region and Levenberg-Marquardt algorithms. The Levenberg-Marquardt [7] algorithm has been used for many years
and has proved to work most of the time for a wide range of linear and nonlinear
models with relatively good initial values. On the other hand, trust-region algorithm is specifically more powerful for solving difficult nonlinear problems, and
it represents an improvement over the popular Levenberg-Marquardt algorithm.
Therefore, trust-region method is used for obtaining the Gaussian parameters of
surface functions in this study. The “days” axis is not optimized by any methods,
because its behavior is analytically obtained using geographical facts such as its
period being 365 days and its extrema corresponding to June XX and Dec. XX.
3
NN Model for 2-D Data
To test test and compare the accuracy of the 2-D model, a NN structure is also
built. In this structure, the model does not yield a global, unified and analytical
surface function. Instead, the result is a surface function that is more specifically
trained to the available data. Although the analytical closed form is ambiguous,
the NNs provide a dedicated and better surface model with less RMSE. Since
the proposed surface model has two inputs (hour and day numbers) and one
output (Solar radiation), the NN structure is constructed to be two input-one
output. The input-output pairs are normalized as to fall in the range [-1,1]. It
is obtained that using 5 neurons in the hidden layer is appropriate according
to simulations. Due to its ability of fast convergence the Levenberg-Marquard
learning algorithm is used in learning process of NN. The network is trained
using 1 year solar radiation data and surface model of the data is obtained
by this way. Both hidden and output layer’s output from their net input are
calculated using Tan-Sigmoid transfer function. The network is trained in 50
epochs. The results are obtained and compared with the global and analytical
surface model in Section 4.
4
Numerical Results
The hourly solar radiation data along one day is considered as a Gaussian function as in equation 5
2
2
(5)
g(x) = ae−(x−b) /c
where a is the height of the Gaussian peak, b is the position of the center of
the peak and c is related to the full width at half maximum of the peak. Hourly
radiation data are fitted to the Gaussian function for “all” days by determining
the Gaussian parameters a, b and c using the trust-region algorithm. Totally 365
parameter stes a ,b and c are obtained for one year of recorded data. Then to
form the generic and global surface model of the data, variation of the parameter
sets a ,b and c are explored along days. Since the daily behavior of the data is
expected to have a sinusoidal form as explained in Section 2, the parameters a
and c are modeled with sinusoidal functions with periods equal 365 days. For
each Gaussian function the position of the center of the peak values should be
around the 12.5 value which corresponds the center of the day time for whole
year. As a result, the parameter b is judiciously taken to be 12.5. The other
coefficients a and c are determined as sinusoidals in equations 6 and 7
a(day) = 364 × sin(2 × pi × day/720) + 162.1
(6)
c(day) = 2.117 × sin(2 × pi × day/712) + 2.644
(7)
Finally the analytical surface that models the data is obtained as given in
equation 8.
−((hour
Surf ace(day, hour) = a(day) × e
− 12.5)/
2
c(day))
(8)
600
600
500
500
2
Solar radiation (W/m )
2
Solar radiation (W/m )
As a visual comparison, the obtained surface model and 2-D plot of actual
data is given in Fig.2. The error data calculated by subtracting actual data from
the analytical surface model for each hour is given in Fig. 3.
400
300
200
100
0
400
400
300
200
100
0
400
300
25
20
200
15
Day
300
25
0
Hour
15
10
100
5
0
20
200
10
100
Day
0
5
0
Hour
Fig. 2. 2-D plot of actual data and obtained analytical surface model
The accuracy of the analytical surface model is tested and compared with
surface function generated by NNs. A two input - one output feed forward neural
network is built and given in Fig. 4.
To numerically compare the NN surface with the analytical surface model,
the input-output pairs of network are chosen to be compatible with each other
as hour - versus - day - versus - Solar radiation. For instance, if it is desired to
find the estimation value of solar radiation at 50th day of the year, at 5 o clock,
2
Solar radiation (W/m )
600
400
200
0
−200
−400
400
300
25
20
200
15
10
100
Day
0
5
0
Hour
Fig. 3. Error surface of the model
Number of Day
Predicted solar radiation
at desired day, desired hour
Number of Hour
Input Layer
Hidden Layer
Output Layer
Fig. 4. The adopted NN structure
the inputs of network the network is taken as (50,5) which also corresponds to
the coordinates of the surface model. Various number of neurons are used in the
hidden layer to determine the optimal number of neurons and it is observed that
using 5 neurons is experimentally appropriate to find more accurate prediction
values. The network is trained 50 epochs. The plot of epoch number versus total
RMS error is obtained as in Fig. 5.
Performance is 0.0307894, Goal is 0
0
Performance
10
−1
10
−2
10
0
10
20
30
Epoch number
40
50
Fig. 5. Plot of performance versus epoch number
It is obvious from Fig.5 that a great deal of learning is already archived in
10 epochs. The surface obtained by NN and plot of actual 2-D data are given in
Fig.6
500
500
2
Solar radiation (W/m )
2
Solar radiation (W/m )
600
400
300
200
100
0
400
400
300
200
100
0
400
300
25
20
200
15
Day
300
25
0
Hour
15
10
100
5
0
20
200
10
100
Day
0
5
0
Hour
Fig. 6. 2-D plot of the solar radiation data, and the surface function obtained by NN.
The Autocorrelation coefficient and RMSE values between actual and predicted values of solar radiation data obtained from both analytical surface model
and the NN surface model are calculated, tabulated, and presented in Table I.
Table 1. RMSE values for proposed structures and Autocorrelation coefficients between actual values and predicted values of solar radiation data
Model
RMSE R
Developed Surface Model 57.24 0.936
NN Surface Model
51.91 0.947
5
Conclusion
In this work, using the 2-D interpretation approach, surface models for solar
radiation data are developed. The developed models have two inputs that are
the number of days beginning from January 1 of the year and the number hours
within the days. For these models, the hourly data variation within a day is fitted
to Gaussian functions. The parameters of Gaussian functions are obtained for
each day. In the analytical attempt of surface modeling, the behavior of the solar
radiation data along the days corresponding to the same hour is observed to have
a sinusoidal oscillation. Therefore, the parameters related with the height and
width of the Gaussian are fitted to separate sinusoidal functions, and finally the
analytical model of the surface is obtained. Alternatively, a NN structure is built
with the same input-output data pairs in the 2-D form and a nonlinear and nonanalytical surface model of whole data is obtained. Two models are compared
using RMSE distortion relative to the original data. Due to its specifity, the
NN model provides a more accurate surface model with less RMSE. On the
other hand, the NN surface model is not analytical, and it cannot be generalized
to other places. Conversely, the analytical surface model is very intuitive with
simple seasonal parameters, and it provides a global view of the solar radiation
phenomenon. Therefore, it can be easily adapted to other places in the world
without a long data collection period.
References
1. Muneer T., Younes S., Munawwar S., Discourses on solar radiation modeling, Renewable and Sustainable Energy Reviews, Vol. 11, (2007) 551-602.
2. Menges H. O. , Ertekin C. , Sonmete M. H., Evaluation of global solar radiation
models for Konya, Turkey, Energy Conversion and Management, Vol.47, (2006)
3149-3173.
3. Trabea AA, Shaltout MA. Correlation of global solar-radiation with meteorological
parameters over Egypt. Renew Energy, Vol.21 (2000), 297-308.
4. Badescu V. Correlations to estimate monthly mean daily solar global-irradiation:
application to Romania. Energy, Vol.24 (1999), 883-93.
5. Hepbasli A, Ulgen K., Prediction of solar-radiation parameters through the clearness
index for Izmir, Turkey, Energy Source, Vol.24 (2002), 773-85.
6. Hocaoglu F.O., Gerek Ö.N., Kurban M., A Novel 2-D Model Approach for the
Prediction of Hourly Solar Radiation, LNCS Springer, Vol. 4507, (2007), 741-749.
7. Marquardt, D., An Algorithm for Least Squares Estimation of Nonlinear Parameters, SIAM J. Appl. Math, Vol. 11, (1963) 431-441.
© Copyright 2026 Paperzz