Title of the Paper (18pt Times New Roman, Bold)

Spatial Electric Load Distribution Forecasting
using Simulated Annealing
HÉCTOR GUSTAVO ARANGO AND GERMANO LAMBERT TORRES
Electrical Engineering Institute
Federal University of Itajuba
Av. BPS, 1303 – Pinheirinho – Itajubá/MG – 37500-903
BRAZIL
Abstract: - In this work, a simulation model of market electric expansion is proposed, to take time and space into
account. This model is based on spatio-temporal disorder ideas. The spatial load evolution is simulated using a
dynamic system, which main components are the interaction among the consumers and the consumption
migration of dense regions. The model presents various possible configurations of the distribution market. For
this, it works with a set of parameters, in order to model the load evolution kind by means of a spatial disorder
algorithm. To adjust the simulation results to a real data set of load evolution, a simulation set is used for each
set of test parameters. The best set of parameters is obtained using a simulated annealing algorithm. The
evaluation function of the simulated annealing is calculated considering the distance among the simulation and
real data set. For this, the result of each simulation set is converted into a number, using a first and a second
order arrangement of the statistical moments. The topology of the optimal parameters search is carry out in a
non-linear hyperspace. Results of several tests, show a parameter set able to generate a similar spatial disorder to
the real market, by means of a pre-defined statistic criterion.
Key-Words: - Spatio-temporal Disorder; Dynamic Systems; Spatial Forecasting; and Simulated Annealing
1 Introduction
For planning electric distribution systems it is
required knowing the load magnitude that will have
to be delivered, in order to plan the investment
expansion and the efficient operation of the
equipment.
However, beyond this knowledge,
estimate the spatial distribution of the electric load
increases is fundamental [1]. In a previous work [2], a
contribution in the spatial-temporal disorder was
presented by proposing a methodology to create an
interfere in the load evolution process. This
methodology was presented through an algorithm and
a computational program that includes routines of
pattern localization. These routines are based on the
concept of composition of strong and light
attractions, like it is commonly used in studies of
regional economics and urban growth. The use of this
approach allows to model what kind of disorder (or
order) is generated. Some authors, like [3], using
order to refer all the kinds of configurations. The
sense is that disorder is one kind of order.
In this work, the result of the application of a test
set to the load growth is showed in a hypothetical
case, considered in the experience like a potential real
set of temporal space evolution. Data are generated
by the spatial-temporal disorder program described in
[2]. The evaluation function for the search
hyperspace -defined by the parameters of the spatial
disorder program- uses a composition of the mean
error of localization and the variability of the load
forecasting errors, by means of the process called
simulated annealing. This choice was made because
the functions that generate the spatial disorder are
unknown. Some parameters of the disorder program
were fixed, such as the number of starting (load)
points, that correspond to the initial distribution
system configuration, the rate of load growth and the
scale. However, the considered main parameters,
have been left free to fit them to the real load data set.
Finally, several tests have been done, in order to
obtain sensibility about the robustness of the search
process.
2 Simulated Annealing Applied to
Fitting Parameters
Simulated annealing is a global optimization
algorithm that uses combinatorial optimization to
distinguish between different local extremes.
In this case, the problem is to obtain a minimum
function value, that represents the difference between
the real and the estimate data set.
Traditionally, the simulated annealing algorithm
starts from an initial point, and in each step the
function is evaluated. When the goal is minimizing a
function, any downhill step is accepted and the
process repeats from this new point. However, in
order to escape from local optima, some uphill step
may be accepted. This uphill decision is made by a
probabilistic criterion, like the Metropolis criteria [4].
This criterion estimates the likelihood of one step
towards of the optimization, considering a density
function with the function differential (gain) and an
algorithm parameter calls temperature, T. The way of
  f 


 T 
the likelihood function is pf   e
, where T
f  f x   f xi 
denotes the temperature and
is the
process gain. While the optimization process
proceeds, the temperature and the length of the steps
decline until the algorithm closes on the global
optimum. The algorithm is quite robust with respect
to the non-quadratic surfaces, since it makes very few
assumptions regarding the function to be optimized.
The degree of the robustness can be adjusted
changing the starting parameters of the algorithm.
The implementation of the simulated annealing
algorithm, modified in order to test the features of
this problem, was based on the routines developed
and implemented by Goffe [5] and Corana [6].
It starts from a high temperature and decreases at a
constant rate, sometimes called Boltzman’s constant.
High values are usually the same as the default values.
Otherwise it is necessary checking which values should be
considered high in each problem.
The process stops either when the stop criterion is
satisfied, which can be a minimum gain, or when the
process cools down, after successive temperature
reductions. Note that, in this case, the algorithm idea
is that the likelihood of escaping a minimum, as the
density function has been defined, is practically
impossible, and the algorithm must converge at a
minimum, even if it is global (if the choice of the
parameters is right) or local.
With regard to the robustness of the process, it
varies and depends on the choice of the annealing
parameters. Theoretically, among the set of
parameters to be chosen, the variation rate of the
temperature and the number of iterations before
reducing the temperature are related to the
“quickness” of the process, and this is directly related
to the dimension of the search space. So, if the
process is quick (both parameters set to low values),
the chance of detecting any global singularity is
smaller.
We said that the algorithm robustness was
connected to the initial parameters of the annealing.
Let R be the set of these parameters. The input data
for the conventional algorithm of simulated annealing
are the following:
 Starting value: vector with the coordinates in Rk,
where k is the number parameters (variables) to
be optimized. This number depends of the kind
of problem in analysis. Normally the search is
doing in a n-dimension surface. The number of
parameters to be estimated in our problem is
four.
The search process starting in a certain region of
Rk and is realized around this region. It is not for sure
that all the Rk will be covered by the search. In other
words, the annealing process can stop before that all
the space was explored. Meanwhile, tests using
functions show that in a general way it is possible to
obtain a R that brings the global optimum in the end
of the process.
 Initial temperature: this parameter influences the
step length over which the algorithm searches
for optima. In our case, the initial temperature
was T = 5000.
This parameter provides the search space
extension. This is a reason why it is suggested
starting with a high temperature. Although the search
is also related to the other parameters, get into a high
value of temperature brings to the process energy
enough to walk a big space across Rk .
Besides, T influences the step length, or the
dimension of the changes in the parameters that will
be optimized to reach an extreme in a direct way.
That is, low temperatures can determine small steps
or small changes to be evaluated in the parameters.
 Boltzman’s number (or temperature reduction
rate): This is the value by which T is multiplied
during the process. The value can be defined
between 1<  < 0. However, a value between
0,90 and 0,99 is generally used.
This modeling parameter is really important for
understanding the philosophy of the algorithm. Note
that when the process begins, the high temperatures
provide the algorithm with a great capacity to identify
singularities in a big area of the searching space.
That is, a search in amplitude in the space Rk is
initially made and a high likelihood of escaping local
extremes is for granted.
As T successively reduces in value by the action
of , the search is narrows in Rk and the process starts
to do a search preferentially in depth.
From a certain value of T, the process cannot get
enough energy to escape the area that surrounds a
given extreme, tending to go on only in the same
direction of this singularity.
Also remember that high temperatures result in
significant changes in the parameters, privileging the
search in the amplitude and that low temperatures, in
the end of the process, carefully evaluate the space
for refining the extreme that is being evaluated.
 Step length: is the value of the change in each
variable, towards the extreme or the variation of
the parameters vector.



Number of cycles: is the number of random
moves taken towards each coordinate in the
searching space. Remember that the acceptance
or the rejection of each new point is decided
according to the Metropolis criterion. The best
point found in each cycle is kept.
Number of iterations: is the number of times that
the set of cycles is repeated, equivalent to the
number of times that the given solution by the
end of a cycle is evaluated. Calling N the
number of variables, NC the number of cycles
and I the number of iterations, the total number
of evaluations of the function before a
temperature reduction is given by N.NS.I . In
the program NC  NS.NT .
Maximum error accepted: stop criterion based
on the maximum difference between the
function value before and after a temperature
reduction.
3 Error Evaluation Function (EEF) of
the Simulated Annealing (SA)
Of all possible (and infinite) sets of parameters, let
the generic set ph. Then, after using this set, the
spatial disorder program allows Ah(1), that is a vector
(set output) that represents a spatial disorder
generated by the ph set of parameters. The dimension
for the one-dimensional case (I is the number of
points of the spatial disorder, calculated by
multiplying the number of iterations of the SD
program by the number of new points of each
iteration) is n  Ix1 . Running the SD program N
times, results:
Âh(1), Âh(2) ... Â(h)N
That is a set of vectors that can be joined in an
IN matrix. Note that the matrix columns are the
result of the space-state generated by the N tests
realized. The error in each of the N tests, when
compared to the column vector A (that represents the
real space time series), can be evaluated by
calculating the square of the difference between each
value if the matrix A and the corresponding value in
the columns of the matrix with the N vectors Â(h),
which is the result of the subtraction operation
(matrix e) and then each element of e is squared.
Then, the matrix e2 (IN) is obtained, with all the
quadratic errors of each test and at each point of the
one-dimensional space.
This matrix will be converted into a scalar, that
will be the representative parameter of the error set of
N tests, depending on the real state-space. This scalar
is obtained by doing the following:
1. Sum all the elements of each column of the e2
matrix. The result is a row vector with N elements.
I
E   e2 i
i 1
2. Calculate the first moment (m1) to the vector
described in 1.
N
Ej
m1 
j 1
N
The first moment can be called mean quadratic
error of the parameters set ph, related to observations
set A.
The second moment (m2) is
 E j  m12
N
m2 
j 1
N 1
The composition of the evaluation function uses
diverse moments just as described above. An
evaluation until the second moment, includes
sensibility about the variability of the results with the
same set of parameters; until the third order, error
concentrations can be identified in the left or in the
right side of the mean error (m1) of the density
distribution. The evaluation function that will be used
must seem like FER   .m1   .m2 , with the same
weights, at first.
In order to obtain an idea of the error function
behavior, were made 30 simulations beginning with
the parameter set
P0 = {10, 2, 0.05, 0.05}
The results were a mean of 245.85 and a standard
deviation equal to 68.9731.
4 Test Results
It was made a set of tests to verify the sensibility of
the simulated annealing parameters to the final
results, using a simple initial load configuration. The
spatial-temporal series used like “real” data is, in fact,
a synthetic series, in the space position of the load
points after t = 20. The parameters of spatial disorder
to be adjusted are the maximum attraction value, M,
the attraction influence, d, the random component, p,
and the migration tax, m [1].
In order to understanding the difference in each
parameter, the process were repeated changing the
temperature chronogram and the number of
evaluations per cycle, using several values of
temperature,
temperature
reduction
rate
number, cycles number and number of temperature
reductions. Both sets of parameters are connected
with the search process.
4.1 Results Changing the Temperature
Chronogram, Variable RT
a. Input data
 Simulated annealing parameters
N=4
MAXEVL = 20000
RT = 0.85
NS = 5
NT = 2
T = 5000
function behavior of the simulated annealing program
in the end of each evaluation period of temperature
reduction.. In this test NNSNT = 80 evaluations of
the function are made before each temperature
reduction.
Test 01: EVL
140
Vectors:
120
LB 
0
20
5.000
0
20
1000
.
0
0
UB 
1
P0 
1
0.030
0.020
where: N = dimension or number of variables of the
optimization process; MAXEVL = maximum number
of evaluations of the EEF function; RT = temperature
reduction rate after NNSNT evaluations of EEF
function; NS = cycles number; NT = number of
temperature reductions; T = starting temperature ;
LB = lower bound; UB = upper bound; P0 = starting
point (starting values of the parameters of the spatial
disorder program).

Parameters of the spatial disorder simulator
NCSP = 30
Xini = 50
Domain  0  x  100
Scale: 1/100
Number of points per iteration = 1
Number of starting points = 1(50)
Iteration number = 20
where: NCSP = number of simulations for each set of
parameters; Xini = coordinate of the seed where the
spatial disorder process starts.
100
80
60
40
20
Number of Temperature Reductions
Figure 1- Results of evaluation function after
successive temperature reductions
d. Evolution of the Spatial Disorder Parameters
During the Annealing Process:
 Parameter M
Test 01: Parameter M
140
120
100
80
60
40
20
0
-20
b. Results after SA
Number of evaluation functions = 1841
Accepted = 1129
Out of bounds = 3989
End Temperature = 3,332
IER = 0
Stop Criterion  IER = 0
Number of Temperature Reductions
Figure 2 - Results for the parameter M after
successive temperature reductions

Parameter d
Test 01: Parameter d
Final value of the evaluation function
FOPT = 27.325
350
300
250
Optimal parameters vector
30.172
24.478
P* 
0.2800
0.0730
200
150
100
50
0
-50
c. Evaluation function evolution after successive
temperature reductions
The following graph shows the evaluation
Number of Temperature Reductions
Figure 3 - Results for the parameter d after successive
temperature reductions

value of the evaluation function.
Parameter p
Test 01: Parameter p
2.2
4.2 Results Changing the Temperature
Chronogram, Variable T
1.8
1.4
1
0.6
0.2
-0.2
Number of Temperature Reductions
Figure 4 - Results for the parameter p after successive
temperature reductions

Parameter m
Test 01:Parameter m
2.2
In order to extend the search space, the temperature
was increased ten times. The results found, however,
do not show a gain in process. Also, the process “lost
time” oscillating strongly in several regions across
the space. This time is afterwards necessary to refine
the final solution. A synthesis of the results obtained
with T = 50000, RT = 0.85 and MAXEVL = 20000
can be showed in Table 2:
Table 2 - Results Changing the Temperature
Chronogram
RT NEVL NEA NEOB TempF CRP FOPT0 FOPT*
0.85 2321 1685 5108 4.7408 IER=0 266.78 29.753
1.8
The result is poorer than the one found with T =
5000.
1.4
1
4.3 Results changing the number of
evaluations per cycle, variables NS and NT
0.6
0.2
-0.2
Number of Temperature Reductions
Figure 5 - Results for the parameter m after
successive temperature reductions
e. Synthesis of the results changing the Temperature
Chronogram, variable RT
The results of the changing of RT variable, ceteris
paribus, are shown in Table 1.
Table 1 - Results of the changing of RT variable
RT NEVL NEA NEOB TempF CRP FOPT0 FOPT*
0.85 1841 1129 3989 3.3329 IER=0 116.296 27.325
0.93 3561 2483 7767 8.4231 IER=0 240.094 25.013
0.97 8281 5741 18004 9.4178 IER=0 79.6523 24.727
0.99 20000 *
*
*
MXEV
*
24.197
Obs.: (*) The test was discarded because the stop
criterion does not allow ending the process.
Increases MAXEVL to 30000,
RT = 0.99, NEVL = 21841, NEA = 17318,
NEOB = 46050, TempF = 20.9003, CRP = MXEV,
FOPT0 = 71.997 and FOPT* = 20.864
Where, NEVL = number of evaluations; NEA =
number of accepted evaluations; NEOB = number of
evaluations out
of bounds;
TempF = final
temperature; CRP = stop criterion; FOPT0 = starting
value of the evaluation function; and FOPT* = final
The goal of increasing the evaluations cycle of the
function before each temperature reduction is to
obtain a detailed recognition of the regions of the
hyperespace around the boundaries of the test area.
Of course, the effort and the computational time
increases. In this case, the number of evaluations in
each cycle is ten times bigger than the past,
employing NS = 20 and NT = 5.
In order to make up for the added computational
effort, MAXEVL was increased to 100000, while the
temperature and the Boltzman’s number employed
was the best in the precedent tests, T = 5000 e RT =
0.99.
The results changing NS and NT, with RT = 0.99,
are shown in Table 3
Table 3 - Results changing the number of evaluations
per cycle, variables NS = 5 and NT
NT NEVL NEA NEOB TempF CRP FOPT0 FOPT*
20 100000 97828 224515 409.387 IER=1 233.552 14.097
10 100000 85322 220921 33.184 IER=1 159.265 14.097
In fact, the process was not ending, given that stop
criterion in both tests. However, it can be observed
that in the test showed, the additional effort of search
was not compensated by a smaller EEF value.
5 Conclusions
Before beginning the last remarks, it is important to
have in mind that the conclusions presented here
concern the application of the simulated annealing
process to the problem of spatial disorder simulation
parameter fit of a dynamic process. Therefore, these
conclusions should not be taken as general
conclusions,
applicable to any problem or as
conclusions of wide character.
In order to check the quality of the results found
with the optimization process, a test with the EEF
was made, using 30 simulations for each set of
parameters. The weight of both moments was the
same. The “starting point” was in a place that shows
lower values of EEF in relation to the set of values
used to compare each simulation. The set of “real”
values used in the comparison was the same that was
used in the fit test of parameters.
The test
corresponds to a sample of 30 values of EEF and
makes it possible to verify the following:
a) the density function of the EEF turned up to be
near to the Normal.
b) with the parameters employed, the mean was
245.85 and the standard deviation 68.971.
Therefore, it would be expected, if beginning with
this vector, that the process led the solution to a
region of the hyperspace that would result in a lower
EEF (how low is here a doubt).
In a general way, it was observed that it is possible
to work with the best solution using the simulated
annealing process. Meanwhile, the choice of the
simulated annealing parameters is very important to
the final result corresponding to a large gain in the fit.
It is important to remember that before presented the
poor results showed in this paper, several tests were
made to obtain robustness in the results.
It has also been noted that the successive “gains”
that the process achieves are obtained by strolling
around very different regions in space, that is, for a
small fall of the EEF value, many times it is
necessary migrating to a place far from the last
optimum EEF.
Another very important note is that although the
results of the setting depend very much on the
parameters employed in the simulated annealing, the
region in the space where it all begins must be
carefully chosen.
This occurs basically because of the fact that the
topography of the searching hyperspace is very
rough, presenting big variations for relatively small
changes in the parameters that generate disorder. So it
is difficult to practically employ any exiting vector
and being able to converge to regions that present
acceptable minimums of EEF, although it is known
that the process presents a theoretical proof of
convergence.
In short, we can say that:
a) The temperature chronogram must work with a
high temperature and a slow annealing, in other
words, high T and RT.
b) The starting vector (with the parameters) must be
in an acceptable region. In this case, understanding
the meaning of the parameters of the dynamic process
is fundamental. The random component and the
migration rate must have adequate values. For
example, r = 1 is in EEF domain, however it means
turning the disorder process into absolutely random.
Certainly, this isn’t a good fit for the load evolution.
c) The number of function evaluations in each cycle
must be high. However, when this number is bigger
than 50, The added computational effort is apparently
not compensatory in terms of significantly better
results. It seems that it is convenient to use N50, (or
200) in this case, with NT = 5 and NS = 10.
d) It is possible to fit a set of parameters to a
hypothetical situation (and likely real) and reproduce
the load growth along the space, using a dynamic
process based on development pole theory, such as it
happens in the real world.
References
[1] H.L. Willis - “Spatial Electric Load Forecasting”,
Marcel Dekker Inc, New York, 1996.
[2] H.G. Arango, G. Lambert Torres, A.P. Alves da
Silva - “Load Evolution for Power System
Distribution Planning Studies: An Approach
using Spatial Disorder”, IEEE International
Conference On Systems Man and Cybernetics,
SMC’96, Beijing, China, Vol. 3, pp. 1910-1915,
October 14-17, 1996.
[3] R.C. Hilborn - “Chaos and Nonlinear Dynamics”,
Oxford University Press, New York, 1994.
[4] N. Metropolis, A. Rosenbluth, M. Rosenbluth, A.
Teller & E. Teller - “Equation of State
Calculations by Fast Computer Machines”,
Journal of Chem. Phys., Vol. 21, pp. 1087-1090,
1953.
[5] W.L. Goffe, G.D. Ferrier & J. Rogers - “Global
Optimization of Statistical Functions with
Simulated Annealing”, Journal of Econometrics,
Vol. 60, pp. 65-99. North-Holland, 1994.
[6] A. Corana, M. Marchesi, C. Martini & S. Ridella“Minimizing
Multimodal
Functions
of
Continuous Variables with the Simulated
Annealing Algorithm”, ACM Transactions on
Mathematical Software, Vol. 13, No. 3, pp. 262280, September 1987.
[7] S. Kirkpatrick, C.D. Gelatt & M.P. Vecchi “Optimization by Simulated Annealing”, Science,
Vol. 220, No. 4598, pp. 671-680, May 1983.

Download Report

Title of the Paper (18pt Times New Roman, Bold)

Paperzz.com

Your Paperzz