a greedy genetic algorithm for unconstrained global optimization

Vol. 18 No. 1
Journal of Systems Science and Complexity
Jan., 2005
A GREEDY GENETIC ALGORITHM FOR
UNCONSTRAINED GLOBAL OPTIMIZATION∗
ZHAO Xinchao
(Key Laboratory of Mathematics Mechanization, Institute of Systems Science, AMSS, Chinese
Academy of Sciences, Beijing 100080, China.
Email: [email protected])
Abstract. The greedy algorithm is a strong local searching algorithm. The genetic
algorithm is generally applied to the global optimization problems. In this paper, we
combine the greedy idea and the genetic algorithm to propose the greedy genetic algorithm
which incorporates the global exploring ability of the genetic algorithm and the local
convergent ability of the greedy algorithm. Experimental results show that greedy genetic
algorithm gives much better results than the classical genetic algorithm.
Key words. Genetic algorithm, greedy algorithm, greedy genetic algorithm, global optimization.
1 Inroduction
Genetic algorithms (GAs) have been successfully applied to optimization problems like routing, adaptive control, game playing, cognitive modelling, transportation, travelling salesman
problems, optimal control problems, etc[1,2] . Many strategies were proposed to improve the
performance of the genetic algorithm. The modified genetic algorithm[3] , the contractive mapping genetic algorithm[4], the genetic algorithms with varying population size[5,6] all improved
the performance of the genetic algorithm to some extent. The elitist strategy [4] , the (µ, λ) and
µ + λ selection[7,8] and the Bolzmann tournament selection[9,10] are all relevant strategies[2] .
Ahuja et al. applied the greedy genetic algorithm to the Quadratic Assignment problem [11] .
The ideas they incorporated in the greedy genetic algorithm include: (i) generating the initial
population using a randomized construction heuristic; (ii) new crossover schemes, including
path crossover[12] and optimized crossover[13]; (iii) a special purpose immigration scheme that
promotes diversity; (iv) periodic local optimization of a subset of the population; (v) tournamenting among different populations; (vi) an overall design that attempts to strike a balance
between diversity and bias towards fitter individuals. In the crossover, the greedy genetic algorithm of [11] produces one “offspring” with two parents. Their rule for replacement is: If the
child is fitter than both the parents, then it replaces the parent with which it is more similar;
otherwise, the child replaces the worse parent. It is also reported in [11] that keeping the best
individuals among two parents and the one child leads to premature convergence.
In this paper, we bring another greedy idea into the genetic algorithm. The Greedy Genetic
Algorithm(gGA) always chooses the best chromosomes during the crossover and mutation process. In the crossover process, two parents are chosen to produce two offsprings based on the
traditional two-point crossover. However, the gGA does not directly accept two offsprings as
Received June 13, 2003.
*This research is supported by 2004 Graduate Students Science and Social Practice Stake (Innovation
Research) of Graduate School of Chinese Academy of Sciences.
No. 1
A GREEDY GENETIC ALGORITHM
103
cGA does. It does not take the strategy of choosing two from three (two parents and one child)
based on the replacement rule as [11], either. The two parents and two offsprings compete
with each other and gGA chooses two best from four chromosomes as “offsprings”. Similarly,
the chromosome chosen to mutated and the altered chromosome compete with each other and
gGA chooses the better one as “offspring” in the mutation process. Experimental results show
that the small population sometimes indeed causes faster premature convergence than cGA.
But, in most cases, this strategy may give more satisfiable results than cGA, even the population is small. The greedy genetic algorithm (gGA) effectively incorporates the global exploring
ability of the genetic algorithm and the local convergent ability of the greedy algorithm [14] .
Theoretically speaking, gGA should have better performance than cGA and the pure greedy
algorithm.
We implement the gGA and compare it with the cGA with several examples. We introduce two new concepts, “the probability of excellence” and “the probability of unluckiness” to
compare the convergent properties of different optimization algorithms. Experimental results
show that the gGA converges to the global maxima much faster than the cGA without much
slowdown of the running speed.
2 The Greedy Genetic Algorithm
2.1 The Classical Genetic Algorithm
To find the maximal value for a function in certain domain, the basic idea of the classical
genetic algorithm (cGA) is to represent the problem domain by binary numbers and search the
maximal value of the function on a fixed number of sample points or a population (chromosome,
vectors). During iteration t, the genetic algorithm maintains a population of potential solutions
t
(chromosome, vectors), P (t) = {v1t , v2t , · · · , vn
}. Each solution vit is evaluated to give some
measure of its “fitness”. Then, a new iteration is formed by selecting the fitter individuals.
Some members of this new population undergo alterations by means of crossover and mutation,
to form new solutions[15] . The crossover combines the features of two parent chromosomes to
form two similar offsprings by swapping corresponding segments of the parents. The mutation
arbitrarily alters one or more genes of a selected chromosome, by a random change with a
probability equal to the mutation rate. After a new population is generated, we may repeat
the above procedure to find better maximal values. This procedure may be described briefly as
follows[1] .
PROCEDURE cGA.
Determine a population size popsize.
Begin
t := 0;
initialize P (t);
evaluate P (t);
while (not termination-condition)
t := t + 1;
select P (t) from P (t − 1);
crossover P (t);
mutate P (t);
evaluate P (t);
end while
end begin
end proc
104
Vol. 18
ZHAO XINCHAO
2.2 Greedy Genetic Algorithm for Global Optimization
The greedy genetic algorithm (gGA) is similar to the cGA except for the operation of
choosing offsprings in the crossover and mutation. The cGA always accept the newly produced
individuals as offsprings in the crossover and mutation. On the other hand, the greedy genetic
algorithm always chooses the best chromosomes during the crossover and mutation process.
This idea comes from greedy algorithm which must gain the largest “benefits” when having
chances to choose. In the crossover process, two parents are chosen to exchange some genes and
to produce two offsprings. The two parents and two newly produced chromosomes compete with
each other and gGA chooses the two best from four chromosomes as “offsprings”. Similarly,
the chromosome chosen to be mutated and the newly altered chromosome compete with each
other and the gGA chooses the better one as the “offspring” in the mutation.
In what below, we show how to find the global maximum for an unconstrained function in
k variables
f (x1 , x2 , · · · , xk ) : Rk 7→ R.
We further suppose that each variable xi can take values from a domain Di = [ai , bi ] ⊆ R
and f (x1 , x2 , · · · , xk ) > 0 for all xi ∈ Di . Otherwise, we use some scaling mechanism to adjust
them[16] .
The algorithm depends on the following parameters: population size, popsize; maximal
generation, mp; probability of crossover, pc; probability of mutation, pm.
2.2.1 Encoding methods
We wish to optimize the function f with some required precision: suppose that six decimal
digits for the variable value are chosen. It is clear that to achieve such a precision, each domain
Di should be divided into (bi − ai ) · 106 equal size ranges. Let us denote by mi the smallest
integer such that (bi − ai ) · 106 ≤ 2mi − 1. Then, a representation having each variable xi coded
as a binary string of length mi clearly satisfies the precision requirement. Additionally, the
following formula interprets each such string:
xi = ai + decimal(1001 · · · 001)2 ·
bi − a i
,
2 mi − 1
where decimal(1001 · · · 001)2 represents the decimal value of that binary string.
Now, each chromosome(as a potential solution) is represented by a binary string of length
Pk
m = i=1 mi ; the first m1 bits map into a value from the range [a1 , b1 ], the next group of m2
bits map into a value from the range [a2 , b2 ], and so on; the last group of mk bits map into a
value from the range [ak , bk ].
To initialize a population, we can simply set some popsize number of chromosomes randomly
in a bitwise fashion. However, if we do have some knowledge about the distribution of potential
optima, we may use such information in arranging the set of initial (potential) solutions.
2.2.2 Selection process
A roulette wheel with slots sized according to fitness is used for selection process. We
construct such a roulette wheel as follows. In this process, we need to assume that the fitness
values are positive.
? Calculate the fitness value eval(vi ) = f (vi ) for each chromosome vi (i = 1, 2, · · · , popsize).
? Compute the sum of the fitness of all the chromosomes:
F =
popsize
P
i=1
eval(vi ).
No. 1
A GREEDY GENETIC ALGORITHM
105
? Calculate the probability of a selection pi for each chromosome vi (i = 1, 2, · · · , popsize):
pi =
eval(vi )
.
F
? Calculate a cumulative probability qi for each chromosome vi (i = 1, 2, · · · , popsize):
qi =
i
X
pj .
j=1
The selection process is based on spinning the roulette wheel popsize times; each time we
select a single chromosome for a new population in the following way:
? Generate a random (floating) number r in the range [0, 1],
? If r < q1 then select the first chromosome(v1); otherwise select the ith chromosome vi
(2 ≤ i ≤ popsize) if qi−1 < r ≤ qi .
2.2.3 Crossover operator
Two-point crossover is adopted. The probability of crossover pc = 0.65.
? Generate a random (floating) number r in the range [0, 1];
? If r < pc, then select the given chromosome for crossover.
Now we mate the selected chromosomes randomly: for each pair of coupled chromosomes [15],
we generate two random numbers pos1 , pos2 in the interval [1, m − 1] (m is the total length –
number of bits in a chromosome). The bits between the numbers pos1 and pos2 indicate the
position of the crossing points. From two chromosomes
v1 = (b1 , · · · , bpos1 , bpos1 +1 , · · · , bpos2 , · · · , bm ),
v2 = (c1 , · · · , cpos1 , cpos1 +1 , · · · , cpos2 , · · · , cm ),
two new chromosomes are generated through exchanging the corresponding bits between positions pos1 and pos2 :
v1 0 = (b1 , · · · , cpos1 , cpos1 +1 , · · · , cpos2 , · · · , bm ),
v2 0 = (c1 , · · · , bpos1 , bpos1 +1 , · · · , bpos2 , · · · , cm ).
The gGA does not directly accept two offsprings v1 0 and v2 0 as cGA does. We compute all
the fitness of {v1 , v2 , v1 0 , v2 0 }. Then we choose two best chromosomes from these four as the
“offsprings” according to their fitness values.
2.2.4 Mutation operator
Mutation is performed on a bit-by-bit basis. The probability of mutation pm = 0.1. For
each chromosome in the current (i.e. after crossover) population and for each bit within the
chromosome:
? For each integer i in [1, m], generate a random floating number ri in the range [0, 1];
? If ri < pm, mutate the ith bit of
v = (b1 , · · · , bi , · · · , bm )
to generate a new chromosome
v 0 = (b1 , · · · , 1 − bi , · · · , bm ).
Then we compute the fitness of v and v 0 and choose the better chromosome as the “offspring”.
106
Vol. 18
ZHAO XINCHAO
3 Experimental Results and Analysis
3.1 Testing Examples
The greedy genetic algorithm was tested and compared to the performance of the classical
genetic algorithm. The following functions are chosen to find the global maxima to compare
the performance between cGA and dGA.
• G1 : 21.5 + x sin(4πx) + y sin(20πy)[1] , where − 3 ≤ x ≤ 12.1, 4.1 ≤ y ≤ 5.8.
This is a multi-modal distribution function which has many local maxima (Figure 1). A
maximal value 38.82755 is given in [1].
Figure 1
Diagram for function G1
Figure 2
Diagram for function G2
• G2 : Shubert function[1]
−
5
X
i cos[(i + 1)x1 + i] ×
i=1
5
X
i cos[(i + 1)x2 + i] + 225,
i=1
−10 ≤ x1 ≤ 10,
−10 ≤ x2 ≤ 10.
Function G2 is a variant of the Shubert function. It has 760 local maxima, 18 of which
are global maxima with maximal value 411.73.
• G3 : Shekel SQRN5 function[1]
s3(x1 , x2 , x3 , x4 ) =
5
X
j=1
1
4
P
,
(xi − aij
)2
+ cj
i=1
where the parameters are given in Table 1.
Table 1
Parameters of Shekel SQRN5 function
j
1
2
3
4
5
a1j
4.0
1.0
8.0
6.0
3.0
a2j
4.0
1.0
8.0
6.0
7.0
a3j
4.0
1.0
8.0
6.0
3.0
a4j
4. 0
1.0
8.0
6.0
7.0
cj
0.1
0.2
0.2
0.4
0.6
No. 1
A GREEDY GENETIC ALGORITHM
107
• G3 is chosen from the {Shekel SQRN3, SQRN4, SQRN5} family of 4-dimensional
functions. SQRN 3 has an optimal value 10.15320 given by [1].
• G4 : Rosenbrock function[1]
100 · (x21 − x2 )2 + (1 − x1 )2 , where −2.048 ≤ xi ≤ 2.048 (i = 1, 2).
This function has a global maximal value 3905.9262. It is an ill-stated function although
it is a unimodal one. It represents a problem recognized as a “deceptive problem[18] ”.
While maximizing such functions, two directions of growth can easily be recognized. But
boundaries are chosen in such a way that only one of them leads to a global maximum.
So it is very hard to do a global search.
Figure 3
Diagram for function G4
Figure 4
Diagram for function G5
• G5 : Schaffer function F6 [17] :
p
sin2 x21 + x22 − 0.5
0.5 −
, where −100 ≤ xi ≤ 100 (i = 1, 2) .
(1.0 + 0.001(x21 + x22 ))2
This function has infinite sub-maxima which form a circle to enclose the global maxima,
1, with a distance about 3.14. The properties of strong vibration and the global maxima
being enclosed by the sub-maxima of this function make the global maxima very hard to
find.
3.2 Experimental Results
We choose the following parameters for the algorithm: popsize = 100 or 20; mp=100;
pc=0.65; pm = 0.1. When the result has no improvement within 30 generations or an evolution generation is greater than the maximal generation, the program will terminate. All the
experimental data are not the last generation optima, but the traced global optima. The
computations of this paper were performed by using Maple(TM).
3.2.1 Large population popsize = 100
We run two algorithms 10 times independently under the condition that all parameters are
the same in order to impartially compare them. The results are given in Table 2 and Table 3.
108
Vol. 18
ZHAO XINCHAO
Table 2 Comparing cGA and gGA for functions G1 , G2 , G3
No.
1
2
3
4
5
6
7
8
9
10
Known
G1
cGA
gGA
38.20357320 38. 85029448
37.61534276 38. 38609936
38.22050674 38. 85029448
38.14305656 38. 85029448
38.04605539 38. 85029448
37.18009798 38. 85029448
38.07232843 38. 40235483
37.76916366 38. 85029448
38.08886578 38. 85029448
37.39413171 38. 85029448
38. 82755
G2
cGA
gGA
397.2010864 411.7309089
411.3667615 411.7309089
404.2913240 411.7298648
410.8357650 411.7308291
410.7306030 411.7309088
409.7986833 411.7309089
410.0819853 411.7309088
401.1260982 411.7309046
409.4547476 411.7309089
407.3481682 411.7309089
411.73
G3
cGA
gGA
9.93533 6.15597
2.42379 6.01839
9.01328 5.10069
9.88134 5.05474
9.08357 5.07691
1.74033 10.1527
9.66298 10.1278
1.56367 6.29919
4.66570 9.77555
2.57869 9.77807
10.15320
Table 3 Comparing cGA and gGA for functions G4 , G5
No.
1
2
3
4
5
6
7
8
9
10
Known
G4
cGA
gGA
3902.36785 3905.92622
3853.73747 3905.92622
3880.67561 3905.92622
3894.67718 3905.92622
3875.69595 3905.92622
3896.49379 3905.92622
3869.84415 3897.73422
3840.24449 3897.73422
3888.62918 3905.92622
3891.17849 3897.73422
3905.9262
G5
cGA
0.99250594
0.98890023
0.93712036
0.98490037
0.98390812
0.98998497
0.96013291
0.94574543
0.92174783
0.94562655
gGA
0.98423062
1.00000000
1.00000000
1.00000000
1.00000000
0.99028409
1.00000000
1.00000000
0.98423062
0.98309912
1
Generally speaking, we may say that gGA greatly improves the performance of the genetic
algorithm. Of course, we will get even better results if we combine other good strategies with our
greedy idea. For multi-modal functions G1 , G2 , G5 , G6 , gGA shows stronger ability of escaping
from local maxima as Tables 2 and 3 indicate. For “deceptive function” G4 , gGA displays
stronger ability of jumping from “deceiving trap”. For the 4-dimensional function G3 , gGA
is more robust and has stronger global exploring ability than cGA, although binary encoding
method always faces the “dimension crisis”[1]. What is more, we may find more properties of
the Shekel SQRN3 through abundant computing experiments. This function might have no
local maxima in the interval between [7, 9), because there is no one local maxima which lies in
this interval from head to tail.
3.2.2 Small population(popsize = 20)
From the last section, we see that our algorithm reaches the global maxima with a high
percentage when population is large (popsize = 100). In order to show that most of the results
are satisfiable even the population is small, we use popsize = 20 and the other parameters
are the same. We run programs cGA and gGA to test the functions G1 , G2 , G4 , G5 100 times
independently. For G3 , which is a 4-dimensional function, popsize = 20 is too small to find the
global maximal value. But the results of gGA are still more competitive than that of cGA.
If we know the global maxima of the testing examples, we can introduce some criteria to
No. 1
109
A GREEDY GENETIC ALGORITHM
estimate their convergence. We call an algorithm globally convergent if the largest fitness value
is larger than V alues1 which is given below. It is “excellent” if the result is larger than V alues2
which is given below. It is “unlucky” if the result is less than V alues3 (Table 4). Of course,
the setting of thresholds are based on the characteristics of the problem and our experience.
Table 4
Thresholds to testing functions
THRESHOLD
Known global maxima
Convergent threshold V alue1
Excellent threshold V alue2
Unlucky threshold V alue3
G1
38.82755
38.82
38
36
G2
411.73
411.73
410
400
G4
3905.92
3905.92
3904.00
3895
G5
1
0.99
0.95
0.88
Table 5 is a comparison of the two algorithms. The results are obtained from 100 successive
runs to the four functions. Among them, the minimal result denotes the worst result in the 100
runs. The time denotes the average running time per run.
times of obtaining the convegent results
× 100%
all runs(100)
Probability of convergence =
Probability of excellence =
times of obtaining the excellent results
× 100%
all runs(100)
Probability of unluckiness =
times of obtaining the unlucky results
× 100%
all runs(100)
The configuration of PC in the experiment is: 2XEON CPU 2.00GHz; 3.00 GB memory.
Note that the running times are collected with Maple, which uses an interactive programming
language. A C language implementation is expected to be much faster.
Table 5
Performance comparison of two algorithms (popsize = 20)
functions
algorithm
convergence%
excellence%
unluckiness%
minimal result
time(sec)
G1
cGA
1%
18%
7%
34.84
4.2
G2
gGA
13%
47%
9%
33.35
8.2
cGA
0%
8%
59%
317.4
4.93
G4
gGA
31%
63%
31%
274.0
10.7
cGA
0%
0%
100%
3575.6
3.9
G5
gGA
51%
53%
0%
3897.7
7.9
cGA
0%
19%
36%
0.71
4.6
gGA
57%
74%
12%
0.65
10.2
As shown in Table 5: (1) The probability of convergence and probability of excellence of
gGA are much higher than that of cGA; (2) The probability of unluckiness of gGA is lower
than that of cGA except for function G1 ; (3) Our strategies occasionally cause faster premature
convergence as the table indicates. So we get worse minimal results than cGA except for
function G4 . But for gGA, we still obtain better results in most of the sample points than
cGA; (4) As gGA must do more comparisons and choices and compute more function values
than cGA in one generation, gGA consumes more time than cGA. The algorithm gGA is about
twice as slow as cGA, as Table 5 indicates.
4 Conclusion
The gGA effectively incorporates the global exploring ability of the genetic algorithm and the
local convergent ability of the greedy algorithm to increase the convergence without much cost
110
ZHAO XINCHAO
Vol. 18
of speed. The gGA performs much better than cGA through comparison with the experimental
results. In this paper, we only want to show that our greedy strategy is effective. Of course, in
order to achieve even better results, we should incorporate other effective strategies.
References
[1] Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs, Third Edition,
Springer, 1996.
[2] Z. Pan, L. Kang and Y. Chen, Evolutionary Computation(Chinese edition), Tsinghua University
Press, GuangXi Science & Technology Press, 1998.
[3] T. Bäck and F. Hoffmeister, Extended selection mechanisms in genetic algorithms, in Proceedings
of the Fourth International Conference on Genetic Algorithms, Morgan Kaufmann Publishers, San
Mateo, CA, 1991.
[4] A. Szalas and Z. Michalewicz, Contractive Mapping Genetic Algorithms and Their Convergence,
University of North Carolina at Charlotte, Technical Report 006-1993.
[5] R.E. Smith, Adaptively resizing populations: an algorithm and analysis, in Proceedings of the Fifth
International Conference on Genetic Algorithms, Morgan Kaufmann, San Mateo, CA, 1993, 653.
[6] J. Arabas, Z. Michalewicz and J. Mulawka, GAVaPS-a genetic algorithm with varying population
size, in Proceeding of the 1st IEEE International Conference on Evolutionary Computation(ICEC),
Orlando, Florida, USA, IEEE Press, 1994.
[7] Y. Davidor and H. P. Schwefel, An introduction to adaptive optimization algorithms based on
principles of nature evolution, dynamic, genetic and chaotic programming, John Wiley & Sons,
1992, 138–202.
[8] H. P. Schwefel, Numerical Optimization of Computer Models, John Wiley, Chichester, UK, 1981.
[9] D. E. Goldberg, A note on Boltzman Tournament Selection for genetic algorithms and populationoriented simulated annealing, Complex Systems, 1990, 4(4): 445–460.
[10] L. Kang, Y. Xie, S. You and Z. Luo, Non-Numerical Parallel Algorithms(1st Volume): Simulated
Annealing Algorithm(Chinese edition), Science Press, Beijing, 1994.
[11] R. A. Ahuja, J. B. Orlin and A. Tiwari, A Greedy Genetic Algorithm for the Quadratic Assignment
Problem, Preprint, 1997.
[12] F. Glover, Genetic Algorihtms and Scatter Search: Unsuspected Potential, Statistics and Computing, 1994, 4: 131–140.
[13] C. C. Aggrwal, J. B. Orlin and R. P. Tai, Optimized crossover for the indepedent set problem,
Research Report, Operations Research Center, MIT, Cambirdge, MA, 1994.
[14] R. L. Harry and D. Larry, Data Structures & Their Algorithms, Harper Collins Publishers, 1991.
[15] J. H. Holland, Adaptation in Nature and Artificial Systems, University of Michigan Press, Ann
Arbor, 1975.
[16] K. A. De Jong, On using genetic algorithms to search program spaces, in Proceedings of the Second
International Conference on Genetic Algorithms, Lawrence Erlbaum Associates, Hillsdale, NJ,
1987, 210–216.
[17] J. D. Schaffel, R. A. Caruana, L. J. Eshelman and R. Das, A study of control parameters affecting online performance of genetic algorithms for function optimization, in Proceeding of the 3rd
International Conference on Genetic Algorithms, Morgan Kaufmann, Los Altos, 1989, 51–60.
[18] K. A. De Jong, On using genetic algorithms to search program spaces, in Proceedings of the Second
International Conference on Genetic Algorithms, Lawrence Erlbaum Associates, Hillsdale, NJ,
1987, 210–216.
[19] J. D. Schaffel, R. A. Caruana, L. J. Eshelman and R. Das, A study of control parameters affecting online performance of genetic algorithms for function optimization, in Proceeding of the 3rd
International Conference on Genetic Algorithms, Morgan Kaufmann, Los Altos, 1989, 51–60.
[20] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, AddisonWesley, Reading, MA, 1989.