Genetic Algorithms

Estimation of Distribution
Algorithms (EDA)
Siddhartha K. Shakya
School of Computing.
The Robert Gordon University
Aberdeen, UK
[email protected]
EDAs
• A novel paradigm in Evolutionary
Algorithm
• Also known as Probabilistic model building
Genetic Algorithms or Iterated density
• A probabilistic model based heuristic
• Motivated from the GA evolution
• More explicit evolution than the GA
Basic Concept of Solution and
Fitness
a
Graph colouring Problem:
An Example
d
b
e
c
f
Given a set of colours, GCP is to try and
assign Colour to each nodes in such the way
that neighbouring nodes will not have same
colour
Basic concept of a solution and Fitness
Given 2 colour
Black = 0
White = 1
Representation
of a solution as
a chromosome
Solution
b
a
1
0
a b c d e f
d
e
1
fitness
1 0 0 1 1 1
1
1 0 1 0 1 0
6
1
c 0
b
1
f
a
1
0
d
e
0
1
c 1
0
f
Chromosome and Fitness in GCP
• Chromosome: is a set of colours assigned
to the nodes of graph. (there are other way
of representing GCP in GA, such as order
based representation).
• Fitness: is the number of correctly
coloured nodes.
GA Iteration
1.
2.
3.
4.
5.
Initialisation of a “parent” population
Evaluation
Crossover
Mutation
Replace parent with “child” population
and go to step 2 until termination criteria
satisfies
GA Iteration
Selected Solution
Initialization
Evaluation
1 0 1 0 1 1
Parent population fitness
1 0 1 1 0 1
2
0 0 1 0 1 1
2
0 1 0 0 1 1
Crossover
Selection
After Crossover
0 1 0 0 1 1
1 0 1 0 1 1
4
0 1 0 0 1 1
3
0 1 1 0 1 1
1 0 1 1 0 1
1 0 0 0 1 1
0 1 0 1 0 1
a
b
d
c
f
Given 2 colours
(0,1)
Repeat
iteration
e
After mutation
fitness
0 1 1 0 1 1
1
1 0 0 0 1 0
2
0 1 0 1 0 1
6
1 0 1 0 1 1
4
1 0 1 0 1 1
Mutation
GA evolution
• Selection drives evolution towards better
solutions by giving a high pressure to the
selection of high-quality solutions
• Crossover and mutation (Variation
operator) together ensures the exploration
of the possible space of the promising
solutions. Maintains the variation in the
population.
Variation in GA Evolution
• Has its limitation
• Can recombine fit solution to produce
more fit solution
• Also can disrupt good solution and
converge in local optimum
Estimation of Distribution Algorithm
(EDA)
• To overcome the negative effective of the
crossover and mutation approach of
variation, a probabilistic approach of
variation has been proposed.
• Algorithm using such approach is known
as EDA (or PMBGA)
GA to EDA
Initial Population
Initial Population
Evaluation
Evaluation
Selection
Selection
Crossover
Probabilistic Model
Building
Mutation
Sampling Child
Population
Simple GA framework
EDA framework
General Notation
• EDA represents a solution as a set of value taken by a
set of random variable.
Chromosome x  x1 , x2 ,..., xn  is a set of value taken by set of random
variables X  X 1 , X 2 ,..., X n  (Where each xi {0,1} for bit representation)
Solution
X  X1 X 2 X 3 X 4 X 5 X 6
p( X i  xi ) or simply
x
1 0 1 1 0 1
x
0 1 0 0 1 1
p( xi ) is a univariate marginal distribution
p( X i  xi | X j  x j ) or simply
p( X  x) or simply
p( xi | x j ) is a conditional distribution
p( x) is a joint probability distribution
Estimation of Probability
distribution
Solution
X  X1 X 2 X 3 X 4 X 5 X 6
x
1 0 1 1 0 1
x
0 1 0 0 1 1
p( xi )   p( x)
X i  xi
p ( xi | x j ) 
p ( xi , x j )
p( x j )
p( x)  p( x1 | x2 ,..., xn ) p( x2 | x3 ,..., xn )..... p( xn1 | xn ) p( xn )
Usually it is not possible to calculate the joint probability distribution, so
it is estimated. For example, assuming all xi are independent of each
other, the joint probability distribution becomes the product of simple
univariate marginal distribution.
n
p( x)   p( xi )
i 1
Simple Univariate Estimation of
Distribution Algorithm
Initial Population
Solution
X  X1 X 2 X 3 X 4 X 5 X 6
x
1 0 1 1 0 1
x
0 1 0 0 1 1
Evaluation
Selection
p( X i  xi ) or p( X i  1)
1 1 1 1 1 2
2 2 2 2 2 2
p( X i  xi ) or p( X i  0)
1 1 1 1 1 0
2 2 2 2 2 2
Calculate univariate
marginal probability
and sample Child
Population
Simple univariate EDA (UMDA)
Initialization
Selected Solution
Evaluation
Parent population fitness
1 0 1 0 1 1
Selection
1 0 1 1 0 1
2
0 1 0 0 1 1
0 0 1 0 1 1
2
0 1 0 0 1 1
1 0 1 0 1 1
4
0 1 0 0 1 1
3
1 0 1 1 0 1
Repeat
iteration
Build model
Estimation of
After mutation
fitness
a
b
d
c
f
Given 2 colours
(0,1)
0 1 1 0 1 1
1
1 0 0 0 1 1
2
Distribution
6
4
2
4
2
p( X i  0)
4
p( X i  1)
Sampling
1 0 1 0 1 1
p( x)   p( xi )
i 1
Calculate Distribution
e
0 1 0 1 0 1
n
2
4
2
4
2
4
2
4
1
4
3
4
3
4
1
4
4
4
0
4
Note
• It is not guaranteed that the above
algorithm will give optimum solution for the
graph colouring problem.
• The reason is obvious.
– The chromosome representation of GCP has
dependency. i.e. node 1 taking black colour
depends upon the colour of node 2.
– But univariate EDAs do not assume any
dependency so it may fail.
• However, one could try
Complex Models
• To tackle problems where there is dependency
between variables we need to consider more
complex models.
• The extra model building step will be added to
univariate EDA.
• Different algorithms has been purposed using
different models
• They are categorised into three groups
– Univariate EDA
– Bivariate EDA
– Multivariate EDA
Univariate EDA Model
x1
x2
x3
x5
x4
x7
x6
Graphical representation of probability model assuming no
dependency among variables. (UMDA, PBIL, cGA)
n
p ( x)   p ( xi )
i 1
Bivariate EDA Model
a. Chain model
(MMIC)
b. Tree model
(COMIT)
c. Forest model
(BMDA)
n
p ( x )   p ( xi | x j )
i 1
p ( x)  p( xi1 | xi2 ) p( xi1 | xi2 ).......... p( xin 1 | xin ) p( xin )
Graphical representation of probability model assuming
dependency of order two among variables.
Multivariate EDA Model
a. Marginal product
model (ECGA)
b. Triangular model
(FDA)
c. (BOA, EBNA)
Graphical representation of probability model considering
multivariate dependency among variables.
Finding a probabilistic model
•
•
•
Task of finding a good probabilistic model
(finding the relationship between variable) is a
optimization problem in itself.
Most of the algorithm use Bayesian network to
represent the probabilistic relationship.
Two metric to measure the goodness of
Bayesian Network.
–
–
•
Bayesian Information Criterion (BIC) metric:
Bayesian-Dirichlet (BD) metric:
Use greedy heuristic to find a good model.
Summary
• EDA is an active area of research for GA
community
• EDAs are reported to solve GA hard
problems, and also hard optimization
optimisation problems like MAX SAT.
• Success and failure of EDAs depends
upon the accuracy of the used
Probabilistic model.
Links
• http://cswww.essex.ac.uk/staff/zhang/MoldeBasedWeb/R
Group.htm (Research Groups working on EDAs)
• http://www.sc.ehu.es/ccwbayes/main.html (EDA
homepage maintained by Intelligent system group).
Books
•
•
Larrañaga P., and Lozano J. A. (2001) Estimation of Distribution Algorithms:
A New Tool for Evolutionary Computation. Kluwer Academic Publishers,
2001.
Pelikan, M., (2002). Bayesian optimization algorithm: From single level to
hierarchy. Ph.D. thesis, University of Illinois at Urbana-Champaign, Urbana,
IL. Also IlliGAL Report No. 2002023.