Evolutionary Algorithms - CS StudentNet

Evolutionary Algorithms
Stochastic Iterative Search / Heuristic Search / Metaheuristics
Joshua Knowles
School of Computer Science
The University of Manchester
COMP60342 - Week 4 2.15, 24 April 2015
In This Lecture
• Simulated Evolution: Overview and Applications
• Evolutionary Algorithms for Optimization (in Detail)
• Tuning and Testing EAs (Basics)
• Other Stochastic Search Algorithms: Hillclimbing and Simulated Annealing
Image from http://www.truthtree.com/
Evolutionary Algorithms
2
2.15, 24 April 2015
Simulated Evolution: Overview and
Applications
Evolutionary Algorithms
3
2.15, 24 April 2015
Evolutionary Algorithms
Evolutionary algorithm (EA) is the collective name for a number of different types of
algorithmic simulation of the processes of Darwinian evolution by natural selection.
The main types of EA are:
• genetic algorithms
• evolution strategies
These different types
originated separately but
are now largely obsolete
as separate categories.
• evolutionary programming
• genetic programming
• learning classifier systems.
The computer science discipline studying evolutionary algorithms is called
Evolutionary Computation (EC).
Evolutionary algorithms can be used to: simulate aspects of evolution to help
understand evolutionary dynamics and processes; to provide a mechanism for the
creation of artificial life-forms; to solve optimization problems.
Evolutionary Algorithms
4
2.15, 24 April 2015
Natural Evolution
The question that led in 19th Century to the Theory was:
How do we explain the diversity of life?
Evolutionary Algorithms
5
2.15, 24 April 2015
Natural Evolution — Origins
Even before Charles Darwin and Alfred Russel Wallace, several different theories of
Evolution attempted to explain the origin of the variety of biota on Earth. Chambers
(1844) popularised transmutation - the idea that one species could change into
another.
Chambers also believed that there was an inbuilt direction to this change: from
primitive to more complex (or advanced).
Lamarck (1809) had proposed that species adapt to their environment during their
lifetime and then can pass on these acquired adaptations. (This mechanism was
later refuted).
So, Darwin and Wallace didn’t invent evolution, but contributed the explanation of the
mechanism, ‘natural selection’, that drives it. This, together with several other related
theories explained how all species come to exist (and depart).
Evolutionary Algorithms
6
2.15, 24 April 2015
Natural Selection
Figure: modified from One Long Argument by Ernst Mayr (1991)
http://www.christs.cam.ac.uk/darwin200/pages/index.php?page id=d3
Competition between individuals occurs because more offspring are created than
needed to replace parents (superfecundity) and these cannot all be supported by the
environment. Variation in the individuals and the inheritance of traits plus differential
survivability then leads to Evolution.
Evolutionary Algorithms
7
2.15, 24 April 2015
Arfiticial Selection: Breeding
Humans had artificially evolved “good” breeds
in animals and plants long before (thousands
of years before) Darwin’s theory.
In fact,
Darwin used the idea to strengthen parts of his
arguments.
Humans achieved this through the selection
of individuals (animals or plants) that they
observed had preferable traits:
selective
breeding.
Today: artificial selection is still used. But
we can also manipulate the genes via genetic
engineering or genetic modification. There is
potential for the latter to be faster, but it is still
a science in infancy.
Can we use similar processes to solve (optimization) problems too?
Evolutionary Algorithms
8
2.15, 24 April 2015
What an EA is
Initialize randomly
a population of
individuals (genotypes)
• An EA consists of several stochastic
Evaluate the initial
population
(i.e. random) processes
Select from the current
population in a biased way
favouring fitter individuals
• But it is not entirely random !
Reproduce selected individuals
by sexual or asexual means
• Heritability of good traits and biased
selection make the difference.
Evaluate the new individuals
• These make the process one of trial
Replace parent population with
new individuals (selection may
be used here again)
Stop?
and error
NO
YES
OUTPUT:
Final population of solutions
Evolutionary Algorithms
9
2.15, 24 April 2015
Some Evolution Vocabulary I
• an organism is defined by its GENOTYPE, a collection of GENES
• organisms exist in POPULATIONS
• new organisms arise after MATING
• new genotypes arise from RECOMBINATION a.k.a. CROSSOVER (shuffling up
of genes from parents’ genotypes)
• new genotypes suffer from MUTATION
• the FITNESS of an individual is both the quality of the TRAITS it possesses, and
its lifetime fecundity (the number of offspring it has)
• populations exist in NICHES, sets of conditions to which the population adapts
and EVOLVES
Evolutionary Algorithms
10
2.15, 24 April 2015
Some Evolution Vocabulary II
Equivalent terms (more or less):
Natural Evolution
population
organism
gene
locus
allele
genotype
phenotype
fitness
Evolutionary Algorithms
population
individual
gene
locus
allele
genotype
phenotype
fitness
Optimization
set of solutions
solution and its utility
variable
variable index
value a variable takes
solution vector
solution, e.g. a graph
utility/cost/objective value
What does fitness mean in a biological entity? In most EC research, it equates to a
well-defined objective function.
Evolutionary Algorithms
11
2.15, 24 April 2015
EA Models vs Natural Evolution
Some of the main differences are:
In EAs, fitness is objective and static; in Nature, fitness is relative and changing
EAs use a fixed population size; in Nature, populations fluctuate in size and go extinct
EA populations assume perfect mixing (panmictic); Natural populations may be
separated by geographic isolation, inter- species mating is forbidden, inbreeding
forbidden in some populations
EAs use single chromosomes; Natural systems have multiple chromosomes enabling
robustness to mutational damage or environmental fluctuations
EAs stop and produce something “final”; Natural evolution has not stopped yet.
NB: The above applies to ‘standard’ EAs. Many ideas from Natural evolution
have been tried in EAs
Evolutionary Algorithms
12
2.15, 24 April 2015
An Example: Bin Packing Problem
INSTANCE: K bins, e.g. lorries; set of items of different weights
PROBLEM: Pack the items into the bins so that the packed weight of the bins is as
close to equal as possible
item number: 1 2 3 4 5 6 7 8 9 10
weight:
17 4 61 8 2 13 22 9 18 49
REPRESENTATION: genotype of 10 genes. Each gene has K alleles, representing
which bin to pack item into, e.g. 2232213221 for K =3
INITIALIZATION: Random allele (value) assigned to each gene independently
MUTATION: choose a gene at random and set its value randomly
COST: Difference between lightest and heaviest bin. Here: = 25.
Evolutionary Algorithms
13
2.15, 24 April 2015
An Example: Bin Packing Problem
initial
f gen 1
2232213221 25 2232213121
1123121333 39 2232113221
3311111213 115 1233321222
1233321222 54 2232212221
f
34
33
54
19
gen 2
f
2233212222 116
2231212221 11
2231213221 33
2232212223 97
gen 3
f gen 4
f gen 5
f
2331212221
5 2331312221
4 2331312221 4
2232212222 116 2311312221 125 2331312321 19
2331233221 54 2231312221
7 2331312221 4
1332212223 84 2331312321 19 2233312221 9
An illustration only. Best fitness improved quite rapidly. But did we reach an
optimum? How do we know when to stop?
Evolutionary Algorithms
14
2.15, 24 April 2015
The Power of Random Mutation + Biased Selection
Roger Alsing’s “Evolisa”.
The fitness function has a definite target. Is this really evolution? ... It demonstrate
the power of selection, even when mutations are entirely random
Evolutionary Algorithms
15
2.15, 24 April 2015
The Power of Random Mutation + Biased Selection
Random mutation alone:
The Monkey Shakespeare Simulator took a simulated total of 42,162,500,000 billion
billion monkey-years, until one of the ”monkeys” typed,
VALENTINE. Cease toIdor:eFLP0FRjWK78aXzVOwm)-;8.t
in which the first 19 letters appear in Two Gentleman of Verona.
Compare that with Richard Dawkins’s demonstration that
“METHINKS IT IS LIKE A WEASEL”
from Hamlet could be evolved using selection + random mutation in about 40
generations.
Evolutionary Algorithms
16
2.15, 24 April 2015
Necessary Ingredients for an EA to Work
• A way to represent solutions (phenotypes) as strings of symbols (genotypes)
• A (fitness) function that maps genotypes to phenotypes and maps phenotypes to
a measure of ‘fitness’
• Operators to reproduce and vary individual genotypes in such a way that
inheritance of traits occurs
The EA does need to know details of the fitness function (although it may help). The
fitness function can be a ‘black box’.
fitness
ACCAGT
Evolutionary Algorithms
17
2.15, 24 April 2015
Evolution and Fitness Landscapes
Hillclimbing, the simplest EA with a
single asexually reproducing individual,
may work well on a simple single-peak
landscape
On a multi-peaked landscape, a
population of individuals should be an
advantage
Evolutionary Algorithms
18
2.15, 24 April 2015
Evolution and Fitness Landscapes
A multi-peaked fitness landscape
Evolutionary Algorithms
19
2.15, 24 April 2015
Evolution and Fitness Landscapes
Individuals are distributed at random. There is initial diversity
Evolutionary Algorithms
20
2.15, 24 April 2015
Evolution and Fitness Landscapes
Selection and variation begins to change the fitness distribution and the distribution
of alleles
Evolutionary Algorithms
21
2.15, 24 April 2015
Evolution and Fitness Landscapes
Further selection and variation drives diversity out of the population and drives up
fitness
Evolutionary Algorithms
22
2.15, 24 April 2015
Evolution and Fitness Landscapes
The population supports only limited diversity — it is distributed over just two peaks
Evolutionary Algorithms
23
2.15, 24 April 2015
Evolution and Fitness Landscapes
Convergence to a single fitness peak may occur. Variation still operates but diversity
is very limited. Further evolution from here is difficult
Evolutionary Algorithms
24
2.15, 24 April 2015
Searching a Multimodal Fitness Landscape
A multimodal fitness landscape has many peaks. Perhaps just one of those peaks is
optimal.
Fitness
x xx
x
x x
x x
Fitness
x
Fitness
Fitness
Fitness
x
x
x
Fitness
Fitness
Fitness
x
x
Good GA
Ordinary GA
x
Fitness
Time
Hillclimbing
x
x
x x
x x xx
x
xx x x
xx
(adapted from a slide by David Corne)
Evolutionary Algorithms
25
2.15, 24 April 2015
Evolution and Gene Frequencies
We can also view evolutionary processes in terms of what happens to the genes.
Random genes
01001010100
01010110101
01000101010
10110101110
fitness
4
6
4
7
Strong selection
10110101110
10110101110
01010110101
10110101110
Weak selection
10110101110
01001010100
01000101010
01010110101
Genes (or alleles) that appear in fit individuals increase in frequency in a population.
Holland (1975) showed that short, higly fit schema (chunks of genetic code) increase
in frequency during the run of a genetic algorithm.
Initial genetic diversity is not sustained if the population is small and if selection
strongly favours fitter individuals. Continuing “Progress” depends on an
intermediate selection pressure.
Evolutionary Algorithms
26
2.15, 24 April 2015
Classification of an EA
An evolutionary algorithm may be called any of these:
• A global optimizer
• A ‘black-box’ optimizer
• A stochastic, iterative search method
• A heuristic search method
• A metaheuristic
• A nature-inspired method
Usually its optimization performance carries no formal guarantees (even to
approximate optimal solutions).
However, EAs can be applied very generally, and there is much accumulated
evidence that their performance is often good if certain design principles are followed.
There are several other methods sharing some common features with EAs:
simulated annealing, tabu search, particle swarm optimization.
Evolutionary Algorithms
27
2.15, 24 April 2015
First EAs
For the really early history see this
interesting book.
Butler (1863) imagined machines
evolving in an article “Darwin among
the machines”. Later on:
Nils Barricelli (1953)
Box (1957)
Friedman (1959)
Bledsoe (1961)
Bremermann (1962)
all ran experiments with computers and
simulated evolution independently.
Evolutionary Algorithms
28
2.15, 24 April 2015
Early German Work on EAs
Rechenberg, Schwefel and Bienert in 1960s and 1970s.
Evolutionary Algorithms
29
2.15, 24 April 2015
Figures from a talk by Rechenberg
Evolving Jet Nozzle shapes
Evolutionary Algorithms
30
2.15, 24 April 2015
Interactive Evolution
Photofit methods based on EAs have been developed. The fitness function is a
human-in-the-loop. This is known as interactive evolution.
One of the main issues is fatigue.
Try David Corne’s interactive evolution demo.
http://www.macs.hw.ac.uk/ dwcorne/Teaching/iea.html
Evolutionary Algorithms
31
2.15, 24 April 2015
Interactive Evolution II
Evolving chocolate?
Cocoa bean roasting process
affects aroma, flavour and
flavonoid content of chocolate.
Interactive EA used to derive
new roasting temperature
curves.
Difficulty:
final taste of
chocolate affected by more
than just roast, so feedback
signal is weak.
Evolutionary Algorithms
32
2.15, 24 April 2015
EAs in Art, Design and Music
Due to their random nature, EAs can come up with ‘surprising’ patterns, designs or
solutions. This has led to interest in them for supporting creative processes
(architecture / design / music) or even being an autonomous creative agent.
Recent work by Kenneth Stanley evolves musical accompaniments.
Evolutionary Algorithms
33
2.15, 24 April 2015
Evolutionary Robotics
Developing controllers for robots is a difficult engineering task. The difficulty is even
greater when we wish to obtain robots that can co-operate to perform tasks robustly.
The evolution of neural network controllers has been successful in developing robots
capable of processing complex input sensor data to achieve coordinated motion and
conglomeration.
Evolutionary Algorithms
34
2.15, 24 April 2015
Hardware Evolution
Electronic circuits and other hardware devices have been evolved by Adrian
Thompson, John Koza and others.
EAs come up with different circuit design solutions than humans, who tend to use
formal design principles. EA solutions can exploit secondary electronic effects not
known (or ignored) by human designers. This can have both positive and negative
effects.
Evolutionary Algorithms
35
2.15, 24 April 2015
Program Evolution (GP)
The evolution of computer programs takes different forms, but is commonly known as
genetic programming or GP.
One application of genetic programming is symbolic regression. Lipson (2009) has
evolved some physical laws of motion from physical observations.
Another notable success is the evolution of a pseudo-random number generators
which score very highly on statistical benchmark tests of apparent randomness.
Evolutionary Algorithms
36
2.15, 24 April 2015
Evolutionary Algorithms in Experimental Optimization
Pioneering work in Germany by Rechenberg, Schwefel and
co-workers hooked “evolution strategies” up to physical experiments.
Evolutionary Algorithms
37
2.15, 24 April 2015
Airfoil shapes, and jet nozzles were optimized using physical
experiments for the fitness function.
Today, some physical / chemical / biochemical systems are still
difficult to model. They may still be optimized by evolution.
Evolutionary Algorithms
38
2.15, 24 April 2015
Using EAs for Optimization: Health Warnings
It is very important to remember
• EA are heuristics; their effectiveness depends upon many factors
• EAs give an approximate solution only
• EAs are stochastic: different runs might give different results
• Usually EAs offer no performance guarantee — not even a guarantee that some
level of approximation will be reached
Evolutionary Algorithms
39
2.15, 24 April 2015
Wide Applicability, Small Development Time
Nevertheless, EAs can still operate when problems feature
• nonlinear, nonconvex, nondifferentiable and/or discontinuous cost
surfaces (fitness landscapes)
• noisy or uncertain estimation of costs
• multiple nonlinear constraints
EAs only require that solutions can be represented by some symbolic
coding, and there is a way to evaluate proposed solutions.
This makes them widely applicable.
Since it is possible to develop a (basic) EA for a problem without
understanding anything about the problem’s structure, development
times can be very fast. Often a fitness function can be just plugged in.
Evolutionary Algorithms
40
2.15, 24 April 2015
Think About Alternative Solution Methods
Since an EA is widely applicable, it is tempting to use them for every
problem.
In reality, it is only sensible to use EAs when an efficient alternative
solution method is not known. (E.g. you wouldn’t use an EA for the
minimum spanning tree problem. Why?)
If development time is an issue this may also be a reason to use an
EA. Some efficient techniques are difficult and time-consuming to
implement; an EA may be developed much more quickly.
If a basic problem (such as minimum spanning tree, is augmented
with additional constraints) then it may become much harder. An EA
may then be sensible, provided exact optimal solutions are not
required.
Evolutionary Algorithms
41
2.15, 24 April 2015
EAs in Detail
Evolutionary Algorithms
42
2.15, 24 April 2015
Representation: Genotype to Phenotype Mapping
With evolutionary algorithms we are FREE to choose the representation: the
way the genotype codes for phenotype.
The phenotype (in EA-speak) is the solution to a problem. E.g. a complete timetable
for the LONDON2012 games.
The genotype is just the coding for how we build it.
genotype
phenotype
Fitness
{
{
101011102A71G
DECODER
The fitness value is a function of the phenotype. But often we map directly from
genotype to fitness.
The choice of representation is important for EA performance.
Evolutionary Algorithms
43
2.15, 24 April 2015
Choice of Representation
Example:
BINPACKING: Pack items with given sizes into
smallest number of bins possible, where bins
have certain capacity
REPRESENTATION 1: For each item, give a bin number it should go in.
REPRESENTATION 2: Genotype is a permutation of the items. Phenotype (the
actual packing) is created by using a heuristic called FIRST-FIT. FIRST-FIT puts the
next item in the first bin in which it will fit.
The first representation is more direct. A possible problem with it is that many
infeasible solutions are represented, and many very poor ones too.
The second representation is indirect. Possible problems with indirect
representations are:
Evolutionary Algorithms
44
2.15, 24 April 2015
• Optimal solution may no be representable !
• Heuristic decoder may be computationally expensive
• Many changes to genotype make no difference to phenotype
But it solves the problem of the direct coding. So, you make your choice...
Evolutionary Algorithms
45
2.15, 24 April 2015
Other Properties of Representations
Representation determines how close different solutions
(phenotypes) are to each other.
So representation choice influences fitness landscape shape.
X
X
Fitness
Fitness
Generally speaking, we would like small changes in genotype to give
(mostly) small changes in phenotype, which will in turn give small
changes in fitness, as in the left figure:
XX
Then we can “hill-climb” up a smooth path to the optimum.
Evolutionary Algorithms
46
2.15, 24 April 2015
Choosing a Population Size
Evolutionary algorithms use (usually) a constant population size |P |.
Fitness
Fitness
Setting it to a moderate value (10s or 100s) is usually advisable because:
X
X
X
XXX
XXX XX
Fitness
Fitness
Time
X
Time
X
X XX
XXX
X
XX X
X
X
XX
Too large a population size may not leave enough
time (generations) for evolution to get very far
Too small a population has insufficient diversity and
may converge prematurely
Evolutionary Algorithms
X
XX
XX
47
2.15, 24 April 2015
Initialization
Random (unbiased) initialization of
population of genotypes is the default.
the
Although it is tempting to build in something more intelligent, the risk is that seeding
with one fit individual can take over the population and cause premature
convergence.
And It can be difficult to generate many different, good solutions.
Ideally, we want diverse and good solutions in the initial population. But it is better to
sacrifice quality than diversity.
Evolutionary Algorithms
48
2.15, 24 April 2015
Steady State vs Generational Reproduction
P(t)
P’(t)
parents
P(t+1)
offspring
Generational replacement scheme: offspring population
entirely replaces parent population. No competition
between offspring and parents
A pure generational scheme has non-overlapping parent and
offspring populations.
Evolutionary Algorithms
49
2.15, 24 April 2015
P(t)
parents
P(t+1)
offspring
A steady state reproduction scheme. Only 1 or 2 offspring
are created. These replace the weakest individuals in the
parent population (if and only if they are fitter).
The steady state scheme has significant overlap between
generations. We must choose which solutions to replace
Evolutionary Algorithms
50
2.15, 24 April 2015
Generational Reproduction with Elitism
P(t)
elite
P’(t)
parents
offspring
P(t+1)
elite
Generational replacement with elitism: offspring population
replaces parent population save for the fittest k individual(s)
For overlapping populations, the fraction of the parent population replaced per
generation is known as the generation gap. Elitism with k elite individuals has a
generation gap of |P | − k/|P |. A steady state EA producing two offspring from
recombination has a generation gap of 2/|P |.
Evolutionary Algorithms
51
2.15, 24 April 2015
Intergenerational Competition
Alternative ways of replacing individuals from previous generations:
• Random replacement (independent of fitness)
• Replace the least fit
• Hold a tournament to tend to select poorer but not necessarily the
worst solutions (see tournament selection in following slides)
• Offspring replace their parents
• Offspring replace their parents only if they are fitter
These give slightly different evolutionary dynamics, and which one to
choose may depend upon how mating selection is being done, and
the selection pressure being applied.
Evolutionary Algorithms
52
2.15, 24 April 2015
Reproduction Schemes in Evolution Strategies
Evolution strategies are the German evolutionary algorithms
There are two main reproduction schemes
(µ + λ) selection. There are µ parents, and λ offspring are generated from them (by
cloning, mutation and recombination). The whole µ + λ of them compete to be the µ
parents of the next generation. Usually truncation selection is used: the fittest µ are
taken. λ may be smaller or larger than µ. E.g. A (10+1) scheme is like a steady state
EA.
(µ, λ) selection. With this “comma” selection strategy the offspring do not compete
with their parents. The λ offspring compete amongst themselves and the fittest µ of
them become the parents of the next generation. In this scheme µ must be greater
than or equal to λ.
Evolutionary Algorithms
53
2.15, 24 April 2015
Mutation
Mutation is a very important operator in EAs. Recombination and selection alone are
generally poor.
Mutation rates should nevertheless be small. A large mutation rate disrupts the
heritability of good traits.
Two common schemes to do point mutation(s) on a binary string
1. Select a random locus (gene), and change its allele value
2. Consider every locus (gene) independently and with probability pm change its
allele value
With the first scheme exactly one gene’s allele value is changed per chromosome
mutation event.
With the second scheme, when the mutation rate pm is set to 1/L, where L is
chromosome length, then the expected number of mutations is one too. But there is
a probability of ((L − 1)/L)L ≈ 0.36 that no
mutations occur. And there is a
M
probability of ((L − 1)/L)L−M .(1/L)M . L that exactly M mutations occur. So for
L = 100 and M = 3, this occurs with probability = 0.061. This allows occasional
longer jumps
Evolutionary Algorithms
54
2.15, 24 April 2015
Mutation with Other Representations
If one is using a genotypic representation other than binary strings, then mutation
must also be defined appropriately. (See the bin-packing example above for an
integer mutation).
With continuous gene values, perturbing the gene’s value by drawing from a
Gaussian distribution centred on the current gene’s value is one technique:
Let x ∈ [a, b] be a real variable. Then the Gaussian mutation operator MG
changes x to MG (x) := min(max(N (x, σ), a), b), where N (x, σ) is a
normally disrtributed random variable with mean x and standard deviation
σ , and σ may depend on the length r := b − a of the interval and is
typically set to σ/r = 1/10.
For permutations, still other mutations are possible...
Evolutionary Algorithms
55
2.15, 24 April 2015
Mutation with Other Representations
Often an EA is used to optimize a problem where the natural representation is a
permutation of the numbers from 1 to N . Examples are TSP, some matching
problems, some scheduling problems, knapsack problems when an indirect coding is
being used.
2-swap mutation
Choose any pair of genes at random. Swap them.
A B C D E F G H −→ A F C D E B G H
2-opt mutation (good for TSP)
Choose a contiguous chunk of the chromosome at random. Reverse its order.
A B C D E F G H −→ E D C B A F G H
shift mutation (good for scheduling problems) Choose a contiguous chunk of the
chromosome at random. Shift it to the right or left by S spaces for some random S . A
B C D E F G H −→ A E F B C D G H
It is easy to design your own operator for a problem. You can also use more than one
mutation operator in your evolutionary algorithm.
Evolutionary Algorithms
56
2.15, 24 April 2015
Recombination
Most evolutionary algorithms use recombination in addition to mutation in order to
obtain variation
The role of recombination (or sexual reproduction in general) is disputed in
Natural evolution and EC.
It seems to allow two or more positive traits that have been separately evolved to be
combined. This could speed up evolution.
It can also act to repair the damage of a poor mutation (has a stabilizing effect).
Usually, recombination of two parents is used to produce 1 or 2 offspring individuals.
But other models are possible.
Evolutionary Algorithms
57
2.15, 24 April 2015
Recombination: One Point Crossover
Early EAs used one-point crossover. Here is an example.
a b c d e f g h
A B C D E F G H
↑
a b c D E F G H
A B C d e f g h
Parent 1 chromosome
Parent 2 chromosome
Crossover point
Offspring 1
Offspring 2
A random point along the string (between two genes) is chosen. Genes to the left are
copied from Parent 1. Those to the right are from Parent 2. (This is then reversed for
the second offspring if two offspring are produced, as above)
A problem with one-point is that certain combinations of genes in the two parents
cannot be passed on to the offspring.
Evolutionary Algorithms
58
2.15, 24 April 2015
Recombination: Uniform Crossover
Syswerda (1989) invented uniform crossover, which is commonly used today.
a b c d e f g h
Parent 1 chromosome
A B C D E F G H
Parent 2 chromosome
0 1 1 1 0 0 1 0
Crossover mask
a B C D e f G h
Offspring 1
A b c d E F g H
Offspring 2
Uniform crossover uses a mask to dictate which parent to take each gene from. Any
combination of genes from either parent can be reached.
Note that alleles common in both parents are preserved in the offspring. The
remainder of the offspring inherits its genetic material from one parent or the other.
When we design recombination operators for other representations (e.g.
permutations) we still try to keep these properties. It can be difficult to achieve,
however.
Evolutionary Algorithms
59
2.15, 24 April 2015
Mating Selection Schemes: Fitness Proportionate
Selection
The probability that an individual i with fitness fi is selected under one call to fitness
proportionate selection is
p(select i) = P
fi
j∈P
fj
where P is the set of indexes of the population. Here f is assumed to be positive,
and we are maximizing fitness.
Evolutionary Algorithms
60
2.15, 24 April 2015
Fitness proportionate selection is also known commonly as roulette-wheel selection:
12
20
Evolutionary Algorithms
10
The size of the slot on the roulette
wheel is proportional to fitness.
The wheel is spun to select an
individual.
61
2.15, 24 April 2015
Mating Selection Schemes: Fitness Proportionate
Selection
Clearly, fitness proportionate selection is sensitive to the scaling of fitness values. If
we have fitnesses {1,2,10,20} in a population, and we change our fitness function by
adding a constant=100 to it, then the selection probabilities change radically.
12
101
120
20
10
102
110
Selective
“flattened”
opportunity
has
been
This can be a good or a bad thing for this selection method. But generally, it means
you might need to be more careful with designing the fitness function.
Problems with fitness proportionate selection:
Small differences in fitness (a competitive advantage) may not be rewarded much.
⇒ May prevent evolution from getting off the ground, or retard progress later on.
Large differences in fitness may be rewarded too much. ⇒ Take-over of the
population occurs where all diversity is lost suddenly.
Evolutionary Algorithms
62
2.15, 24 April 2015
Fitness-proportionate selection was popular in the early days of genetic algorithms
research, following Holland’s book in 1975. Much less so now.
Evolutionary Algorithms
63
2.15, 24 April 2015
Mating Selection Schemes: Rank-Based Selection
In Rank-Based Selection, we use the same method of selecting using a
roulette-wheel, but raw fitness is replaced by the solution’s rank.
p(select i) = P
Ri
j∈P
Rj
where Ri is the rank of individual i in the population P , with the worst individual
having a rank of 1 and the best a rank of |P |.
Drawbacks:
Although rank-based selection overcomes some difficulties with fitness-proportionate
selection, it adds computational complexity because we need to sort the individuals.
Also: what about tied fitness values? What happens?
Evolutionary Algorithms
64
2.15, 24 April 2015
Mating Selection Schemes: Rank-Based Selection
We may not like the balance of selection pressure given by the ranks. To control the
pressure, the ranks can be raised to a power in the calculation of probabilities.
p(selection)
Selection Probabilities under
Rank−Based Selection
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
rank−based
Evolutionary Algorithms
p(select i) =
RiB
P
j∈P
RjB
where B is
the bias. A bias of zero would give
equal probability to all population
members.
rank−based
using R1/2
65
2.15, 24 April 2015
Mating Selection Schemes: Tournament Selection
Another selection scheme that is invariant to rescaling of fitness values (i.e. depends
only on the relative ranks of the individuals in the population) is Tournament
Selection.
Tournament Selection( population P ) {
best = contestant[1] = randomly select from( P )
for (j = 2 to T size )
contestant[ j ] = randomly select from( P )
if ( f(contestant[ j ]) > f(best) )
best = contestant[j]
return j
}
The random selections are done with uniform probability and with replacement.
The tournament size T size controls the bias or selection pressure. A tournament
size of 1 gives no bias to fitter individuals.
Advantages: Tournament selection gives similar invariance to fitness re-scaling as
rank-based selection but is simpler to implement and computationally more efficient
Evolutionary Algorithms
66
2.15, 24 April 2015
Tournament Selection: Expected Numbers of Offspring
T size values of 2 or 3 are generally very good in practice. The value 2 is known as
binary tournament selection.
Q. In a generational EA, with |P | selections for mating being made, what is the
expected number of copies made of the fittest individual?
A. The fittest individual will win any tournament it appears in. We would expect it to
appear in two tournaments (almost certainly two different tournaments) when
|P | selections are made since its independent probability of appearing in any
one tournament is 2/|P | (two being the number of individuals contesting the
tournament). So, overall two copies of the fittest individual will be made in
expectation.
(The above assumed that there is only one fittest individual.)
What is the expected number for the least fit? And the second-to-least fit?
Evolutionary Algorithms
67
2.15, 24 April 2015
Stopping Criteria for Evolutionary Algorithms
Standard stopping criteria are
• A fixed number of generations
• Monitor the fitness and stop when it plateaus
• Monitor the genetic diversity and stop when it falls below some threshold
The first is common but not really advisable for best results. It is a guess and is likely
to lead to stopping too early (not yet converged) or too late (wasting resources and
doing nothing useful).
Monitoring fitness is OK, but the fitness may plateau for some time before progress
again occurs. This is the case for complex problems.
Monitoring diversity may be better. An EA usually loses most of its power to progress
once diversity is lost. How should we measure this?
Combining diversity and fitness criteria is also possible
Evolutionary Algorithms
68
2.15, 24 April 2015
Tuning Parameters in an Evolutionary Algorithm
EAs have many parameters. What are good values for them (generally) and how
should we go about setting them for a specific problem?
This is a recurring and difficult question.
Here are some broad practical guidelines
parameter
first choice 3 alternatives
Population size 100
1‡, 20, 200
Mutation rate∗
1/L
0.5/L, 2/L, 4/L
Crossover rate
0.7
0, 0.2, 0.95
Tournament size 3
2, 5, 10
∗
Note: Per-gene mutation rates in inverse proportion to the length L of the genotype
seem to work well
‡Note: Population size=1 gives a kind of hillclimber
Also: crossover rate is least important to tune, usually
Evolutionary Algorithms
69
2.15, 24 April 2015
Comparing Performance of Stochastic Algorithms
First level: Descriptive statistics. It is a good idea to plot the fitness evolution (or
objective function cost) over the generations
These are means over several runs. The error bars indicate the standard error of
the mean.
(Clearly, the problem shown requires minimization of the objective function)
Evolutionary Algorithms
70
2.15, 24 April 2015
Comparing Performance of Stochastic Algorithms
Second level: inferential statistics or hypothesis testing.
• Collect data from several runs for a pair of algorithms.
• Test the hypothesis that one algorithm is better than the other. Using a t test is
OK, but distributions may not be normal.
• Alternative is to use a non-parametric test. One type is the Mann Witney U test.
• A p-value is defined as the probability of seeing differences as extreme as those
observed, given that the samples are from the same population. It does not
prove that differences are real, or give the probability they are real
To test multiple algorithms is more difficult. (You need to correct the statistical
significance values)
Evolutionary Algorithms
71
2.15, 24 April 2015
Constraint-Handling Schemes
Many problems are constrained. How can we handle constraints in EAs?
3 alternatives are:
1. Give a fitness penalty to solutions that violate constraints. Make this
penalty large if the constraint is hard. Make the penalty larger the more
the constraints are violated.
2. Do not let individuals that violate constraints survive (the death penalty
approach)
3. Repair individuals that violate constraints before breeding from them.
The penalty method is generally the best and most often used.
Repairing can be very effective, but it can also be difficult in some circumstances to
know how to repair an infeasible solution (this may require a long search in itself).
Evolutionary Algorithms
72
2.15, 24 April 2015
Maintaining Diversity
There are numerous ways EAs can be improved beyond the basic model. One of the
most effective improvements is to actively try to maintain diversity for longer. This is
generally called ‘niching’, as it encourages different niches to be populated.
Three ways this can be achieved are:
1. structured populations with (partial) isolation
2. preventing stagnation by restarting evolution or injecting random solutions
3. fitness sharing
Evolutionary Algorithms
73
2.15, 24 April 2015
Maintaining Diversity: Structured Populations
This figure shows the fitness of individuals in a diffusion model GA:
Mating occurs locally on this grid. Offspring are also placed close to their parents,
e.g. by doing a short random walk from one parent and replacing the least fit
individual on the walk.
Island Model GAs are another similar type: populations evolve separately on islands,
with only very limited migration between islands.
Note: these EAs are also more highly parallel than standard EAs, so they are a good choice for distributed architectures
Evolutionary Algorithms
74
2.15, 24 April 2015
Maintaining Diversity: Restart Methods
This is an EA called the micro-GA:
Typically the population
is
smaller
than
a
standard EA, but when
convergence is detected it
is restarted (so it can run
for a long time relative to
population size).
Evolutionary Algorithms
75
2.15, 24 April 2015
Maintaining Diversity: Fitness Sharing
The idea of fitness sharing is to reduce the effective fitness of individuals if they are
too similar to others (if they occupy the same ‘niche’)
NICHE RADIUS
FITNESS
REDUCED FITNESS
The three individuals atop the high fitness peak fall within a certain radius (called the
niche radius) of each other. In a simple scheme, their fitness would be divided
(shared) by 3 (thus reducing it, as indicated by the magenta points).
Evolutionary Algorithms
76
2.15, 24 April 2015
Summary for EAs
EAs are computational models of adaptation that follow many principles of the
modern evolutionary synthesis: Mendel’s genetics + Darwin’s Natural Selection
EAs have many applications, mostly in optimization. They are particularly useful
when the thing to optimize is a ‘black box’ (i.e., complex or unknowable) function, or
when many constraints make it difficult to develop or apply other methods
EAs are stochastic and heuristic. No formal performance guarantees are usually
possible. Nevertheless, much theory DOES exist. We can now give theoretically
justified performance predictions for several classes of problem
In practice, EAs usually need some careful tuning. For better performance, choose a
good representation, work on the variation operators, and add diversity control
Evolutionary Algorithms
77
2.15, 24 April 2015
Local Search Heuristics: Hillclimbing and
Simulated Annealing
Evolutionary Algorithms
78
2.15, 24 April 2015
HillClimber
A hillclimber is an example of a local search heuristic.
It is like the simplest form of an EA:
Hillclimber
population size
crossover
mutation/neighbourhood move
initialization
replacement
1
no
yes
random
accept non-deteriorating moves
The big advantages of hillclimbing are its simplicity and generality. And with a
good choice of neighbourhood move it can be quite effective.
The big drawback is that it gets stuck at local optima. There is no way to escape.
Possible solution: use a big neighbourhood. Big neighbourhoods are inefficient. (But
one advanced method is to vary the neighbourhood size. ⇒ Variable Neighbourhood
Search.)
Evolutionary Algorithms
79
2.15, 24 April 2015
Simple Hillclimber Pseudocode
First-Improvement Hillclimber()
c = random solution
Ec = evaluate(c)
while not stop
m = neigbour(c)
Em = evaluate(m)
if Em ≤ Ec {comment: minimization}
c=m
endwhile
Output c, Ec
Just start somewhere, and walk uphill (or downhill for minimization) until you can walk
no more !
Evolutionary Algorithms
80
2.15, 24 April 2015
Hillclimber: Neighbourhoods and How to Use
Them
The neighbourhood of a solution is the set of solutions that can be reached by
applying a neighbourhood move (aka mutation) operator to it.
A local optimum is a solution that has no fitter neighbours.
Aside: What is the neighbourhood when applying the standard EA mutation:
mutating each bit with an independent positive probability?
Hillclimbers can be first-improvement or best-improvement. The first-improvement
one is more like an EA. As soon as an improving move is found, it becomes the
current solution. In best-improvement the whole neighbourhood is systematically
checked first, and the best neighbour becomes the new current solution if it is not
worse than the current solution.
Best-improvement has a definite stopping criterion. But it is more common to use
first-improvement.
It is often found that the performance of the first-improvement hillclimber is better if a
move is accepted if it is better or equal in fitness to the current solution.
Evolutionary Algorithms
81
2.15, 24 April 2015
Simulated Annealing: The Physics of Escaping
Local Optima
Simulated annealing is a search method that derives from physics theory explaining
how metals reach lower energy (more ordered) states if they are heated up and
cooled sufficiently slowly.
What has cooling a metal (annealing) got to do with optimization?
The following analogies/connections are made
Physics
Optimization
Physical State of Metal
Solution
Energy/entropy of Metal
Fitness / Cost
Temperature
Acceptance Probability for Moving to Worse Solutions
A metal at a high temperature can move from a low energy state to a higher one
freely. If it is cooled, it can become “frozen” in a state. There is no energy to allow it
to change.
Hillclimbing is like being frozen: no moves to worse solutions are ever accepted.
Evolutionary Algorithms
82
2.15, 24 April 2015
Simulated Annealing: The Physics of Escaping
Local Optima
Boltzmann says that the probability that a move from a state c to a proposed state m
is accepted is related to temperature as follows
p(accept move) =
1
exp(Ec − Em)/kT
if Em ≤ Ec
otherwise
(1)
where k is a constant.
When a metal is cooled slowly it still makes some upward energy moves, but
increasingly less frequently. So it does not get frozen in a glassy (weak) state. It ends
in a low energy state.
We can use this idea to let hillclimbing escape local optima !
Evolutionary Algorithms
83
2.15, 24 April 2015
Simulated Annealing: The Algorithm
SA was first proposed by Kirkpatrick in 1973. It is based on earlier work by
Metropolis.
Simulated Annealing()
T = Tstart
c=random solution
Ec =evaluate(c)
while not stop
m = neigbour(c)
Em = evaluate(m)
if Em ≤ Ec or if exp(Ec - Em)/T < randomvar()
c=m
cool(T)
endwhile
Output c, Ec
This is just hillclimbing with three lines of code changed ! We add a variable T and
use it to accept some deteriorating moves. We must also have some way to reduce T
(cool it).
Evolutionary Algorithms
84
2.15, 24 April 2015
The Cooling Schedule
A cooling schedule is a specification of the initial temperature Tstart and how it will be
reduced over the course of the algorithm.
Much work has gone into the theory and practice of good cooling schedules for
simulated annealing.
A good basic technique for setting a cooling schedule is
• Set Tstart so that roughly half of worsening moves are accepted. If most
worsening moves have a cost difference of W , then Tstart = −W/ log(0.5).
• Every iteration reduce T by doing T = g T, where g is a parameter with value less
than 1. This is called geometric cooling.
• A generally good value for g is one where the final temperature reached after
MAXITERATIONS is one where even the smallest possible increase in cost is
very unlikely to be accepted.
Let us say that M =MAXITERATIONS. Then g will be given by
g = e1/M. log(Tend/Tstart)
More information is here: http://www.btluke.com/featur05.html
Evolutionary Algorithms
85
2.15, 24 April 2015
Neighbourhood Move Assumptions/Rules
It can be shown that simulated annealing converges to an optimal solution, given
certain assumptions about the move operator (neighbourhood) and the cooling
schedule.
The cooling schedule may have to be exponentially slow: since otherwise simulated
annealing could be used to not solve NP-hard problems in polynomial time. But in
practice, reasonably fast cooling still gives good results.
The simulated annealing process must be ergodic. This means it must be possible
to reach every state from every other one. So the move operator must have this
property. It must not be possible to move to a solution from which other solutions
become unreachable.
If the move operator is symmetric (any move is reversible) then it is usually the case
that the whole search space remains reachable.
Evolutionary Algorithms
86
2.15, 24 April 2015
Summary
Hillclimbing and local search are simple but very effective methods for
many problems
Simulated annealing (SA) extends hillclimbing to be able to jump out
of local optima
The cooling schedule and the neighbourhood (mutation or move)
operator are important to get right but there are some good guiding
principles.
SA is often competitive with evolutionary algorithms and other
advanced stochastic search methods.
Other search methods based on local search also exist: tabu search,
variable neighbourhood search, reactive search.
Evolutionary Algorithms
87
2.15, 24 April 2015

Download Report

Evolutionary Algorithms - CS StudentNet

Paperzz.com

Your Paperzz