GENETIC ALGORITHMS (“GA”)

CSc-180 (Gordon)
Week 12 notes
GENETIC ALGORITHMS (“GA”)
Optimization inspired by biological evolution.
•
•
•
•
there is a population of individuals
individuals defined by chromosomes
individuals with higher fitness tend to survive and reproduce
parents produce offspring:
 selection
 crossover
 mutation
To cast a problem for a GA:
• Define candidate solutions as chromosomes
(typical encodings are binary)
• Define a fitness function
• Define operations for:
 selection
 crossover
 mutation
“SIMPLE GENETIC ALGORITHM” (SGA) example:
Maximize f(x,y) = x2 + y2 where x ϵ [0,31] and y ϵ [0,31].
•
•
•
•
•
Define encoding as 10 bit binary string: first 5 bits are X, and last 5 bits are Y.
Define fitness function as f(s) = integer decoding (first 5 bits) squared, plus integer decoding (next 5 bits) squared
Define selection as Probability(string s is selected) = f(s) / Σ f (all strings in the population)
Define crossover by picking a random point on two selected parents, and swapping the LHS’s of the two strings
Define mutation by potentially flipping each bit in the population with a small probabiliy (such as .05)
Now, let’s choose a population size of 6 individuals, and randomly generate an initial population:
string #1 – 0110000011
string #2 – 1100101001
string #3 – 0001100110
string #4 – 0010111001
string #5 – 1001100000
string #6 – 0000111000
Then calculate each string’s fitness, the total fitness of the population, and each string’s probability of selection:
fitness(string #1) = 122+32 = 153
fitness(string #2) = 252 + 92 = 706
fitness(string #3) = 32 + 62 = 45
fitness(string #4) = 52 + 252 = 650
fitness(string #5) = 192 + 02 = 361
fitness(string #6) = 12 + 242 = 577
(total = 2492)
probability(selecting string #1) = 153/2492 = .06
probability(selecting string #2) = 706/2492 = .28
probability(selecting string #3) = 45/2492 = .02
probability(selecting string #4) = 650/2492 = .26
probability(selecting string #5) = 361/2492 = .14
probability(selecting string #6) = 577/2492 = .24
The resulting pie chart at the right is an example of a “roulette wheel”
with fitness proportional selection.
We spin it 6 times to generate the 6 selected parents. Note:
•
•
•
the same parent may be selected multiple times,
the best strings are likely to be selected, but not guaranteed,
the worst strings are unlikely to be selected, but may be.
Suppose that we spin the wheel 6 times, resulting in the following random (but probabilistically biased) selections:
string #5, string #2, string #4, string #2, string #6, string #5
The six selected strings form an “intermediate” population. Each pair of strings then become pairs of parents:
string #5 – 1001100000
string #2 – 1100101001
parents
string #4 – 0010111001
string #2 – 1100101001
parents
string #6 – 0000111000
string #5 – 1001100000
parents
Each pair of parents produce two offspring, using the “one-point” crossover operator described earlier.
A random crossover point is selected for each pair of strings, and the LHS of the two strings are swapped:
1001100000
1100101001
1100100000
1001101001
0010111001
1100101001
1100111001
0010101001
0000111000
1001100000
1001100000
0000111000
The intermediate population then undergoes mutation. Bits are randomly selected at a low probability, and flipped.
For example, in our running example, the following bits (shown in red) might be flipped:
1100100000
1001101001
1100111001
0010101001
1001100000
0000111000
1100100000
1000101001
1100111001
0010101001
1001100100
0000101000
This completes one generation, and produces the new population. The new population replaces the initial population.
We then repeat the process, computing the fitnesses of each string in the new population:
string #1
string #2
string #3
string #4
string #5
string #6
-
1100100000
1000101001
1100111001
0010101001
1001100100
0000101000
fitness = 252 + 02 = 625
fitness = 172 + 92 = 370
fitness = 252 + 252 = 1250
fitness = 52 + 92 = 106
fitness = 192 + 42 = 377
fitness = 12 + 82 = 65
(total = 2793)
And the process repeats over and over, generation to generation.
Note that the average fitness of the population has improved slightly.
It tends to improve over time due to the “selective pressure” in the selection operation.
Genetic diversity is maintained through:
•
•
the mutation operator, and
the periodic selection of weaker strings
It is common to save the best single solution (string) found over all generations.
There are different ways of measuring the performance of a GA, depending on the application:
• number of generations to find the optimum (assuming the optimum can be recognized)
• quality of the answer after some specific number of generations
When assessing a GA’s performance, it is necessary to consider total work done, not just the number of generations:
• number of evaluations = number of generations X population size
• when comparing results for different population sizes, normalize so they can be compared fairly
ENCODING
• binary
• real-valued
• gray-code
• diploid
• incest prevention
Graycoding (example):
0 = 000
1 = 001
2 = 011
3 = 010
etc.
avoids “hamming cliffs”
SELECTION
• fitness-proportional
• rank-based
• tournament
• elitism (copy best string in pop)
Rank-Based selection:
- prob of selection based on rank
- can still use roulette wheel, but with fixed regions
CROSSOVER
• one-point
• two-point
• multi-point
• reduced-surrogate
“incest prevention”:
- if parents are identical, children simply
duplicate the parents (wasted effort)
- only select parents that are different
from each other
Real-valued encodings:
- need to define crossover/mutation
Tournament selection:
- pick a few strings at random (e.g., 3 or 4)
- best two of those are selected as parents
- effect is same as rank-based
1-point perform poorly when there are relationships between distant sections
of the string (1-point crossover always breaks them apart).
2-point = randomly generate two crossover points, swap middle section.
(reduces linkage effects)
reduced surrogate = only perform crossover in the portion of the strings
where the parents differ. Decreases likelihood that children will be the
same as the parents (see strings 5 and 6 in example on previous page)
MUTATION
• standard bitwise
• adaptive
ALGORITHMS
• SGA (“generational”)
• Genitor (“steady state”)
PARALLEL MODELS
• Island
• CGA
• TBGA
“Adaptive” mutation (various forms):
• gradually decrease mutation over time (like annealing)
• increase mutation when population stops improving
• increase mutation when population becomes homogenous
“Steady State” GAs don’t have “generations”.
Instead, select two parents, offspring immediately replace some other string
in the population – such as a weak string.
Island model (smaller subpopulations with periodic migration):
GA
GA
GA
...
Cellular GA (individuals only mate with nearby strings):
 Parallel models mimic “isolation by distance".
 They often perform better, even when run serially.
GA