Reporting on Results Obtained by Genetic Programming Systems

GP Applications

Two main areas of research



Genetic Programming
Testing genetic programming in areas other
techniques have been applied to.
Applying genetic programming to problems that have
not been previously solved.
Examples of applications





Data mining
Image processing
Computer graphics
Natural language processing
Board games
GP Parameters

Standard parameters








Genetic Programming
Population size (M)
Number of generations
Tournament size
Application rates for each of the genetic operators
Maximum tree depth
Bound
Maximum offspring size
These parameters are varied in an attempt to
find a solution.
Reporting on GP Results

Genetic Programming
GP reports must specify:











A description of the objective.
The terminal set used.
The function set used.
The fitness cases used.
The raw fitness measure.
Hits criterion.
The population size.
The number of generations.
The success predicate used.
The method used to create the initial population.
The seeds of the random number generator used on successful
runs together with the corresponding solution found on each of
these runs.
Describing the Performance of
a GP System






Genetic Programming
Hits histogram
Standardized fitness histogram
Structural complexity histogram
Variety histogram
Number of runs that must be performed to
find a solution.
Calculating the computational effort
needed to find a solution.
Hits Histogram Example
Hits Histogram for Generation N
7
F requency
6
5
4
3
2
1
0
1
2
3
Hits
4
Genetic Programming
Creating a Hits Histogram




Genetic Programming
A hits histogram is created for each
generation.
Count the number of individuals that have
n hits.
n usually ranges from 0 to the number of
fitness cases
Plot the number of individuals with n hits
against each n value.
Standardized Fitness
Histogram Example
Genetic Programming
S ta n d a rd i z e d F i tn e s s
Standardized Fitness Histogram
10
8
6
4
2
0
1
2
3
Generations
4
5
Creating a Standardized
Fitness Histogram



Genetic Programming
Illustrates the standardized of an entire
run.
The standardized fitness is averaged for
each generation.
Plot the average standardized fitness
against each generation.
Structural Complexity
Histogram Example
Genetic Programming
S t ru c t u ra l C o m p le x i t y
Structural Complexity Histogram
10
8
6
4
2
0
1
2
3
Generations
4
5
Creating a Structural
Complexity Histogram




Genetic Programming
Illustrates the structural complexity over
an entire run.
Calculate the size, i.e. the number of
nodes, of each individual in a generation.
Calculate the average tree size for each
generation.
Plot the average tree size for each
generation.
Variety
Population 1
Genetic Programming
Variety = 60%
+
c
%
c
Population 2
Variety = 100%
*
b
+
c
a
a
*
d
-
c
c
bc
c
+
+
+
+
c
a
*
-
c
c
bc
c
c
a
*
c
c
bc
+
+
*
c
e
-
f
c
%
c
*
ca
b
c
e
f
c
%
c
a
c
b
c
Variety Histogram Example
Variety Histogram
100
V a r ie t y %
98
96
94
92
90
88
86
1
2
3
Generations
4
5
Genetic Programming
Creating a Variety
Histogram



Genetic Programming
Illustrates the variety over an entire run.
Calculate the variety percentage for each
generation.
Plot the variety percentage against each
generation.
Calculating the Number of
Runs Needed

The probability, x, that a successful solution to a problem will be
found in R independent runs of the GP algorithm.



Genetic Programming
x = 1 - ( 1 - P(M, i))R
where P(M,i) is the cumulative probability of success by generation
i, using population M.
P(M,i) is calculated by finding the total number of runs that
succeeded before or on generation i and dividing this total by the
total number of runs conducted.
The number of runs needed is given by:

R(x, M, i) =
log ( 1 – x )
log (1 – P(M, I)
(note: the ceiling is taken)
Calculating
Computational Effort

The number of individuals that must be
examined as part of the search for the
generational control model, i.e. the
computational effort, is then calculated using:


Genetic Programming
f(x, M, i) = R(x, M, i)*M*i
The following equation is used to calculate the
number of individuals that will be examined in a
steady-state system with a fixed population size:

f(x, M, i) = R(x, M, i) * i
Example
Run
Genetic Programming
Generation on which a
solution was found
1
45
2
7
3
13
4
22
5
11
6
3
7
9
8
34
9
10
10
6
How many runs are needed
to find a solution with a 99%
probability by generation 15?
What is the computational
effort needed?