pEvoSAT: A Novel Permutation Based Genetic Algorithm for Solving

pEvoSAT: A Novel Permutation Based Genetic Algorithm
for Solving the Boolean Satisfiability Problem
Kay C. Wiese
Boris Shabash
Simon Fraser University
8888 University Drive
Burnaby, BC, Canada
Simon Fraser University
8888 University Drive
Burnaby, BC, Canada
[email protected]
[email protected]
ABSTRACT
instance requires finding an assignment for each variable xi ,
true (1) or false (0), such that the conjunction described by
the CNF is satisfied. For example, suppose the following
CNF needs to be satisfied
In this paper we introduce pEvoSAT, a permutation based
Genetic Algorithm (GA), designed to solve the boolean satisfiability (SAT) problem when it is presented in the conjunctive normal form (CNF). The use of permutation based
representation allows the algorithm to take advantage of domain specific knowledge such as unit propagation, and pruning. In this paper, we explore and characterize the behavior
of our algorithm. This paper also presents the comparison
of pEvoSAT to GASAT, a leading implementation of GAs
for the solving of CNF instances.
(x1 ∨ x2 ) ∧ (x¯2 ∨ x3 ) ∧ (x¯2 ∨ x4 ) ∧ (x¯4 ∨ x¯1 )
A satisfying assignment would be x1 = 1, x2 = 0, x3 = 1,
x4 = 0. A single CNF may have many, one or no satisfying
assignments.
If one wishes to find a solution to a CNF using an exhaustive search, each variable has to be considered as being 0 or
1. As a result, for an n-variable CNF, 2n potential solutions
exist. Problems that are members of the NP-Complete set
usually have an order of O(2n ) algorithm to find a solution
where n is the size of the input. However these algorithms
perform very poorly for input of even fairly small size; a CNF
composed of 250 variables has 2250 ≈ 1.8 × 1075 potential
solutions. Assuming a computer could verify 1,000,000,000
solutions each second, it may have to run for 5.7×1058 years
to find the correct solution. Many CNFs that are of interest today contain thousands, and even millions, of variables,
so using exhaustive search is not a practical solution. Over
the years, several advancements were made in solving SAT
instances.
Categories and Subject Descriptors
I.2 [Artificial Intelligence]: Miscellaneous
General Terms
Algorithms, Design, Performance
Keywords
SAT; permutation-based representation; gentic algorithms;
DPLL; unit-propagation
1. INTRODUCTION
1.1
1.2
The Boolean Satisfiability Problem
Unit Propagation and the DPLL Algorithm
The first significant improvement came in 1962 from the
following observation: Suppose one has a clause with three
literals, (l1 ∨ l2 ∨ l3 ), and suppose both l1 and l2 have been
falsified (i.e. l1 and l2 are both false). Then, there is no
point in considering any assignment that has l3 false as well,
since such an assignment would falsify the clause, and hence,
the entire CNF. This notion is called Unit Propagation (or
UP); If any clause has all its literals set to false but one,
that literal is automatically set to true and the variables it
represents is set accordingly. In our example, if l3 was xi
then xi = 1 and if l3 was x¯i then xi = 0.
Though a relatively straightforward observation, each variable that is unit propagated brings forth a two prone effect:
The satisfiability problem (abbreviated as SAT) is the first
NP-Complete problem to be described. An instance of a
SAT problem is composed of n boolean variables labeled x1
through xn and logical operators that connect them such
as conjunctions (∧), disjunctions (∨), negations (x̄), implications (→), exclusive conjunctions, etc’. Every instance
of SAT can be represented in a Conjunctive Normal Form
(CNF), where a CNF instance is a conjunction of clauses,
each clause C is a disjunction of literals, and each literal
l is an affirmation (xi ) or a negation (x¯i ) of a variable
xi ∈ {x1 , ..., xn }. Finding a solution for a CNF problem
1. The search space is cut in half for each variable that
is unit propagated. If 3 variables have been unit propagated, the search space decreases almost by an order
of magnitude.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
GECCO’13, July 6–10, 2013, Amsterdam, The Netherlands.
Copyright 2013 ACM 978-1-4503-1963-8/13/07 ...$15.00.
2. Also, a literal that has been unit propagated may falsify another literal somewhere else in the CNF, potentially causing another UP to take place.
861
“restart” is integrated if the algorithm appears to have become stuck in a local optimum.
This notion gave rise to the Davis-Putnam-Logen-Loveland
(DPLL) [1] algorithm. The algorithm used two techniques
to prune the search space; UP and “pure” literal elimination. A pure variable is one that is only affirmed or negated,
but not both, in the formula. If a variable is only affirmed,
there is no reason to explore any assignments that set it as
false, and vice versa. This algorithm set the first milestone
in pruning the search space, but for large CNFs of hundreds
and thousands of variables, this was still too slow. The
biggest breakthrough for CNF assignment came 34 years
later, in 1996. The improvement was the introduction of
clause “learning” throughout the search of possible solutions.
Whenever a solution would prove to be inappropriate, a new
clause would be derived. This new clause would implicitly
detail why the most recent assignment failed, and so future
searches would use this clause to prevent bad searches from
occurring again. The details of the algorithm, known as
Conflict Derived Clause Learning (CDCL) are omitted here
for brevity purposes. However, interested readers can refer
to [15, 9].
2.1
1.3 Previous GA Approaches
I = (1, -2, -4, 6, -5, 3, 8, -10, 7, 9)
Previous approaches to the SAT problem mostly revolved
around the classic representation of individuals. Each potential solution was represented as a bit string of length n
of either 0s or 1s. The bit string corresponded to an assignment of the n variables. The fitness function that is also
most often used for instances of SAT is the maxSAT fitness
function. The function takes an assignment, assigns it, and
then checks to see how many clauses were satisfied by it.
Mathematically it can be described as
Φ(I) =
n
Cj (I)
which is interpreted as “set x1 to true, set x2 to false, set x4
to false,...”
In pEvoSAT, the value at each gene of the individual is
applied to the CNF, and all literals which correspond to the
variable this gene represents receive an assignment. Then
UP takes place and assignments are given to variables that
are propagated through UP. This would give an assignment
to some variables xi ∈ {x1 , ..., xn } that may conflict with
the assignment that individual I suggests, but in such a case
the UP assignment is taken. The process is repeated until
a clause becomes entirely falsified, and the fitness of the
individual is then calculated. Section 2.4 will discuss the
details of the fitness evaluation and the selection methods.
pEvoSAT works to develop an individual that can produce
a satisfying assignment, but such an individual’s genotype
would not necessarily correspond to the satisfying assignment. Rather, its phenotype will correspond to a satisfying
assignment since the phenotype also takes into account the
values from UP.
(1)
j=0
Where Φ(I) is the fitness of an individual I, and Cj (I) is
the value of the j th clause with the assignment individual
I represents. More satisfied clauses correspond to more fit
individuals. Mutations and crossovers are also straightforward with the bit-string representation. A mutation would
simply be flipping a bit in the string from 0 to 1 or vice
versa. A crossover would simply mean switching two substrings between two individuals.
Like all GAs, this method is prone to getting stuck at
local optima. Several methods were attempted in aiding
the algorithm to escape local optima such as the weighted
maxSAT fitness function [6, 2], the use of a high mutation
rate combined with hill climbing [8], and the use of more
sophisticated recombination operators [7, 5]. Out of all these
methods, GASAT is, to the authors’ knowledge, the most
promising and best implementation of using GA for solving
the SAT problem.
An issue that is not addressed by other methods is the fact
that some poor assignments are explored and no pruning of
the search space is done. The formulation of UP, and of
permutation based GAs paved the way to use UP in a genetic
algorithm to prune the search space whenever possible.
2.
Representation of the Problem
Unlike classical GA based methods, permutation based
GAs seek to represent individuals as an ordered set of elements relating to the problem [17, 16, 19, 13, 18]. The
classical representation of SAT individuals, as mentioned in
Section 1.3, is a string of length n of 0s and 1s that corresponds to an assignment of the n variables. However,
pEvoSAT takes a permutation based approach to represent
the individuals.
Each individual is a sequence of n elements from the alphabet ℵ = {1, 2, ...(n − 1), n}, combined with a positive
(True) or negative (False) sign for each element, where an
individual, I, contains the symbol k (1 ≤ |k| ≤ n) if, and
only if, it does not contain the symbol −k. If a gene, gj in
individual I is k (positive), this is interpreted as ‘set xk in
the CNF to “true”’, if it is −k, (negative) it is interpreted as
‘set xk to “false”’. An example for an individual in pEvoSAT
could be:
2.2
Mutation Operators
The two mutation operators that can be used are classical
mutations (negation of a variable assignment from xj to x¯j )
and translocations.
Classical mutations, or simply mutations in pEvoSAT,
are controlled by a parameter in pEvoSAT denoted Pm . Pm
can assume any real value from the range [0.0,1.0]. When applied to a gene, gj , a classical mutation operator will negate
the value of gj (e.g. change it from 3 to -3, or from -16 to
16). Mutations will only change the sign of the gene, but
not its absolute value (since this will cause some variables to
be accounted for more than once, and some not even once).
The other type of modifcations used in pEvoSAT is translocations. In a translocation, two random genes are chosen,
gj and gk . Then, their values are swapped. In essence, this
action changes a sequence of decision, and causes a decision
that may have been ignored due to UP to be executed.
Translocations are also controlled by a parameter, Pt .
Like Pm , Pt is also a real value from the range [0.0, 1.0].
However, it does not directly correspond to the number of
pEvoSAT
pEvoSAT is an evolutionary SAT solver. pEvoSAT also
uses GAs to represent and solve the problem, but uses a permutation based representation for the individuals, as well
as specialized mutation operators. In addition, a type of
862
translocations done. Pt × size(I) is the number of translocation operations that are performed. Consider the extreme
case where both locations chosen are always the same. In
that case, only 1 or no translocations would actually happen. The number of translocations can actually vary from
0 to size(I) × Pt . In that sense, Pt does not represent the
number of translocations that would happen, but the degree to which an individual would be rearranged. A higher
Pt value would produce more shuffling of the genes at every
generation.
domly chosen. P1 [6] = 8, so O1 [6] = 8 which produces
O1 = (-, -, -, -, -, -, 8, -, -, -)
Then, the operation finds that P2 [2] = 8, so it now searches
location 2 to find P1 [2] = −4, hence O1 [2] = −4, resulting
in
O1 = (-, -, -4, -, -, -, 8, -, -, -)
2.3 Recombination Operators
Another important feature of GAs is the use of crossover
to combine elements from different individuals into one individual. The crossover operation is controlled by the parameter Pc . Pc can take on any value from [0.0,1.0] and dictates
what percentage of the children of the next generation will
be the result of crossovers. If the population of a parent
generation is μ and the offspring size is λ, then Pc × λ individuals will be produced through crossover to make the
offspring generation.
When binary strings are used as the representation, crossover
is simply performed by splitting the strings (parents) at random intervals and recombining the pieces (into offspring).
However, in permutation based representation, recombining
information from both parents becomes a much harder task.
Information from both parents needs to be integrated into
the offspring, but in such a way as to produce viable offspring. In the case of pEvoSAT, values for all parameters of
the SAT instance are required.
There are several crossover operators available for permutation based representation. The most prominent of them
are: Ordered Crossover (OX#2) [16], which is used to maintain the relative order of elements from both parents, Cycle Crossover (CX) [12], which maintains the absolute position of elements from both parent, and Partially Mapped
Crossover (PMX) [4], which simply uses information from
the parents to shuffle elements in each parent. In our experiments, CX was used as the crossover method of choice.
2.3.1
The operation then finds that P2 [7] = 4 (notice the absolute
value of -4 was searched for), and that P1 [7] = −10, hence
O1 [7] = −10. The operation continues to find that P2 [3] =
10, and P1 [3] = 6, hence O1 [3] = 6. Then, P2 [9] = 6, and
P1 [9] = 9, hence O1 [9] = 9. Then P2 [4] = −9, and P1 [4] =
−5, hence O1 [4] = −5. Then P2 [1] = 5, and P1 [1] = −2,
hence O1 [1] = −2. Then P2 [0] = −2, and P1 [0] = 1 then
O1 [0] = 1. Finally, P2 [6] = −1, which brings the operation
back to O1 [6]. The remaining values are copied from P2 [5] =
3 and P2 [8] = −7 to give
O1 = (1, -2, -4, 6, -5, 3, 8, -10, -7, 9)
As mentioned before, CX retains the absolute location and
order of the values in the parents. A property which OX#2
and PMX do not comply with.
2.3.2
Keep Best Reproduction (KBR)
In addition to the creation of crossover offspring, a selection method for the crossover offspring was required. Traditionally both offspring are chosen to replace both parents.
However, it was shown in [20] that there is merit in keeping
the most fit parent and most fit offspring.
2.4
The Fitness Function and Selection Method
The fitness function used to evaluate each individual is
the one mentioned in Section 1.3 and is given in Equation
2:
n
Φ(I) =
Cj (I)
(2)
j=0
Here, Cj the value of the j th clause with the assignment
individual I represents.
Cycle Crossover (CX)
The CX operation maintains absolute location of the different values. Some of the values’ locations are taken from
one parent (P1 ) and some from the other (P2 ). In CX, a
random position i is chosen in P1 . Then, the value from P1
at i is copied into O1 at i, i.e. O1 [i] = P1 [i]. Then, a position j is selected such that the value of P2 at j is the value
just copied, O1 [i]. i.e. P2 [j] = O1 [i]. The value j is used for
the next copying action, such that the value for j in P1 is
copied into O1 , i.e. O1 [j] = P1 [j]. The next location in P2
is chosen such that its value is the one just copied and the
cycle continues until the original location i is re-encountred.
At this point, all other entries are simply copied in order
from P2 . Another individual can be produced in the same
way, or by applying the same algorithm and switching P1
and P2 . The following example would illustrate the process
of CX. Suppose P1 and P2 are the following individuals:
The selection method used in pEvoSAT is tournament selection. In tournament selection of size t, t individuals are
chosen at random from the population, and the winner of
the tournament is the individual with the highest fitness.
This individual is selected for the next generation, and the
process continues for as many individuals as are required.
Larger tournament sizes correspond to a higher selection
pressure. The selection mechanism also uses 1-elitism. As a
final resort against local optima, pEvoSAT uses a particular
kind of restarts; inversions
2.5
Inversions
Many algorithms that aim to solve SAT instances use
restarts. The observation that the few initial decisions performed could dictate the rest of the search lead to the idea
that if a search from a set of initial conditions takes too long,
perhaps a different set of initial conditions would do better.
Usually that would be done by a fresh restart, free from the
search results seen so far. However, pEvoSAT uses a different kind of restart, an inversion. In an inversion of individual
I, I is inverted by having all of its decisions negated. Every
P1 = (1, -2, -4, 6, -5, 3, 8, -10, 7, 9)
P2 = (-2, 5, 8, 10, -9, 3, -1, 4, -7, 6)
To produce O1 the location 6 (with 0-based index) is ran-
863
#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
CNF Instance
random ksat 01v100c650.cnf1
ais8.cnf
ais10.cnf
random ksat 02v200c1300.cnf
driverlog1 k99i.renamed-as.sat05-3951.cnf
bart17.shuffled.cnf
ais12.cnf
3blocks.cnf
random ksat 03v300c1950.cnf
qg1-07.cnf
random ksat 04v400c2600.cnf
4blocksb.cnf
bw large.a.cnf
random ksat 05v500c3250.cnf
4blocks.cnf
Variables
100
113
181
200
207
231
265
283
300
343
400
410
459
500
758
Clauses
650
1520
3151
1300
588
1166
5666
9690
1950
68083
2600
24758
4675
3250
47820
Clause/Variable Ratio
6.50
13.45
17.41
6.50
2.84
5.05
21.38
34.24
6.50
198.49
6.50
60.39
10.19
6.50
63.09
Table 1: The 15 SAT instances used for the experiments in this project
3.1
gj becomes −gj , and vice versa.
For example, the individual
Characterizing pEvoSAT
pEvoSAT was run over the test SAT instances and the
average fitness of the population, as well as the best fitness,
were recorded. The experimental parameter values for the
first set of experiments can be seen in Table 2.
I = (10, -5, 2, 3, 4, -1, 8, -9, -7, 6)
becomes
Iinverted = (-10, 5, -2, -3, -4, 1, -8, 9, 7, -6)
Variable
Number of generations (G)
Population size (μ)
Offspring population size (λ)
Probability of mutation (Pm )
Probability of translocation (Pt )
Probability of crossover (Pc )
Crossover method
Crossover selection strategy
Tournament size (t)
RateInversion
elitism
Value
100,000
1.6 × v
1.6 × v
1.5 × vc
1.5 × vc
0.8
CX
KBR
5
1000
1
In several early iterations of pEvoSAT, solutions that scored
very high in fitness had some variables set to the opposite
value of what they should be for a satisfying assignment. As
a result, the inversion operator was integrated.
Inversions occur at regular intervals, regulated by the
RateInversion parameter. Basically, inversions occur
RateInversion generations after the most recent improvement. Such that, if during generation g, a new optimum is
found (one better than the best optimum observed so far),
the inversion will occur RateInversion generations after g.
This inversion operation is applied to all individuals, including the elites chosen by the elitism in pEvoSAT. The
reason for that is because the elite individuals are a strong
force in the stall of the population over a particular region
in the search space and those have to be inverted for new
solution space to be explored.
Table 2: The evolutionary parameter values for the first set
of experiments. v is the number of variables in the SAT
instance, and c is the number of clauses in the same SAT
instance
3. EXPERIMENTAL RESULTS
Figure 1-Figure 3 show the recorded results of 20 runs of
pEvoSAT over a few selected instances. Similar behavior
was observed for all instances.. The overall behaviour seen
was that of quick convergence in the first several generations,
and then a slow, localized search within the region around
which convergence was achieved.
The experimental setup for this project sought to determine two objectives: (1) Characterize pEvoSAT’s performance over the given SAT instances, and (2) compare its
performance against GASAT. All experiments described use
the collection of the 15 SAT instances shown in Table 1, and
were performed on a MacBook Pro with a 2.4 GHz Intel
Core 2 Duo processor, running Mac OS X 10.5.8. The SAT
instances had their satisfiability tested using zChaff [14, 10].
The observed premature convergence could suggest a high
selection pressure due to a tournament size of 5. However,
A second set of experiments with a tournament size of 2
displayed similar results. Interestingly, the algorithm shows
a distinctly different convergence for high clause to variable
1
instances of the form random_ksat_nvVcC.cnf were produced using Tough SAT Project available at http://
toughsat.appspot.com/[21]
864
ratio instances (7, 8, 10, 12 and 15) and low clause to variable ratio instances (1-6, 9, 11, and 13-14). High clause to
variable ratio instances display a steady convergence around
97% satisfiability. On the other hand, low clause to variable
ratio instances show a noisy convergence around 80% satisfiability.
Such results are to be expected by the nature of the pEvoSAT
algorithm. High clause to variable ratio instances contain
more information about the proper assignment of the variables in the sheer number of clauses. In other words, in
these instances, implication by UP would be invoked more,
and would guide the search to very high fitness individuals. In the initial population then, one would end up with
very low fitness individuals that falsify many clauses due to
UP, or some very high fitness individuals. Since the selection method used is tournament selection, the high fitness
individuals will quickly outbreed the lower fitness ones, and
then resort to competing amongst themselves.
Conversely, instances with low clause to variable ratios
would offer less feedback through UP, and would rely more
on the genotype of the individual. Since the search would
be less directed then, one would expect to find a vast array of individuals with average fitnesses. Furthermore, one
would expect the fitness to fluctuate more in these instances
since mutations and crossovers, which are both stochastic
events, would play a greater role in determining any given
individual’s fitness.
The results indicate that the algorithm behaves very differently under the two conditions. As the clause to variable
ratio increases, the algorithm’s behaviour becomes more predictable. This can be seen by the graphs in Figure 1 and
Figure 2, where the different graphs overlay each other much
more closely than in Figure 3. However, there is a natural tendency for the algorithm to get stuck in local optima
where the clause to variable ratio is high since local optima,
as well as extreme minima, are very common in those instances. Moreover, low clause to variable ratio instances
have potentially more solutions than high clause to variable
ratio ones. Therefore, searching for the global optima in
the high clause to variable ratio instances becomes a much
harder and challenging task, as the run times in Section 3.2,
Table 3 will demonstrate.
The results in this section demonstrate that the pEvoSAT
algorithm takes advantage of any information offered by the
SAT instance. pEvoSAT uses the fitness of the individuals
to guide the search process like all other GAs. However,
it also uses the clauses in the SAT instance itself to direct
the search such that an individual need not have a genotype
which exactly matches the satisfying assignment to satisfy
the SAT instance.
The next section tests pEvoSAT’s performance against
GASAT on the same test instances in Table 1. We will
demonstrate that for some SAT instances, pEvoSAT can
outperform GASAT, and for many produce competitive results.
Figure 1: The average fitness (blue) and best fitness (green)
of pEvoSAT when solving qg1-07.cnf. The graph represents
20 runs overlaid on top of each other.
Figure 2: The average fitness (blue) and best fitness (green)
of pEvoSAT when solving 4blocksb.cnf. The graph represents 20 runs overlaid on top of each other.
3.2
Comparison with GASAT
The performance of pEvoSAT was tested against GASAT2 ,
the leading GA based SAT solver we are aware of. Unlike
pEvoSAT, GASAT uses the classical binary string representation of the SAT problem, but employes problem specific
Figure 3: The average fitness (blue) and best fitness (green)
of pEvoSAT when solving random ksat 05v500c3250.cnf.
The graph represents 20 runs overlaid on top of each other.
2
obtained
~lardeux/
865
from
http://www.info.univ-angers.fr/
recombination operations to preserve maximum satisfiability between generations, as well as other optimization techniques. The experiments were run with the parameter values given in Table 2 for pEvoSAT, except a tournament size
of 2, and GASAT’s default parameters with its population
changed to also be 1.6 × v. This change was done to guarantee that over a single generation, each algorithm looks at
the same number of individuals to make the measurements
comparable. The results of the comparison can be seen in
Table 3 and Table 4.
Run Time (seconds) ± SD
GASAT
pEvoSAT
1
0.014
± 0.015
0.031 ± 0.031
2
0.15
± 0.14
0.25
± 0.29
3
0.64
± 0.76
3.03
± 2.09
4
0.024
± 0.016
0.59
± 0.61
5
0.0062
± 0.0071
0.61
± 0.54
6
0.003
± 8.9E-19 2.90
± 2.02
7
135.33
± 127.45
76.25 ± 88.23
8
12190.33 ±1050.81
14.07 ± 14.56
9
0.081
± 0.055
7.86
± 6.53
10
2049.22 ± 2848.36 880.35 ± 560.68
11
0.24
± 0.33
56.92 ± 33.70
12
11992.63 ± 797.27
131.58 ± 341.18
13
0.202
± 0.21
0.71
± 0.63
14
1.52
± 1.04
569.55 ± 535.53
15
11363.03 ± 441.38
2957.4 ± 3006.1
Total
8
4
SAT
p-value
p<0.1
p<0.5
p<0.05
p<0.05
p<0.05
p<0.05
p<0.2
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
Table 3: The running times (until a satisfiable solution is
found3 ) for pEvoSAT compared to those of GASAT. The
times are expressed as the averages of 20 runs ± standard deviation.The shortest running time for each row is labeled in
blue, and the total number each algorithm scored as fastest
is in the ’Total’ row. The statistical significance, as determined by the Wilcoxon signed rank test, of one algorithm
outperforming its competitor is given in the ’p-value’ column
pEvoSAT outperforms GASAT overall with regards to the
number of evaluations (Table 4), performing fewer evaluations on 9 out of the 15 instances, often on an order of magnitude. However, when comparing the run-time performance,
GASAT seems to solve most SAT instances faster. Though
a comparison of the number of evaluations is useful to see
how the different algorithms operate, the measurement of
interest when solving a SAT instance is run time.
With regards to run time (Table 3), GASAT outperforms
pEvoSAT for 8 out of the 15 instances tested, where 3 instances did not show either algorithm significantly outperforming the other. However, for some instances, pEvoSAT
proved to be faster than GASAT. Interestingly enough, GASAT
seems to cope with the problem space increasing much better than pEvoSAT. When the problem size increases from
100 variables
(random ksat 01v100c650.cnf) to 500
(random ksat 05v500c3250.cnf), the run time of GASAT
goes from 0.014±0.015 seconds to 1.52±1.04 seconds. While
pEvoSAT goes from 0.031±0.031 to 569±536.53 seconds.
3
For instances 8, 12 and 15, GASAT was set to time out after
13,000 seconds even if it did not find a satisfying solution
866
However, for some SAT instances, pEvoSAT has a clear advantage over GASAT.
The main difference between the two algorithms is their
feedback from the SAT instance itself. While GASAT only
uses the SAT instance for fitness calculation, and calculation
of optimal recombination operations, pEvoSAT uses the instances to complement its assignments by using UP. Therefore, it would be logical to assume that instances where
pEvoSAT outperforms GASAT would be instances where
unit propagation provides considerably more feedback. In
general, those would be instances where there is a very high
clause to variable ratio such that there is a very limited set
of satisfying assignments. The clause to variable ratios of instances 8, 10, 12 and 15 are 34.24, 198.49, 60.39 and 63.09,
respectively. These numbers are higher than the majority
of ratios in the test set, and we see that pEvoSAT outperforms GASAT on instances in up to orders of magnitude.
At the same time, for some instances of low clause to variable ratio, pEvoSAT solves the instances in the same order
as GASAT. On the other hand, when the clause to variable
ratio remains low, but the number of variables increases (as
it does for instances 1,4,9,11, and 14) GASAT presents better and better performance over pEvoSAT. Therefore, it is
suggested that there is a unique set of SAT problems that
are better approached by pEvoSAT and the use of permutation based representation. These are problems where the
clause to variable ratio is high, making for very few satisfiable assignments. Figure 4 shows the run times’ ratio
tGASAT
#Clauses
( tpEvoSAT
) versus the clause to variable ratio ( #V
).
ariables
In this figure, a value of 2 would suggest that pEvoSAT performs twice as fast as GASAT, while a value of 0.1 suggests
that pEvoSAT performs 10 times slower than GASAT.
These results show that pEvoSAT outperforms GASAT when
the clause to variable ratio is high and is predominantly outperformed by GASAT when the clause to variable ratio is
low. It may be expected that as the clause to variable ratio increases, the run time difference would increase as well.
That is partially true if one looks at instances 8 (ratio =
34.24),12 (ratio = 60.39), and 15 (ratio = 63.09). However,
instance 10 (ratio = 198.49) violates that expectation by
demonstrating a run time ratio of about 3.1. One must keep
in mind the nature of the SAT instance also plays a role in
the performance of different algorithms. Instance 10, unlike
instances 8, 12 and 15, is a SAT translation of a QuasiGroup instance. Therefore its nature may favour GASAT
more despite the large clause to variable ratio.
An important observation we acknowledge is that these
are the first results we have gathered with regards to pEvoSAT.
There are further directions and attributes that can be explored, and those would be the subject of future work. However, we do emphasize that our current result show competitiveness with a leading GA based SAT solver.
In addition, while CDCL based algorithms currently outperform GAs, they have a theoretical limit that is enforced
due to the fact they are executed serially. GAs, on the other
hand, hold the potential of scalability through the use of
multiple cores and parallelization. GAs are inherently easy
to parallelize since the fitness calculation, crossover, and mutations of each individual are completely independent from
other individuals (in terms of data writing, parallel readings from the same block of data may be necessary). As a
result, though currently lagging behind, GAs have the potential to outperform CDCL based approaches when applied
SAT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Total
Average number of evaluations ± SD
GASAT
pEvoSAT
3.33E+03
±4.5E+03
1.41E+02
±2.05E+02
2.79E+04
±2.9E+04
1.49E+03
±1.53E+03
6.25E+04
±6.3E+04
4.46E+03
±4.39E+03
1.90E+03
±1.2E+03
1.27E+03
±1.64E+03
1.00E+03
±1.3E+03
4.99E+03
±2.93E+03
1.21E+02
±1.57E+02
6.91E+03
±4.18E+03
5.98E+06
±6.31E+06
1.32E+05
±9.16E+04
8.45E+08
±1.02E+08
9.91E+03
±1.08E+03
4.36E+03
±3.58E+03
2.25E+04
±1.78E+04
2.49E+07
±1.21E+07
1.54E+05
±7.67E+04
2.45E+04
±2.47E+04
2.42E+05
±2.15E+05
4.86E+08
±3.11E+06
5.84E+04
±2.79E+04
2.22E+04
±1.82E+04
5.46E+02
±4.44E+02
8.81E+04
±8.52E+04
2.23E+06
±1.06E+06
2.22E+08
±7.87E+06
9.69E+05
±7.12E+05
5
9
p-value
p<0.05
p<0.05
p<0.05
p<0.2
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
p<0.05
Table 4: The number of evaluations (until a satisfiable solution is found) for pEvoSAT compared to those of GASAT. The
numbers are expressed as the averages of 20 runs ± standard deviation.The smallest number of evaluations for each row is
labeled in blue, and the total number each algorithm scored as fastest is in the ’Total’ row.The statistical significance, as
determined by the Wilcoxon signed rank test, of one algorithm outperforming its competitor is given in the ’p-value’ column
ity instances. Experiments show that this particular use
of permutation based GAs can outperform traditional GA
approaches in some instances. As noted before, there are future directions of research that could be explored to improve
the results even further. However, the results shown in this
paper show potential of development into a novel approach
to dealing with SAT problems.
In addition, some ideas have been discussed in this paper regarding the scalability of GAs compared to CDCL
based methods. Though CDCL methods are in the lead
at the moment, performance wise, the return gained with
each improvement introduced in these methods is increasingly smaller. While the use of GAs for the SAT problem is
facing improvements in scaling as the use of multi-core machines and GPU programming is becoming more and more
popular.
The approach presented here demonstrates the merits of
permutation based GAs and the use of GA based approaches
for the SAT problem. This method shows potential to grow,
scale up, and perhaps outperform CDCL based approaches.
Figure 4: The ratio of GASAT’s running time to pEvoSAT’s
running time versus the clause/variable ratio of the SAT instances plotted on a logarithmic scale. A value of 10 indicates that pEvoSAT ran 10 times faster than GASAT while
a value of 0.1 indicates pEvoSAT ran 10 times slower than
GASAT. 95% Confidence Intervals are used as error bars
5. REFERENCES
[1] M. Davis, G. Logemann, and D. Loveland. A machine
program for theorem-proving. Communications of the
ACM, 5(7), 1962.
[2] A. E. Eiben and J. K. van der Haw. Solving 3-SAT by
GAs adapting constraint weights. In Evolutionary
Computation, 1997. IEEE International Conference
on, pages 81–86, April 1997.
[3] M. A. Franco, N. Krasnogor, and J. Bacardit.
Speeding Up the Evolution of Evolutionary Learning
Systems using GPGPUs. In Proceedings of the 12th
Annual Conference on Genetic and Evolutionary
Computation - GECCO2010, pages 1039–1046. ACM
Press, 2010.
[4] D. E. Goldberg and robert. Alleles, loci, and the
in mutli-core systems. A very promising potential route in
this endeavor is to use NVIDIA’s Compute Unified Device
Architecture (CUDA) [11] API which allows programmers
to easily assign tasks to GPU cores. There are usually more
GPU cores in a given computer than CPU cores, and the use
of GPU programming in the context of evolutionary methods has demonstrated a notable level of success [3].
4. CONCLUSION
This paper presents the novel idea of using permutation
based GAs for the purpose of solving boolean satisfiabil-
867
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
traveling salesman problem. In J. J. Grefenstette,
editor, Proceedings of the First International
Conference on Genetic Algorithms and Their
Applications. Lawrence Erlbaum Associates,
Publishers, 1985.
A. Gorbenko and V. Popov. A Genetic Algorithm
with Expansion and Exploration Operators for the
Maximum Satisfiability Problem. Applied
Mathematical Sciences, 7(24):1183–1190, 2013.
J. Gottlieb, E. Marchoiri, and C. Rossi. Evolutionary
Algorithms for the Satisfiability Problem.
Evolutionary Computation, 10(1):35–50, 2002.
F. Lardeux, F. Saubion, and J.-K. Hao. GASAT:A
Genetic Local Search Algorithm for the Satisfiability
Problem. Evolutionary Computation, 14(2):223–253,
July 2006.
E. Marchoiri and C. Rossi. A Flipping Genetic
Algorithm for Hard 3-SAT Problems. In Proceedings
of the Genetic and Evolutionary Computation
Conference(GECCO-99), pages 393–400. Morgan
Kaufman, 1999.
D. G. Mitchell. A SAT Solver Primer. EATCS
Bulletin (The Logic In Computer Science Column),
85:112–133, 2005.
M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang,
and S. Malik. Chaff: Engineering an Efficient SAT
Solver. In ANNUAL ACM IEEE DESIGN
AUTOMATION CONFERENCE, pages 530–535.
ACM, 2001.
TM
R CUDA
NVIDIA.
PARALLEL PROGRAMMING
MADE EASY. http:
//www.nvidia.com/object/cuda_home_new.html,
2011.
I. M. Oliver, D. J. Smith, and J. R. C. Holland. A
study of permutation crossover operators on the
traveling salesman problem. In Proceedings of the
Second International Conference on Genetic
Algorithms on Genetic algorithms and their
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
868
application, pages 224–230, Hillsdale, NJ, USA, 1987.
L. Erlbaum Associates Inc.
B. B. Rad, M. Masrom, and S. Ibrahim. Camouflage
in Malware: from Encryption to Metamorphism.
International Journal of Computer Science and
Network Security, 12(8):74–83, August 2012.
SAT research group, Princeton University. zchaff.
http://www.princeton.edu/~chaff/zchaff.html,
2004.
J. P. M. Silva and K. A. Sakallah. GRASP - A New
Search Algorithm for Satisfiability. In IEEE/ACMD
International Conference on Computer-Aided Design,
November 1996.
G. Syswerda. Schedule Optimization Using Genetic
Algorithms. In L. Davis, editor, Handbook of Genetic
Algorithms, pages 332–349. Van Nostrand Reinhold,
1991.
D. Whitley, T. Starkweather, and D. Shaner. The
Traveling Salesman and Sequence Scheduling: Quality
Solutions Using Genetic Edge Recombination. In
L. Davis, editor, Handbook of Genetic Algorithms,
chapter 22. International Thomas Computers Press,
1991.
K. C. Wiese, A. A. Deschenes, and A. G. Hendriks.
RnaPredict - An Evolutionary Algorithm for RNA
Secondary Structure Prediction. ACM Transactions
on Computational Biology and Bioinformatics,
5(1):25–41, 2008.
K. C. Wiese and E. Glen. A permutation-based
genetic algorithm for the rna folding problem: a
critical look at selection strategies, crossover operators
and representation issues. BioSystems, 7:29–41, 2003.
K. C. Wiese and S. D. Goodwin. Keep-Best
Reproduction: A Local Family Competition Selection
Strategy and the Environment it Flourishes in.
Constraints, 6:399–422, 2001.
H. Yuen and J. Bebel. Tough SAT Project.
http://toughsat.appspot.com/, 2011.