pEvoSAT: A Novel Permutation Based Genetic Algorithm for Solving the Boolean Satisfiability Problem Kay C. Wiese Boris Shabash Simon Fraser University 8888 University Drive Burnaby, BC, Canada Simon Fraser University 8888 University Drive Burnaby, BC, Canada [email protected] [email protected] ABSTRACT instance requires finding an assignment for each variable xi , true (1) or false (0), such that the conjunction described by the CNF is satisfied. For example, suppose the following CNF needs to be satisfied In this paper we introduce pEvoSAT, a permutation based Genetic Algorithm (GA), designed to solve the boolean satisfiability (SAT) problem when it is presented in the conjunctive normal form (CNF). The use of permutation based representation allows the algorithm to take advantage of domain specific knowledge such as unit propagation, and pruning. In this paper, we explore and characterize the behavior of our algorithm. This paper also presents the comparison of pEvoSAT to GASAT, a leading implementation of GAs for the solving of CNF instances. (x1 ∨ x2 ) ∧ (x¯2 ∨ x3 ) ∧ (x¯2 ∨ x4 ) ∧ (x¯4 ∨ x¯1 ) A satisfying assignment would be x1 = 1, x2 = 0, x3 = 1, x4 = 0. A single CNF may have many, one or no satisfying assignments. If one wishes to find a solution to a CNF using an exhaustive search, each variable has to be considered as being 0 or 1. As a result, for an n-variable CNF, 2n potential solutions exist. Problems that are members of the NP-Complete set usually have an order of O(2n ) algorithm to find a solution where n is the size of the input. However these algorithms perform very poorly for input of even fairly small size; a CNF composed of 250 variables has 2250 ≈ 1.8 × 1075 potential solutions. Assuming a computer could verify 1,000,000,000 solutions each second, it may have to run for 5.7×1058 years to find the correct solution. Many CNFs that are of interest today contain thousands, and even millions, of variables, so using exhaustive search is not a practical solution. Over the years, several advancements were made in solving SAT instances. Categories and Subject Descriptors I.2 [Artificial Intelligence]: Miscellaneous General Terms Algorithms, Design, Performance Keywords SAT; permutation-based representation; gentic algorithms; DPLL; unit-propagation 1. INTRODUCTION 1.1 1.2 The Boolean Satisfiability Problem Unit Propagation and the DPLL Algorithm The first significant improvement came in 1962 from the following observation: Suppose one has a clause with three literals, (l1 ∨ l2 ∨ l3 ), and suppose both l1 and l2 have been falsified (i.e. l1 and l2 are both false). Then, there is no point in considering any assignment that has l3 false as well, since such an assignment would falsify the clause, and hence, the entire CNF. This notion is called Unit Propagation (or UP); If any clause has all its literals set to false but one, that literal is automatically set to true and the variables it represents is set accordingly. In our example, if l3 was xi then xi = 1 and if l3 was x¯i then xi = 0. Though a relatively straightforward observation, each variable that is unit propagated brings forth a two prone effect: The satisfiability problem (abbreviated as SAT) is the first NP-Complete problem to be described. An instance of a SAT problem is composed of n boolean variables labeled x1 through xn and logical operators that connect them such as conjunctions (∧), disjunctions (∨), negations (x̄), implications (→), exclusive conjunctions, etc’. Every instance of SAT can be represented in a Conjunctive Normal Form (CNF), where a CNF instance is a conjunction of clauses, each clause C is a disjunction of literals, and each literal l is an affirmation (xi ) or a negation (x¯i ) of a variable xi ∈ {x1 , ..., xn }. Finding a solution for a CNF problem 1. The search space is cut in half for each variable that is unit propagated. If 3 variables have been unit propagated, the search space decreases almost by an order of magnitude. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. GECCO’13, July 6–10, 2013, Amsterdam, The Netherlands. Copyright 2013 ACM 978-1-4503-1963-8/13/07 ...$15.00. 2. Also, a literal that has been unit propagated may falsify another literal somewhere else in the CNF, potentially causing another UP to take place. 861 “restart” is integrated if the algorithm appears to have become stuck in a local optimum. This notion gave rise to the Davis-Putnam-Logen-Loveland (DPLL) [1] algorithm. The algorithm used two techniques to prune the search space; UP and “pure” literal elimination. A pure variable is one that is only affirmed or negated, but not both, in the formula. If a variable is only affirmed, there is no reason to explore any assignments that set it as false, and vice versa. This algorithm set the first milestone in pruning the search space, but for large CNFs of hundreds and thousands of variables, this was still too slow. The biggest breakthrough for CNF assignment came 34 years later, in 1996. The improvement was the introduction of clause “learning” throughout the search of possible solutions. Whenever a solution would prove to be inappropriate, a new clause would be derived. This new clause would implicitly detail why the most recent assignment failed, and so future searches would use this clause to prevent bad searches from occurring again. The details of the algorithm, known as Conflict Derived Clause Learning (CDCL) are omitted here for brevity purposes. However, interested readers can refer to [15, 9]. 2.1 1.3 Previous GA Approaches I = (1, -2, -4, 6, -5, 3, 8, -10, 7, 9) Previous approaches to the SAT problem mostly revolved around the classic representation of individuals. Each potential solution was represented as a bit string of length n of either 0s or 1s. The bit string corresponded to an assignment of the n variables. The fitness function that is also most often used for instances of SAT is the maxSAT fitness function. The function takes an assignment, assigns it, and then checks to see how many clauses were satisfied by it. Mathematically it can be described as Φ(I) = n Cj (I) which is interpreted as “set x1 to true, set x2 to false, set x4 to false,...” In pEvoSAT, the value at each gene of the individual is applied to the CNF, and all literals which correspond to the variable this gene represents receive an assignment. Then UP takes place and assignments are given to variables that are propagated through UP. This would give an assignment to some variables xi ∈ {x1 , ..., xn } that may conflict with the assignment that individual I suggests, but in such a case the UP assignment is taken. The process is repeated until a clause becomes entirely falsified, and the fitness of the individual is then calculated. Section 2.4 will discuss the details of the fitness evaluation and the selection methods. pEvoSAT works to develop an individual that can produce a satisfying assignment, but such an individual’s genotype would not necessarily correspond to the satisfying assignment. Rather, its phenotype will correspond to a satisfying assignment since the phenotype also takes into account the values from UP. (1) j=0 Where Φ(I) is the fitness of an individual I, and Cj (I) is the value of the j th clause with the assignment individual I represents. More satisfied clauses correspond to more fit individuals. Mutations and crossovers are also straightforward with the bit-string representation. A mutation would simply be flipping a bit in the string from 0 to 1 or vice versa. A crossover would simply mean switching two substrings between two individuals. Like all GAs, this method is prone to getting stuck at local optima. Several methods were attempted in aiding the algorithm to escape local optima such as the weighted maxSAT fitness function [6, 2], the use of a high mutation rate combined with hill climbing [8], and the use of more sophisticated recombination operators [7, 5]. Out of all these methods, GASAT is, to the authors’ knowledge, the most promising and best implementation of using GA for solving the SAT problem. An issue that is not addressed by other methods is the fact that some poor assignments are explored and no pruning of the search space is done. The formulation of UP, and of permutation based GAs paved the way to use UP in a genetic algorithm to prune the search space whenever possible. 2. Representation of the Problem Unlike classical GA based methods, permutation based GAs seek to represent individuals as an ordered set of elements relating to the problem [17, 16, 19, 13, 18]. The classical representation of SAT individuals, as mentioned in Section 1.3, is a string of length n of 0s and 1s that corresponds to an assignment of the n variables. However, pEvoSAT takes a permutation based approach to represent the individuals. Each individual is a sequence of n elements from the alphabet ℵ = {1, 2, ...(n − 1), n}, combined with a positive (True) or negative (False) sign for each element, where an individual, I, contains the symbol k (1 ≤ |k| ≤ n) if, and only if, it does not contain the symbol −k. If a gene, gj in individual I is k (positive), this is interpreted as ‘set xk in the CNF to “true”’, if it is −k, (negative) it is interpreted as ‘set xk to “false”’. An example for an individual in pEvoSAT could be: 2.2 Mutation Operators The two mutation operators that can be used are classical mutations (negation of a variable assignment from xj to x¯j ) and translocations. Classical mutations, or simply mutations in pEvoSAT, are controlled by a parameter in pEvoSAT denoted Pm . Pm can assume any real value from the range [0.0,1.0]. When applied to a gene, gj , a classical mutation operator will negate the value of gj (e.g. change it from 3 to -3, or from -16 to 16). Mutations will only change the sign of the gene, but not its absolute value (since this will cause some variables to be accounted for more than once, and some not even once). The other type of modifcations used in pEvoSAT is translocations. In a translocation, two random genes are chosen, gj and gk . Then, their values are swapped. In essence, this action changes a sequence of decision, and causes a decision that may have been ignored due to UP to be executed. Translocations are also controlled by a parameter, Pt . Like Pm , Pt is also a real value from the range [0.0, 1.0]. However, it does not directly correspond to the number of pEvoSAT pEvoSAT is an evolutionary SAT solver. pEvoSAT also uses GAs to represent and solve the problem, but uses a permutation based representation for the individuals, as well as specialized mutation operators. In addition, a type of 862 translocations done. Pt × size(I) is the number of translocation operations that are performed. Consider the extreme case where both locations chosen are always the same. In that case, only 1 or no translocations would actually happen. The number of translocations can actually vary from 0 to size(I) × Pt . In that sense, Pt does not represent the number of translocations that would happen, but the degree to which an individual would be rearranged. A higher Pt value would produce more shuffling of the genes at every generation. domly chosen. P1 [6] = 8, so O1 [6] = 8 which produces O1 = (-, -, -, -, -, -, 8, -, -, -) Then, the operation finds that P2 [2] = 8, so it now searches location 2 to find P1 [2] = −4, hence O1 [2] = −4, resulting in O1 = (-, -, -4, -, -, -, 8, -, -, -) 2.3 Recombination Operators Another important feature of GAs is the use of crossover to combine elements from different individuals into one individual. The crossover operation is controlled by the parameter Pc . Pc can take on any value from [0.0,1.0] and dictates what percentage of the children of the next generation will be the result of crossovers. If the population of a parent generation is μ and the offspring size is λ, then Pc × λ individuals will be produced through crossover to make the offspring generation. When binary strings are used as the representation, crossover is simply performed by splitting the strings (parents) at random intervals and recombining the pieces (into offspring). However, in permutation based representation, recombining information from both parents becomes a much harder task. Information from both parents needs to be integrated into the offspring, but in such a way as to produce viable offspring. In the case of pEvoSAT, values for all parameters of the SAT instance are required. There are several crossover operators available for permutation based representation. The most prominent of them are: Ordered Crossover (OX#2) [16], which is used to maintain the relative order of elements from both parents, Cycle Crossover (CX) [12], which maintains the absolute position of elements from both parent, and Partially Mapped Crossover (PMX) [4], which simply uses information from the parents to shuffle elements in each parent. In our experiments, CX was used as the crossover method of choice. 2.3.1 The operation then finds that P2 [7] = 4 (notice the absolute value of -4 was searched for), and that P1 [7] = −10, hence O1 [7] = −10. The operation continues to find that P2 [3] = 10, and P1 [3] = 6, hence O1 [3] = 6. Then, P2 [9] = 6, and P1 [9] = 9, hence O1 [9] = 9. Then P2 [4] = −9, and P1 [4] = −5, hence O1 [4] = −5. Then P2 [1] = 5, and P1 [1] = −2, hence O1 [1] = −2. Then P2 [0] = −2, and P1 [0] = 1 then O1 [0] = 1. Finally, P2 [6] = −1, which brings the operation back to O1 [6]. The remaining values are copied from P2 [5] = 3 and P2 [8] = −7 to give O1 = (1, -2, -4, 6, -5, 3, 8, -10, -7, 9) As mentioned before, CX retains the absolute location and order of the values in the parents. A property which OX#2 and PMX do not comply with. 2.3.2 Keep Best Reproduction (KBR) In addition to the creation of crossover offspring, a selection method for the crossover offspring was required. Traditionally both offspring are chosen to replace both parents. However, it was shown in [20] that there is merit in keeping the most fit parent and most fit offspring. 2.4 The Fitness Function and Selection Method The fitness function used to evaluate each individual is the one mentioned in Section 1.3 and is given in Equation 2: n Φ(I) = Cj (I) (2) j=0 Here, Cj the value of the j th clause with the assignment individual I represents. Cycle Crossover (CX) The CX operation maintains absolute location of the different values. Some of the values’ locations are taken from one parent (P1 ) and some from the other (P2 ). In CX, a random position i is chosen in P1 . Then, the value from P1 at i is copied into O1 at i, i.e. O1 [i] = P1 [i]. Then, a position j is selected such that the value of P2 at j is the value just copied, O1 [i]. i.e. P2 [j] = O1 [i]. The value j is used for the next copying action, such that the value for j in P1 is copied into O1 , i.e. O1 [j] = P1 [j]. The next location in P2 is chosen such that its value is the one just copied and the cycle continues until the original location i is re-encountred. At this point, all other entries are simply copied in order from P2 . Another individual can be produced in the same way, or by applying the same algorithm and switching P1 and P2 . The following example would illustrate the process of CX. Suppose P1 and P2 are the following individuals: The selection method used in pEvoSAT is tournament selection. In tournament selection of size t, t individuals are chosen at random from the population, and the winner of the tournament is the individual with the highest fitness. This individual is selected for the next generation, and the process continues for as many individuals as are required. Larger tournament sizes correspond to a higher selection pressure. The selection mechanism also uses 1-elitism. As a final resort against local optima, pEvoSAT uses a particular kind of restarts; inversions 2.5 Inversions Many algorithms that aim to solve SAT instances use restarts. The observation that the few initial decisions performed could dictate the rest of the search lead to the idea that if a search from a set of initial conditions takes too long, perhaps a different set of initial conditions would do better. Usually that would be done by a fresh restart, free from the search results seen so far. However, pEvoSAT uses a different kind of restart, an inversion. In an inversion of individual I, I is inverted by having all of its decisions negated. Every P1 = (1, -2, -4, 6, -5, 3, 8, -10, 7, 9) P2 = (-2, 5, 8, 10, -9, 3, -1, 4, -7, 6) To produce O1 the location 6 (with 0-based index) is ran- 863 # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 CNF Instance random ksat 01v100c650.cnf1 ais8.cnf ais10.cnf random ksat 02v200c1300.cnf driverlog1 k99i.renamed-as.sat05-3951.cnf bart17.shuffled.cnf ais12.cnf 3blocks.cnf random ksat 03v300c1950.cnf qg1-07.cnf random ksat 04v400c2600.cnf 4blocksb.cnf bw large.a.cnf random ksat 05v500c3250.cnf 4blocks.cnf Variables 100 113 181 200 207 231 265 283 300 343 400 410 459 500 758 Clauses 650 1520 3151 1300 588 1166 5666 9690 1950 68083 2600 24758 4675 3250 47820 Clause/Variable Ratio 6.50 13.45 17.41 6.50 2.84 5.05 21.38 34.24 6.50 198.49 6.50 60.39 10.19 6.50 63.09 Table 1: The 15 SAT instances used for the experiments in this project 3.1 gj becomes −gj , and vice versa. For example, the individual Characterizing pEvoSAT pEvoSAT was run over the test SAT instances and the average fitness of the population, as well as the best fitness, were recorded. The experimental parameter values for the first set of experiments can be seen in Table 2. I = (10, -5, 2, 3, 4, -1, 8, -9, -7, 6) becomes Iinverted = (-10, 5, -2, -3, -4, 1, -8, 9, 7, -6) Variable Number of generations (G) Population size (μ) Offspring population size (λ) Probability of mutation (Pm ) Probability of translocation (Pt ) Probability of crossover (Pc ) Crossover method Crossover selection strategy Tournament size (t) RateInversion elitism Value 100,000 1.6 × v 1.6 × v 1.5 × vc 1.5 × vc 0.8 CX KBR 5 1000 1 In several early iterations of pEvoSAT, solutions that scored very high in fitness had some variables set to the opposite value of what they should be for a satisfying assignment. As a result, the inversion operator was integrated. Inversions occur at regular intervals, regulated by the RateInversion parameter. Basically, inversions occur RateInversion generations after the most recent improvement. Such that, if during generation g, a new optimum is found (one better than the best optimum observed so far), the inversion will occur RateInversion generations after g. This inversion operation is applied to all individuals, including the elites chosen by the elitism in pEvoSAT. The reason for that is because the elite individuals are a strong force in the stall of the population over a particular region in the search space and those have to be inverted for new solution space to be explored. Table 2: The evolutionary parameter values for the first set of experiments. v is the number of variables in the SAT instance, and c is the number of clauses in the same SAT instance 3. EXPERIMENTAL RESULTS Figure 1-Figure 3 show the recorded results of 20 runs of pEvoSAT over a few selected instances. Similar behavior was observed for all instances.. The overall behaviour seen was that of quick convergence in the first several generations, and then a slow, localized search within the region around which convergence was achieved. The experimental setup for this project sought to determine two objectives: (1) Characterize pEvoSAT’s performance over the given SAT instances, and (2) compare its performance against GASAT. All experiments described use the collection of the 15 SAT instances shown in Table 1, and were performed on a MacBook Pro with a 2.4 GHz Intel Core 2 Duo processor, running Mac OS X 10.5.8. The SAT instances had their satisfiability tested using zChaff [14, 10]. The observed premature convergence could suggest a high selection pressure due to a tournament size of 5. However, A second set of experiments with a tournament size of 2 displayed similar results. Interestingly, the algorithm shows a distinctly different convergence for high clause to variable 1 instances of the form random_ksat_nvVcC.cnf were produced using Tough SAT Project available at http:// toughsat.appspot.com/[21] 864 ratio instances (7, 8, 10, 12 and 15) and low clause to variable ratio instances (1-6, 9, 11, and 13-14). High clause to variable ratio instances display a steady convergence around 97% satisfiability. On the other hand, low clause to variable ratio instances show a noisy convergence around 80% satisfiability. Such results are to be expected by the nature of the pEvoSAT algorithm. High clause to variable ratio instances contain more information about the proper assignment of the variables in the sheer number of clauses. In other words, in these instances, implication by UP would be invoked more, and would guide the search to very high fitness individuals. In the initial population then, one would end up with very low fitness individuals that falsify many clauses due to UP, or some very high fitness individuals. Since the selection method used is tournament selection, the high fitness individuals will quickly outbreed the lower fitness ones, and then resort to competing amongst themselves. Conversely, instances with low clause to variable ratios would offer less feedback through UP, and would rely more on the genotype of the individual. Since the search would be less directed then, one would expect to find a vast array of individuals with average fitnesses. Furthermore, one would expect the fitness to fluctuate more in these instances since mutations and crossovers, which are both stochastic events, would play a greater role in determining any given individual’s fitness. The results indicate that the algorithm behaves very differently under the two conditions. As the clause to variable ratio increases, the algorithm’s behaviour becomes more predictable. This can be seen by the graphs in Figure 1 and Figure 2, where the different graphs overlay each other much more closely than in Figure 3. However, there is a natural tendency for the algorithm to get stuck in local optima where the clause to variable ratio is high since local optima, as well as extreme minima, are very common in those instances. Moreover, low clause to variable ratio instances have potentially more solutions than high clause to variable ratio ones. Therefore, searching for the global optima in the high clause to variable ratio instances becomes a much harder and challenging task, as the run times in Section 3.2, Table 3 will demonstrate. The results in this section demonstrate that the pEvoSAT algorithm takes advantage of any information offered by the SAT instance. pEvoSAT uses the fitness of the individuals to guide the search process like all other GAs. However, it also uses the clauses in the SAT instance itself to direct the search such that an individual need not have a genotype which exactly matches the satisfying assignment to satisfy the SAT instance. The next section tests pEvoSAT’s performance against GASAT on the same test instances in Table 1. We will demonstrate that for some SAT instances, pEvoSAT can outperform GASAT, and for many produce competitive results. Figure 1: The average fitness (blue) and best fitness (green) of pEvoSAT when solving qg1-07.cnf. The graph represents 20 runs overlaid on top of each other. Figure 2: The average fitness (blue) and best fitness (green) of pEvoSAT when solving 4blocksb.cnf. The graph represents 20 runs overlaid on top of each other. 3.2 Comparison with GASAT The performance of pEvoSAT was tested against GASAT2 , the leading GA based SAT solver we are aware of. Unlike pEvoSAT, GASAT uses the classical binary string representation of the SAT problem, but employes problem specific Figure 3: The average fitness (blue) and best fitness (green) of pEvoSAT when solving random ksat 05v500c3250.cnf. The graph represents 20 runs overlaid on top of each other. 2 obtained ~lardeux/ 865 from http://www.info.univ-angers.fr/ recombination operations to preserve maximum satisfiability between generations, as well as other optimization techniques. The experiments were run with the parameter values given in Table 2 for pEvoSAT, except a tournament size of 2, and GASAT’s default parameters with its population changed to also be 1.6 × v. This change was done to guarantee that over a single generation, each algorithm looks at the same number of individuals to make the measurements comparable. The results of the comparison can be seen in Table 3 and Table 4. Run Time (seconds) ± SD GASAT pEvoSAT 1 0.014 ± 0.015 0.031 ± 0.031 2 0.15 ± 0.14 0.25 ± 0.29 3 0.64 ± 0.76 3.03 ± 2.09 4 0.024 ± 0.016 0.59 ± 0.61 5 0.0062 ± 0.0071 0.61 ± 0.54 6 0.003 ± 8.9E-19 2.90 ± 2.02 7 135.33 ± 127.45 76.25 ± 88.23 8 12190.33 ±1050.81 14.07 ± 14.56 9 0.081 ± 0.055 7.86 ± 6.53 10 2049.22 ± 2848.36 880.35 ± 560.68 11 0.24 ± 0.33 56.92 ± 33.70 12 11992.63 ± 797.27 131.58 ± 341.18 13 0.202 ± 0.21 0.71 ± 0.63 14 1.52 ± 1.04 569.55 ± 535.53 15 11363.03 ± 441.38 2957.4 ± 3006.1 Total 8 4 SAT p-value p<0.1 p<0.5 p<0.05 p<0.05 p<0.05 p<0.05 p<0.2 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 Table 3: The running times (until a satisfiable solution is found3 ) for pEvoSAT compared to those of GASAT. The times are expressed as the averages of 20 runs ± standard deviation.The shortest running time for each row is labeled in blue, and the total number each algorithm scored as fastest is in the ’Total’ row. The statistical significance, as determined by the Wilcoxon signed rank test, of one algorithm outperforming its competitor is given in the ’p-value’ column pEvoSAT outperforms GASAT overall with regards to the number of evaluations (Table 4), performing fewer evaluations on 9 out of the 15 instances, often on an order of magnitude. However, when comparing the run-time performance, GASAT seems to solve most SAT instances faster. Though a comparison of the number of evaluations is useful to see how the different algorithms operate, the measurement of interest when solving a SAT instance is run time. With regards to run time (Table 3), GASAT outperforms pEvoSAT for 8 out of the 15 instances tested, where 3 instances did not show either algorithm significantly outperforming the other. However, for some instances, pEvoSAT proved to be faster than GASAT. Interestingly enough, GASAT seems to cope with the problem space increasing much better than pEvoSAT. When the problem size increases from 100 variables (random ksat 01v100c650.cnf) to 500 (random ksat 05v500c3250.cnf), the run time of GASAT goes from 0.014±0.015 seconds to 1.52±1.04 seconds. While pEvoSAT goes from 0.031±0.031 to 569±536.53 seconds. 3 For instances 8, 12 and 15, GASAT was set to time out after 13,000 seconds even if it did not find a satisfying solution 866 However, for some SAT instances, pEvoSAT has a clear advantage over GASAT. The main difference between the two algorithms is their feedback from the SAT instance itself. While GASAT only uses the SAT instance for fitness calculation, and calculation of optimal recombination operations, pEvoSAT uses the instances to complement its assignments by using UP. Therefore, it would be logical to assume that instances where pEvoSAT outperforms GASAT would be instances where unit propagation provides considerably more feedback. In general, those would be instances where there is a very high clause to variable ratio such that there is a very limited set of satisfying assignments. The clause to variable ratios of instances 8, 10, 12 and 15 are 34.24, 198.49, 60.39 and 63.09, respectively. These numbers are higher than the majority of ratios in the test set, and we see that pEvoSAT outperforms GASAT on instances in up to orders of magnitude. At the same time, for some instances of low clause to variable ratio, pEvoSAT solves the instances in the same order as GASAT. On the other hand, when the clause to variable ratio remains low, but the number of variables increases (as it does for instances 1,4,9,11, and 14) GASAT presents better and better performance over pEvoSAT. Therefore, it is suggested that there is a unique set of SAT problems that are better approached by pEvoSAT and the use of permutation based representation. These are problems where the clause to variable ratio is high, making for very few satisfiable assignments. Figure 4 shows the run times’ ratio tGASAT #Clauses ( tpEvoSAT ) versus the clause to variable ratio ( #V ). ariables In this figure, a value of 2 would suggest that pEvoSAT performs twice as fast as GASAT, while a value of 0.1 suggests that pEvoSAT performs 10 times slower than GASAT. These results show that pEvoSAT outperforms GASAT when the clause to variable ratio is high and is predominantly outperformed by GASAT when the clause to variable ratio is low. It may be expected that as the clause to variable ratio increases, the run time difference would increase as well. That is partially true if one looks at instances 8 (ratio = 34.24),12 (ratio = 60.39), and 15 (ratio = 63.09). However, instance 10 (ratio = 198.49) violates that expectation by demonstrating a run time ratio of about 3.1. One must keep in mind the nature of the SAT instance also plays a role in the performance of different algorithms. Instance 10, unlike instances 8, 12 and 15, is a SAT translation of a QuasiGroup instance. Therefore its nature may favour GASAT more despite the large clause to variable ratio. An important observation we acknowledge is that these are the first results we have gathered with regards to pEvoSAT. There are further directions and attributes that can be explored, and those would be the subject of future work. However, we do emphasize that our current result show competitiveness with a leading GA based SAT solver. In addition, while CDCL based algorithms currently outperform GAs, they have a theoretical limit that is enforced due to the fact they are executed serially. GAs, on the other hand, hold the potential of scalability through the use of multiple cores and parallelization. GAs are inherently easy to parallelize since the fitness calculation, crossover, and mutations of each individual are completely independent from other individuals (in terms of data writing, parallel readings from the same block of data may be necessary). As a result, though currently lagging behind, GAs have the potential to outperform CDCL based approaches when applied SAT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Total Average number of evaluations ± SD GASAT pEvoSAT 3.33E+03 ±4.5E+03 1.41E+02 ±2.05E+02 2.79E+04 ±2.9E+04 1.49E+03 ±1.53E+03 6.25E+04 ±6.3E+04 4.46E+03 ±4.39E+03 1.90E+03 ±1.2E+03 1.27E+03 ±1.64E+03 1.00E+03 ±1.3E+03 4.99E+03 ±2.93E+03 1.21E+02 ±1.57E+02 6.91E+03 ±4.18E+03 5.98E+06 ±6.31E+06 1.32E+05 ±9.16E+04 8.45E+08 ±1.02E+08 9.91E+03 ±1.08E+03 4.36E+03 ±3.58E+03 2.25E+04 ±1.78E+04 2.49E+07 ±1.21E+07 1.54E+05 ±7.67E+04 2.45E+04 ±2.47E+04 2.42E+05 ±2.15E+05 4.86E+08 ±3.11E+06 5.84E+04 ±2.79E+04 2.22E+04 ±1.82E+04 5.46E+02 ±4.44E+02 8.81E+04 ±8.52E+04 2.23E+06 ±1.06E+06 2.22E+08 ±7.87E+06 9.69E+05 ±7.12E+05 5 9 p-value p<0.05 p<0.05 p<0.05 p<0.2 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 p<0.05 Table 4: The number of evaluations (until a satisfiable solution is found) for pEvoSAT compared to those of GASAT. The numbers are expressed as the averages of 20 runs ± standard deviation.The smallest number of evaluations for each row is labeled in blue, and the total number each algorithm scored as fastest is in the ’Total’ row.The statistical significance, as determined by the Wilcoxon signed rank test, of one algorithm outperforming its competitor is given in the ’p-value’ column ity instances. Experiments show that this particular use of permutation based GAs can outperform traditional GA approaches in some instances. As noted before, there are future directions of research that could be explored to improve the results even further. However, the results shown in this paper show potential of development into a novel approach to dealing with SAT problems. In addition, some ideas have been discussed in this paper regarding the scalability of GAs compared to CDCL based methods. Though CDCL methods are in the lead at the moment, performance wise, the return gained with each improvement introduced in these methods is increasingly smaller. While the use of GAs for the SAT problem is facing improvements in scaling as the use of multi-core machines and GPU programming is becoming more and more popular. The approach presented here demonstrates the merits of permutation based GAs and the use of GA based approaches for the SAT problem. This method shows potential to grow, scale up, and perhaps outperform CDCL based approaches. Figure 4: The ratio of GASAT’s running time to pEvoSAT’s running time versus the clause/variable ratio of the SAT instances plotted on a logarithmic scale. A value of 10 indicates that pEvoSAT ran 10 times faster than GASAT while a value of 0.1 indicates pEvoSAT ran 10 times slower than GASAT. 95% Confidence Intervals are used as error bars 5. REFERENCES [1] M. Davis, G. Logemann, and D. Loveland. A machine program for theorem-proving. Communications of the ACM, 5(7), 1962. [2] A. E. Eiben and J. K. van der Haw. Solving 3-SAT by GAs adapting constraint weights. In Evolutionary Computation, 1997. IEEE International Conference on, pages 81–86, April 1997. [3] M. A. Franco, N. Krasnogor, and J. Bacardit. Speeding Up the Evolution of Evolutionary Learning Systems using GPGPUs. In Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation - GECCO2010, pages 1039–1046. ACM Press, 2010. [4] D. E. Goldberg and robert. Alleles, loci, and the in mutli-core systems. A very promising potential route in this endeavor is to use NVIDIA’s Compute Unified Device Architecture (CUDA) [11] API which allows programmers to easily assign tasks to GPU cores. There are usually more GPU cores in a given computer than CPU cores, and the use of GPU programming in the context of evolutionary methods has demonstrated a notable level of success [3]. 4. CONCLUSION This paper presents the novel idea of using permutation based GAs for the purpose of solving boolean satisfiabil- 867 [5] [6] [7] [8] [9] [10] [11] [12] traveling salesman problem. In J. J. Grefenstette, editor, Proceedings of the First International Conference on Genetic Algorithms and Their Applications. Lawrence Erlbaum Associates, Publishers, 1985. A. Gorbenko and V. Popov. A Genetic Algorithm with Expansion and Exploration Operators for the Maximum Satisfiability Problem. Applied Mathematical Sciences, 7(24):1183–1190, 2013. J. Gottlieb, E. Marchoiri, and C. Rossi. Evolutionary Algorithms for the Satisfiability Problem. Evolutionary Computation, 10(1):35–50, 2002. F. Lardeux, F. Saubion, and J.-K. Hao. GASAT:A Genetic Local Search Algorithm for the Satisfiability Problem. Evolutionary Computation, 14(2):223–253, July 2006. E. Marchoiri and C. Rossi. A Flipping Genetic Algorithm for Hard 3-SAT Problems. In Proceedings of the Genetic and Evolutionary Computation Conference(GECCO-99), pages 393–400. Morgan Kaufman, 1999. D. G. Mitchell. A SAT Solver Primer. EATCS Bulletin (The Logic In Computer Science Column), 85:112–133, 2005. M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik. Chaff: Engineering an Efficient SAT Solver. In ANNUAL ACM IEEE DESIGN AUTOMATION CONFERENCE, pages 530–535. ACM, 2001. TM R CUDA NVIDIA. PARALLEL PROGRAMMING MADE EASY. http: //www.nvidia.com/object/cuda_home_new.html, 2011. I. M. Oliver, D. J. Smith, and J. R. C. Holland. A study of permutation crossover operators on the traveling salesman problem. In Proceedings of the Second International Conference on Genetic Algorithms on Genetic algorithms and their [13] [14] [15] [16] [17] [18] [19] [20] [21] 868 application, pages 224–230, Hillsdale, NJ, USA, 1987. L. Erlbaum Associates Inc. B. B. Rad, M. Masrom, and S. Ibrahim. Camouflage in Malware: from Encryption to Metamorphism. International Journal of Computer Science and Network Security, 12(8):74–83, August 2012. SAT research group, Princeton University. zchaff. http://www.princeton.edu/~chaff/zchaff.html, 2004. J. P. M. Silva and K. A. Sakallah. GRASP - A New Search Algorithm for Satisfiability. In IEEE/ACMD International Conference on Computer-Aided Design, November 1996. G. Syswerda. Schedule Optimization Using Genetic Algorithms. In L. Davis, editor, Handbook of Genetic Algorithms, pages 332–349. Van Nostrand Reinhold, 1991. D. Whitley, T. Starkweather, and D. Shaner. The Traveling Salesman and Sequence Scheduling: Quality Solutions Using Genetic Edge Recombination. In L. Davis, editor, Handbook of Genetic Algorithms, chapter 22. International Thomas Computers Press, 1991. K. C. Wiese, A. A. Deschenes, and A. G. Hendriks. RnaPredict - An Evolutionary Algorithm for RNA Secondary Structure Prediction. ACM Transactions on Computational Biology and Bioinformatics, 5(1):25–41, 2008. K. C. Wiese and E. Glen. A permutation-based genetic algorithm for the rna folding problem: a critical look at selection strategies, crossover operators and representation issues. BioSystems, 7:29–41, 2003. K. C. Wiese and S. D. Goodwin. Keep-Best Reproduction: A Local Family Competition Selection Strategy and the Environment it Flourishes in. Constraints, 6:399–422, 2001. H. Yuen and J. Bebel. Tough SAT Project. http://toughsat.appspot.com/, 2011.
© Copyright 2026 Paperzz