Backdoor Sets in SAT Instances Ryan Williams Carnegie Mellon University Joint work in IJCAI03 with: Carla Gomes and Bart Selman Cornell University 1 Significant progress in Complete search methods! (Complete = always returns SAT or unSAT) Software and hardware verification – complete methods are critical e.g. for verifying the correctness of chip design, using SAT encodings Current methods can verify automatically the correctness of a large fraction of a Pentium IV. 2 A “real world” example (Thanks to: Oliver Kullmann) 3 Bounded Model Checking instance: i.e. (( x1) or x7) and ((x1) or x6) and … etc. 4 10 pages later: … (x177 or x169 or x161 or x153 … or x17 or x9 or x1 or (x185)) clauses / constraints are getting more interesting… 5 4000 pages later: ?!! a 59-cnf clause… … 6 Finally, 15,000 pages later: Note that: … !!! The MiniSat solver (Een&Sorensson) solves this instance in 2 seconds. 7 Gap between Theory and Practice The good scaling behavior of state-of-the art SAT solvers seems to defy our complexity-theoretic intuition that SAT is NP-complete! How can we explain this gap between theory and practice? What makes this possible? Our answer: Hidden tractable substructure in real-world problems. Can we make this more precise? Proposal: We consider structures we call backdoor sets. Idea came out of study of heavy-tailed phenomena in runtime distributions for some SAT solvers. 8 Backdoor Sets – Initial Motivation Heavy-tailed distributions and Randomization. Certain problems, when solved by randomized backtracking, yield a runtime distribution that is heavy-tailed Pr[solution found in time t] ~ 1/t^c, 0 < c < 2 • Explains why restarting a solver often is an effective strategy • Implies a wide range of possible solution times, often including short runs How to explain short runs? 9 Explaining short runs: Backdoors to tractability Informally: A backdoor set to a given problem instance is a subset of its variables such that, once assigned values, the remaining instance simplifies to a tractable class. Formally: We define notion of a “sub-solver” (handles tractable substructure of problem instance) backdoor set and strong backdoor set Defining a sub-solver Definition is general enough to encompass many polynomial time propagation methods. (Also those for which we do not know a clean characterization of the tractable subclass.) Valid for other encoding languages besides SAT: e.g., Mixed Integer Programming and Constraint Satisfaction Problems 11 Defining backdoors Backdoor set (for satisfiable instances): Strong backdoor set (applies to satisfiable or inconsistent instances): 12 Backdoors can be surprisingly small: Backdoors help explain how a solver can get “lucky” on certain runs: backdoor sets are identified early on in backtracking search. Most recent: Other combinatorial domains. E.g. Graphplan planning, near constant size backdoors (2 or 3 variables) in certain domains. (Hoffman, Gomes, Selman ’03) Backdoors capture critical problem resources (bottlenecks). Constraint Satisfaction Problem The Constraint Satisfaction Problem (CSP): • A finite set of n variables is given and with each variable is associated a non-empty finite domain. • A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk) D1 x …x Dk. • A solution to a CSP is an assignment of values to all the variables, satisfying all the constraints. • (Satisfaction of a constraint = the relation holds) (Dechter 86, Freuder 82, Mackworth 77, Tsang 93, van Beek and Dechter 97) 14 Explicit Algorithms for Finding/Exploiting Backdoor Sets We cover three kinds of strategies for dealing with instances with small backdoor sets: • A deterministic algorithm • A randomized algorithm – Provably better worst-case performance over the deterministic one • A heuristic randomized algorithm – Assumes existence of a good heuristic for choosing variables to branch on – We believe this is close to what happens in practice 15 Deterministic Generalized Iterative Deepening 16 Randomized Generalized Iterative Deepening Assumption: There exists a backdoor whose size is bounded by a function of n (call it B(n)) Idea: Repeatedly choose random subsets of variables that are slightly larger than B(n), searching these subsets for the backdoor 17 Randomized Generalized Iterative Deepening 18 Deterministic Versus Randomized Suppose variables have 2 possible values (e.g. SAT) For B(n) = n/k, algorithm runtime is cn c Deterministic algorithm Randomized algorithm k 19 Complete Randomized Depth First Search with Heuristic Assume we have the following. DFS, a generic depth first search randomized backtrack search solver with: • (polytime) sub-solver A • Heuristic H that (randomly) chooses variables to branch on, in polynomial time H has probability 1/h of choosing a backdoor variable (h is a fixed constant) Call this ensemble (DFS, H, A) 20 Polytime Restart Strategy for (DFS, H, A) Essentially: If there is a small backdoor, then (DFS, H, A) has a restart strategy that runs in polytime. 21 Runtime Table for Algorithms DFS,H,A B(n) = upper bound on the size of a backdoor, given n variables When the backdoor is a constant fraction of n, there is an exponential improvement between the randomized and deterministic algorithm Summary Introduced notion of a “backdoor set” of variables. 1) More closely captures combinatorics of a problem instance, as dealt with in practice. 2) Provides insight into restart strategies. 3) Backdoors can be surprisingly small in practice. 4) Search heuristics + randomization can be used to find them, provably efficiently.
© Copyright 2026 Paperzz