Phase transitions in NP-complete problems:
a challenge for probability, combinatorics, and
computer science
Cristopher Moore
University of New Mexico
and the Santa Fe Institute
Tuesday, October 12, 2010
Satisfiability
• n Boolean variables, x1, x2, ... xn
• m constraints, or “clauses”: e.g.
(x1 ∨ x4 ∨ x5 ) ∧ (x4 ∨ x7 ∨ x19 ) ∧ · · ·
• Can we satisfy the entire formula, i.e., all the clauses at once?
• There are 2n possible truth assignments—exhaustive search takes
exponential time.
• Can we do better? Is there an algorithm that works in poly(n) time, i.e.,
O(nc) time for some constant c?
Tuesday, October 12, 2010
P vs. NP
• P: we can find the solution (or tell if there is one) in poly(n) time
• NP: we can check a solution in poly(n) time, e.g. 3-SAT
• If P=NP, we can find tours, tilings, proofs...
• Surely finding is harder than checking! Can we prove P≠NP?
Tuesday, October 12, 2010
Now suppose that P = NP. You can take your favorite unsolved mathematical problem—the Riemann
Hypothesis, the Goldbach Conjecture, you name it—and use your polynomial-time algorithm for SHORT
PROOF to search for proofs of less than, say, a billion lines. For that matter, if P = NP you can quickly search
for solutions to most of the other Millennium Problems as well. The point is that no proof constructed by a
human will be longer than a billion lines anyway, even when we go through the tedious process of writing
it out axiomatically. So, if no such proof exists, we have no hope of finding one.
This point was raised long before the classes P and NP were defined in their modern forms. In 1956,
the logician Kurt Gödel wrote a letter to John von Neumann. Turing had shown that the Entscheidungsprob•lem,
Gödel
to vonofNeumann:
letexists
φ(n)
the
timemathematical
it takes the
best possible
the problem
whether a proof
at be
all for
a given
statement,
is undecidable—that
is,algorithm
as we will discuss
Chapter 7,there
it cannot
be solved
in finite
by any
In his
to tellinwhether
exists
a proof
oftime
length
n program.
or less for
a letter
givenGödel
considered
the bounded
version of this problem, where we ask whether there is a proof of length n or
mathematical
statement.
less. He defined ϕ(n ) as the time it takes the best possible algorithm to decide this, and wrote:
P vs. NP
180
The question is, how fast does ϕ(n ) grow for an optimal machine. One can show that ϕ(n ) ≥
K n . If there actually were a machine with ϕ(n ) ∼ K n (or even only ϕ(n ) ∼ K n 2 ), this would
have consequences of the greatest magnitude. That is to say, it would clearly indicate that,
THE DEEP QUESTION: P VS. NP
despite the unsolvability of the Entscheidungsproblem, the mental effort of the mathematician in the case of yes-or-no questions could be completely replaced by machines (footnote:
apart from the postulation of axioms). One would simply have to select an n large enough
that, if the machine yields no result, there would then be no reason to think further about the
problem.
Gödel goes on to point out that we can find proofs in polynomial time—in our terms, that P = NP—
precisely if we can do much better than exhaustive search (dem blossen Probieren, meaning “simply try•ing”)
In the
modern age, proving that P≠NP is one of the outstanding open
among N alternatives, where N is exponentially large in n :
questions in mathematics.
ϕ ∼ K n (or ∼ K n 2 ) means, simply, that the number of steps vis-à-vis exhaustive search can
be reduced from N to log N (or (log N )2 ).
FT -- DRAFT -- DRAFT -- DRAFT -- DRA
Since there are an exponential number of possible proofs of length n , even excluding those which are
obviously nonsense, this problem seems very hard. As mathematicians, we like to believe that we need
to use all the tools at our disposal—drawing analogies with previous problems, visualizing and doodling,
Tuesday, October 12, 2010
NP-completeness
• A problem A is NP-complete if any problem in NP can be translated (or
“reduced”) to A in poly(n) time
• Problems are NP-complete because we can
build computers out of them.
• Imagine a circuit that checks a proposed solution:
• We can express the claim that this circuit works,
and that a solution exists, as a 3-SAT formula:
(x1 ∨ y 1 ) ∧ (x2 ∨ y 1 ) ∧ (x1 ∨ x2 ∨ y1 )
∧ ··· ∧ z .
• If any NP-complete problem is in P, then P=NP.
Tuesday, October 12, 2010
x1
x2
AND
y1
OR
NOT
y2
AND
z
y3
NP-completeness
• If A is NP-complete, and we can translate A to B, then B is NP-complete too.
Since 1972, the tree has grown...
Quadratic Diophantine Equation
Exact Spanning Tree
Independent Set
Vertex Cover
Cosine Integral
Integer Partitioning
Max Clique
Graph 3-Coloring
Subset Sum
CA Predecessor
Tiling
NAE-3-SAT
Planar SAT 3-SAT
Circuit SAT
Witness Existence
Tuesday, October 12, 2010
Max Cut
Max 2-SAT
Hamiltonian Path
3-Matching
Random formulas
• Instead of having an adversary choose the formula, make them randomly.
• choose m clauses uniformly and independently: choose xi, xj, xk, and negate
each one with probability 1/2
• Denote these formulas F3(n,m), analogous to random graphs G(n,m).
• Sparse case: m=αn for some constant α
• Intuitively, the larger α is, the harder it is to satisfy all the clauses...
Tuesday, October 12, 2010
A phase transition
1.0
0.8
n = 10
n = 15
n = 20
n = 30
n = 50
n = 100
0.6
P
0.4
0.2
0.0
Tuesday, October 12, 2010
3
4
5
!
6
7
Search times
10
6
DPLL calls
10
10
4
10
10
5
3
2
1
Tuesday, October 12, 2010
2
3
4
!
5
6
7
8
A threshold conjecture
• k-SAT formulas have k variables per clause
• Define random formulas Fk(n,m) similarly.
• Conjecture: for each k ≥ 2, there is a constant αc such that
!
1
lim Pr[Fk (n, m = αn) is satisfiable] =
n→∞
0
if α < αc
if α > αc
• Proving this seems quite hard, but we know something almost as good...
Tuesday, October 12, 2010
A threshold theorem, almost
• [Friedgut] For each k ≥ 2, there is a function αc(n) such that for any ε>0,
!
1
lim Pr[Fk (n, m = αn) is satisfiable] =
n→∞
0
if α < (1 − ε) αc (n)
if α > (1 + ε) αc (n)
• Except for k=2, we don’t know that αc(n) converges to a constant.
• But we do have many theorems placing upper and lower bounds on αc
assuming that it exists, i.e., showing that Fk(n,m=αn) is w.h.p. satisfiable for
α<α*, or w.h.p. unsatisfiable for α>α*.
Tuesday, October 12, 2010
An easy upper bound
• Let Z be the number of satisfying assignments of F3(n,m)
• We can prove that random formulas are w.h.p. unsatisfiable using
Pr[Z > 0] ≤ E[Z]
• For each assignment σ, let 1σ be 1 if σ satisfies the formula and 0 otherwise
• Since the clauses are independent and σ satisfies each one with probability
7/8,
E[Z] =
!
σ∈{0,1}n
Tuesday, October 12, 2010
E[1σ ] =
!
σ
(7/8)m = 2n (7/8)m
An easy upper bound
• Thus the expected number of satisfying assignments is
E[Z] = 2 (7/8)
n
m
α n
= [2 (7/8) ]
• Exponentially small when 2(7/8)α < 1, so
αc ≤ log8/7 2 ≈ 5.191
• For general k, the same argument gives
αc ≤ 2 ln 2
k
−k
1
−
2
• Hint: σ satisfies a random clause with probability
Tuesday, October 12, 2010
Why isn’t this the right answer?
• The real threshold is αc ≈ 4.267
• In the range αc < α < 5.191, the expected number of solutions is
exponentially large—even though most of the time there aren’t any.
• Z is not concentrated around its expectation—
• E[Z] is dominated by exponentially rare events where the formula has
exponentially many solutions.
• For a long time, the best lower bound was O(2k/k), far below 2k ln 2.
• Maybe we can control the variance?
Tuesday, October 12, 2010
The second moment method
• If Z is a nonnegative random variable,
E[Z]
Pr[Z > 0] ≥
E[Z 2 ]
2
• Now we have to think about pairs of solutions, and their correlations:
E[Z ] = E
2
=
!
"
"
σ
1σ
"
τ
1τ
#
E[1σ 1τ ]
σ,τ
=
"
σ,τ
Tuesday, October 12, 2010
Pr[σ, τ both satisfying]
When correlations are weak
• In many cases the events associated with σ≠τ are independent. Then
E[Z ] =
2
!
σ,τ
=
!
σ
Pr[σ ∧ τ ]
Pr[σ] +
!
Pr[σ] Pr[τ ]
τ !=σ
≈ E[Z] + E[Z]2
so
if
Var Z = E[Z] , like Poisson...
E[Z] is large then Pr[Z > 0] is close to 1
Tuesday, October 12, 2010
Entropy and correlations
• If σ and τ have overlap z=ζ n, they both satisfy a random clause with probability
q(ζ) = 1 − 2 · 2
−k
+ζ 2
k
−k
• Note that q(1/2) = (1 − 2−k )2 , just as if they were independent.
E[Z ] = 2
2
• Using
!
n
ζn
"
n
"
n
!
n
m
q(z/n)
z
z=0
≈ enh(ζ) where h(ζ) = −ζ ln ζ − (1 − ζ) ln(1 − ζ) ,
E[Z 2 ] ≈ 2n
• where g(ζ) = h(ζ) + α ln q(ζ)
Tuesday, October 12, 2010
#
!
0
1
eng(ζ) dζ
Asymptotic integrals
• When n is large, integrals of the form I =
the ζ that maximizes g(ζ)
!
1
eng(ζ) dζ are dominated by
0
ngmax
I
=
e
• Up to polynomial terms,
• In our case, if g(ζ) is maximized at ζ=1/2, then E[Z2] ≈ CE[Z]2
• The second moment method then gives Pr[Z > 0] ≥ E[Z]2/E[Z2] ≈ 1/C, and
Friedgut raises this probability to 1
• But if g(ζ) > g(1/2), then E[Z2] is exponentially larger than E[Z]2 and the second
moment method fails
• So, where is g(ζ) maximized?
Tuesday, October 12, 2010
Fatal attraction
• Correlations between σ and τ increase monotonically with the overlap ζ
• Attraction
correlation
large variance
second moment method fails.
0.010
0.005
0.000
!0.005
!0.010
!0.015
!0.020
0.40
Tuesday, October 12, 2010
0.45
0.50
0.55
0.60
Forcing symmetry
• Require that both σ and its complement are satisfying: “Not-All-Equal k-SAT”
0.010
0.005
0.000
!0.005
!0.010
!0.015
0.0
Tuesday, October 12, 2010
0.2
0.4
0.6
0.8
1.0
The unbearable lightness of being satisfying
• Let Z be the sum over all satisfying assignments of η(# of true literals) for some η < 1
0.02
0.01
0.00
!0.01
!0.02
!0.03
!0.04
!0.05
0.4
Tuesday, October 12, 2010
0.5
0.6
0.7
0.8
0.9
1.0
Tight lower bounds
• Recall the first-moment upper bound:
αc ≤ 2 ln 2
k
• Second moment with symmetry gives
αc ≥ 2
k−1
ln 2 − O(1)
[Achlioptas and Moore]
• Second moment with weighting gives
αc ≥ 2 ln 2 − O(k)
k
Tuesday, October 12, 2010
[Achlioptas and Peres]
The geometry of solutions: first hints
• Many pairs of solutions with large overlap, others almost independent
0.00
same clusters
!0.02
different clusters
!0.04
!0.06
0.4
Tuesday, October 12, 2010
0.5
0.6
0.7
0.8
0.9
1.0
WHEN FORMULAS FREEZE
Clustering, condensation, and criticality
αclust
αcond
αc
14.26:
Phase
in breaks
randomup
k -SAT.
Below the clustering transition at αclust , the set
• The
settransitions
of solutions
into clusters
assignments is essentially a single connected set. At αclust it breaks apart into exponentially
s, each of which has exponential size, and which are widely separated from each other.
• These
condense
a few large clusters
that
the set the set. Finally, at
nsation
transition
αcond , into
a subexponential
number
of dominate
clusters dominates
s disappear and there are no solutions at all.
• Finally, all the clusters shrink and disappear at αc
ief Propagation fails at a certain density because
long-range
correlations
appear
between th
[Krzakała,
Montanari, Ricci-Tersenghi,
Semerjian,
Zdeborová]
f a random formula. These correlations no longer decay with distance, so the messages
or variable receives from its neighbors in the factor graph can no longer be considered indepe
Tuesday, October
2010
hough
the 12,
roundabout
paths that connect these neighbors to each other have length Ω(log n
What’s a cluster?
• The Ising model of magnetism: for T < Tc, the state space breaks up into two
“macrostates,” mostly up and mostly down:
640x640 IsingLab
09/23/2006 02:17 PM
640x640 IsingLab
640x640 Ising model
T 2.26
L
160
09/23/2006 02:15 PM
640x640 Ising model
border
free
init
+1
Run
Clear
Print
T 2.26
L
160
border
free
init
-1
Run
Clear
Print
Controls
to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve) and E (the blue one)
Controls Click mouse to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve)
and Click
E (themouse
blue one)
of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends data to the Java console
in the right part of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends dataintothe
theright
Javapart
console
The Ising applet with the thermostat algorithm.
Tuesday, October 12, 2010
The Ising applet with the thermostat algorithm.
What’s a cluster? The discrete math view
• Each cluster of satisfying assignments is connected under moves that flip
one variable at a time
• Clusters have Hamming distance cn between them for some c > 0
• Clusters have diameter bn for some b < c
• “Energy barriers”: to get from one cluster to another, have to dissatisfy hn
clauses for some h > 0
• N.B.: each of these might not be literally true in the physics picture
Tuesday, October 12, 2010
What’s a cluster? Mixing times
• If we want to sample a uniformly random satisfying assignment, clustering
seems to make this hard
• Energy barriers bottlenecks in state space exponential time to get from
one cluster to another, as in the Ising model for T < Tc.
• But that doesn’t mean it’s hard to find a single satisfying assignment
• For 3-Coloring, there are algorithms that work beyond the clustering transition
[Achlioptas and Moore]
Tuesday, October 12, 2010
What’s a cluster? Fixed point of belief propagation
• Sparse random formulas, like sparse random graphs, are locally treelike
• For most vertices, the shortest cycle they lie on has length O(log n)
MESSAGE PASSING
773
• If correlations decay with distance, then neighbors are roughly independent
Ti
i
a
k
Tk
j
Tj
Figure 14.21: Removing a vertex from a tree breaks it into independent subtrees. Here we remove a clause
Tuesday, October 12, 2010
What’s a cluster? Fixed point of belief propagation
• Each vertex (clause or variable) sends “messages” to its neighbors
• Marginal distribution based on its other neighbors
774
WHEN FORMULAS FREEZE
• “My other variables can’t satisfy me,” “I must be true to my other clauses”
j
b
Z
Z
b→
j→
i
i
a
a
Z i →a
a
Z a →i
i
Figure 14.22: The flow of messages from the leaves towards the root of the factor graph. Left, each variable
Tuesday, October 12, 2010
What’s a cluster? Fixed point of belief propagation
• In the Ising model, if most of your neighbors are up, you probably are too
• For T < Tc, distant spins are probably the same
640x640 IsingLab
long-range correlations
09/23/2006 02:17 PM
640x640 IsingLab
09/23/2006 02:15 PM
• But within
each
640x640 Ising
modelcluster, correlations decay
640x640 Ising model
T 2.26
L
160
border
free
init
+1
Run
Clear
Print
T 2.26
L
160
border
free
init
-1
Run
Clear
Print
Controls
to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve) and E (the blue one)
Controls Click mouse to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve)
and Click
E (themouse
blue one)
of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends data to the Java console
in the right part of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends dataintothe
theright
Javapart
console
The Ising applet with the thermostat algorithm.
Tuesday, October 12, 2010
The Ising applet with the thermostat algorithm.
What’s a cluster? Reconstruction from boundaries
• Choose a random state, hide all but its boundary, and reconstruct its interior
• How much information can we recover? Is the new state uniformly random
deep inside, or is it correlated with the initial state?
09/23/2006 02:17 PM
640x640 IsingLab
09/23/2006 02:15 PM
640x640 Ising
model
• Note:
typical
boundary conditions, not worst-case
Run
Clear
Print
T 2.26
L
160
border
free
init
-1
Run
Clear
Print
See talks by Sly and Bhatnagar
this afternoon!
Controls
to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve) and E (the blue one)
You see M (the red curve)
and Click
E (themouse
blue one)
of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends data to the Java console
g. "Print" sends dataintothe
theright
Javapart
console
Tuesday, October 12, 2010
Freezing and hardness
FROZEN VARIABLES AND HARDNESS
αclust
αcond
αrigid
795
αc
Figure
A refined
phase
diagram
of random
k -SAT. Gray
blobs represent
frozen
clusters,
i.e., those
• A 14.30:
cluster
is frozen
if there
are
qn variables,
for some
q > 0, that
take
the same
where Θ(n ) variables take fixed values. Above αrigid almost all clusters are frozen, and we believe this is
value in every assignment in that cluster
responsible for the average-case hardness of random k -SAT.
• Search algorithms are doomed if they set any of these variables wrong
clause renders this instance unsatisfiable with finite probability. Therefore we have αc < α + " for any " > 0,
a contradiction.
• Almost all clusters are frozen when
But even if there are no variables that are frozen in all solutions, we could certainly have variables frozen
k
k
[Achlioptas
and Coja-Oghlan]
+ ε)(2
/k)that
lnakvariable
<α<
− ε)2
ln 2 C if x i takes the
within a (1
cluster.
Let’s say
x i is(1
frozen
in a cluster
same value
in every
solution in C , and call a cluster frozen if it has κn frozen variables for some constant κ > 0.
kDPLL algorithm descending
Intuitively,
these
frozen
clusters
spell
doom
for
local
algorithms.
Image
a
[Coja-Oghlan]
• But there is an algorithm that works for α < (1 − ε)(2 /k) ln k
into its search tree. With every variable it sets, it contradicts any cluster in which this variable is frozen
with the opposite value. If every cluster is frozen, then it contradicts a constant fraction of them at each
step, until it has excluded every cluster from the branch ahead. This forces it to backtrack, taking expoTuesday, October 12, 2010
On the other hand...
• Random XORSAT: each clause sets a parity, xi ⊕ xj ⊕ xk = 0 or 1
• Has clusters with frozen variables, just like k-SAT
[Cocco, Dubois, Mandler, Monasson]
• But we can solve it in polynomial time... it’s just a system of linear equations
• Gaussian elimination is not a local algorithm—it globally redefines the
variables until the problem becomes trivial
• We believe that there is no analogous approach to k-SAT...
• ...but we don’t know that P ≠ NP.
Tuesday, October 12, 2010
Colorings with permutations
• A 3-coloring of a graph G=(V,E) is a function c:V→{red,blue,green} such that
c(u)≠c(v) for any (u,v) ∈ E
• A threshold conjecture for 3-colorability:
!
1 if d < dc
lim Pr[G(n, p = d/n) is 3-colorable] =
n→∞
0 if d > dc
• Conjecture: the threshold dc stays the same even if we put a random
permutation π∈S3 on each edge, and demand that c(u)≠π(c(v))
• If this is true, second moment calculations are a lot easier, and we can bound
dc within a constant
[Moore and Olson]
• Justification: correlation decay
Tuesday, October 12, 2010
Shameless Plug
This book rocks! You somehow manage
to combine the fun of a popular book
with the intellectual heft of a textbook.
— Scott Aaronson
THE NATURE
of COMPUTATION
A treasure trove of information on
algorithms and complexity, presented in
the most delightful way.
— Vijay Vazirani
A creative, insightful, and accessible
introduction to the theory of computing,
written with a keen eye toward the
frontiers of the field and a vivid
enthusiasm for the subject matter.
— Jon Kleinberg
Oxford University Press,
2011
Tuesday, October 12, 2010
Cristopher Moore
Stephan Mertens
Acknowledgements
Tuesday, October 12, 2010
© Copyright 2026 Paperzz