Fall 2011
The Chinese University of Hong Kong
CSCI 3130: Formal languages and automata theory
NP and NP-completeness
Andrej Bogdanov
http://www.cse.cuhk.edu.hk/~andrejb/csc3130
Some more problems
A clique is a subset of vertices
that are all interconnected
2
1
4
3
Graph G
{1, 4}, {2, 3, 4}, {1} are cliques
An independent set is a subset of
vertices so that no pair is connected
{1, 2}, {1, 3}, {4} are independent sets
there is no independent set of size 3
A vertex cover is a set of vertices
that touches (covers) all edges
{2, 4}, {3, 4}, {1, 2, 3} are vertex covers
How to solve them
CLIQUE = {〈G, k〉: G is a graph with a clique of k vertices}
M: On input 〈G, k〉:
For all subsets S of vertices of size k:
If for every pair u, v in S
(u, v) is an edge in G, accept
Otherwise, reject.
1
2
input: 〈G, 3〉
subsets:
{123} {124} {134} {234}
all edges in?
3
4
NO
NO
NO
YES
Running time analysis
CLIQUE = {〈G, k〉: G is a graph with a clique of k vertices}
M: On input 〈G, k〉:
For all subsets S of vertices of size k:
If for every pair u, v in S
(u, v) is an edge in G, accept
Otherwise, reject.
( nk ) subsets
≈ n2 pairs
≈ n2 ( nk )
≈ 2n when k = n/2
Status of these problems
CLIQUE = {〈G, k〉: G is a graph with a clique of k vertices}
IS = {〈G, k〉: G is a graph with an independent set of k vertices}
VC = {〈G, k〉: G is a graph with a vertex cover of k vertices}
problem
CLIQUE IS
VC
running time
≈ 2n
≈ 2n
of best-known algorithm
≈ 2n
What do these problems have in common?
Checking solutions efficiently
• We don’t know how to solve them efficiently
• But if someone told us the solution, we would be able
to verify it very quickly
Example:
Is (G, 5) in CLIQUE?
12
9
1,5,9,12,14
1
13
2
4
3
14
5
6
15
7
11
10
8
The class NP
• A verifier for L is a TM V such that
x∈L
V accepts 〈x, s〉 for some s
– s is a potential solution for x
– We say V runs in polynomial time if on every input x, it
runs in time polynomial in |x| (for every s)
NP is the class of all languages that have
polynomial-time verifiers
Example
CLIQUE is in NP:
V := On input 〈G, k〉 and a set of vertices C,
If C has size k and all edges between vertices of
C are present in G, accept, otherwise reject.
running time ≈ O(k2) ✔
P versus NP
P is contained in NP
because the verifier can
ignore the solution
decidable
NP (efficiently checkable)
IS
SAT
CLIQUE
Conceptually, finding
solutions can only be
harder than checking
them
VC
P (efficient)
PATH
L01
Millenium prize problems
• Recall how in 1900, Hilbert gave 23 problems that
guided mathematics in the 20th century
• In 2000, the Clay Mathematical Institute gave 7
problems for the 21st century
1 P versus NP
computer science
2 The Hodge conjecture
3 The Poincaré conjecture Perelman 2006
4 The Riemann hypothesis
Hilbert’s 8th problem
5 Yang–Mills existence and mass gap
$1,000,000
6 Navier–Stokes existence and smoothness
7 The Birch and Swinnerton-Dyer conjecture
P versus NP
• The answer to the question
$1,000,000
Is P equal to
NP?
is not known. But one reason it is believed to be
negative is because, intuitively, searching is harder
than verifying
• For example, solving homework problems (searching
for solutions) is harder than grading (verifying the
solution is correct)
Searching versus verifying
Mathematician:
Given a mathematical claim, come up with a proof for it.
Scientist:
Given a collection of data on some phenomena, find a theory
explaining it.
Engineer:
Given a set of constraints (on cost, physical laws, etc.) come up
with a design (of an engine, bridge, etc.) which meets them.
Detective:
Given the crime scene, find “who’s done it”.
P and NP
P = languages that can be decided on a TM
with polynomial running time
(problems that admit efficient algorithms)
NP = languages whose solutions can be verified
on a TM with polynomial running time
(solutions can be checked efficiently)
We believe that NP is bigger than P,
but we are not 100% sure
decidable
NP
P
Evidence that NP is bigger than P
CLIQUE = {〈G, k〉: G is a graph with a clique of k vertices}
IS = {〈G, k〉: G is a graph with an independent set of k vertices}
VC = {〈G, k〉: G is a graph with a vertex cover of k vertices}
• These (and many others) are in NP
– Their solutions, once found, are easy to verify
• But no efficient solutions are known for any of them
– The fastest known programs take time ≈2n
Equivalence of certain NP languages
• We strongly suspect that problems like CLIQUE,
SAT, etc. require time ≈2n to solve
• We do not know how to prove this, but what we can
prove is that
If any one of them can be solved efficiently,
then all of them can be solved efficiently
Equivalence of some NP languages
• All these problems are as hard as one another
• Moreover, they are at the “frontier” of NP
– They are at least as hard as any problem in NP
NP
P
Polynomial-time reductions
• What do we mean when we say, for example,
“IS is at least as hard as CLIQUE”
• We mean that
If CLIQUE has no polynomial-time TM, then
neither does IS
• Or, we can convert any polynomial-time TM for IS
into one for CLIQUE
Polynomial-time reductions
IS = {〈G, k〉: G is a graph with an independent set
of k vertices}
CLIQUE = {〈G, k〉: G is a graph with a clique of k vertices}
• Theorem
If IS has a
polynomial-time TM,
so does CLIQUE
1
2
3
4
{1, 4}, {2, 3, 4}, {1} are
independent sets
{1, 2}, {1, 3}, {4} are cliques
Polynomial-time reductions
If IS has a polynomial-time TM, so does CLIQUE
• Proof: Suppose IS has an poly-time TM A
〈G, k〉
〈G’, k’ 〉
A
for IS
accept
accept
G has
clique
if G’ifhas
IS of
size kof size k
reject
reject
if not
if not
• We want to use it to solve CLIQUE
Reducing CLIQUE to IS
• We look for a polynomial-time TM R that turns the
question:
“Does G have a clique of size k?”
G, k
into:
G’, k’
R
“Does G’ have an IS of size k’?”
1
2
G
3
flip all edges
4
cliques of size k
k’ = k
1
2
3
4
G’
ISs of size k’
Reducing CLIQUE to IS
On input 〈G, k〉
Construct G’ by flipping all edges of G
Set k’ = k
Output 〈G’, k’〉
cliques in G
If G’ has an IS of size k,
then G has a clique of size k
G, k
R
independent sets in G’
✓
If G’ does not have an IS of size k,
then G has no clique of size k
✓
G’, k’
Reduction recap
• We showed that
If IS has a polynomial-time TM, so does CLIQUE
by converting an imaginary TM for IS into one for
CLIQUE
• To do this, we came up with a reduction that
transforms instances of CLIQUE into ones of IS
Polynomial-time reductions
• Language L polynomial-time reduces to L’ if
there exists a polynomial-time TM R that takes an
instance x of L into instance y of L’ s.t.
x ∈ L if and only if y ∈ L’
L (CLIQUE)
x = 〈G, k〉
L’ (IS)
R
y = 〈G’, k’〉
x∈L
y ∈ L’
(G has clique of size k)
(G’ has IS of size k’)
The meaning of reductions
• Saying L reduces to L’ means L is no harder than L’
– In other words, if we can solve L’, then we can also solve L
• Therefore
If L reduces to L’ and L’ ∈ P, then L ∈ P
x
x∈L
R
y
y ∈ L’
poly-time TM for L’
TM accepts
acc
rej
The direction of reductions
• The direction of the reduction is very important
– Saying “A is easier than B” and “B is easier than A” mean
different things
• However, it is possible that L reduces to L’ and L’
reduces to L
– This means that L and L’ are as hard as one another
– For example, IS and CLIQUE reduce to one another
Boolean formula satisfiability
• A boolean formula is an expression made up of
variables, ands, ors, and negations, like
(x1∨x2 ) ∧ (x2 ∨x3 ∨x4) ∧ (x1)
• The formula is satisfiable if one can assign values to
the variables so the expression evaluates to true
Above formula is satisfiable because this assignment
makes it true: x1 = F x2 = F x3 = T x4 = T
3SAT
SAT = {〈f〉: f is a satisfiable Boolean formula}
3SAT = {〈f〉: f is a satisfiable Boolean formula in
conjunctive normal form with
3 literals per clause}
literal: xi or xi
CNF: AND of ORs of literals
(x1∨x2∨x2 ) ∧ (x2∨x3∨x4)
literals
(conjunctive normal form)
3CNF: CNF with 3 literals per clause
(repetitions are allowed)
clause
3SAT and NP
f = (x1∨x2 ) ∧ (x2 ∨x3 ∨x4) ∧ (x1)
Finding a solution:
Verifying a solution:
Try all possible assignments
FFTT
FFFF
FFFT
FFTF
FFTT
substitute
x1 = F x2 = F x3 = T x4 = T
FTFF
FTFT
FTTF
FTTT
TFFF
TFFT
TFTF
TFTT
TTFF
TTFT
TTTF
TTTT
For n variables, there are 2n
possible assignments
Takes exponential time
evaluate formula
f = (F ∨T ) ∧ (F ∨ T ∨ F) ∧ (T)
can be done in linear time
The Cook-Levin Theorem
Every L ∈ NP reduces to SAT
SAT = {f: f is a satisfiable Boolean formula}
E.g. (x1∨x2 ) ∧ (x2 ∨x3 ∨x4) ∧ (x1)
• So every problem in NP is no harder than SAT
• But SAT itself is in NP, so SAT must
be the “hardest problem” in NP:
If SAT ∈ P, then P = NP
SAT
NP
P
NP-completeness
• A language C is NP-complete if:
1. C is in NP, and
2. For every L in NP, L reduces to C.
• Cook-Levin Theorem:
C
NP
SAT is NP-complete
P
Our picture of NP
A
B
A reduces to B
SAT
IS
CLIQUE
P
contextfree
PATH
0n1n
NP
More NP-complete problems
A
B
A reduces to B
3SAT
IS
CLIQUE
P
NP
PATH
0n1n
In practice, most of the NP-problems are
either in P (easy) or NP-complete (probably hard)
Interpretation of Cook-Levin Theorem
• Optimistic view:
If we manage to solve SAT, then we can also solve
CLIQUE and many other things
• Pessimistic view:
Since we do not believe P = NP, it is unlikely that
we will ever have a fast algorithm for SAT
The ubiquity of NP-complete problems
• We saw a few examples of NP-complete problems,
but there are many more
• A surprising fact of life is that most CS problems are
either in P or NP-complete
• A 1979 book by Garey and Johnson
lists 100+ NP-complete problems
© Copyright 2026 Paperzz