Statement of the Problem Analysis of Algorithms NP Completeness

P versus NP
Statement of the Problem
Analysis of Algorithms
NP Completeness
Additional NP Complete Problems
Statement of Problem
●
●
●
P is the class of problems solvable in
deterministic polynomial time, i.e., a computer
program will yield results in “reasonable time.”
EXP is the class of problems solvable by brute
force techniques in deterministic exponential
time, i.e., months, years, decades, …
And between these two is NP
P ⊆ NP ⊆ EXP
Statement of Problem
●
●
What does NP more closely resemble: P or
EXP? To date all deterministic implementations
of nondeterministic algorithms are in EXP.
But that does not rule out the possibility that
someone might someday find one in P.
Statement of Problem
●
Chapter Topics:
–
Analysis of Algorithms
●
–
Just because an algorithm is in P does not
mean it is “fast.” Using Landau Notation
we have a means to rank efficiency.
NP Completeness
●
We look for the hardest problems that
belong to the class NP. If any one of the
problems has a deterministic polynomial
time algorithm, then NP collapses to P!
Statement of Problem
●
As we have progressed through the various
topics in this text, we have gradually simplified
our description of Turing Machines.
–
We have described standard descriptors SDs
and numeric descriptors NDs which serve as
the assembly language and machine
language for TMs.
–
We have talked in generalities how a TM
could recognize / decide certain problems.
–
Now we will describe our algorithms using a
high level language pseudocode similar to
JAVA.
Analysis of Algorithms
●
●
●
●
In simplest terms, algorithms in P are
essentially O(nk) for some natural number k.
The smaller the value of k, the more efficient
the algorithm.
Even replacing a single factor of n with √n or
with log2(n) is an improvement in efficiency.
What follows are several illustrative examples
for calculating the efficiency of algorithms.
Analysis of Algorithms
●
SEQUENTIAL SEARCH
–
Given an array A (n items) and a value t
–
Find the location of t within A
–
algorithm overview:
–
●
consecutively check each location
●
If A[i] = t, then FOUND
O(n)
Analysis of Algorithms
●
BINARY SEARCH
–
Given an array A (n items) and a value t with
A[1] ≤ A[2] ≤ … ≤ A[n]
–
Find the location of t within A
–
algorithm overview:
●
look at the middle item A[(lb+ub)/2]
– if
A[mid] = t, then FOUND
– if A[mid] > t,
then ub = mid-1 else lb = mid+1
– O(log2(n))
Analysis of Algorithms
●
SELECTION SORT
–
Given an array A (n items)
–
Sort increasing order: A[1] ≤ A[2] ≤ … ≤ A[n]
–
algorithm overview:
–
●
find smallest item and put it first
●
find next smallest item and put it second
●
etc
O(n2)
Analysis of Algorithms
●
MATRIX MULTIPLICATION
–
Given three arrays A, B, and C (nxn)
–
Calculate matrix product A x B and store in C
–
algorithm overview
●
for each row r in A
for each col c in B
generate the sum S A[r,k]B[k,c]
– O(n3)
Analysis of Algorithms
●
PATH
–
G is a directed graph with n nodes
–
G is represented by adjacency matrix A (nxn)
●
A[i,j] = 1 iff ∃ arc from node i to node j
–
node s is start node
–
node t is terminal node
–
Does there exist a path in G from s to t?
Analysis of Algorithms
–
algorithm overview:
●
all nodes are designated as unmarked
●
start node s is marked
●
repeat until no more nodes are marked
– for
i = 1 to n do
if node i is marked
then for j = 1 to n do
if A[i,j] = 1
then node j is marked
–
O(n3)
Analysis of Algorithms
●
SAT (traditional)
–
We have previously discussed SAT: Is a
boolean expression ϕ satisfiable?
–
Boolean expressions are comprised of
variables, operators (and, or, not), and
parentheses
Analysis of Algorithms
●
●
●
●
Def: A literal is either a simple variable x or the
negation of a simple variable ¬x.
Def: A clause is the or/disjunction of a finite
number of literals.
Def: A boolean expression is said to be in
conjunctive normal form (CNF) if it is the
and/conjunction of a finite number of clauses.
Theorem: For every boolean expression ϕ of
the propositional logic, there is an equivalent
boolean expression ψ in conjunctive normal
form.
Analysis of Algorithms
–
algorithm overview
●
ϕ is boolean expression in CNF having k
variables
– for
x1 = 0,1 do
for x2 = 0,1 do
… for xk = 0,1 do
plug in x1 , x2, … , xk into ϕ
if ϕ TRUE, then satisfiable
– SAT ∈ DTIME (2n) ⊆ EXP
–
SAT ∈ NTIME (n) ⊆ NP
NP Completeness
●
So the question remains
–
●
●
Is P ⊂ NP? … or … Is P = NP?
If we are serious about answering this question,
then we should seek the hardest problem
imaginable. If we can show that problem is an
element in P, then all the rest of NP comes
along for free!
But how do we recognize such a problem????
NP Completeness
●
●
To move forward, we move backword – to the
topic of reducibility.
Def: We say that f:S* → S* is a computable
function provided there exists a Turing
Machine M such that
–
M halts on all input w
–
result output is f(w)
NP Completeness
●
Def: We say that a language A is mapping
reducible to a language B, denoted A ≤ B,
provided there exists a computable function f :
S* → S* such that
w ∈ A if and only if f(w) ∈ B.
The computable function f is called a reduction
from A to B.
●
For decidability, determinism versus
nondeterminism is irrelevant; not so for
efficiency!
NP Completeness
●
Def: We say that a language A is polynomial
time reducible to a language B, denoted A ≤P
B, provided there exists a computable function f
: S* → S* such that f ∈ P and
w ∈ A if and only if f(w) ∈ B.
●
Lemma:
–
If A ≤P B and B ≤P C, then A ≤P C
–
If A ≤P B and B ∈ P, then A ∈ P
–
If A ≤P B and A ∉ P, then B ∉ P
NP Completeness
●
●
●
Def: B is called NP complete provided
–
B ∈ NP
–
∀ A ∈ NP, A ≤P B
Every other problem in NP can be reduced to B!
Such a problem would truly be the hardest to solve.
Theorem:
–
If B is NP complete and B ∈ P, then P = NP.
–
If B is NP complete and B ≤P C, then C is NP
complete.
NP Completeness
●
●
All the above theory would be meaningless
unless we can prove that at least one problem
is NP complete!
Theorem:
Cook – Levin
SAT is NP complete.
●
Proof:
–
We already know that SAT ∈ NP.
–
All we need show is if A ∈ NP then A ≤P SAT
NP Completeness
●
Since A ∈ NP there is a nondeterministic Turing
Machine M that decides A in polynomial time,
i.e., Ο(nk) for k sufficiently large. M also decides
A in polynomial space, i.e.,
●
Ο(nk).
Recall that when we previously discussed
reducibility, we presented computational
histories:
#C1#C2#C3# … #Ck#
●
number Cks
Ο(nk) and length of each Ck Ο(nk)
NP Completeness
●
So we may represent an accepting
computational history in a nk x nk matrix T
NP Completeness
●
We will now build a boolean expression ϕ which
represents the matrix T above! Our boolean
variables will be Xi,j,s = 1 if T[i,j] = s.
NP Completeness
●
●
Each of the four components is in CNF!!
Furthermore, the conversion from
computational history to boolean expression
requires:
–
ϕstart
Ο(nk)
–
ϕaccept
Ο(n2k)
reduction A ≤ SAT
–
ϕcell
Ο(n4k)
in polynomial time
–
ϕstep
Ο(n2k)
Additional NP
Complete Problems
●
●
●
SAT is NP Complete!! Let’s close the book, find
a nice little pub, and celebrate!!
Sorry … not just yet.
Let’s see if we can find some more NP
Complete problems.
WHY??
●
Because the more problems we have to
consider, the more likely we will be able to
resolve the question: P = NP?
Additional NP
Complete Problems
●
3_SAT
–
3_SAT is SAT with one additional restriction:
●
each clause must be a 3-clause, i.e., each
clause must contain exactly three literals
a
a∨a∨a
a∨b
a∨b∨a
a∨b∨c
a∨b∨c
a∨b∨c∨d
(a ∨ b ∨ x) ∧
(¬x ∨ c ∨ d)
a∨b∨c∨d∨e
(a ∨ b ∨ x) ∧
(¬x ∨ c ∨ y) ∧
(¬y ∨ d ∨ e)
Additional NP
Complete Problems
●
3_SAT
–
3_SAT ∈ NP (like SAT 2n truth values)
–
the above substitutions are straight-forward
and can be done in polynomial time.
–
if ϕ is a boolean expression in SAT then the
substitutions generate a boolean expression
ϕ’ in 3_SAT
–
lastly ϕ is satisfiable if and only if is ϕ’
satisfiable
–
therefore SAT ≤ 3_SAT
Additional NP
Complete Problems
●
CLIQUE
–
Suppose G is an undirected (bidirectional)
graph, i.e., G is a collection of nodes
(vertices) { n1,n2,…,np}, together with a
collection of edges (arcs) {eij connecting ni to
nj}. Since G is undirected if eij = 1 then so is
eji = 1.
–
Def: A k-clique in the graph G is a subset of
nodes C containing exactly k nodes such that
for any two nodes ni and nj in C there is an
edge joining the two.
Additional NP
Complete Problems
●
CLIQUE
= {(<G>,k) | G is undirected graph,
k is an integer,
does G contain a k-clique?}
–
CLIQUE ∈ NP
●
if collection of nodes is size p, then the
number of subsets to consider is size 2p
Additional NP
Complete Problems
●
We will now show that 3_SAT ≤P CLIQUE
–
consider a boolean expression having k 3clauses. From this boolean expression we
construct a problem in CLIQUE
–
for each literal in each of the k 3-clauses
create a node in a graph G
–
for every pair of nodes create a bidirectional
edge between them except
●
no edge between nodes in same clause
●
no edge between a literal and its negation
Additional NP
Complete Problems
Example:
Additional NP
Complete Problems
●
●
●
We created mapping f : 3_SAT → CLIQUE
which takes polynomial time.
Furthermore the boolean expression ϕ belongs
in 3_SAT if and only if f(ϕ) has a k-clique.
Hence, 3_SAT ≤P CLIQUE.
Additional NP
Complete Problems
●
VERTEX_COVER
–
Suppose G is an undirected (bidirectional)
graph, i.e., G is a collection of nodes
(vertices) { n1,n2,…,np}, together with a
collection of edges (arcs) {eij connecting ni to
nj}. Since G is undirected if eij = 1 then so is
eji = 1.
–
Def: A vertex cover of the graph G is a
subset of nodes C such that if nodes in C are
removed together with any incoming or
outgoing edges, no edges remain in the
graph.
Additional NP
Complete Problems
●
VERTEX_COVER
= {(<G>,k) | G is undirected graph,
k is an integer,
does G contain a vertex cover having at
most k nodes?}
–
CLIQUE ∈ NP
●
if collection of nodes is size p, then the
number of subsets to consider is size 2p
Additional NP
Complete Problems
●
We will now show that 3_SAT ≤P
VERTEX_COVER
–
consider a boolean expression having k 3clauses. From this boolean expression we
construct a problem in VERTEX_COVER
–
for each of the p variables (xj and
two nodes and an edge between
–
for each of the q clauses create a triangle of
nodes with 3 edges
¬xj) create
Additional NP
Complete Problems
–
for each node in the triangle create an edge
to the appropriate literal
–
set k = p + 2q
Example:
Additional NP
Complete Problems
●
●
●
We created mapping f : 3_SAT →
VERTEX_COVER which takes polynomial time.
Furthermore the boolean expression ϕ belongs
in 3_SAT if and only if f(ϕ) has a vertex_cover
having k nodes.
Hence, 3_SAT ≤P VERTEX_COVER.
Additional NP
Complete Problems
●
SUBSET_SUM
–
Given an integer target value t and a
collection S of integer values {x1,x2,…,xk}, is it
possible to choose a subset of the collection
so that their sum is the value t?
–
The collection S is not a set but rather a
multi-set in which duplicate values are
permitted but considered separate items.
–
We generalize the problem to integer tuples
{x1,x2,…,xk} and an integer tuple target:
xi = (xi1,…,xip) and t = (t1,…,tp)
Additional NP
Complete Problems
●
SUBSET_SUM
= {(<S>,t) | ∃ T ⊆ S s/t ST xj = t}
–
SUBSET_SUM ∈ NP
●
if collection S contains k tuples, then the
number of subsets T to consider is size 2k
Additional NP
Complete Problems
●
We will now show that 3_SAT ≤P SUBSET_SUM
–
consider a boolean expression having p
variables and q 3-clauses
–
from this boolean expression we construct a
problem in SUBSET_SUM represented by a
[2(p+q) + 1] x (p+q) grid as illustrated on the
next page
●
●
●
upper left: variable is either true or false
bottom right: variables in clause not be
distinct
upper right: variables used in that 3-clause
Additional NP
Complete Problems
●
Example:
Additional NP
Complete Problems
●
●
●
We created mapping f : 3_SAT →
SUBSET_SUM which takes polynomial time.
Furthermore the boolean expression ϕ belongs
in 3_SAT if and only if f(ϕ) has a solution in
subset_sum.
Hence, 3_SAT ≤P SUBSET_SUM.