Lecture 7: The Subgraph Isomorphism Problem

CSC2429, MAT1304: Circuit Complexity
October 25, 2016
Lecture 7: The Subgraph Isomorphism Problem
Instructor: Benjamin Rossman
1
The Problem SUB(G)
Convention 1 (Graphs).
Graphs are finite simple graphs with no isolated vertices. Formally,
a graph G consists of a finite
S
|E(G)|
vertex set V (G) and an edge set E(G) ⊆ V (G)
such
that
e∈E(G) e = V (G). There are 2
2
subgraphs of G. (We may thus identify subgraphs G with elements of the Boolean hypercube
{0, 1}E(G) .)
For k ≥ 2: Kk denotes the complete graph of order k, and Pk denotes the path of order k.
Definition 2 (The Blow-Up G↑n of a Graph G).
For a graph G and n ∈ N, let G↑n denote the graph
V (G↑n ) = {v (i) : v ∈ V (G), i ∈ [n]},
E(G↑n ) = {v (i) , w(j) } : {v, w} ∈ E(G), i, j ∈ [n] .
For α ∈ [n]V (G) , let G(α) denote the graph
V (G(α) ) = v (αv ) : v ∈ V (G) ,
E(G(α) ) = {v (αv ) , w(αw ) } : {v, w} ∈ E(G) .
Each G(α) is an isomorphic copy of G that sits inside G↑n .
We have a similar notation for subgraphs of G and their copies inside G↑n . For H ⊆ G and
β ∈ [n]V (H) , let H (β) denote the graph
V (H (β) ) = v (βv ) : v ∈ V (H) ,
E(H (β) ) = {v (βv ) , w(βw ) } : {v, w} ∈ E(H) .
For X ⊆ G↑n and H ⊆ G, let
SubH (X) = H (β) : β ∈ [n]V (H) such that H (β) ⊆ X ,
subH (X) = |SubH (X)|.
We refer to elements of SubH (X) as H-subgraphs of X.
Definition 3 (The Colored G-Subgraph Isomorphism Problem).
For a graph G and n ∈ N, let SUB(G, n) be the problem, given a subgraph X ⊆ G↑n , of determining
whether or not there exists α ∈ [n]V (G) such that G(α) ⊆ X. For complexity purposes, we regard
2
SUB(G, n) as a Boolean function {0, 1}|E(G)|·n → {0, 1} with variables {Xe }e∈E(G↑n ) . We refer to
SUB(G) = {SUB(G, n)}n∈N as the colorful G-subgraph isomorphism problem.
1
Note the brute-force upper bound: SUB(G) is solvable by monotone depth-2 (OR ◦ AND)
formulas of size O(|E(G)| · n|V (G)| ):
_
^
X{v(αv ) ,w(αw ) } .
α∈[n]V (G) {v,w}∈E(G)
As we will see later on, there is a better upper bound (for monotone circuits of O(|V (G)|) depth) in
terms of the tree-width of G. In the next few lectures, we will show nearly matching lower bounds
for bounded-depth circuits, as well as monotone circuits.
Two important special cases of the problem SUB(G) are when G = Kk (the “k-clique problem”)
and G = Pk (the “distance-k connectivity problem”).
2
Relationship to SUBuncolored (G)
We remark on the relationship between SUB(G) and its uncolored version, denoted SUBuncolored (G).
This is the problem: given a graph X ⊆ Kn (i.e. a graph with V (X) ⊆ {1, . . . , n}), determine
whether or not X contains a subgraph isomorphic to G. (Note that k-CLIQUE is precisely the
problem SUBuncolored (Kk ).)
There is a well-known reduction from SUBuncolored (G) to SUB(G).
Theorem 4 (“Color-Coding Technique”, Alon-Yuster-Zwick 1995). There is a quasi-linear size
AC0 reduction from SUBuncolored (G) to SUB(G).
Proof Sketch. We are given an uncolored graph X ⊆ Kn and wish to determine whether or not it
contains a subgraph isomorphic to G. For a function ϕ : [n] → V (G), let X ϕ ⊆ G↑n be the graph
with
E(X ϕ ) = {v (i) , w(j) } : {v, w} ∈ E(G) and {i, j} ∈ E(X) and ϕ(i) 6= ϕ(j) .
Suppose we have a family Φ of functions ϕ : [n] → V (G) with the property that, for every
U ⊆ [n] with |U | = |V (G)|, there exists ϕ ∈ Φ with ϕ(U ) = V (G). Such a family Φ is called a
k-perfect family of hash functions where k = |V (G)| and explicit constructions are known of size
Ok (log n) (the constant here is exponential in k).
The reduction from SUBuncolored (G) to SUB(G) works as follows: For each ϕ ∈ Φ, test whether
subG (X ϕ ) ≥ 1. If subG (X ϕ ) ≥ 1 for any ϕ ∈ Φ, then accept (i.e. conclude that X contain a
subgraph isomorphic to G); otherwise, reject.
To see that this reduction is correct, note that if X does not contain a subgraph isomorphic to
G, then clearly subG (X ϕ ) = 0 for every function ϕ : [n] → V (G). On the other hand, if X contains
a subgraph G0 isomorphic to G, then for any ϕ ∈ Φ with ϕ(V (G0 )) = V (G), we have subG (X ϕ ) ≥ 1.
The fact that this reduction can be implemented by AC0 circuits of size O(n log n) was noted
by Amano (2010).
In many cases, there is a trivial reduction in the opposite direction from SUB(G) to SUBuncolored (G).
Definition 5. A graph G is a core if every homomorphism G → G is one-to-one (and hence an
automorphism).
For example, the complete graph Kk is a core.
2
Lemma 6. If G is a core and X ⊆ G↑n , then X contains a subgraph isomorphic to G if and only
if subG (X) ≥ 1.
Proof. In one direction, if subG (X) ≥ 1 then clearly X contains a subgraph isomorphic to G. In
∼
=
the other direction, assume X contains a subgraph G0 isomorphic to G. Let ϕ : V (G0 ) −
→ V (G)
↑n
(i)
be an isomorphism. Let π : V (G ) → V (G) be the homomorphism v 7→ v. Then π ◦ ϕ−1 is a
homomorphism G → G and, therefore, one-to-one (since G is a core). It follows that there exists
an automorphism σ of G such that π ◦ ϕ−1 ◦ σ : V (G) → V (G) is the identity function. Then
G(α) ⊆ X where α ∈ [n]V (G) is given by αv = ϕ−1 (σ(v)).
Corollary 7. If G is a core, then there is a linear-size AC0 reduction (in fact, monotone projection)
from SUBuncolored (G) to SUB(G).
↑n
Proof. This reduction simply maps X regarded as an instance of SUB(G, n) (i.e. a subgraph G )
to X regarded as an instance of SUBuncolored (G, |V (G)|·n) (i.e. a subgraph of K|V (G)|·n ).
Theorem 4 and Corollary 7 show that SUB(G) and SUBuncolored (G) are equivalent problems (up
to quasi-linear AC0 reductions) for cores G such as Kk . Henceforth, we shall focus almost entirely
on the colored problem SUB(G), which appears to be more well-structured and better behaved
than the uncolored version.
3
Threshold Weightings
For the uncolored subgraph isomorphism problem SUBuncolored (G), there is a canonical choice of
input distribution (i.e. random graph) with respect to which the corresponding Boolean function
n
{0, 1}( 2 ) → {0, 1} is balanced (i.e. has expectation bounded away from 0 and 1). This is the
Erdos-Renyi random graph G(n, p) for an appropriate threshold value of p (see Example 15).
In the colored setting, we can identify a family of product distributions (what might be called
“G-colored Erdos-Renyi random graphs”) with respect to SUB(G) is balanced. This family is nicely
indexed by a convex polytope of edge-weightings E(G) → [0, 2].
Definition 8. A threshold weighting on G is a function θ : E(G) → R≥0 such that
P
1.
e∈E(H) θ(e) ≤ |V (H)| for all H ⊆ G,
P
2.
e∈E(G) θ(e) = |V (G)|.
For H ⊆ G, we define
∆θ (H) := |V (H)| −
X
θ(e).
e∈E(H)
Thus, condition (1) states that ∆θ (H) ≥ 0 for all H ⊆ G and condition (2) states that ∆θ (G) = 0.
We say that θ is strict if ∆θ (H) > 0 for all ∅ ⊂ H ⊂ G. (Note that θ is strict ⇒ G is connected.)
Example 9.
1. Here is a threshold weighting on a particular 4-vertex graph:
3
2. The set of threshold weightings on G is a polytope in RE(G) . In particular, if θ1 , θ2 are
threshold weightings, then so is every convex combination λθ1 + (1 − λ)θ2 for 0 < λ < 1.
3. If G is r-regular, then the constant function θ ≡ 2/r is a threshold weighting.
Exercise: If G is an r-regular expander, then ∆2/r (H) ≥ Ω(min{|V (H)|, |V (G)| − |V (H)|}).
4. More generally, the function θ({v, w}) :=
4
1
deg(v)
+
1
deg(w)
is a threshold weighting.
The Random Graph Xθ ⊆ G↑n
Definition 10. For every threshold weighting θ on G, we define a random subgraph Xθ,n ⊆ G↑n
(we write Xθ when n is understood from context) where, independently for all {v (i) , w(j) } ∈ E(G↑n ),
P {v (i) , w(j) } ∈ E(Xθ ) = n−θ({v,w}) .
We will be interested in the average-case complexity of SUB(G) on Xθ .
Definition 11. If f = (fn ) is a sequence of Boolean functions fn : {subgraphs of G↑n } → {0, 1},
we say that “f solves SUB(G) in the average-case on Xθ ” if
lim P fn (Xθ,n ) = 1 ⇐⇒ subG (Xθ,n ) ≥ 1 = 1.
n→∞
Clearly, the worst-case complexity of SUB(G) is lower-bounded by the maximum—over threshold weightings θ—of its average-case complexity with respect to Xθ . Our method for proving lower
bounds on SUB(G) will consist of two parts:
(i) For each threshold weighting θ, we characterize the average-case AC0 circuit size of SUB(G)
on Xθ in terms of a combinatorial parameter κθ (G) (defined in the next lecture): we show
a lower bound of nκθ (G)−o(1) , as well as an upper bound of n2κθ (G)+O(1) .
(ii) For every graph G, we show that there exists a threshold weighting θ such that κθ (G) =
Ω(tw(G)/ log tw(G)) where tw(G) is the tree-width of G. In special cases, such as G = Kk
or in general when G is an expander, we achieve optimal bound κθ (G) = Ω(|V (G)|).
Together (i) and (ii) imply nearly tight lower bounds on the worst-case (really: worst average-case)
AC0 circuit size of of SUB(G).
The upper and lower bounds in (i) rely on properties of the random graph Xθ . In particular,
we require a good understanding of the subgraph counts subH (Xθ ) for H ⊆ G.
Lemma 12. For all H ⊆ G, we have E subH (Xθ ) = n∆θ (H) . In particular, E subG (Xθ ) = 1.
Proof. Direct from definitions.
In the case where θ is strict, we can say a lot more.
Theorem 13. If θ is strict, then the following hold:
4
1. subG (Xθ ) is asymptotically Poisson(1). That is, for every k ∈ N,
1
lim P subG (Xθ ) = k =
.
n→∞
ek!
In particular, lim P[ subG (Xθ ) ≥ 1 ] = 1 −
n→∞
justifying terminology “threshold weighting”).
1
(i.e. SUB(G) is balanced with respect to Xθ ,
e
2. For all ∅ ⊂ H ⊂ G,
P subH (Xθ ) < 21 n∆θ (H) ≤ exp −Ω(nc ) ,
P subH (Xθ ) > 23 n∆θ (H) ≤ exp −Ω(nc ) .
for a constant c > 0 depending on θ and H.
Time permitting, we will see elements of the proof of Theorem 13 in a future lecture.
5
More Examples of Threshold Weightings
Example 14 (Threshold weightings from Markov chains). If M ∈ V (G) × V (G) → [0, 1] is the
transition matrix for any Markov chain on G, that is,
P
1.
w M (v, w) = 1 for every v ∈ V (G) and
2. M (v, w) > 0 =⇒ {v, w} ∈ E(G).
Then
θ({v, w}) := M (v, w) + M (w, v)
is a threshold weighting for G. In this case, we have
X
∆θ (H) =
M (v, w),
(v,w) : v∈V (H) and {v,w}∈E(G)\E(H)
that is, ∆θ (H) is equal to the amount of M -flow that leaves V (H) via edges in E(G) \ E(H). This
is derived as follows:
X
∆M (H) = |V (H)| −
M (v, w) + M (w, v)
{v,w}∈E(H)

=
X

X
1 −
w : {v,w}∈E(H)
v∈V (H)

=
X
=

X

v∈V (H)
M (v, w)
M (v, w)
w : {v,w}∈E(H)
/
X
(v,w) : v∈V (H) and {v,w}∈E(G)\E(H)
5
M (v, w).
Example 15. The threshold exponent of G is defined by
τ (G) := min
H⊆G
|V (H)|
.
|E(H)|
This constant characterizes the appearance of subgraphs isomorphic to G in the Erdos-Renyi random graph G(n, p): for p(n) = n−τ (G) , the probability that G(n, p) contains a subgraph isomorphic
to G is bounded away from 0 and 1 (while this probability tends to 0 or 1 for p n−τ (G) and
p n−τ (G) respectively). The Erdos-Renyi random graph G(n, n−τ (G) ) is a canonical input distribution for studying the average-case complexity of SUBuncolored (G).
The average-case analysis of SUBuncolored (G) on G(n, n−θ(G) ) is essentially equivalent to the
average-case analysis of SUB(G) on Xθ for the threshold weighting θ : E(G) → [0, 2] defined by

τ (G) if {v, w} ∈ E(H) for any H ⊆ G with τ (G) = |V (H)| ,
|E(H)|
θ({v, w}) :=

0
otherwise.
All of the lower/upper bounds we show for average-case SUB(G) on Xθ carry over to SUBuncolored (G)
on G(n, n−θ(G) ) for this particular setting of θ. (This is not by a formal reduction; I am only claiming that the calculations work out similarly.) We get more powerful results by working with the
colored version SUB(G), thanks to the freedom to choose optimal θ. For this reason, going forward
we will focus entirely on the average-case complexity of SUB(G) on Xθ and leave the uncolored
setting as a special case.
Remark 16. G is balanced if τ (G) = |V (G)|/|E(G)|, and G is strictly balanced if τ (G) < |V (H)|/|E(H)|
for all H ⊂ G. If G is balanced, then the above-defined θ is identically τ (G). If G is strictly balanced, then this θ is strict.
6