九大数理集中講義
Comparison, Analysis, and Control of
Biological Networks (7)
Partial k-Trees, Color Coding, and
Comparison of Graphs
Tatsuya Akutsu
Bioinformatics Center
Institute for Chemical Research
Kyoto University
Tree Decomposition and
Partial k-Tree
[Flum, Grohe: Parameterized Complexity Theory, Springer]
Tree Decomposition
Tree decomposition of G(V,E)
Pair of rooted tree and family of sets of vertices
(T (VT , ET ), ( Bt )tVT )
all v∊V , B 1 (v) {t VT | v Bt } is connected
For all {u,v}∊E , u, v∊Bt holds for some t∊VT
For
Width
maxt |Bt|-1
Treewidth
Minimum width
of possible tree
decompositions
Examples
⇒ treewidth of tree is 1
⇒ treewidth of cycle is 2
Several Properties
Prop. Let t1,…,th be children of node t in T(VT,ET).
For all i≠j, ( Bt Bt ) ( Bt Bt )
i
j
Prop. Let s be parent and t1,…,th be children of node t.
For all j, ( Bs Bt ) ( Bt j Bt )
⇒ Many optimization problems can be solved in a bottom up manner
Thm. Graphs with treewidth k is partial k-tree,
and treewidth of partial k-tree is k
Thm. Determination of treewidth is NP-hard
Thm. For fixed k, tree decomposition of partial k-tree can
be computed in linear time
Definition of partial k-tree is omitted.
DP Algorithm for Partial k-Trees
For fixed k, many NP-hard problems can be solved in
polynomial time using dynamic programming
Ex. Vertex cover problem
Ch(t): Set of children of node t in tree T
Wt Ws Wt ( Bt Bs ) Ws ( Bt Bs )
Dynamic
programming algorithm
OPTt (Wt ) | Wt |
min OPT (W ) | W W
s
s
s
t
W
:
W
W
sCh ( t ) s s t
|
OPT min OPTr (Wr )
Wr
where Wt is a vertex cover for a subgraph induced by Bt,
r is the root of T.
Explanation of DP Algorithm
OPTt (Wt ) | Wt |
min OPT (W ) | W W
s
s
s
t
Ws :Ws Wt
sCh ( t )
OPT min OPTr (Wr )
OPTt(Wt): size of minimum
vertex cover of G(t) under the
condition that Wt is cover of Bt
T(t): subtree of T induced by t
and its descendants
G(t): subgraph of G induced by
Wr
Bt
t 'V (T ( t ))
Bs
|
Bs’
B 1 (t )
Wt :
Ws :
Ws ' :
Analysis of Time Complexity
Let k be a constant.
Tree decomposition can be computed in linear time.
For each t∊VT, at most 2k+1 Wt are tested.
To compute min in Σ, 2k+1× 2k+1 =4k+1 pairs are tested
per edge in T
Thus, the total complexity is O(4k poly(n)).
OPTt (Wt ) | Wt |
min OPT (W ) | W W
s
s
s
t
Ws :Ws Wt
sCh ( t )
OPT min OPTr (Wr )
Wr
|
Applications to Bioinformatics
Graphs representing structures of proteins
and RNAs are considered to have small
treewidth
Examples
Protein threading
Protein side-chain packing
Protein structure alignment
Comparison of RNA secondary structures
Attractor detection in Boolean networks
Color Coding
[Alon et al.: J. ACM 1995]
k-Path Problem
Input: undirected graph G(V,E), integer k
Output: vertex disjoint path of G with length k
NP-hard ⇐ Hamilton path problem if k=n(=|V|)
Naïve algorithm: For each vertex v, examine neighbors,
neighbors of neighbors, …
⇒ O(nk) time
Idea
Partition V into k subsets (color vertices using k colors)
If lucky, all vertices lie in different subsets
(analysis of such probability ⇒ randomized algorithm)
DP Algorithm
For each v , examine whether there exists k-path starting
from v
Path can be reconstructed by traceback
P(u,C): 1 if there exists a path from v to u using each color in C
exactly once, otherwise 0 (C is a subset of {1,2,…,k})
Initialization: P(v,{f(v)})←1, others be 0 (f(v) is color of v)
Recursion: (in the order of |C|=1 to |C|=k-1) {u,w}∈E
P(u, C { f (u )}) 1 P( w, C ) 1 and f (u ) C
P(v,{R})=1
v
P(w,{R,Y,B})=1
w
u1
u2
P(u1,{R,Y,
B,G})=1
Analysis of Time Complexity
P(u,C): 1 if there exists a path from v to u using each color
in C exactly once, otherwise 0 (C is a subset of {1,2,…,k})
Initialization: P(v,{f(v)})←1, others be 0 (f(v) is color of v)
Recursion: (in the order of |C|=1 to |C|=k-1) {u,w}∈E
P(u, C { f (u )}) 1 P( w, C ) 1 and f (u ) C
Lemma: The above algorithm works in O(2k poly(n))
time
Proof: Numbr of C is 2k. Thus, it is enough to examine
2kn P(u,C)s.
This computation should be done for all initial vertex
v, which needs additional O(n) factor
Analysis of Success Probability
Lemma: Let P be k-path of G. When randomly coloring, the
probability that k vertices in P have different colors is ≧ e-k
Proof: #coloring to P is kk. On the other hand, #(successful
coloring) is k!. Therefore, by using Stirling formula, we have
k!
2k (k / e )
2k
k
k e
k
k
k
k
e
k
k
Theorem: By repeating the algorithm at least ek times, a
solution can be obtained (if any) with probability ≧ 1/2
Proof: The probability of all fails is bounded by
k ek
(1 e ) e 1 12
The algorithm never outputs a wrong solution
Derandomization
Idea: use of hash function families
k-perfect hash functions: Let F be a family of hash functions
from V={1,2,…,n} to {1,2,…,k} . F is called a family of kperfect hash functions if, for any k-element subsets of V,
there exists a function f∊F that gives one-to-one mapping
Theorem: For any n and k, k-perfect hash functions with
2O(k)・log2n functions can be constructed in 2O(k)・n・log2n time
⇒ In place of random coloring, it is enough to examine all f
given by this theorem
Corollary: k-Path Problem can be solved in 2O(k)・poly(n) time
Applications of Color Coding
`Path’ is color coding can be extended to small
trees and small subgraphs (network motifs)
⇒ Applications to bioinformatics
Network motif [Alon et al.: Bioinformatics , 2008]
Signal pathway analysis [Huffner et al.: Bioinformatics 2007 &
Algorithmica 2008]
Network marker [Dao et al.: Bioinformatics 2011]
Pathway search/alignment [Shlomi et al.: BMC Bioinformaics
2006]
Comparison of Chemical Graphs
Chemical Structures and Graphs
Tree
graph without cycle
Almost tree
tree + some edges
(in each strongly connected component)
Outerplanar graph
No crossing edges
No internal vertex
Partial k-tree
Decomposed into tree
by identifying k+1
vertices as one node
Partial k-trees
Partial k-tree(tree width≦k)
Decomposed into tree by identifying k+1 vertices as one node
Outerplanar graphs are 2-trees
Chemical
compounds in
NCI database
[Horvath & Ramon, TCS
2010]
tree
width
1 (tree)
21,950
2
221,675
3
6,548
≧4
65
If we can design efficient algorithms for partial 4-trees,
we can cover almost all chemical compounds
Three Matching
Problems
Graph isomorphism
Subgraph
isomorpshim
Are two graphs are
essentially the same ?
Is one graph a part of
the other graph ?
Maximum common
subgraph
Largest (connected)
common part between
two given graphs
Complexity of Graph Comparison Problems
Graph isomorphism
Polynomial time for bounded degree graphs [Luks, JCSS, 1982]
However, not practical because the algorithm is too
complicated (based on group theory)
Subgraph isomorphism
Polynomial time for partial k-trees of bounded degree
[Matousek & Thomas, Disc. Math., 1992]
However, the algorithm is still too complicated
Maximum common subgraph
trees:polynomial time [Matula, Ann. Disc. Math, 1978]
almost trees: polynomial time [Akutsu, IEICE Trans., 1993]
outerplanar graphs: polynomial time [Akutsu & Tamura, Algorithms, 2013]
partial k-trees: NP-hard for k=11 [Akutsu & Tamura, Proc. ISAAC 2013]
partial k-trees with k=3: open problem (since we recently
improved to k=4)
Algorithm for Outerplanar Graphs: Key Idea
Difficulty: need to find cut points ⇒ easily lead to
combinatorial explosion
Idea: introduction of the concept of blade
Lemma: #blades is O(n2). ⇒ polynomial time algorithm
Maximum Common Subgraph: Summary
Trees
Almost trees
polynomial time [Akutsu, IEICE Trans.,1993]
Outerplanar graphs of bounded degree
polynomial time [Matula, Ann. Disc. Math, 1978]
polynomial time [Akutsu & Tamura, Algorithms, 2013]
Partial k-trees of bounded degree
NP-hard [Akutsu & Tamura, Proc. ISAAC 2013]
⇔ Polynomial time for subgraph isomorphism
[Matousek & Thomas, Disc. Math., 1992]
Summary
Tree Decomposition
Color Coding
For fixed k, many NP-hard problems can be solved in
polynomial time by DP algorithms
Applications to analysis of protein/RNA structures
Useful for finding small paths/subgraphs in networks
Applications to biological pathway analysis
Comparison of Chemical Graphs
The maximum common subgraph problem is NP-hard
even for partial k-trees for k=4, but is solvable in
polynomial time for outerplanar graphs
© Copyright 2026 Paperzz