What comes next?
example for a hardness result:
cross×plain→cross, ’all operations’ is Max SNP-hard
wr +wb
2 ).
S.Will, 18.417, Fall 2011
(i.e. without the restriction wa =
Max-Cut-3
max cut
• formal:
• let G = (V , E ) be a graph
• a cut in G is a set of edges s.t. there is a partition
v2
v3
v1
v6
V1 ] V2 = V , where for every edge one endpoint is
in V1 , the other in V2 .
• Max-Cut-3: given graph g with degree ≤ 3, find cut with
maximal cardinality.
v4
v5
Theorem
Remark An optimization problem is Max-SNP-hard iff it does not
have a PTAS (Polynomial Time Approximation Scheme).
A PTAS is an algorithm that takes an instance of a maximization
problem and a parameter > 0 and, in polynomial time, produces
a solution that is within a factor 1 − of being maximal.
S.Will, 18.417, Fall 2011
Max-Cut-3 is Max-SNP-hard
Reduction of Max-Cut-3 to cross×plain→cross
Reduction idea:
represent Max-Cut-3 problem as alignment problem
cross×plain→cross such that optimal alignment corresponds
to maximum cut.
→ if Max-Cut-3 can be solved using the alignment problem, the
alignment problem must also be Max-SNP-hard.
Plan
• show how to represent graph G as input of alignment problem
• show how optimal alignment corresponds to maximum cut for
G.
S.Will, 18.417, Fall 2011
(e.g. Sequences S1 ,S2 + structure P1 for S1 )
Representation of Graph G as Alignment Problem
(Example)
v2
v3
v1
v4
AAAUUU
AAAUUU
AAAUUU
AAAUUU
AAAUUU
AAAUUU
UUUAAA
UUUAAA
UUUAAA
UUUAAA
UUUAAA
UUUAAA
v1
v2
v3
v4
v5
v6
S.Will, 18.417, Fall 2011
v5
v6
Representation of Graph G as Alignment Problem
(formally)
v2
v3
• given G = v
1
v4
v6
• sequences
v5
• S1 = (AAAUUU(C )c )n−1 AAAUUU, and
• S2 = (UUUAAA(C )c )n−1 UUUAAA.
nodes
• each edge (vi , vj ) of G corresponds to two arcs in P1 : one connecting
an A of the i-th segment with a U of the j-th segment and one
connecting a U of the i-th segment with an A of the j-th segment.
• C s are used to avoid alignment of different segments, and their
← arc changes
a ,wr )
number c depends on the ratio min(wwb ,w
← base deletion
d
S.Will, 18.417, Fall 2011
• the segments AAAUUU in S1 and UUUAAA in S2 correspond to the
Correspondence of Optimal Alignment and Max Cut
Properties of Optimal Alignment
• we choose c such that every optimal alignment must match all C s
• we choose a scoring with wm > wd and 2wa > wb + wr .
AAA UUU
• wm > wd implies no base mismatch: A A A U U U >
UUUAAA
• two alignment types for each node vi :
• A-type:
• U-type:
UUUAAA
AAA UUU
UUUAAA
AAA UUU
UUUAAA
• A-type :⇔ node in V1
U-type :⇔ node in V2 .
• cost for each edge of the cut (vi and vj have different type)
AAA UUU
UUUAAA
cost: wb + wr
AAA UUU
UUUAAA
S.Will, 18.417, Fall 2011
arc breaking
arc removing
Correspondence of Optimal Alignment and Max Cut
• cost for each edge that is not in the cut (vi and vj have same type)
arc altering
arc altering
AAA UUU
AAA UUU
UUUAAA
UUUAAA
•
•
•
•
C = k (wb + wr ) + (
V1 = all A-type nodes
V2 = all U-type nodes
n nodes, each degree 3 ⇒
k := |cut(V1 , V2 )|
3n
2
edges
3n
− k) 2wa + n 3wd
2
assumption: 2wa > wb + wr
⇒
z
}|
{
= 3n(wa + wd ) − k (2wa − wb − wr )
• ⇒ C minimal ≡ k maximal
• ⇒ maximal cut ≡ minimal edit distance.
>0
S.Will, 18.417, Fall 2011
cost: 2wa
• total cost for alignment:
Approaches for Alignments of RNAs
Plan A
Plan C
Plan B
A:
B:
ALIGN
FOLD
single
sequences
single
sequences
A:
A:
B:
FOLD
alignment
simultanously
ALIGN and FOLD
B:
[Sankoff 85]
ALIGN
sequence AND
structure
A:
B:
consensus:
consensus structure:
A:
B:
adopted from:
[Gardener & Giiegerich BMC 2004]
S.Will, 18.417, Fall 2011
consensus
structure
Simultaneous Alignment and Folding: Sankoff (1985)
• What do we want? What means folding into a common structure?
• First idea: preserve “shape” ≡ branching structure
• Formally: let i1 < i2 . . . < iv in a and j1 < j2 . . . < jw in b be the
• Hence: minimize edit distance + energies (of 2 equiv. structures)
S.Will, 18.417, Fall 2011
positions in pairs that limit multiloops or are external
(branching configuration)
Then: structures equivalent (according to branching)
iff v = w , and (if , ig ) ∈ Pa if and only if (jf , jg ) ∈ Pb
• finding good equivalent structures not sufficient:
Sankoff Problem Definition
• Idea: Sankoff = Zuker Folding + Needleman/Wunsch Alignment
• IN: two sequences a and b
• find two equivalent structures Pa and Pb
and compatible alignment A of a and b
such that Energy (a, Pa ) + Energy (b, Pb ) + EditDistance(A) minimal
• where: Energy yields (loop-based) Turner free energy,
EditDistance is edit distance (base mismatch x, indel y)
alignment must be “consistent” with branching structure
formally: the base pairs (if , ig ) ∈ Pa and (jf , jg ) ∈ Pb (from Def.
of equivalent) must be aligned to each other
S.Will, 18.417, Fall 2011
• what means compatible?
Constraints
We want to find the optimal structures + alignment with the
following constraints:
constraints on the predicted structures:
• must be equivalent (intuitively: same kind of multiloops)
constraints on the alignment:
• multiloops must be aligned to their equivalent partner
• hairpin loops must be aligned to their equivalent partner
one other 2-loop or must be entirely aligned to a gap.
S.Will, 18.417, Fall 2011
• each 2-loop (or stacking or bulge) must be aligned to exactly
Edit distance of sub-sequences
• distance based score
• Recursion:
D(i, j − 1; h, k − 1) + x
D(i, j − 1; h, k − 1)
D(i, j; h, k) = min
D(i, j − 1; h, k) + y
D(i, j; h, k − 1) + y
D(i + 1, j; h + 1, k) + x
D(i + 1, j; h + 1, k)
= min
D(i + 1, j; h, k) + y
D(i, j; h + 1, k) + y
(
x
• Initialization: D(i, i; h, h) =
0
if ai 6= bh
else
if aj 6= bk
if aj = bk
if ai 6= bh
if ai = bh
S.Will, 18.417, Fall 2011
x = base mismatch
y = base deletion/insertion
• D(i, j; h, k) minimum sequence alignment cost
between sequences ai . . . a
j and bh . . . bk .
Recall Zuker
Energies: e(s), where s is k-loop (or s = φ for empty structure)
F (i, j) “free”, minimum energy for subsequence ai . . . aj
C (i, j) “closed”, minimum energy for subsequence where (i, j) ∈ P
Zuker Recursion:
• Problem: (6) requires time proportional to n2K
where K maximum k in k-loops
S.Will, 18.417, Fall 2011
•
•
•
•
Usual Simplification
• e(s) for k-loops with k ≥ 3 (multiloops)
S.Will, 18.417, Fall 2011
e(s) = A + (k − 1)P + uQ
• New matrix: G (i, j) for multiloops
• Recursion:
Simultanous Alignment and Folding
• Extend definition of D(i1 , j1 ; i2 , j2 )
S.Will, 18.417, Fall 2011
if i1 > j1 , then cost for deleting bi2 . . . bj2 .
if j2 > i2 , then cost for deleting ai1 . . . aj1 .
• F (i1 , j1 ; i2 , j2 ) minimum cost (sum of alignment and free energy)
for ai1 . . . aj1 and bi2 . . . bj2 .
• C (i1 , j1 ; i2 , j2 ): minimum cost for ai1 +1 . . . aj1 −1 and bi2 +1 . . . bj2 −1
under condition (i1 , j1 ) ∈ Pa and (i2 , j2 ) ∈ Pb
S.Will, 18.417, Fall 2011
Simultanous Alignment and Folding: “Closed”
• G (i1 , j1 ; i2 , j2 ): matrix for multiloop alignment
• Recursion for G
G (i1 , j1 ; i2 , j2 )
match j1 and j2
match i1 and i2
}|
{
}|
{
z
z
C (i1 , j1 ; i2 , j2) + 2P + D(i1 , i1 ; i2 , i2 ) + D(j1 , j1 ; j2 , j2 )
G (i1 , h1 ; i2 , h2 ) + (j1 − h1 + j2 − h2 )Q
= min
+D(h1 + 1, j1 ; h2 + 1, j2 ),
min
G (i1 , h1 ; i2 , h2 ) + G (h1 + 1, j1 ; h2 + 1, j2 ),
i1 < h1 < j1
i2 < h2 < j2
(h1 − i1 + 1 + h2 − i2 + 1)Q
+D(i1 , h1 ; i2 , h2 ) + G (h1 + 1, j1 ; h2 + 1, j2 )
S.Will, 18.417, Fall 2011
Simultanous Alignment and Folding: Multiloop
Simultanous Alignment and Folding: “free”
• Recursion for F
C (i1 , j1 ; i2 , j2 ) + D(i1 , i1 ; i2 , i2 ) + D(j1 , j1 ; j2 , j2 )
min
F (i1 , h1 ; i2 , h2 ) + F (h1 + 1, j1 ; h2 + 1, j2 )
F (i1 , j1 ; i2 , j2 ) = min
i1 < h1 < j1
i2 < h2 < j2
D(i1 , j1 ; i2 , j2 )
• with initial conditions C (i1 , i1 ; i2 , i2 ) = ∞
S.Will, 18.417, Fall 2011
and G (i1 , i1 ; i2 , j2 ) = G (i1 , j1 ; i2 , i2 ) = ∞
Complexity
space complexity O(n4 )
• constant number of matrices (C,D,F, and G)
• each of them has O(n4 ) entries
time complexity O(n6 )
• each entry of matrix D requires constant time
• each entry of F,C, and G requires O(n2 ) time (minimize over
• hence: n4 · n2 = n6
S.Will, 18.417, Fall 2011
all h1 , h2 )
© Copyright 2026 Paperzz