Alphabet SOUP: A Framework for Approximate Energy Minimization

Stephen Gould, Fernando Amat, Daphne Koller
• Alphabet SOUP provides a general purpose framework for
approximate energy minimization of arbitrary energies that
scales to accommodate available resources and problem
complexity.
4
• Finding the MAP solution is NP-hard, so people resort to
approximate techniques.
γ-Expansion Moves
of the optimal, where
θij (Xi , Xj )
xi = 1, xj = 3
Aγi = {0, 1}
Aγj = {0, 1}
∞ ∞
∞ ∞
∞∞∞∞∞
∞∞∞∞∞
xc ∈γ
/
*Assume θ (x ) ≥ 0 for all cliques c, with equality only if there exists some γ such that x ∈ A
c
5
• α-expansion: set Ai = {α} for all i.
• α-β swap: set Ai = {α} for all xi = β, Ai = {β} for all xi = α, and empty
otherwise.
• ICM: set Ai = dom(Xi) for i = k and empty otherwise.
• Fusion move: set xi = xi0 and Ai = {xi1}.
γ
θij
(X̂i , X̂j )
½
n
o
1
δcs (xs ) = minxc\s {βcs (xc )}xc ∈γ , |S(c)|
θc (xc )
c
k
i
i
k for
Choosing the γ-expansion Moves
• Our method allows the selection of γ-expansion moves to be
tailored to the problem at hand, e.g., (static) overlapping ordinal
ranges, grouped low-energy configurations, or (dynamic)
coordinate descent on vector-valued variables
greedy
Alphabet
SOUP
E = 3.37
E = 4.97
E = 2.84
image
Komodakis &
Tziritas (2006)
greedy
Alphabet
SOUP
E = 3.26
E = 3.36
E = 2.03
Cell Membrane Surface Reconstruction
Energy per pixel
Problem
(i)
(ii)
(iii)
0.507 0.509 0.521
40x43
112x116
n/ a 0.527 0.555
•
•
•
•
¾
all variables X i in the clique.
Komodakis &
Tziritas (2006)
• Task: Reconstruct surface of bacteria from tomography scans.
• Model: Discretized mesh: 3D orientation and 1D radial offset.
Here the energy is regular, but α-expansion does not scale.
• γ-Expansion Moves: (θX, θY, θZ, R) coordinates.
• Theorem (dynamic): By using the LP-dual message passing
methods for the inner loop we can provide a global bound on the
optimality gap. E.g., for the method of Globerson et al. (2007) the
current assignment is within
P
P
∆ = E(x) − s minxs c∈N (s) δcs (xs )
• Static and dynamic theoretical guarantees when the inner loop of
our method is exact.
• We can find the γ-expansion
move with minimum energy by
P
minimizing Eγ (x̂; x) = c θcγ (x̂c ) where
½
θc (x̂c ) if ∀Xi ∈ Xc : x̂i ∈ Aγi ∪ xi
θcγ (x̂c ) =
∞
otherwise
8
• Theorem (static): Let γ1, …, γK be a covering set of moves.* If x
is a local optimum relative to γ1, …, γK, then E(x) is within a
constant factor of the optimal energy.
• Message passing algorithms (e.g., max-product or its convex
variants) are the only general purpose approach (for problems
with arbitrary energies and non-homogeneous domains).
• For each i, let Aiγ be a subset of the domain for variable Xi.
Then a γ-expansion move maps each Xi from its current value xi
to a value in {xi} ∪ Aiγ. This generalizes other approaches:
Theoretical Guarantees
• Definition (covering set): A set of moves γ1, …, γK is covering if
for every xi ∈ dom(Xi), there exists a γk such that xi ∈ Aik.
• Very efficient methods exist for problems with certain structure
(e.g., α-expansion (Veksler, 1999) for submodular energies).
image
0.56
Running time (s)
(i)
(ii)
(iii)
39
1
<1
23
5
∞
0.55
(i) simultaneous optimization of all coordinates
(ii) Alphabet SOUP over coordinate pairs
(iii) Alphabet SOUP over individual coordinates
∞ indicates that problem could not fit into memory
9
0.54
0.53
0.52
0.51
∞
0.5
(a)(i)
(a)(ii)
(a)(iii)
(b)(i)
(b)(ii)
(b)(iii)
benchmark
Rosetta Protein Design
• Task: Find most stable configuration of amino-acids that give
rise to a given 3D structure.
• Model: Protein design dataset from Yanover et al. (2007).
• γ-Expansion Moves: Subset of 50 values (modulo domain size)
Running time (s)
Quality of Solution
2000
100
1800
90
1600
80
6
Object Detection and Outlining
1400
70
% of problems
∀Xi ∈ X
Image Completion and In-painting
• Task: Fill in missing part of image using image patches.
• Model: Pairwise MRF of Komodakis and Tziritas (2006).
• γ-Expansion Moves: Non-overlapping sets of 250 patches.
Patches for adjacent variables taken from adjacent locations in
the image. This provides low energy pairs.
start with arbitrary assignment x
repeat
for k = 1, . . . , K
find x̂ one γk -expansion-move away from x
if E(x̂) < E(x) then x ← x̂
• Many problems in computer vision can be modeled using
conditional Markov random fields.
P
minimize E(x) = c θc (xc )
subject to xi ∈ dom(Xi )
7
Alphabet SOUP Algorithm
energy / # pixels
3
Overview
Alphabet SOUP
1
2
S
T
A
N
F
O
R
D
Alphabet SOUP: A Framework for Approximate Energy Minimization
1200
1000
800
600
60
50
40
30
400
Max. Product
GEMPLP (Globerson et. al)
Alphabet SOUP
20
200
10
0
0
200
400
600
800
1000
1200
Max. Product
• Task: Detect object and localize using landmark based outline.
• Model: LOOPS (Heitz et al. (2008)).
• γ-Expansion Moves: Overlapping candidate landmark locations,
sorted by unary potential (highest to lowest).
2000
1800
energy for Alphabet SOUP
S
T
A
N
F
O
R
D
1600
1400
1200
10
1400
1600
1800
2000
0
0
20
40
60
80
100
% error
Conclusion
• Alphabet SOUP is a flexible method for finding approximate solutions to the
MAP inference problem for arbitrary energy functions. Our method is faster
than standard max-product belief propagation, requires significantly less
memory, and often produces lower energy solutions.
1000
800
600
400
200
0
0
200
400
600
800
1000
1200
1400
1600
energy for the LOOPS method
1800
2000
• Most interesting direction for further study is in providing more formal
foundations for the choice of subsets used for the different variables.