ΗΥ590.71 Θέματα διακριτής βελτιστοποίησης - imagine

Discrete Optimization in
Computer Vision
Nikos Komodakis
Ecole des Ponts ParisTech, LIGM
Traitement de l’information et vision artificielle
Message passing algorithms for
energy minimization
Message-passing algorithms



Central concept: messages
These methods work by propagating
messages across the MRF graph
Widely used algorithms in many areas
Message-passing algorithms


But how do messages relate to optimizing
the energy?
Let’s look at a simple example first:
we will examine the case where the MRF
graph is a chain
Message-passing on chains
MRF graph
Message-passing on chains
Corresponding lattice or trellis
Message-passing on chains


Global minimum in linear time
Optimization proceeds in two passes:
 Forward pass (dynamic programming)
 Backward pass
Message-passing on chains
(example on board)
(algebraic derivation of messages)
Message-passing on chains
p
q
r
s
Forward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
s
Forward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
s
Forward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
s
Forward pass (dynamic programming)
 p (xp ) 
 2 .5 
  1
M
 p , xq )
pqpq( x
 0.1
 
1.5 
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
s
Forward pass (dynamic programming)
 0.5 
 2 
M qr   
1.2 
 
 2.0 
p
q
r
s
k
M qr (k )  min    q ( j )  M pq ( j )    qr ( j, k ) 
j
Forward pass (dynamic programming)
s
1.0 
 4.0 
M rs   
 2.0 
 
1.0 
p
q
r
 0.5 
 2 
 
1.2 
 
 2.0 
s
Min-marginal
for node s and label j:
min  E (x)
x
xs  j

Backward pass
p
q
r
s
xp
xq
xr
xs
 qr( (j)j)MMpqqr((jj)) qrrs( j, xrs ) 


jj
min
MM
)min
qx(rx(rxarg
qrxrs
s) arg
jj
Message-passing on chains
How
can I compute min-marginals for any node in the chain?
How
to compute min-marginals for all nodes efficiently?
What
is the running time of message-passing on chains?
Message-passing on trees



We can apply the same idea to treestructured graphs
Slight generalization from chains
Resulting algorithm called:
belief propagation
(also called under many other names: e.g.,
max-product, min-sum etc.)
(for chains, it is also often called the
Viterbi algorithm)
Belief propagation
(BP)
BP on a tree [Pearl’88]
leaf
p
q
leaf


r
root
Dynamic programming: global minimum in linear time
BP:
 Inward pass (dynamic programming)
 Outward pass

Gives min-marginals
Inward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
Inward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
Inward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
Inward pass (dynamic programming)
 p (xp ) 
 2 .5 
  1
M
 p , xq )
pqpq( x
 0.1
 
1.5 
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
Inward pass (dynamic programming)
M pq
p
q
 0 .5 
 2
 
1.2 
 
 2 .0 
r
k
M qr (k )  min    q ( j )  M pq ( j )    qr ( j, k ) 
j
Inward pass (dynamic programming)
p
q
r
Inward pass (dynamic programming)
p
q
r
Outward pass
p
q
r
BP on a tree: min-marginals
p
q
r
j
Min-marginal
for node q and label j:
min  E (x)
x
xq  j 
  q ( j )  M pq ( j )  M rq ( j )
Belief propagation: message-passing on
trees
Belief propagation: message-passing on
trees
min-marginals = ???
sum of all messages +
unary potential
What is the running
time of messagepassing for trees?
Message-passing on chains


Essentially, message passing on chains is
dynamic programming
Dynamic programming means
reuse of computations
Generalizing belief propagation



Key property: min(a+b,a+c) = a+min(b,c)
BP can be generalized to any operators
satisfying the above property
E.g., instead of (min,+), we could have:


(max,*)
Resulting algorithm called max-product.
What does it compute?
(+,*)
Resulting algorithm called sum-product.
What does it compute?
Belief propagation as a distributive
algorithm

BP works distributively
(as a result, it can be parallelized)

Essentially BP is a decentralized algorithm

Global results through local exchange of
information

Simple example to illustrate this: counting soldiers
Counting soldiers in a line
(From David MacKay’s book “Information Theory, Inference, and Learning”)

Can you think of a distributive algorithm for the
commander to count its soldiers?
Counting soldiers in a line
Counting soldiers in a tree

Can we do the same for this case?
Counting soldiers
in a tree
Counting soldiers


Simple example to illustrate BP
Same idea can be used in cases which are
seemingly more complex:



counting paths through a point in a grid
probability of passing through a node in the grid
In general, we have used the same idea for
minimizing MRFs (a much more general
problem)
Graphs with loops

How about counting these soldiers?

Hmmm…overcounting?