X - M. Pawan Kumar

Discrete Optimization
Lecture 2 – Part I
M. Pawan Kumar
[email protected]
Slides available online http://cvn.ecp.fr/personnel/pawan
Recap
Label l1
Label l0
Va
Vb
Vc
da
db
dc
D : Observed data (image)
V : Unobserved variables
L : Discrete, finite label set
Labeling f : V
L
Recap
Label l1
Label l0
Va
Vb
Vc
da
db
dc
D : Observed data (image)
V : Unobserved variables
L : Discrete, finite label set
Labeling f : V
L
Va is assigned lf(a)
Recap
Label l1
2
0
1
Label l0
a;f(a)
5
2
1
0
0
4
Va
2
Vb
da
db
6
3
1
3
Vc
bc;f(b)f(c)
dc
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Recap
Label l1
2
0
1
Label l0
5
2
1
0
0
4
Va
2
Vb
da
db
6
3
1
3
Vc
dc
f* = argminf Q(f; )
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Recap
Label l1
2
0
1
Label l0
5
Va
1
0
0
4
2
Vb
2
6
3
1
3
Vc
f* = argminf Q(f; )
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results
Mathematical Optimization
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
x is a feasible point
Objective function
Inequality constraints
Equality constraints
gi(x) ≤ 0, hi(x) = 0
x is a strictly feasible point
gi(x) < 0, hi(x) = 0
Feasible region - set of all feasible points
Convex Optimization
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
Feasible region is convex
Objective function is convex
Convex set? Convex function?
Convex Set
Line Segment
x1
x2
c x1 + (1 - c) x2
c  [0,1]
Endpoints
Convex Set
x1
x2
All points on the line segment lie within the set
For all line segments with endpoints in the set
Non-Convex Set
x1
x2
Examples of Convex Sets
x1
x2
Line Segment
Examples of Convex Sets
x1
x2
Line
Examples of Convex Sets
Hyperplane aTx - b = 0
Examples of Convex Sets
Halfspace aTx - b ≤ 0
Examples of Convex Sets
t
x2
x1
Second-order Cone ||x|| ≤ t
Examples of Convex Sets
Semidefinite Cone {X | X
0}
aTXa ≥ 0, for all a  Rn
All eigenvalues of X are non-negative
aTX1a
≥0
aTX2a ≥ 0
aT(cX1 + (1-c)X2)a ≥ 0
Operations that Preserve Convexity
Intersection
Polyhedron / Polytope
Operations that Preserve Convexity
Intersection
Operations that Preserve Convexity
Affine Transformation x
Ax + b
Convex Function
g(x)
x1
x2
x
Blue point always lies above red point
Convex Function
g(x)
x1
x2
x
g( c x1 + (1 - c) x2 ) ≤ c g(x1) + (1 - c) g(x2)
Domain of g(.) has to be convex
Convex Function
g(x)
x1
x2
x
g( c x1 + (1 - c) x2 ) ≤ c g(x1) + (1 - c) g(x2)
-g(.) is concave
Convex Function
Once-differentiable functions
g(y) + g(y)T (x - y) ≤ g(x)
g(x)
(y,g(y))
g(y) + g(y)T (x - y)
x
Twice-differentiable functions
2g(x)
0
Convex Function and Convex Sets
g(x)
x
Epigraph of a convex function is a convex set
Examples of Convex Functions
Linear function aTx
p-Norm functions (x1p + x2p + xnp)1/p, p ≥ 1
Quadratic functions xT Q x
Q
0
Operations that Preserve Convexity
Non-negative weighted sum
g1(x)
g2(x)
+ ….
+ w2
w1
x
x
xT Q x + aTx + b
Q
0
Operations that Preserve Convexity
Pointwise maximum
g1(x)
g2(x)
,
max
x
x
Pointwise minimum of concave
functions is concave
Convex Optimization
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
Feasible region is convex
Objective function is convex
Linear Programming
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
min g0Tx
Objective function
Inequality constraints
Equality constraints
Linear function
s.t. giTx ≤ 0
Linear constraints
hiTx = 0
Linear constraints
Quadratic Programming
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
min xTQx + aTx + b Quadratic function
s.t. giTx ≤ 0
Linear constraints
hiTx = 0
Linear constraints
Second-Order Cone Programming
min g0(x)
Objective function
s.t. gi(x) ≤ 0
Inequality constraints
hi(x) = 0
Equality constraints
min g0Tx
s.t.
xTQix
Linear function
+
hiTx = 0
aiTx
+ bi ≤ 0
Quadratic
constraints
Linear constraints
Semidefinite Programming
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
min Q  X
s.t. X
0
Ai  X = 0
Objective function
Inequality constraints
Equality constraints
Linear function
Semidefinite constraints
Linear constraints
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results
Integer Programming Formulation
Unary Cost
2
Label ‘1’
0
1
Label ‘0’
5
Unary Cost Vector u = [ 5
3
0
V1
Labeling = {1 , 0}
2
; 2
Cost of
Cost
V1 of
= 0V1 = 1
4
4 ]
2
V2
Integer Programming Formulation
Unary Cost
2
Label ‘1’
0
1
Label ‘0’
5
4
3
0
V1
Labeling = {1 , 0}
2
V2
4 ]T
Unary Cost Vector u = [ 5
2
; 2
Label vector x = [ -1
1
; 1 -1 ]T
Recall that the aimVis1 toV0find
1 = 1the optimal x
Integer Programming Formulation
Unary Cost
2
Label ‘1’
0
1
Label ‘0’
5
4
3
0
V1
Labeling = {1 , 0}
4 ]T
Unary Cost Vector u = [ 5
2
; 2
Label vector x = [ -1
1
; 1 -1 ]T
Sum of Unary Costs =
1 ∑ u (1 + x )
i i
i
2
2
V2
Integer Programming Formulation
Pairwise Cost
Label ‘1’
2
0
1
Label ‘0’
Labeling = {1 , 0}
5
V1
4
3
0
2
V2
Pairwise Cost Matrix P
0
0
3
Cost of V1 = 0 and V1 = 0
0 0
1
0
Cost of V1 = 0 and V2 = 0
0 1
0 0
Cost of V1 = 0 and V2 = 1
3 0
0 0
0
Integer Programming Formulation
Pairwise Cost
Label ‘1’
2
0
1
Label ‘0’
Labeling = {1 , 0}
Pairwise Cost Matrix P
0
0
0
3
0 0
1
0
0 1
0 0
3 0
0 0
5
V1
4
3
0
2
V2
Sum of Pairwise Costs
1 ∑ P (1 + x )(1+x )
ij ij
i
j
4
Integer Programming Formulation
Pairwise Cost
Label ‘1’
2
0
1
Label ‘0’
Labeling = {1 , 0}
Pairwise Cost Matrix P
0
0
0
3
0 0
1
0
0 1
0 0
3 0
0 0
5
4
3
0
V1
2
V2
Sum of Pairwise Costs
1 ∑ P (1 + x +x + x x )
ij ij
i
j
i j
4
= 1 ∑ij Pij (1 + xi + xj + Xij)
4
X = x xT
Xij = xi xj
Integer Programming Formulation
Constraints
• Integer Constraints
xi {-1,1}
X = x xT
• Uniqueness Constraint
∑ xi = 2 - |L|
i  Va
Integer Programming Formulation
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
Convex
xi {-1,1}
X = x xT
Non-Convex
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
– Linear Programming (LP-S)
– Semidefinite Programming (SDP-L)
– Second Order Cone Programming (SOCP-MS)
• Comparison
• Generalization of Results
LP-S
Schlesinger, 1976
Retain Convex Part
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi {-1,1}
X = x xT
Relax Non-Convex
Constraint
LP-S
Schlesinger, 1976
Retain Convex Part
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi [-1,1]
X = x xT
Relax Non-Convex
Constraint
LP-S
Schlesinger, 1976
X = x xT
Xij [-1,1]
1 + xi + xj + Xij ≥ 0
∑ Xij = (2 - |L|) xi
j  Vb
LP-S
Schlesinger, 1976
Retain Convex Part
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi [-1,1]
X = x xT
Relax Non-Convex
Constraint
LP-S
Schlesinger, 1976
Retain Convex Part
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi [-1,1],
Xij [-1,1]
1 + xi + xj + Xij ≥ 0
∑ Xij = (2 - |L|) xi
j  Vb
LP-S
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
– Linear Programming (LP-S)
– Semidefinite Programming (SDP-L)
– Second Order Cone Programming (SOCP-MS)
• Comparison
• Generalization of Results
SDP-L
Lasserre, 2000
Retain Convex Part
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi {-1,1}
X = x xT
Relax Non-Convex
Constraint
SDP-L
Lasserre, 2000
Retain Convex Part
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi [-1,1]
X = x xT
Relax Non-Convex
Constraint
SDP-L
1
x1
x2
1 x1 x2
...
xn
..
.
=
xn
1
xT
x
X
Xii = 1
Convex
Non-Convex
Positive Semidefinite
Rank = 1
SDP-L
1
x1
x2
1 x1 x2
...
xn
..
.
=
xn
1
xT
x
X
Xii = 1
Convex
Positive Semidefinite
Schur’s Complement
=
I
A
B
BT
C
0
A
BTA-1 I
0
A
0
0
C - BTA-1B
C -BTA-1B
0
I
A-1B
0
I
0
SDP-L
=
1
xT
x
X
1
0
1
x
I
0
0
X - xxT
Schur’s Complement
X - xxT
0
I
xT
0
1
SDP-L
Lasserre, 2000
Retain Convex Part
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi [-1,1]
X = x xT
Relax Non-Convex
Constraint
SDP-L
Lasserre, 2000
Retain Convex Part
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi [-1,1]
Xii = 1
Accurate
X - xxT
SDP-L
0
Inefficient
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
– Linear Programming (LP-S)
– Semidefinite Programming (SDP-L)
– Second Order Cone Programming (SOCP-MS)
• Comparison
• Generalization of Results
SOCP Relaxation
Derive SOCP relaxation from the SDP relaxation
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi [-1,1]
Xii = 1
X - xxT
0
Further Relaxation
1-D Example
X - xxT
0
For two semidefinite matrices,
Frobenius inner product is non-negative
A
X - x2 ≥
0
x2  X = 1
SOC of the form || v ||2  st
A0
2-D Example
X
=
xxT =
X11
X12
X21
X22
x1x1
x1x2
x2x1
x2x2
=
=
1
X12
X12
1
x12
x1x2
x1x2
x22
2-D Example
C1 (X - xxT)  0
1
0
0
0
C1
1 - x12

X12-x1x2
x1  1
-1  x1  1
2
X12-x1x2
1 - x22
0
0
2-D Example
C2 (X - xxT)  0
0
0
0
1
C2
1 - x12

X12-x1x2
x2  1
-1  x2  1
2
X12-x1x2
1 - x22
0
0
2-D Example
C3 (X - xxT)  0
1
1
1
1
C3
1 - x12

X12-x1x2
X12-x1x2
1 - x22
(x1 + x2  2 + 2X12
2
)
SOC of the form || v ||2  st
0
0
2-D Example
C4 (X - xxT)  0
1
-1
-1
1
C4
1 - x12

X12-x1x2
X12-x1x2
1 - x22
(x1 - x2  2 - 2X12
2
)
SOC of the form || v ||2  st
0
0
SOCP Relaxation
Kim and Kojima, 2000
Consider a matrix C1 = UUT
0
C1  (X - xxT)  0
||UTx ||2  X  C1
SOC of the form || v ||2  st
Continue for C2, C3, … , Cn
SOCP Relaxation
How many constraints for SOCP = SDP ?
Infinite.
For all C
0
Specify constraints similar to the 2-D example
xi
Xij
xj
(xi + xj)2  2 + 2Xij
(xi + xj)2  2 - 2Xij
SOCP-MS
Muramatsu and Suzuki, 2003
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi [-1,1]
Xii = 1
X - xxT
0
SOCP-MS
Muramatsu and Suzuki, 2003
x* = argmin
1 ∑ u (1 + x ) + 1 ∑ P (1 + x + x + X )
ij
i
j
ij
i
i
4
2
∑ xi = 2 - |L|
i  Va
xi [-1,1]
(xi + xj)2  2 + 2Xij
(xi - xj)2  2 - 2Xij
Specified only when Pij  0
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results
Kumar, Kolmogorov and Torr, JMLR 2010
Dominating Relaxation
≥
A
B
For all MAP Estimation problem (u, P)
A
dominates
B
Dominating relaxations are better
Equivalent Relaxations
=
A
B
For all MAP Estimation problem (u, P)
A
dominates
B
B
dominates
A
Strictly Dominating Relaxation
>
A
B
For at least one MAP Estimation problem (u, P)
A
B
dominates
B
does not dominate
A
SOCP-MS
Muramatsu and Suzuki, 2003
(xi + xj)2  2 + 2Xij
• Pij ≥ 0
• Pij < 0
Xij =
(xi - xj)2  2 - 2Xij
(xi + xj)2
Xij = 1 -
2
-1
(xi - xj)2
2
SOCP-MS is a QP
Same as QP by Ravikumar and Lafferty, 2005
SOCP-MS ≡ QP-RL
LP-S vs. SOCP-MS
Differ in the way they relax X = xxT
Xij [-1,1]
1 + xi + xj + Xij ≥ 0
∑ Xij = (2 - |L|) xi
j  Vb
LP-S
(xi + xj)2  2 + 2Xij
(xi - xj)2  2 - 2Xij
F(LP-S)
SOCP-MS
F(SOCP-MS)
LP-S vs. SOCP-MS
• LP-S strictly dominates SOCP-MS
• LP-S strictly dominates QP-RL
• Where have we gone wrong?
• A Quick Recap !
Recap of SOCP-MS
Xij
xi
C=
xj
1
1
1
1
(xi + xj)2  2 + 2Xij
Recap of SOCP-MS
Xij
xi
C=
xj
1
-1
-1
1
(xi - xj)2  2 - 2Xij
Can we use different C matrices ??
Can we use a different subgraph ??
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results
– SOCP Relaxations on Trees
– SOCP Relaxations on Cycles
Kumar, Kolmogorov and Torr, JMLR 2010
SOCP Relaxations on Trees
Choose any arbitrary tree
SOCP Relaxations on Trees
Choose any arbitrary C
0
Repeat over trees to get relaxation SOCP-T
LP-S strictly dominates SOCP-T
LP-S strictly dominates QP-T
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results
– SOCP Relaxations on Trees
– SOCP Relaxations on Cycles
Kumar, Kolmogorov and Torr, JMLR 2010
SOCP Relaxations on Cycles
Choose an arbitrary even cycle
Pij ≥ 0
OR
Pij ≤ 0
SOCP Relaxations on Cycles
Choose any arbitrary C
0
Repeat over even cycles to get relaxation SOCP-E
LP-S strictly dominates SOCP-E
LP-S strictly dominates QP-E
SOCP Relaxations on Cycles
• True for odd cycles with Pij ≤ 0
• True for odd cycles with Pij ≤ 0 for only one edge
• True for odd cycles with Pij ≥ 0 for only one edge
• True for all combinations of above cases
The SOCP-C Relaxation
Include all LP-S constraints
0
1
0
0
0
0
1
1
0
0
0
0
0
0
0
1
a
b
b
c
Submodular
Non-submodular
True SOCP
0
0
0
1
1
0
0
0
c
a
Submodular
The SOCP-C Relaxation
Include all LP-S constraints
0
1
0
0
0
0
1
1
0
0
0
0
0
0
0
1
a
b
b
c
Frustrated Cycle
True SOCP
0
0
0
1
1
0
0
0
c
a
The SOCP-C Relaxation
Include all LP-S constraints
0
1
0
0
0
0
1
1
0
0
0
0
0
0
0
1
a
b
b
c
1
0
-1
0
0
-1
1
0
1
a
0
LP-S Solution
-1
b
0
1
0
-1
1
0
-1
b
0
True SOCP
0
0
0
1
1
0
0
0
c
a
c
Objective Function = 0
0
0
-1
0
1
c
a
The SOCP-C Relaxation
Include all LP-S constraints
0
1
0
0
0
0
1
1
0
0
0
0
0
0
0
1
a
b
b
c
1
0
-1
0
0
-1
1
0
1
a
0
LP-S Solution
-1
b
0
1
0
-1
1
0
-1
b
0
True SOCP
0
0
0
1
1
0
0
0
c
a
c
0
0
-1
0
1
c
Define an SOC Constraint using C = 1
a
The SOCP-C Relaxation
Include all LP-S constraints
0
1
0
0
0
0
1
1
0
0
0
0
0
0
0
1
a
b
b
c
1
0
-1
0
0
-1
1
0
1
a
0
LP-S Solution
-1
b
0
1
0
-1
1
0
-1
b
0
True SOCP
0
0
0
1
1
0
0
0
c
a
c
0
0
-1
0
1
c
(xi + xj + xk)2  3 + 2 (Xij + Xjk + Xki)
a
The SOCP-C Relaxation
Include all LP-S constraints
0
1
0
0
0
0
1
1
0
0
0
0
0
0
0
1
a
b
b
c
True SOCP
0
0
0
1
1
0
0
0
c
a
SOCP-C Solution
0.8
-0.6
-0.8
-0.6
0.3
-0.8
0.6
0.6
-0.3
-0.6
b
-0.4
0
-0.3
b
0
0.3
0.6
0.8
a
0
0.4
-0.4
0.6
0
c
-0.6
0.4
c
Objective Function = 0.75
SOCP-C strictly dominates LP-S
a
The SOCP-Q Relaxation
Include all cycle inequalities
a
True SOCP
b
Clique of size n
c
d
Define an SOCP Constraint using C = 1
(Σ xi)2 ≤ n + (Σ Xij)
SOCP-Q strictly dominates LP-S
SOCP-Q strictly dominates SOCP-C
4-Neighbourhood MRF
Test SOCP-C
50 binary MRFs of size 30x30
u ≈ N (0,1)
P ≈ N (0,σ2)
4-Neighbourhood MRF
σ = 2.5
8-Neighbourhood MRF
Test SOCP-Q
50 binary MRFs of size 30x30
u ≈ N (0,1)
P ≈ N (0,σ2)
8-Neighbourhood MRF
σ = 1.125
Conclusions
• Large class of SOCP/QP dominated by LP-S
• New SOCP relaxations dominate LP-S
• But better LP relaxations exist