Frank-Wolfe Algorithms for Saddle Point
problems
Gauthier Gidel1
1 INRIA
Tony Jebara2
Paris, Sierra Team
3 Department
2 Department
Simon Lacoste-Julien3
of CS, Columbia University
of CS & OR (DIRO) Université de Montréal
10th December 2016
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Overview
I
Frank-Wolfe algorithm (FW) gained in popularity in the
last couple of years.
I
Main advantage: FW only needs LMO.
I
Extend FW properties to solve saddle point problem.
I
Straightforward extension but Non trivial analysis.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Saddle point and link with variational inequalities
Let L : X × Y → R, where X and Y are convex and compact.
Saddle point problem:
solve min max L(x, y)
x∈X y∈Y
A solution (x∗ , y ∗ ) is called a Saddle Point.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Saddle point and link with variational inequalities
Let L : X × Y → R, where X and Y are convex and compact.
Saddle point problem:
solve min max L(x, y)
x∈X y∈Y
A solution (x∗ , y ∗ ) is called a Saddle Point.
I Necessary stationary conditions:
hx − x∗ , ∇x L(x∗ , y ∗ )i ≥ 0
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Saddle point and link with variational inequalities
Let L : X × Y → R, where X and Y are convex and compact.
Saddle point problem:
solve min max L(x, y)
x∈X y∈Y
A solution (x∗ , y ∗ ) is called a Saddle Point.
I Necessary stationary conditions:
hx − x∗ , ∇x L(x∗ , y ∗ )i ≥ 0
hy − y ∗ , −∇y L(x∗ , y ∗ )i ≥ 0
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Saddle point and link with variational inequalities
Let L : X × Y → R, where X and Y are convex and compact.
Saddle point problem:
solve min max L(x, y)
x∈X y∈Y
A solution (x∗ , y ∗ ) is called a Saddle Point.
I Necessary stationary conditions:
hx − x∗ , ∇x L(x∗ , y ∗ )i ≥ 0
hy − y ∗ , −∇y L(x∗ , y ∗ )i ≥ 0
I
Variational inequality:
∀z ∈ X × Y
hz − z ∗ , g(z ∗ )i ≥ 0
where (x∗ , y ∗ ) = z ∗ and g(z) = (∇x L(z), −∇y L(z))
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Saddle point and link with variational inequalities
Let L : X × Y → R, where X and Y are convex and compact.
Saddle point problem:
solve min max L(x, y)
x∈X y∈Y
A solution (x∗ , y ∗ ) is called a Saddle Point.
I Necessary stationary conditions:
hx − x∗ , ∇x L(x∗ , y ∗ )i ≥ 0
hy − y ∗ , −∇y L(x∗ , y ∗ )i ≥ 0
I
Variational inequality:
∀z ∈ X × Y
I
hz − z ∗ , g(z ∗ )i ≥ 0
where (x∗ , y ∗ ) = z ∗ and g(z) = (∇x L(z), −∇y L(z))
Sufficient condition: Global solution if L
convex-concave. ∀(x, y) ∈ X × Y
x0 7→ L(x0 , y) is convex
Gauthier Gidel
and y 0 7→ L(x, y 0 ) is concave.
Frank-Wolfe Algorithms for SP
10th December 2016
Motivations: games and robust learning
I
Zero-sum games with two players:
min
max x> M y
x∈∆(I) y∈∆(J)
1
J. Wen, C. Yu, and R. Greiner. “Robust Learning under Uncertain
Test Distributions: Relating Covariate Shift to Model Misspecification.”
In: ICML. 2014.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Motivations: games and robust learning
I
Zero-sum games with two players:
min
max x> M y
x∈∆(I) y∈∆(J)
I
Generative Adversarial Network (GAN)
1
J. Wen, C. Yu, and R. Greiner. “Robust Learning under Uncertain
Test Distributions: Relating Covariate Shift to Model Misspecification.”
In: ICML. 2014.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Motivations: games and robust learning
I
Zero-sum games with two players:
min
max x> M y
x∈∆(I) y∈∆(J)
I
Generative Adversarial Network (GAN)
I
Robust learning:1 We want to learn
min
θ∈Θ
n
1X
` (fθ (xi ), yi ) + λΩ(θ)
n i=1
with an uncertainty regarding the data:
min max
θ∈Θ w∈∆n
n
X
ωi ` (fθ (xi ), yi ) + λΩ(θ)
i=1
Minimize the worst case → gives robustness
1
J. Wen, C. Yu, and R. Greiner. “Robust Learning under Uncertain
Test Distributions: Relating Covariate Shift to Model Misspecification.”
In: ICML. 2014.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Problem with Hard projection
The structured SVM:
min λΩ(ω) +
ω∈Rd
n
1X
max (Li (y) − hω, φi (y)i)
n i=1 y∈Yi
|
Gauthier Gidel
{z
structured hinge loss
Frank-Wolfe Algorithms for SP
}
10th December 2016
Problem with Hard projection
The structured SVM:
min λΩ(ω) +
ω∈Rd
n
1X
max (Li (y) − hω, φi (y)i)
n i=1 y∈Yi
|
{z
structured hinge loss
}
Regularization: penalized → constrained.
min
max bT α − ω T M α
Ω(ω)≤β α∈∆(|Y|)
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Problem with Hard projection
The structured SVM:
min λΩ(ω) +
ω∈Rd
n
1X
max (Li (y) − hω, φi (y)i)
n i=1 y∈Yi
|
{z
structured hinge loss
}
Regularization: penalized → constrained.
min
max bT α − ω T M α
Ω(ω)≤β α∈∆(|Y|)
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Problem with Hard projection
The structured SVM:
min λΩ(ω) +
ω∈Rd
n
1X
max (Li (y) − hω, φi (y)i)
n i=1 y∈Yi
|
{z
structured hinge loss
}
Regularization: penalized → constrained.
min
max bT α − ω T M α
Ω(ω)≤β α∈∆(|Y|)
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Problem with Hard projection
The structured SVM:
min λΩ(ω) +
ω∈Rd
n
1X
max (Li (y) − hω, φi (y)i)
n i=1 y∈Yi
|
{z
structured hinge loss
}
Regularization: penalized → constrained.
min
max bT α − ω T M α
Ω(ω)≤β α∈∆(|Y|)
Hard to project when:
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Problem with Hard projection
The structured SVM:
min λΩ(ω) +
ω∈Rd
n
1X
max (Li (y) − hω, φi (y)i)
n i=1 y∈Yi
|
{z
structured hinge loss
}
Regularization: penalized → constrained.
min
max bT α − ω T M α
Ω(ω)≤β α∈∆(|Y|)
Hard to project when:
I
Structured sparsity norm (group lasso norm).
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Problem with Hard projection
The structured SVM:
min λΩ(ω) +
ω∈Rd
n
1X
max (Li (y) − hω, φi (y)i)
n i=1 y∈Yi
|
{z
structured hinge loss
}
Regularization: penalized → constrained.
min
max bT α − ω T M α
Ω(ω)≤β α∈∆(|Y|)
Hard to project when:
I
Structured sparsity norm (group lasso norm).
I
The output Y is structured: exponential size.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Standard approaches in literature
Simplest algorithm to solve Saddle point problems is the
projected gradient algorithm.
x(t+1) = PX (x(t) − η∇x L(x(t) , y (t) ))
y (t+1) = PY (y (t) + η∇y L(x(t) , y (t) ))
For non-smooth optimization,
T 1X
x(t) , y (t) −→ (x∗ , y ∗ )
T →∞
T t=1
2
N. He and Z. Harchaoui. “Semi-proximal Mirror-Prox for Nonsmooth
Composite Minimization”. In: NIPS. 2015.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Standard approaches in literature
Simplest algorithm to solve Saddle point problems is the
projected gradient algorithm.
x(t+1) = PX (x(t) − η∇x L(x(t) , y (t) ))
y (t+1) = PY (y (t) + η∇y L(x(t) , y (t) ))
For non-smooth optimization,
T 1X
x(t) , y (t) −→ (x∗ , y ∗ )
T →∞
T t=1
Faster algorithm: projected extra-gradient algorithm.
2
N. He and Z. Harchaoui. “Semi-proximal Mirror-Prox for Nonsmooth
Composite Minimization”. In: NIPS. 2015.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Standard approaches in literature
Simplest algorithm to solve Saddle point problems is the
projected gradient algorithm.
x(t+1) = PX (x(t) − η∇x L(x(t) , y (t) ))
y (t+1) = PY (y (t) + η∇y L(x(t) , y (t) ))
For non-smooth optimization,
T 1X
x(t) , y (t) −→ (x∗ , y ∗ )
T →∞
T t=1
Faster algorithm: projected extra-gradient algorithm.
Can use LMO to compute approximate projections2 .
2
N. He and Z. Harchaoui. “Semi-proximal Mirror-Prox for Nonsmooth
Composite Minimization”. In: NIPS. 2015.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
The FW algorithm
Algorithm Frank-Wolfe algorithm
1: Let x(0) ∈ X
2: for t = 0 . . . T do
3:
Compute r (t) = ∇f (x(t) )
4:
5:
6:
7:
Compute s(t) ∈ argmin s, r (t)
s∈X
Compute gt := x(t) − s(t) , r (t)
if gt ≤ then return x(t)
2
Let γ = 2+t
(or do line-search)
8:
Update x(t+1) := (1−γ)x(t) +γs(t)
9: end for
Gauthier Gidel
Frank-Wolfe Algorithms for SP
f
f (↵)
↵
M
10th December 2016
The FW algorithm
Algorithm Frank-Wolfe algorithm
1: Let x(0) ∈ X
2: for t = 0 . . . T do
3:
Compute r (t) = ∇f (x(t) )
4:
5:
6:
7:
Compute s(t) ∈ argmin s, r (t)
s∈X
Compute gt := x(t) − s(t) , r (t)
if gt ≤ then return x(t)
2
Let γ = 2+t
(or do line-search)
8:
Update x(t+1) := (1−γ)x(t) +γs(t)
9: end for
Gauthier Gidel
Frank-Wolfe Algorithms for SP
f
f (↵)
)
f (↵
⌦ 0
s
+
↵
↵
(↵
, rf
↵
)
M
10th December 2016
The FW algorithm
Algorithm Frank-Wolfe algorithm
1: Let x(0) ∈ X
2: for t = 0 . . . T do
3:
Compute r (t) = ∇f (x(t) )
4:
5:
6:
7:
Compute s(t) ∈ argmin s, r (t)
s∈X
Compute gt := x(t) − s(t) , r (t)
if gt ≤ then return x(t)
2
Let γ = 2+t
(or do line-search)
8:
Update x(t+1) := (1−γ)x(t) +γs(t)
9: end for
Gauthier Gidel
Frank-Wolfe Algorithms for SP
f
f (↵)
)
f (↵
⌦ 0
s
+
↵
↵
(↵
, rf
↵
)
M
10th December 2016
The FW algorithm
Algorithm Frank-Wolfe algorithm
1: Let x(0) ∈ X
2: for t = 0 . . . T do
3:
Compute r (t) = ∇f (x(t) )
4:
5:
6:
7:
f
f (↵)
Compute s(t) ∈ argmin s, r (t)
s∈X
Compute gt := x(t) − s(t) , r (t)
if gt ≤ then return x(t)
2
Let γ = 2+t
(or do line-search)
8:
Update x(t+1) := (1−γ)x(t) +γs(t)
9: end for
Gauthier Gidel
)
f (↵
s
Frank-Wolfe Algorithms for SP
⌦ 0
s
+
↵
↵
(↵
, rf
↵
)
M
10th December 2016
SP-FW
Algorithm Saddle point FW algorithm
Let z (0) = (x(0) , y (0) ) ∈ X × Y
2: for t = 0 . . . T do
!
(t) , y (t) )
∇
L(x
x
3:
Compute r (t) :=
−∇y L(x(t) , y (t) )
1:
4:
5:
Compute s(t) ∈ argmin
D z∈X ×Y
D
z, r (t)
Compute gt := z (t) − s(t) , r (t)
E
E
if gt ≤ then return
z (t)
2
7:
Let γ = min 1, Cν gt or γ = 2+t
8:
Update z (t+1) := (1 − γ)z (t) + γs(t)
9: end for
6:
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
SP-FW
Algorithm Saddle point FW algorithm
Let z (0) = (x(0) , y (0) ) ∈ X × Y
2: for t = 0 . . . T do
!
(t) , y (t) )
∇
L(x
x
3:
Compute r (t) :=
−∇y L(x(t) , y (t) )
1:
4:
5:
Compute s(t) ∈ argmin
D z∈X ×Y
D
z, r (t)
Compute gt := z (t) − s(t) , r (t)
E
E
if gt ≤ then return
z (t)
2
7:
Let γ = min 1, Cν gt or γ = 2+t
8:
Update z (t+1) := (1 − γ)z (t) + γs(t)
9: end for
6:
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
SP-FW
Algorithm Saddle point FW algorithm
Let z (0) = (x(0) , y (0) ) ∈ X × Y
2: for t = 0 . . . T do
!
(t) , y (t) )
∇
L(x
x
3:
Compute r (t) :=
−∇y L(x(t) , y (t) )
1:
4:
5:
Compute s(t) ∈ argmin
D z∈X ×Y
D
z, r (t)
Compute gt := z (t) − s(t) , r (t)
E
E
if gt ≤ then return
z (t)
2
7:
Let γ = min 1, Cν gt or γ = 2+t
8:
Update z (t+1) := (1 − γ)z (t) + γs(t)
9: end for
6:
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
SP-FW
Algorithm Saddle point FW algorithm
Let z (0) = (x(0) , y (0) ) ∈ X × Y
2: for t = 0 . . . T do
!
(t) , y (t) )
∇
L(x
x
3:
Compute r (t) :=
−∇y L(x(t) , y (t) )
1:
4:
5:
Compute s(t) ∈ argmin
D z∈X ×Y
D
z, r (t)
Compute gt := z (t) − s(t) , r (t)
E
E
if gt ≤ then return
z (t)
2
7:
Let γ = min 1, Cν gt or γ = 2+t
8:
Update z (t+1) := (1 − γ)z (t) + γs(t)
9: end for
I One can define FW extension with away step.
6:
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
SP-FW
Algorithm Saddle point FW algorithm
Let z (0) = (x(0) , y (0) ) ∈ X × Y
2: for t = 0 . . . T do
!
(t) , y (t) )
∇
L(x
x
3:
Compute r (t) :=
−∇y L(x(t) , y (t) )
1:
4:
5:
Compute s(t) ∈ argmin
D z∈X ×Y
D
z, r (t)
Compute gt := z (t) − s(t) , r (t)
E
E
if gt ≤ then return
z (t)
2
7:
Let γ = min 1, Cν gt or γ = 2+t
8:
Update z (t+1) := (1 − γ)z (t) + γs(t)
9: end for
I One can define FW extension with away step.
Pt
(i)
I γt = 1 ⇒ z (t) = 1
i=0 s .
1+t
t
1
I (γt =
1+t ) + Bilinear objective ↔ fictitious play algorithm.
6:
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Advantages of SP-FW
Same main property as FW:
Only LMO (linear minimization oracle).
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Advantages of SP-FW
Same main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:
I
Convergence certificate gt for free.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Advantages of SP-FW
Same main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:
I
Convergence certificate gt for free.
I
Affine invariance of the algorithm.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Advantages of SP-FW
Same main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:
I
Convergence certificate gt for free.
I
Affine invariance of the algorithm.
I
Sparsity of the iterates.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Advantages of SP-FW
Same main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:
I
Convergence certificate gt for free.
I
Affine invariance of the algorithm.
I
Sparsity of the iterates.
I
Universal step size γt :=
Gauthier Gidel
2
2+t ,
adaptive step size γt :=
Frank-Wolfe Algorithms for SP
ν
C gt .
10th December 2016
Advantages of SP-FW
Same main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:
I
Convergence certificate gt for free.
I
Affine invariance of the algorithm.
I
Sparsity of the iterates.
I
Universal step size γt :=
2
2+t ,
adaptive step size γt :=
ν
C gt .
Main difference with SP:
I
No line-search.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Advantages of SP-FW
Same main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:
I
Convergence certificate gt for free.
I
Affine invariance of the algorithm.
I
Sparsity of the iterates.
I
Universal step size γt :=
2
2+t ,
adaptive step size γt :=
ν
C gt .
Main difference with SP:
I
No line-search.
When constraints set is a “complicated” structured polytope
projections can be hard whereas LMO might be tractable.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Theoretical contribution
SP extension of FW with away step:
Convergence:
Linear rate with adaptive step size.
Sublinear rate with universal step size.
3
J. Hammond. “Solving asymmetric variational inequality problems and
systems of equations with generalized nonlinear programming algorithms”.
PhD thesis. MIT, 1984.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Theoretical contribution
SP extension of FW with away step:
Convergence:
I
Linear rate with adaptive step size.
Sublinear rate with universal step size.
Similar hypothesis as AFW for linear convergence:
1. Strong convexity and smoothness of the function.
2. X and Y polytopes.
3
J. Hammond. “Solving asymmetric variational inequality problems and
systems of equations with generalized nonlinear programming algorithms”.
PhD thesis. MIT, 1984.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Theoretical contribution
SP extension of FW with away step:
Convergence:
I
Linear rate with adaptive step size.
Sublinear rate with universal step size.
Similar hypothesis as AFW for linear convergence:
1. Strong convexity and smoothness of the function.
2. X and Y polytopes.
I
Additional assumption on the bilinearity.
L(x, y) = f (x) + x> M y − g(y)
kM k smaller than the strong convexity constant.
3
J. Hammond. “Solving asymmetric variational inequality problems and
systems of equations with generalized nonlinear programming algorithms”.
PhD thesis. MIT, 1984.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Theoretical contribution
SP extension of FW with away step:
Convergence:
I
Linear rate with adaptive step size.
Sublinear rate with universal step size.
Similar hypothesis as AFW for linear convergence:
1. Strong convexity and smoothness of the function.
2. X and Y polytopes.
I
Additional assumption on the bilinearity.
L(x, y) = f (x) + x> M y − g(y)
kM k smaller than the strong convexity constant.
I
Proof use recent advances on AFW.
I
Partially answering a 30 years old conjecture3 .
3
J. Hammond. “Solving asymmetric variational inequality problems and
systems of equations with generalized nonlinear programming algorithms”.
PhD thesis. MIT, 1984.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Difficulties for saddle point
Usual descent Lemma:
ht+1 ≤ ht − γt gt +γt2
|{z}
Lkd(t) k2
2
≥0
With γt small enough the sequence decreases.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Difficulties for saddle point
Usual descent Lemma:
ht+1 ≤ ht − γt gt +γt2
|{z}
Lkd(t) k2
2
≥0
With γt small enough the sequence decreases.
For saddle point problem the Lipschitz gradient property gives
(x)
Lt+1 − L∗ ≤ Lt − L∗ − γt gt
|
(y)
− gt
{z
arbitrary sign
+γt2
Lkd(t) k2
.
2
}
I
Cannot control the oscillation of the sequence.
I
Must introduce other quantities to establish convergence.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Toy experiments
SP-AFW on a toy example
d = 30. with theoretical
step-size γt = Cν gt .
Figure: SP-AFW on a toy
example d = 30 with heuristic
gt
step-size. γt =
kM k2 D 2
C+2
µ
C = 2LD2
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Toy experiments
SP-AFW on a toy example
d = 30. with theoretical
step-size γt = Cν gt .
Figure: SP-AFW on a toy
example d = 30 with heuristic
gt
step-size. γt =
kM k2 D 2
C+2
µ
C = 2LD2
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Conclusion
I
SP-FW one of the first SP solver only working with LMO.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Conclusion
I
SP-FW one of the first SP solver only working with LMO.
I
FW resurgence lead to new structured problems.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Conclusion
I
SP-FW one of the first SP solver only working with LMO.
I
FW resurgence lead to new structured problems.
I
Same hope as FW for SP-FW
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Conclusion
SP-FW one of the first SP solver only working with LMO.
I
FW resurgence lead to new structured problems.
I
Same hope as FW for SP-FW
I
Call for applications !
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Conclusion
SP-FW one of the first SP solver only working with LMO.
I
FW resurgence lead to new structured problems.
I
Same hope as FW for SP-FW
I
Call for applications !
I
Still many theoretical opened questions.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Conclusion
SP-FW one of the first SP solver only working with LMO.
I
FW resurgence lead to new structured problems.
I
Same hope as FW for SP-FW
I
Call for applications !
I
Still many theoretical opened questions.
I
With a bilinear objective this algorithm is highly related
to the fictitious play algorithm.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Conclusion
SP-FW one of the first SP solver only working with LMO.
I
FW resurgence lead to new structured problems.
I
Same hope as FW for SP-FW
I
Call for applications !
I
Still many theoretical opened questions.
I
With a bilinear objective this algorithm is highly related
to the fictitious play algorithm.
I
Rich interplay tapping into this game theory literature.
Gauthier Gidel
Frank-Wolfe Algorithms for SP
10th December 2016
Thank You !
Slides available on www.di.ens.fr/~gidel.
© Copyright 2026 Paperzz