Convex Functions

– Convex Optimization –
3. Convex Functions
Goele Pipeleers
KU Leuven, Feb. 9-13, 2015
Overview
• definition
• properties of convex functions
–
–
–
–
–
Jensen’s inequality
restriction to line
first-order condition
second-order condition
epigraph
• operations that preserve convexity
→ calculus of convex functions
• quasiconvex functions
• convexity with respect to generalized inequalities
Definition
function f : Rn → R is convex if dom f is convex and
f (θx1 + (1 − θ)x2 ) ≤ θf (x1 ) + (1 − θ)f (x2 )
for all x1 , x2 ∈ dom f , 0 ≤ θ ≤ 1
f (x)
f (x2 )
f (x1 )
x1
function f is concave if −f is convex
x2
x
Examples on R
convex:
• affine: ax + b on R, for any a, b ∈ R
• exponential: e ax , for any a ∈ R
• powers: x p on R++ , for p ≥ 1 or p ≤ 0
• powers of absolute value: |x |p on R, for p ≥ 1
• negative entropy: x log x on R++
concave:
• affine: ax + b on R, for any a, b ∈ R
• powers: x p on R++ , for 0 ≤ p ≤ 1
• logarithm: log x on R++
Examples on Rn and Rm×n
affine functions are convex and concave; all norms are convex
examples on Rn
• affine function: f (x ) = aT x + b
qP
• norms: kx kp = p ni=1 |xi |p for p ≥ 1; kx k∞ = max |xi |
i
examples on Rn×m
• affine function: f (X ) = tr(AT X ) + b =
n X
n
X
Aij Xij + b
i=1 j=1
• spectral norm: f (X ) = kX k2 = σmax (X ) =
q
λmax (X T X )
Extended-value extension
extended-value extension f˜ of f is
f˜(x ) = f (x ), for x ∈ dom f ;
f˜(x ) = ∞, for x ∈
/ dom f
often simplifies notation: f is convex if
f˜(θx1 + (1 − θ)x2 ) ≤ θf˜(x1 ) + (1 − θ)f˜(x2 )
for all x1 , x2 ∈ Rn , 0 ≤ θ ≤ 1
Overview
• definition
• properties of convex functions
–
–
–
–
–
Jensen’s inequality
restriction to line
first-order condition
second-order condition
epigraph
• operations that preserve convexity
→ calculus of convex functions
• quasiconvex functions
• convexity with respect to generalized inequalities
Jensen’s inequality
basic inequality: if f is convex, then
f (θ1 x1 + θ2 x2 ) ≤ θ1 f (x1 ) + θ2 f (x2 )
for all x1 , x2 ∈ dom f , θ1 , θ2 ∈ R+ , θ1 + θ2 = 1
extension to arbitrary convex combinations: if f is convex, then
f (Ez) ≤ Ef (z)
for any random variable z
basic inequality is special case with discrete distribution
prob(z = x1 ) = θ1 ,
prob(z = x2 ) = θ2
Restriction of a convex function to a line
f : Rn → R is convex if and only if the function g : R → R
g(t) = f (x + tv ),
dom g = {t | x + tv ∈ dom f }
is convex for any x ∈ dom f , v ∈ Rn
check convexity of f by checking convexity of functions of one variable
example: f : Sn → R with f (X ) = log det X on dom f = Sn++
g(t) = log det(X + tV ) = log det X + log det(I + tX −1/2 VX −1/2 )
= log det X +
n
X
i=1
−1/2
log(1 + tλi )
where λi are the eigenvalues of X −1/2 VX
g is concave in t for any choice of X 0, V ; hence f is concave
First-order condition
f is differentiable if dom f is open and the gradient ∇f (x ) ∈ Rn
∇f (x ) =
∂f (x )
∂f (x ) ∂f (x )
,
, ... ,
∂x1
∂x2
∂xn
exists at each x ∈ dom f
1st-order condition: differentiable f with convex domain is convex iff
f (x ) ≥ f (x0 ) + ∇f (x0 )T (x − x0 ),
for all x0 , x ∈ dom f
first-order approximation of f is global underestimator
f (x)
f (x0 )
f (x0 ) + ∇f (x0 )T (x − x0 )
x0
x
Second-order condition
f is twice differentiable if dom f is open and the Hessian ∇2 f (x ) ∈ Sn
∇2 f (x )ij =
∂ 2 f (x )
,
∂xi ∂xj
i, j = 1, ... , n
exists at each x ∈ dom f
2nd-order condition: twice differentiable f with convex domain is convex iff
∇2 f (x ) 0,
for all x ∈ dom f
Examples
quadratic function: f (x ) = (1/2)x T Px + q T x + r , with P ∈ Sn+ is convex
∇f (x ) = Px + q,
∇2 f (x ) = P
least-squares objective: f (x ) = kAx − bk2 is convex
∇f (x ) = 2AT (Ax − b),
∇2 f (x ) = 2AT A
quadratic-over-linear: f (x , y ) = x 2 /y on R × R++ is convex
"
2 y
∇ f (x ) = 3
y −x
2
#"
y
−x
#T
0
2
f
1
0
2
2
y
0
1
0
-2
x
Qn
1/n
i=1 xi )
geometric mean: f (x ) = (
on Rn++ is concave
" # " #T
−1
x2
∇ f (x1 , x2 ) =
3/2
x1
4(x1 x2 )
x2
x1
2
log-sum-exp: f (x ) = log
Pn
i=1 exp(xi )
0
is convex
"
#"
e x1 e x2
1
∇2 f (x1 , x2 ) = x
2
x
2
1
−1
(e + e )
1
−1
4
2
f
0
-2
2
x2
0
-2
-2
0
x1
2
#T
0
Epigraph
epigraph of f : Rn → R:
epi f = {(x , t) ∈ Rn+1 | x ∈ dom f , f (x ) ≤ t}
epi f
f
f is convex if and only if epi f is a convex set
1st-order condition for convexity:
t ≥ f (x ) ≥ f (x0 ) + ∇f (x0 )T (x − x0 )
for all x0 ∈ dom f and (x , t) ∈ epi f ; hence
"
∇f (x0 )
−1
#T
" #
"
#!
x
x0
−
t
f (x0 )
≤0
epi f
(x0 , f (x0 ))
(∇f (x0 ), −1)
Overview
• definition
• properties of convex functions
–
–
–
–
–
Jensen’s inequality
restriction to line
first-order condition
second-order condition
epigraph
• operations that preserve convexity
→ calculus of convex functions
• quasiconvex functions
• convexity with respect to generalized inequalities
Operations that preserve convexity
analyzing convexity of a function
• verify definition
– often simplified by restricting to a line
• use properties
– 2nd-order condition
– epigraph
• calculus of convex functions: obtain f from simple convex functions by
operations that preserve convexity
– nonnegative weighted sum
– composition with affine function
– pointwise maximum and supremum
– partial minimization
– perspective
– composition
Weighted sum & composition with affine function
for convex functions f , f1 , f2
nonnegative multiple: αf with α ≥ 0 is convex
sum: f1 + f2 , is convex; extends to infinite sums, integrals
composition with affine function: f (Ax + b) is convex
examples
• log barrier for linear inequalities
f (x ) = −
m
X
log(bi − aiT x ),
dom f = {x | aiT x < bi , i = 1, ... , m}
i=1
• any norm of affine function: f (x ) = kAx + bk
Pointwise maximum
if f1 , ... , fm are convex, then
f (x ) = max{f1 (x ), ... , fm (x )}
is convex
epi f is intersection of epi fi , i = 1, ... , m
f
epi f1
epi f
epi f2
x
examples
• piecewise-linear function: f (x ) = max {aiT x + bi } is convex
i=1,...,m
• sum of r largest components of x ∈ Rn :
f (x ) = x[1] + x[2] + · · · + x[r ]
is convex since
f (x ) = max{xi1 + xi2 + · · · + xir | 1 ≤ i1 < i2 < · · · < ir ≤ n}
Pointwise supremum
if f (x , y ) is convex in x for each y ∈ C , then
g(x ) = sup f (x , y )
y ∈C
is convex
examples
• support function of a set C : SC (x ) = sup y T x is convex
y ∈C
• distance to farthest point in a set C :
f (x ) = sup kx − y k
y ∈C
• maximum eigenvalue of symmetric matrix: for X ∈ S n ,
λmax = sup y T Xy
ky k2 =1
Partial minimization
if f (x , y ) is convex in (x , y ) and C is a convex set, then
g(x ) = inf f (x , y )
y ∈C
is convex
epi g is projection of epi f ∩ {(x , y , t) | y ∈ C }
g
f
epi f
C
epi g
x
y
x
examples
• f (x , y ) = x T Ax + 2x T By + y T Cy with
#
"
A
BT
B
0,
C
C 0
minimizing over y gives g(x ) = inf f (x , y ) = x T (A − BC −1 B T )x
y
g is convex, hence Schur complement A − BC −1 B T 0
• distance to a set: dist(x , S) = inf kx − y k is convex if S is convex
y ∈S
Perspective
the perspective of a function f : Rn → R is the function g : Rn × R → R,
g(x , t) = tf (x /t),
dom g = {(x , t) | x /t ∈ dom f , t > 0}
g is convex if f is convex
epi g inverse image of epi f under perspective function
g(x , t) ≤ s
⇔
f (x /t) ≤ s/t
examples
• f (x ) = x T x is convex; hence g(x , t) = x T x /t is convex for t > 0
• negative logarithm f (x ) = − log x is convex; hence relative entropy
g(x , t) = t log t − t log x is convex on R2++
• if f is convex, then
g(x ) = (c T x + d) f
Ax + b
cT x + d
is convex on {x | c T x + d > 0, (Ax + b)/(c T x + d) ∈ dom f }
Composition with scalar function
composition of g : Rn → R and h : R → R:
f (x ) = h(g(x ))
(
f is convex if
g convex, h̃ convex and nondecreasing
g concave, h̃ convex and nonincreasing
• proof for n = 1, differentiable g, h
f 00 = h00 (g(x ))g 0 (x )2 + h0 (g(x ))g 00 (x )
• note: monotonicity must hold for extended-value extension h̃
examples
• exp g(x ) is convex if g is convex
• 1/g(x ) is convex if g is concave and positive
Vector composition
composition of g : Rn → Rk and h : Rk → R:
f (x ) = h(g(x )) = h(g1 (x ), g2 (x ), ... , gk (x ))
(
f is convex if
gi convex, h convex, h̃ nondecreasing in each argument
gi concave, h convex, h̃ nonincreasing in each argument
• proof for n = 1, differentiable g, h
f 00 (x ) = g 0 (x )T ∇2 h(g(x ))g 0 (x ) + ∇h(g(x ))T g 00 (x )
examples
•
•
Pm
i=1 log gi (x ) is concave if gi
P
log m
i=1 exp gi (x ) is convex if
are concave and positive
gi are convex
Overview
• definition
• properties of convex functions
–
–
–
–
–
Jensen’s inequality
restriction to line
first-order condition
second-order condition
epigraph
• operations that preserve convexity
→ calculus of convex functions
• quasiconvex functions
• convexity with respect to generalized inequalities
Quasiconvex functions
α-sublevel set of f : Rn → R:
Cα = {x ∈ dom f | f (x ) ≤ α}
sublevel sets of convex functions are convex (converse is false)
f : Rn → R is quasiconvex if dom f is convex and the sublevel sets Cα are
convex for all α
f (x)
β
α
a
b
c
• f is quasiconcave if −f is quasiconvex
• f is quasilinear if it is quasiconvex and quasiconcave
x
Examples
•
q
|x | is quasiconvex on R
• ceil(x ) = inf{z ∈ Z | z ≥ x } is quasilinear
• log x is quasilinear on R++
2
• f (x1 , x2 ) = x1 x2 is quasiconcave on R++
• linear-fractional function
f (x ) =
aT x + b
,
cT x + d
dom f = {x | c T x + d > 0}
is quasilinear
• distance ratio
f (x ) =
is quasiconvex
kx − ak2
,
kx − bk2
dom f = {x | kx − ak2 ≤ kx − bk2 }
Convex representation of sublevel sets
if f is quasiconvex, there exists a family of functions φα such that:
• φα (x ) is convex in x for fixed α
• α-sublevel set of f is 0-sublevel set of φα , i.e.,
f (x ) ≤ α
⇔ φα (x ) ≤ 0
examples:
• f (x ) =
q
|x | → φα (x ) = |x | − α2
• f (x ) = ceil(x ) → φα (x ) = x − floor(α)
• f (x1 , x2 ) = −x1 x2 → φα (x ) = −x1 −
• f (x ) =
α
x2
(α < 0)
aT x + b
→ φα (x ) = (aT x + b) − α(c T x + d)
cT x + d
Convexity with respect to generalized inequalities
f : Rn → Rm is K -convex if dom f is convex and
f (θx + (1 − θ)y ) K θf (x ) + (1 − θ)f (y )
for x , y ∈ dom f , 0 ≤ θ ≤ 1
examples
• Rm
+ -convexity means componentwise convexity
• f : Sm → Sm , f (X ) = X 2 is Sm
+ -convex
– for fixed z ∈ Rm , z T X 2 z = kXzk2 is convex in X , i.e.,
z T (θX + (1 − θ)Y )2 z ≤ θz T X 2 z + (1 − θ)z T Y 2 z
for X , Y ∈ Sm , 0 ≤ θ ≤ 1
– therefore (θX + (1 − θ)Y )2 θX 2 + (1 − θ)Y 2
Overview
• definition
• properties of convex functions
–
–
–
–
–
Jensen’s inequality
restriction to line
first-order condition
second-order condition
epigraph
• operations that preserve convexity
→ calculus of convex functions
• quasiconvex functions
• convexity with respect to generalized inequalities