Solutions HW 1

Solutions HW 1
Description of convex sets.
Which of the following sets are convex?
1. A slab, i.e., a set of the form {x ∈ Rn | α ≤ aT x ≤ β}.
2. A rectangle, i.e., a set of the form {x ∈ Rn | αi ≤ xi ≤ βi , i = 1, . . . , n}. A rectangle is
sometimes called a hyperrectangle when n > 2.
3. A wedge, i.e., {x ∈ Rn | aT1 x ≤ b1 , aT2 x ≤ b2 }.
4. The set of points closer to a given point than a given set, i.e.,
{x | kx − x0 k2 ≤ kx − yk2 for all y ∈ S}
where S ⊆ Rn .
5. The set of points closer to one set than another, i.e.,
{x | dist(x, S) ≤ dist(x, T )},
where S, T ⊆ Rn , and
dist(x, S) = inf{kx − zk2 | z ∈ S}.
6. The set {x | x + S2 ⊆ S1 }, where S1 , S2 ⊆ Rn with S1 convex.
7. The set of points whose distance to a does not exceed a fixed fraction θ of the distance to b,
i.e., the set {x | kx − ak2 ≤ θkx − bk2 }. You can assume a 6= b and 0 ≤ θ ≤ 1.
Solution.
1. A slab is an intersection of two halfspaces, hence it is a convex set (and a polyhedron).
2. As in part (a), a rectangle is a convex set and a polyhedron because it is a finite intersection
of halfspaces.
3. A wedge is an intersection of two halfspaces, so it is convex set. It is also a polyhedron. It is
a cone if b1 = 0 and b2 = 0.
4. This set is convex because it can be expressed as
\
{x | kx − x0 k2 ≤ kx − yk2 },
y∈S
i.e., an intersection of halfspaces. (For fixed y, the set
{x | kx − x0 k2 ≤ kx − yk2 }
is a halfspace; see exercise 2.7).
1
5. In general this set is not convex, as the following example in R shows. With S = {−1, 1} and
T = {0}, we have
{x | dist(x, S) ≤ dist(x, T )} = {x ∈ R | x ≤ −1/2 or x ≥ 1/2}
which clearly is not convex.
6. This set is convex. x + S2 ⊆ S1 if x + y ∈ S1 for all y ∈ S2 . Therefore
\
\
{x | x + S2 ⊆ S1 } =
{x | x + y ∈ S1 } =
(S1 − y),
y∈S2
y∈S2
the intersection of convex sets S1 − y.
7. The set is convex.
{x | kx − ak2 ≤ θkx − bk2 }
= {x | kx − ak22 ≤ θ2 kx − bk22 }
= {x | (1 − θ2 )xT x − 2(a − θ2 b)T x + (aT a − θ2 bT b) ≤ 0}
If θ = 1, this is a halfspace.
Euclidean distance matrices.
Let x1 , . . . , xn ∈ Rk . The matrix D ∈ Sn defined by Dij = kxi − xj k22 is called a Euclidean
distance matrix. It satisfies some obvious properties such as Dij = Dji , Dii = 0, Dij ≥ 0, and
1/2
1/2
1/2
(from the triangle inequality) Dik ≤ Dij + Djk . We now pose the question: When is a matrix
D ∈ Sn a Euclidean distance matrix (for some points in Rk , for some k)? A famous result answers
this question: D ∈ Sn is a Euclidean distance matrix if and only if Dii = 0 and xT Dx ≤ 0 for all
x with 1T x = 0.
Show that the set of Euclidean distance matrices is a convex cone. Find the dual cone.
Solution. The set of Euclidean distance matrices in Sn is a closed convex cone because it is
the intersection of (infinitely many) halfspaces defined by the following homogeneous inequalities:
X
eTi Dei ≤ 0,
eTi Dei ≥ 0,
xT Dx =
xj xk Djk ≤ 0,
j,k
for all i = 1, . . . , n, and all x with 1T x = 1.
It follows, denoting Co the conic hull of a set, that the dual cone is given by
[
K ∗ = Co({−xxT | 1T x = 1} {e1 eT1 , −e1 eT1 , . . . , en eTn , −en eTn }).
This can be made more explicit as follows. Define V ∈ Rn×(n−1) as
1 − 1/n i = j
Vij =
−1/n
i 6= j.
The columns of V form a basis for the set of vectors orthogonal to 1, i.e., a vector x satisfies
1T x = 0 if and only if x = V y for some y. The dual cone is
K ∗ = {V W V T + diag(u) | W 0, u ∈ Rn }.
2
Pointwise maximum and supremum.
Show that the following functions f : Rn → R are convex.
1. f (x) = maxi=1,...,k kA(i) x − b(i) k, where A(i) ∈ Rm×n , b(i) ∈ Rm and k · k is a norm on Rm .
Solution. f is the pointwise maximum of k functions kA(i) x − b(i) k. Each of those functions
is convex because it is the composition of an affine transformation and a norm.
P
2. f (x) = ri=1 |x|[i] on Rn , where |x| denotes the vector with |x|i = |xi | (i.e., |x| is the absolute
value of x, componentwise), and |x|[i] is the ith largest component of |x|. In other words,
|x|[1] , |x|[2] , . . . , |x|[n] are the absolute values of the components of x, sorted in nonincreasing
order.
Solution. Write f as
f (x) =
r
X
i=1
|x|[i] =
max
1≤i1 <i2 <···<ir ≤n
|xi1 | + · · · + |xir |
which is the pointwise maximum of n!/(r!(n − r)!) convex functions.
Products and ratios of convex functions.
In general the product or ratio of two convex functions is not convex. However, there are some
results that apply to functions on R. Prove the following.
1. If f and g are convex, both nondecreasing (or nonincreasing), and positive functions on an
interval, then f g is convex.
2. If f , g are concave, positive, with one nondecreasing and the other nonincreasing, then f g is
concave.
3. If f is convex, nondecreasing, and positive, and g is concave, nonincreasing, and positive,
then f /g is convex.
Solution.
1. We prove the result by verifying Jensen’s inequality. f and g are positive and convex, hence
for 0 ≤ θ ≤ 1,
f (θx + (1 − θ)y) g(θx + (1 − θ)y) ≤ (θf (x) + (1 − θ)f (y)) (θg(x) + (1 − θ)g(y))
= θf (x)g(x) + (1 − θ)f (y)g(y)
+ θ(1 − θ)(f (y) − f (x))(g(x) − g(y)).
The third term is less than or equal to zero if f and g are both increasing or both decreasing.
Therefore
f (θx + (1 − θ)y) g(θx + (1 − θ)y) ≤ θf (x)g(x) + (1 − θ)f (y)g(y).
2. Reverse the inequalities in the solution of part (a).
3. It suffices to note that 1/g is convex, positive and increasing, so the result follows from
part (a).
3
Conjugates.
Derive the conjugates of the following functions.
1. Max function. f (x) = maxi=1,...,n xi on Rn .
Solution. We will show that
∗
f (y) =
0 if y 0, 1T y = 1
∞ otherwise.
We first verify the domain of f ∗ . First suppose y has a negative component, say yk < 0. If
we choose a vector x with xk = −t, xi = 0 for i 6= k, and let t go to infinity, we see that
xT y − max xi = −tyk → ∞,
i
so y is not in dom f ∗ . Next, assume y 0 but 1T y > 1. We choose x = t1 and let t go to
infinity, to show that
xT y − max xi = t1T y − t
i
is unbounded above. Similarly, when y 0 and 1T y < 1, we choose x = −t1 and let t go to
infinity.
The remaining case for y is y 0 and 1T y = 1. In this case we have
xT y ≤ max xi
i
for all x, and therefore xT y − maxi xi ≤ 0 for all x, with equality for x = 0. Therefore
f ∗ (y) = 0.
P
2. Sum of largest elements. f (x) = ri=1 x[i] on Rn .
Solution. The conjugate is
f ∗ (y) =
0 0 y 1,
∞ otherwise,
1T y = r
We first verify the domain of f ∗ . Suppose y has a negative component, say yk < 0. If we
choose a vector x with xk = −t, xi = 0 for i 6= k, and let t go to infinity, we see that
xT y − f (x) = −tyk → ∞,
so y is not in dom f ∗ .
Next, suppose y has a component greater than 1, say yk > 1. If we choose a vector x with
xk = t, xi = 0 for i 6= k, and let t go to infinity, we see that
xT y − f (x) = tyk − t → ∞,
so y is not in dom f ∗ .
Finally, assume that 1T x 6= r. We choose x = t1 and find that
xT y − f (x) = t1T y − tr
4
is unbounded above, as t → ∞ or t → −∞.
If y satisfies all the conditions, for x = (x1 , . . . , xn ) ∈ Rn with x1 ≥ . . . ≥ xn ,
xT y − f (x) =
=
n
X
x i yi −
i=1
n
X
xi
i=1
r
X
(yi − 1)xi
xi yi +
i=r+1
i=1
n
X
≤ xr
r
X
!
r
X
yi +
(yi − 1)
i=r+1
i=1
=0
with equality for x = 0. Therefore f ∗ (y) = 0.
3. Piecewise-linear function on R. f (x) = maxi=1,...,m (ai x + bi ) on R. You can assume that the
ai are sorted in increasing order, i.e., a1 ≤ · · · ≤ am , and that none of the functions ai x + bi
is redundant, i.e., for each k there is at least one x with f (x) = ak x + bk .
Solution. Under the assumption, the graph of f is a piecewise-linear, with breakpoints
(bi − bi+1 )/(ai+1 − ai ), i = 1, . . . , m − 1. We can write f ∗ as
∗
f (y) = sup xy − max (ai x + bi )
i=1,...,m
x
We see that dom f ∗ = [a1 , am ], since for y outside that range, the expression inside the
supremum is unbounded above. For ai ≤ y ≤ ai+1 , the supremum in the definition of f ∗ is
reached at the breakpoint between the segments i and i + 1, i.e., at the point (bi+1 − bi )/(ai −
ai+1 ), so we obtain
bi+1 − bi
− bi
f ∗ (y) = (y − ai )
ai − ai+1
where i is defined by ai ≤ y ≤ ai+1 . Hence the graph of f ∗ is also a piecewise-linear curve.
4. Power function. f (x) = xp on R++ , where p > 1. Repeat for p < 0.
Solution. We’ll use standard notation: we define q by the equation 1/p + 1/q = 1, i.e.,
q = p/(p − 1).
We start with the case p > 1. For y ≤ 0 the function yx − xp is non-negative decreasing and it
achieves its maximum for x → 0, so f ∗ (y) = 0. For y > 0 the function achieves its maximum
at x = (y/p)1/(p−1) , where it has value
y(y/p)1/(p−1) − (y/p)p/(p−1) = (p − 1)(y/p)q .
Therefore we have
f ∗ (y) =
0
y≤0
(p − 1)(y/p)q y > 0.
For p < 0 similar arguments show that dom f ∗ = −R++ and f ∗ (y) = (p − 1)(y/p)q .
5
5. Geometric mean. f (x) = −(
Q
xi )1/n on Rn++ .
Solution. The conjugate function is
Q
0 if y ≺ 0, ( i (−yi ))1/n ≥ 1/n
∗
f (y) =
∞ otherwise.
We first verify the domain of f ∗ . Assume y has a non-negative component, say yk ≥ 0. Then
we can choose xk = t and xi = 1, i 6= k, to show that
X
xT y − f (x) = tyk +
yi + t1/n
i6=k
is unbounded above as a function of t > 0. Hence the condition y ≺ 0 is indeed required.
Q
Next assume that y ≺ 0, but ( i (−yi ))1/n < 1/n. We choose xi = −t/yi , and obtain
Y 1
xT y − f (x) = −tn + t
(− )
yi
!1/n
→∞
i
as t → ∞. This demonstrates that the second condition for the domain of f ∗ is also needed.
Q
Now assume that y ≺ 0 and ( i (−yi ))1/n ≥ 1/n, and x 0. The arithmetic-geometric mean
inequality states that
xT |y|
≥
n
!1/n
Y
(−yi xi )
i
1
≥
n
!1/n
Y
xi
,
i
i.e., −xT y ≥ −f (x) with equality for x → 0. Hence, f ∗ (y) = 0.
6. Negative generalized logarithm for second-order cone. f (x, t) = − log(t2 − xT x) on {(x, t) ∈
Rn × R | kxk2 < t}.
Solution.
f ∗ (y, u) = −2 + log 4 − log(u2 − y T y),
dom f ∗ = {(y, u) | kyk2 < −u}.
We first verify the domain. Suppose kyk2 ≥ −u (so u ≤ 0). Choose x = sy, t = kxk2 + 1 ≥
skyk2 ≥ −su, with s ≥ 0. Then
y T x + tu ≥ sy T y − su2 = s(y T y − u2 ) ≥ 0
log(t2 − xT x) = log(2skyk2 + 1)
Therefore
y T x + tu + log(t2 − xT x)
is unbounded above.
Next, assume that kyk2 < −u. Setting the derivative of
y T x + ut + log(t2 − xT x)
6
with respect to x and t equal to zero, and solving for t and x we see that the maximizer is
x=
2y
,
u2 − y T y
t=−
2u
.
u2 − y T y
This gives
f ∗ (y, u) = ut + y T x + log(t2 − xT x)
= −2 + log 4 − log(y T y − u2 ).
7