SELECTED SOLUTIONS, SECTION 1.2 1. Prove Sn + is a closed

SELECTED SOLUTIONS, SECTION 1.2
1. Prove Sn+ is a closed convex cone with interior Sn++ .
Recall that
Sn+ = X ∈ Rn×n : X T = X and xT Xx ≥ 0 for all x ∈ Rn .
• Each of the sets X ∈ Rn×n : X T = X and X ∈ Rn×n : xT Xx ≥ 0
(with arbitrary but fixed x ∈ Rn ) is closed, and Sn+ is the intersection of all
these sets. Thus it is closed as well.
• If X, Y ∈ Sn+ and 0 ≤ λ ≤ 1, then λX + (1 − λ)Y ∈ Sn . Moreover, we have
for every x ∈ Rn that
xT (λX + (1 − λ)Y )x = λxT Xx + (1 − λ)xT Y x ≥ 0,
because λ ≥ 0, (1 − λ) ≥ 0, and X, Y ∈ Sn+ . This shows that also λX +
(1 − λ)Y ∈ Sn+ , proving that Sn+ is convex.
• If X ∈ Sn+ and λ ≥ 0, then λX ∈ Sn and xT (λX)x = λxT Xx ≥ 0 for all
x ∈ Rn , showing that λX ∈ Sn+ . Thus Sn+ is a cone.
• Assume that X ∈ Sn++ . Then λn (X) > 0. Now let Y ∈ Sn satisfy kY −
Xk2 < λn (X). Then for every x ∈ Rn the inequality
xT Y x = xT Xx + xT (Y − X)x ≥ λn (X)kxk2 − kY − Xkkxk2 > 0
is satisfied. This proves that Sn++ is contained in the interior of Sn+ .
On the other hand, assume that X ∈ Sn+ \ Sn++ . Then we can write
X as X = U T Diag(λ(X))U and λn (X) = 0. Now define Xk , k ∈ N, as
Xk = U T Diag(λ(X) − 1/k)U . Then Xk → X, but λn (Xk ) = −1/k < 0
showing that X is not contained in the interior of Sn+ . Thus the interior of
Sn+ is indeed exactly equal to Sn++ .
2. Explain why S2+ is not a polyhedron.
Write a matrix X ∈ S2 as
X=
a
c
c
.
b
Then X ∈ S2+ , if and only if a ≥ 0 and det(X) = ab − c2 ≥ 0. In particular, the
intersection of S2 with the hyperplane given by the equation c = 1 consists of all a,
b satisfying ab ≥ 1, which is obviously not a polyhedron in R2 .
7. The
(a)
(b)
(c)
Fan and Cauchy–Schwarz inequalities.
For any matrices X in Sn and U in On , prove kU T XU k = kXk.
Prove the function λ is norm-preserving.
Explain why Fan’s inequality is a refinement of the Cauchy-Schwarz
inequality.
(a) We have
T
kU XU k2 = tr(U T XU U T XU ) = tr(U T XXU ) = tr(U U T XX) = tr(XX) = kXk2 .
Date: October 5, 2015.
1
2
SELECTED SOLUTIONS, SECTION 1.2
(b) Write X ∈ Sn as X = U T Diag(λ(X))U . Then
kXk2 = kU T Diag(λ(X))U k2 = kDiag(λ(X))k2
= tr(Diag(λ(X))2 ) =
X
λk (X)2 = kλ(X)k2 .
k
(c) Given X, Y ∈ Sn , the Cauchy–Schwarz inequality (for symmetric matrices
with the inner product hX, Y i = tr(XY )) reads as
hX, Y i ≤ kXkkY k.
On the other hand, we have that
hX, Y i ≤ λ(X)T λ(Y )
Fan’s inequality
≤ kλ(X)kkλ(X)k
Cauchy–Schwarz inequality in Rn
= kXkkY k
from (b).
Thus Fan’s inequality refines the Cauchy–Schwarz inequality in the sense
that it adds another intermediate estimate to this inequality.
11. For a fixed volumn vector s in Rn , define a linear map A : Sn → Rn by
setting AX = Xs for any matrix X in Sn . Calculate the adjoint map A∗ .
Recall that the adjoint A∗ of A is the unique linear map A∗ : Rn → Sn satisfying
hA∗ x, XiSn = hx, AXiRn
for all x ∈ Rn and X ∈ Sn . In the case of this particular mapping A, this means
that A∗ is defined by the equation
tr((A∗ x)X) = xT Xs
for all x ∈ Rn and X ∈ Sn . Now note that, given two vectors a, b ∈ Rn ∼ Rn×1 ,
we can write
1
aT b = tr(abT + baT )
2
Thus, using the symmetry of X, we obtain that
1
tr(x(Xs)T + (Xs)xT )
2
1
1
1
= tr((xsT )X T + X(sxT )) = tr((xsT )X + (sxT )X) = tr (xsT + sxT )X .
2
2
2
This shows that
1
A∗ x = (xsT + sxT )
2
for every x ∈ Rn .
hA∗ x, XiSn = tr((A∗ x)X) = xT Xs =
12∗ (Fan’s inequality) For vectors x and y in Rn and a matrix U in On , define
α = Diag x, U T (Diag y)U .
(a) Prove α = xT Zy for some doubly stochastic matrix Z.
(b) Use Birkhoff’s theorem and Proposition 1.2.4 to deduce the inequality
α ≤ [x]T [y].
(c) Deduce Fan’s inequality (1.2.2).
SELECTED SOLUTIONS, SECTION 1.2
3
(a) We can write
X
Diag x, U T (Diag y)U = tr((Diag x)U T (Diag y)U ) =
xi (U T (Diag y)U )ii .
i
Moreover
(U T (Diag y)U )ii =
X
X
X
(U T )ik ((Diag y)U )ki =
Uki yk Uki =
yk (Uki )2 .
k
k
k
Thus
X
Diag x, U T (Diag y)U =
xi (Uki )2 yk .
i,k
Defining Z ∈ R
n×n
by
Zik = (Uki )2 ,
we see that
Diag x, U T (Diag y)U = xT Zy.
Moreover, for every i we have
X
X
Zik =
(Uki )2 = 1
k
k
X
X
and
Zki =
k
(Uik )2 = 1
k
because U is orthogonal. Thus Z is doubly stochastic.
(b) Using Birkhoff’s theorem, we see that we can write
X
Z=
λj Pj
j
n×n
for
and 0 ≤ λj ≤ 1 satisfying
P some permutation matrices Pj ∈ R
λ
=
1.
Note
here
that
[P
y]
=
[y]
because
Pj y is only a permutation
j
j
j
of y. Thus, using the Hardy–Littlewood–P’olya theorem, we see that
X
X
X
α = xT Zy =
λj xT (Pj y) ≤
λj [x]T [Pj y] =
λj [x]T [y] = [x]T [y].
j
j
V1T λ(X)V
(c) Write X =
x = λ(X), y = λ(Y ),
and Y = V2T λ(Y )V2 .
and U = V2 V1T , we see
j
Then, applying part (b) with
that
tr(XY ) = tr(V1T (Diag λ(X))V1 V2T (Diag λ(Y ))V2 )
= tr((Diag λ(X))V1 V2T (Diag λ(Y ))V2 V1T ) ≤ [λ(X)]T [λ(Y )] = λ(X)T λ(Y ),
because the vectors λ(X) and λ(Y ) are ordered anyway.