A LITTLE TASTE OF SYMPLECTIC GEOMETRY: THE

A LITTLE TASTE OF SYMPLECTIC GEOMETRY:
THE SCHUR-HORN THEOREM
TIMOTHY E. GOLDBERG
A BSTRACT. This is a handout for a talk given at Bard College on Tuesday, 1 May
2007 by the author. It gives careful versions of some of the basic definitions from
symplectic geometry, describes the Atiyah/Guillemin-Sternberg and Schur-Horn
theorems, and gives an elementary proof of the latter theorem for the case n = 2.
C ONTENTS
1.
Statement of the Schur-Horn theorem
1
2.
Careful definitions
3
3.
Symplectic interpretation of the Schur-Horn theorem
5
4.
Proof of the Schur-Horn theorem for n = 2
6
Section 2 is somewhat technical, but an attempt has been made to make the rest
of this document accessible to a wider audience of math students.
1. S TATEMENT OF THE S CHUR -H ORN THEOREM
Definition 1.1. Let V be a finite-dimensional real vector space. A subset C ⊂ V
is convex if the line segment between any two points in C is itself contained in C.
For any subset X ⊂ V, the convex hull of X is the minimal convex set C in V that
contains X. In this case we say that C is the convex set generated by X. A subset
P ⊂ V is a convex polytope if it is the convex hull of a finite set of points.
4
Let H(n) be the set of n × n Hermitian matrices, i.e. n × n complex matrices A
such that A∗ = A, where A∗ := ĀT is the adjoint of A. Recall from linear algebra
that every Hermitian matrix:
1
2
TIMOTHY E. GOLDBERG
• is diagonalizable;
• has all real eigenvalues;
• has all real diagonal entries.
Let λ = (λ1 , . . . , λn ) be a list of n real numbers, and let Oλ be the subset of H(n)
consisting of Hermitian matrices whose eigenvalues are given by λ. Such a subset
is sometimes called an isospectral set.
Theorem 1.2 (Schur-Horn). Let f : Oλ → Rn be defined by


a11
f(A) =  ...  ,
ann
where a11 , . . . , ann are the diagonal entries of A. Then f(Oλ ) is the convex polytope
generated by the vectors in Rn whose entries are precisely λ1 , . . . , λn , written is some
order.
Notice that the points that generate the convex polytope all satisfy the equation
x1 + . . . + xn =
n
X
λi ,
i=1
so the polytope itself lies in the hyperplane of Rn corresponding to this equation.
Example 1.3. Let λ = (3, 2, 1). Then if we take every Hermitian matrix with these
eigenvalues, then pluck off their diagonal entries and map them in R3 , we will fill
out the hexagon whose vertices are the six points
           
2
3
3
2
1
1
2 , 3 , 1 , 3 , 1 , 2 .
2
3
1
2
1
3
(See Figure 1.) This hexagon lies in the plane x + y + z = 6.
♦
This theorem is attributed to I. Schur and A. Horn. In 1970, B. Kostant showed
that this is a special case of a more general theorem about compact Lie groups. In
1982, M. Atiyah and, independently, V. Guillemin and S. Sternberg showed that
Kostant’s result is a special case of a still more general theorem from symplectic
geometry involving Hamiltonian manifolds.
A LITTLE TASTE OF SYMPLECTIC GEOMETRY:
THE SCHUR-HORN THEOREM
3
F IGURE 1. The hexagon from Example 1.3
2. C AREFUL DEFINITIONS
Let M be a smooth manifold, and let Diff(M) denote the set of diffeomorphisms
M → M. This forms a group under function composition.
Definition 2.1. An action of a Lie group G on M is a group homomorphism
A : G → Diff(M). This action is called smooth if the associated map G × M →
M, (g, p) 7→ A(g)(p) is smooth. We use the notation A(g)(p) = g · p.
4
Definition 2.2. A symplectic form on M is a differential two-form ω which is
closed and nondegenerate, i.e. the exterior derivative of ω vanishes (dω = 0), and
for each p ∈ M, the bilinear form ωp on the tangent space Tp M is nondegenerate.
(This can be thought of as a collection of nondegenerate, skew-symmetric, bilinear
forms on each tangent space of M, subject to certain conditions.) The pair (M, ω)
is called a symplectic manifold.
Given a smooth function f : M → R, the symplectic gradient of f is the unique
vector field ∇ω f such that, for all p ∈ M and v~p ∈ Tp M we have
Dv~p f = ωp (v~p , ∇ω f(p)) ,
where Dv~p is the directional derivative at p in the direction v~p .
4
A smooth action of G on M induces a map g → Vec(M), ξ 7→ ξM , where g = Te G
is the Lie algebra of G and Vec(M) is the set of smooth tangent vector fields on M.
Intuitively, this is because the identity e of G corresponds to the identity map on
4
TIMOTHY E. GOLDBERG
M, so an infinitesimal displacement from e in G corresponds to an infinitesimal
displacement from each element of M. An infinitesimal displacement is essentially just a tangent vector. The vector field ξM is often called the fundamental
vector field corresponding to ξ.
Definition 2.3. A moment map for the action of G on (M, ω) is a function Φ : M →
g∗ such that Φ is G-equivariant and, for all ξ ∈ g,
∇ω (Φξ ) = ξM ,
where Φξ : M → R is defined by Φξ (p) = Φ(p)(ξ) for all p ∈ M. By G-equivariant,
we mean that
Φ(g · p) = g · Φ(p)
for all g ∈ G and p ∈ P, where g · Φ(p) is an instance of the coadjoint action
of G on the dual of its Lie algebra, g∗ . If a moment map for the action of G on
M exists, then the action is called Hamiltonian, and M is called a Hamiltonian
G-manifold.
4
Example 2.4. Every Lie group G acts on its Lie algebra g, and this action induces
an action on the dual of the Lie algebra, g∗ . The coadjoint orbit of G through the
element λ ∈ g∗ is the subset Oλ ⊂ g∗ defined by
Oλ = {g · λ | g ∈ G}.
It is known that every coadjoint orbit can be given the structure of a symplectic manifold, canonically, and of course G acts on it by the coadjoint action. This
action of G on Oλ is actually Hamiltonian, with a moment map given by the inclusion Oλ ,→ g∗ .
♦
Definition 2.5. A torus is a compact, connected, abelian Lie group. Equivalently
(although not obviously nor trivially), a torus is a Lie group T which is isomorphic
to a product S1 ×. . .×S1 of the circle group with itself some number of times. Here,
the isomorphism is required to be a smooth group isomorphism.
4
Theorem 2.6 (Atiyah/Guillemin-Sternberg). Let T be a torus, and let M be a compact
and connected Hamiltonian T -manifold with moment map Φ : M → t∗ , where t is the Lie
algebra of T . Then Φ(M) is a convex polytope generated by the image under Φ of the fixed
point set MT of T :
MT = {p ∈ M | t · p = p for all t ∈ T }.
This theorem was considerably strengthened to apply to nonabelian Lie groups
by F. Kirwan, and these convexity theorems have since been generalized to many,
many, many difference contexts. (For instance, the author has been working to apply this to real Lagrangian subsets of invariant subvarieties of Hamiltonian Kähler
manifolds, whatever all of that means.)
A LITTLE TASTE OF SYMPLECTIC GEOMETRY:
THE SCHUR-HORN THEOREM
5
3. S YMPLECTIC INTERPRETATION OF THE S CHUR -H ORN THEOREM
Let U(n) denote the n×n unitary matrices, i.e. the set of n×n complex matrices
a such that a∗ = a−1 . This is a compact connected Lie group of dimension n2 . Its
Lie algebra is denoted u(n), and is the set of skew-Hermitian matrices, (X∗ = −X).
The adjoint action of U(n) on its Lie algebra u(n) is simply conjugation:
g · X = gXg−1
for all g ∈ U(n) and X ∈ u(n), where the multiplication on the right hand side is
matrix multiplication.
The set H(n) of Hermitian matrices can be identified with the dual space u(n)∗
via the linear isomorphism L : H(n) → u(n)∗ defined by L(A)(X) = Tr(iAX), for
all A ∈ H(n) and X ∈ u(n). This isomorphism is actually U(n)-equivariant, so the
coadjoint action of U(n) on u(n)∗ induces an action of U(n) on H(n), which just
turns out to be conjugation.
Let λ = (λ1 , . . . , λn ) be a list of n real numbers, and let Aλ denote the diagonal
matrix with entries given by λ. Notice that Aλ is a Hermitian matrix. We have the
following very powerful fact from linear algebra.
Fact. Conjugation by unitary matrices is a transitive action on each isospectral set of
Hermitian matricies.
If you unpack this fact, you obtain the following.
• For all g ∈ U(n) and A ∈ Oλ , we have gXg−1 ∈ Oλ .
• For all A ∈ Oλ , there exists g ∈ U(n) such that gAg−1 = Aλ .
• For all A, B ∈ Oλ , there exists g ∈ U(n) such that gAg−1 = B.
Therefore the coadjoint orbit OAλ of U(n) through Aλ is equal to the isospectral
set Oλ of Hermitian matrices with eigenvalues given by λ. Hence Oλ is a Hamiltonian U(n)-manifold.
There is a very nice torus T sitting inside U(n) as a subgroup. Let T be the set


z1 . . . 0
of diagonal complex matrices  ... . . . ...  with kz1 k = . . . = kzn k = 1. The
0 . . . zn
Lie algebra t of T is the set of diagonal matrices with pure imaginary entries, and
its dual t∗ can be identified with the space of diagonal matrices with real entries,
via the linear isomorphism L : H(n) → u(n)∗ defined above. The inclusion map
∼ u(n)∗ → t∗ which projects a matrix A to the
t ,→ u(n) induces a map Φ : H(n) =
diagonal matrix with the same diagonal entries as A. The isospectral set Oλ is a
6
TIMOTHY E. GOLDBERG
Hamiltonian T -manifold, and the restriction of Φ to this set is a moment map for
the action. If we each diagonal matrix with its diagonal (thought of as a vector
in Rn ), then this moment map is exactly the function f : Oλ → Rn from the SchurHorn theorem.
The Atiyah/Guillemin-Sternberg theorem tells us that f(Oλ ) is a convex polytope. Because T is a group of diagonal matrices and T acts on Oλ by conjugation, it
can be shown that a matrix A ∈ H(n) is fixed under conjugation by every element
of T if and only if A is diagonal. Because the eigenvalues of a diagonal matrix
are simply its diagonal entries, the only diagonal matrices in Oλ are those with diagonal entries λ1 , . . . , λn , in some order. Therefore these are exactly the elements
of the fixed point set OTλ , so the diagonals of these matrices generate the convex
polytope f(Oλ ). This is exactly the statement of the Schur-Horn theorem.
4. P ROOF OF THE S CHUR -H ORN THEOREM FOR n = 2
a
c + di
An arbitrary 2 × 2 Hermitian matrix is of the form A =
. The
c − di
b
eigenvalues of A are the roots of its characteristic polynomial:
a
c + di
1 0
a − x c + di
det
−x
= det
c − di
b
0 1
c − di b − x
= (a − x)(b − x) − (c + di)(c − di)
= x2 − (a + b)x + (ab − c2 − d2 ).
Since the leading coefficient of this quadratic is 1, the sum of its roots is (a + b)
and the product of its roots is (ab − c2 − d2 ).
Let λ = (λ1 , λ2 ) be a pair of real numbers. If A ∈ Oλ then
λ1 + λ2 = a + b
and
λ1 λ2 = ab − c2 − d2 .
Hence a + b = λ1 + λ2 and ab ≥ λ1 λ2 . Define α, β : R2 → R by α(x, y) = x + y and
β(x, y) = xy for (x, y) ∈ R2 . Then our equations can be expressed as
α(a, b) = α(λ1 , λ2 ) and
β(a, b) ≥ β(λ1 , λ2 ).
Note that (λ1 , λ2 ) is contained in the following intersection of solution sets
(x, y) ∈ R2 | α(x, y) = λ1 + λ2 ∩ (x, y) ∈ R2 | β(x, y) = λ1 λ2 ,
and (a, b) is contained in the intersection
(x, y) ∈ R2 | α(x, y) = λ1 + λ2 ∩ (x, y) ∈ R2 | β(x, y) ≥ λ1 λ2 .
By plotting these curves, we see that the latter intersection is exactly the line segment between the points (λ1 , λ2 ) and (λ2 , λ1 ). (Figure 2 shows this plot under the
A LITTLE TASTE OF SYMPLECTIC GEOMETRY:
THE SCHUR-HORN THEOREM
7
assumption 0 ≤ λ1 ≤ λ2 . For each other case there is an entirely analogous picture.) Of course, this line segment is exactly the convex hull of these two points.
This proves the Schur-Horn theorem for n = 2.
F IGURE 2. The dark line segment must contain the point (a, b).
Remark 4.1. Note that if λ1 = λ2 , then the line segment is actually just a point. So
in this case, both diagonal entries of every matrix in Oλ are λ1 = λ2 .
♦
Timothy E. Goldberg
graduate student
Department of Mathematics
Cornell University
Ithaca, New York
[email protected]